Data Science and Cybersecurity: The Equifax Breach

Unless you’ve been buried under graduate research or the like for the last month, you’ve probably heard about the Equifax data breach, that according to the New York Times, ‘is one of the worst data breaches in terms of quantity and quality (how valuable the information may be.’ Since this blog is about data science, I want to talk a little about this case first from that perspective.

The question that we can ask is: ‘What role does the data scientist play in prevention and/or recovery of data?’ Along with anyone in the organization that works with data (almost everyone), the data scientist has the responsibility to safeguard information to the best of their ability. I would argue that unless the roles of cybersecurity experts and data scientists are one and the same, that the data scientist in this Equifax case is not responsible for the breach.

This TechCrunch article talks about how data science can detect anomalies and play a crucial role in cybersecurity. It will be interesting if Equifax discloses in the coming months if they had any automatic algorithms in place detecting data breaches or if they relied solely on the Apache Struts software. Perhaps if someone in the IT department would have had this additional security measure in place, our data wouldn’t have been stolen.

To someone who was affected by the Equifax data breach, there are several disconcerting things not related to data science that are important in this conversation:

Equifax waited almost 2 months after the breach happened to disclose it to the public.

Equifax isn’t disclosing exactly how many people were affected.

Not only did the cybersecurity team not do their job in keeping the Apache Struts software attack-proof, the communications department at Equifax also failed miserably. If Equifax knew that ‘Criminals gained access to certain files in the company’s system from mid-May to July’ [1] and discovered the breach July 29, why did they wait until September to disclose this crucial information to the public? The negative impact of potential criminal activity could have happened in this time window while Equifax kept quiet about it. I did a brief stunt in a communications office and rule #1 of ‘crisis communication‘ was to communicate the bad news if the organization is aware of bad news. Equifax failed in their responsibility to communicate the breach in a timely fashion.

The NY Times and other headlines state that ‘Cyberattack May Have Affected 143 Million in the U.S.’ Does this wording mean that Equifax isn’t exactly sure of how many people’s data was attacked? This is very worrisome if they don’t know the extent of the data breach. Or if they know whose data was taken, are trying to cover up the extent of the problem?

Has our society done well enough in defining laws to keep up with technology/cyber security? The Apache Struts bug was documented as early as March 2017. I don’t think any sort of law will help prevent cyber security attacks because laws generally are designed as a reactive measure after something happens. The Equifax database was an easy target for hackers since it was one place that held a lot of sensitive information. Cybersecurity professionals should stay abreast of potential threats and have the proper tools to handle these threats before they happen. The public, Equifax and others need to be now asking the question – who’s next?? By asking this question and focusing on future threats, preventative measures can be taken so this doesn’t happen again.