Micro Focus is now part of OpenText. Learn more >

You are here

You are here

Ethical AI needs to thrive in SecOps: 3 key guidelines

public://pictures/stephanjou.jpeg
Stephan Jou CTO Security Analytics, Interset, CyberRes
 

Security operations centers (SOCs) increasingly rely on network data flows as they collect telemetry from devices and monitor user behaviors. To make these massive data flows manageable, SOCs turn to rules, machine learning, and artificial (or augmented) intelligence to triage, de-duplicate, and add context to the alerts about potential dangerous or malicious activity.

Pushing the boundaries of what machine learning can deliver when nourished by massive data has already led to significant invasions of privacy, especially when the efforts are driven by business demands. More often than not, ethics has taken a back seat when applying machine learning and AI. Companies such as ClearView AI and Cambridge Analytica have vastly overreached in their analysis of consumer data because they could, using consumer data without explicit permission and offering nothing in return.

The pushback against these abuses has fueled greater consideration of ethical issues in data collection, machine-learning models, and the pursuit of AI-augmented services. These issues are no less significant within the walled garden of an organization. Employees deserve the same consideration from the internal groups that collect data, most often security operations and human resources.

The ethical considerations of data collection, data analysis, machine-learning models, and AI analytics need to be a focus of every security operations team. With time pressures and stretched resources, security teams often pursue the simplest approach to an operations problem, but ethics should never be ignored.

Here are three key guidelines to help your company use data collection, machine learning, and AI responsibly in your security operations.

1. Responsible AI use is limited AI use

While today's technology is far from the visions of humanlike AI put forth in popular media, the combination of vast datasets and better analytical techniques does deliver significant benefits that we did not have a decade ago. Machine-learning models and well-managed data can give security operations a significant advantage in detecting attackers.

However, the technology must be used responsibly. Companies should have strict policies in place to narrow the focus of any use of employee data for machine learning. Alerts should be based on behaviors: Did a user access systems at unusual times, from an unknown location, or in an anomalous way? However, no employees should be identified until enough evidence has been uncovered to suggest either that their device or identity has been compromised or that they are taking risky actions.

Similarly, the principle of data minimization is an important one for responsible AI: Collect only the minimum amount of data required to support your use case, and no more. Data that you do not store is data that cannot be breached, stolen, or compromised.

Increasingly, this sort of data collection will be covered by regulations, such as the EU's recently proposed regulation for ethical AI, often described as GDPR for AI. Security operations should treat violating such regulations as part of their threat model.

2. Understand your models

Companies need to understand the decision-making process of AI algorithms to understand their analysis. Do not make a model so complex that you cannot understand it. When purchasing technology from a vendor, make sure that its system explains any alert. Explainable AI is a trend that is critical to the development of ethical AI and the ability of companies to make informed decisions about whether the AI's recommendation is a good one.

The corollary of this is that models that cannot be understood should not be used at all, or only with significant warnings to users and restrictions on usage. Many systems, for example, allow the user to specify a certain sensitivity. Using that control to reduce sensitivity until one or more results appear will likely produce unwarranted investigations, potentially alienating employees.

The machine-learning architecture may be limited by the need to understand the model's conclusion. Deep neural networks often produce results that are hard to explain, especially when the network ingests massive datasets. If the analyst cannot understand the reason for the alert, the neural network will be difficult to trust.

Another part of understanding the model is to realize how the training data can cause bias and lead to poor conclusions. For example, if an insider threat model is trained on another company’s dataset where the threats entirely came from the corporate finance department, then the model may learn this bias and try to apply it, unfairly, to your company’s finance employees. While this bias is obvious, others are much more subtle.

3. Anonymize users until the investigation gathers enough evidence

On many security teams, everyone seems to know about "that user," the one who keeps clicking on bad attachments and links. They may think that identifying the user is a good thing, but security operations can be effective without knowing the identity of users. Instead, you focus on behaviors, systems, and traffic flows and only de-identify the user when there is sufficient evidence to warrant an investigation.

Companies now have systems that profile users and learn all about them. Is that a good thing? It is neutral, but abusing such systems is easy and poses significant ethical issues. And while notification can mitigate legal issues, it does not fully address the ethical issues.

Companies should not identify individuals whose systems or credentials are flagged as possibly compromised until enough evidence has been collected to warrant the uncloaking of the individual's identity. Human resources should own the uncloaking process and the identity of people, so as to minimize abuses by security operations analysts who are not trained in the laws covering employees.

Anonymization needs to be strong to prevent casual abuses. Despite the known limitations of anonymization methods, you have an ethical obligation to perform due diligence to respect the data privacy of the company you protect.

Get on it

Just thinking about the ethics of AI is not enough. Companies need to implement due diligence and best practices, spell out policies, have a division of labor to reinforce those policies, and communicate the organization's policies effectively.

Keep learning

Read more articles about: SecurityInformation Security