AI Security: Managing its Implications and Impact

By Dr Weng Jianshu, Head of SecureAI Lab and Mr Lim Tern Poh, Asst Head (AI Standards), AI Singapore

AI Singapore is a national AI programme launched by the National Research Foundation Singapore (NRF) to anchor deep national capabilities in artificial intelligence (AI) and grow local talent in order to build an AI ecosystem that will create positive social and economic impacts and put Singapore on the AI world map.

In the past few years, we have seen the emergence of more and more AI applications. At the same time, we have also witnessed the unintended consequences of AI. The Tesla autopilot, for example, is an AI application that brings us closer to autonomous driving. However in 2018, there was a collision involving a Tesla and a crash attenuator which killed a driver. According to investigations by the National Transportation Safety Board (NSTB), one of the main contributing factors leading to the crash was the failure of the Tesla autopilot system to detect the faulty crash attenuator (the latter was damaged and non-operational due to another accident 11 days earlier). Another example of an AI application is facial recognition, which has been deployed to facilitate immigration clearance at checkpoints4 in many countries.

However, facial recognition systems are also fallible. Using a printed mask showing a different person's face, security researchers were able to fool a number of facial recognition systems including those at a Chinese border checkpoint and a passport-control gate at Amsterdam's Schiphol Airport.

All these are important reminders that, with more AI models deployed in systems impacting human lives, it is critical to manage unintended consequences that could compromise human safety and national security.This article looks at what AI is, the security implications behind AI applications, and how to manage them.

What is AI?

An AI system is, to a large extent, a software system. But it is significantly different from conventional software that is constructed line by line in a particular programming language to give a computer explicit instructions on how to execute a particular task or tasks.

With AI (specifically machine learning), we no longer write code to tell the system what to do. Instead, we “teach” the system by feeding it with examples. In an email protection system for instance, it is not possible to program the software to recognise all possible permutations of malicious emails. Instead, through AI, the system develops the capability to recognise and detect malicious emails after being fed a sufficient volume of malicious and normal emails.

This process is called training and the examples that are fed to the AI system constitute the training data. Through the training process, the system develops a model which is capable of completing a task, for example, the detection of malicious emails.The model's effectiveness and accuracy is then evaluated using a separate set of examples called the testing data. When the model achieves the required level of performance, it is deployed into production together with a set of logic which enables it to interact with other system components or the external world. The interaction logic can be programmed into the system or developed via the training route, depending on how sophisticated it needs to be.

An AI system thus comprises three main components: data, model and the code.

Security implications

Traditional software systems are often secured by measures such as establishing a security perimeter to prevent intruders from gaining access to the system, and writing secure code to prevent exploits such as SQL injection. With AI systems, however, there are two additional dimensions that create new implications for security.

The first has to do with the data that is used to train the AI model. The volume and quality of the data are key to the effectiveness and accuracy of the AI model. This data is usually collected from sources that lie outside the security perimeter of an organisation, and could potentially expose the system to a new suite of attack vectors.

For example, attackers could inject bad data into the training data and cause the AI to extract wrong patterns from the poisoned training data. In the case of the email protection system, attackers could deliberately label malicious emails as normal emails and inject them into the training data. This pattern would then be extracted, and future malicious emails will be wrongly classified as normal, evading any attempts to filter them out.

Once the pattern is extracted by the model during training, it will be very difficult to detect the exploit as the pattern becomes deeply embedded in the parameters of the AI model. Another aspect of AI that has implications for the safe and secure use of the technology has to do with the processing of unseen data that is statistically different from the training data.

When training an AI model, the focus is usually on its ability to carry out generalisation, i.e. how well it is able to perform with unseen data. Using the email protection example, the model should be able to determine if any new email is malicious.To achieve this, the model is trained on one set of training data and then evaluated on a set of test data which it had not been exposed to during the training phase. The test data is usually derived from the same distribution as the training data, so that the evaluation could provide an unbiased estimation of the AI model’s performance.

A fundamental assumption here is that all future unseen data will be from the same distribution as the training data (in-distribution generalisation). However, in the real world, another type of unseen data is more common - unseen data that is statistically different from the training data.This type of unseen data can undermine the robustness of AI models and cause them to be brittle. An extreme example of this is where an AI model is trained and evaluated using images of cat and dog. But when the model is deployed, it is asked to recognise a bird. The AI model, trained using existing mainstream algorithms, is likely to recognise the bird as either a cat or a dog with a high degree of confidence.

The inability to handle this type of unseen data can have serious safety and security implications. In the 2018 collision involving the Tesla sport utility vehicle and the crash attenuator, the Tesla autopilot system’s failure to detect the faulty crash attenuator was a real- world example of the impact of unseen data.

To enhance the security of AI models and ensure that they can be deployed safely, the models need to be trained to go beyond in- distribution generalisation and work better with unseen data that is statistically different from the training data (out-of-distribution generalisation) as well.

The key to safe and secure AI, People, process and technology play a big role in addressing these issues and other safety and security implications of AI.

People and Process

The people and process aspects are addressed by AI Singapore as part of its AI Readiness Index (AIRI).

AIRI classifies organisations into AI Unaware, AI Aware, AI Ready or AI Competent, and suggests the type of AI solutions that an organisation could consider adopting to progress in its AI journey. The classification is based on nine dimensions, three of which directly address the role of people and process in ensuring safety and security in the use of AI. They are: AI Literacy, AI Governance and Data Quality.

AI Literacy looks at the percentage of employees in the organisation who are AI literate. In an AI- literate organisation, the management understands the potential risks arising from the use of AI and is able to plan for precautionary measures that will enable AI applications to fail safely. Employees in an AI-literate organisation also understand the limitations of AI applications and along with it, the importance of having good AI Governance.

The AI Governance dimension of AIRI looks at whether an organisation has policies in place to guide the development and use of AI solutions. In today’s context, AI applications cannot be focused solely on accuracy, especially when deployed in sensitive areas such as healthcare. AI applications have to be fair, explainable and robust in order the garner the trust of users. To this end, Singapore has released the Model AI Governance Framework to provide guidance on addressing ethical and governance issues when deploying AI solutions.

Good data quality is one of the cornerstones of AI governance. The Data Quality dimension of AIRI looks at whether an organisation has processes in place to ensure the quality of data collected. An organisation with good data quality practices will be less vulnerable to security issues such as data poisoning, as there will be controls that prevent malicious data from finding its way into the training dataset for model training. Given the importance of data quality, Singapore has published guidelines (Technical Reference 41: 2015 Data Quality Metrics) for the adoption of a baseline set of data quality metrics.

Technology

To address the new security and safety challenges arising from the use of AI, security professionals will need to be equipped with new technologies and tools.

For example, to address the need for AI models to handle out-of-distribution generalisation, AI Singapore is working with a partner to develop a tool that is able to evaluate this capability in a trained AI model.

The tool works in a similar way to fuzz testing on conventional software, where invalid or random data is provided as inputs to a computer program. In the case of AI, every time a new model is trained, the tool will generate new batches of data that are statistically different from the training data. The data that is generated is then fed to the trained AI model, and then the test oracle compares the output with the expected output label. Depending on the format of the training data, different mechanisms can be used to generate new inputs for the AI model. For example, for image data, new data can be generated by applying mutations (such as rotation, shift, shear, flips, etc.) to the original images.

There are, however, some difficulties in applying fuzz testing on AI models. It is not easy to determine the expected output label for the test oracle, which is a mechanism for determining whether the model has passed or failed a test.

To address this, the tool that is being developed exploits metamorphic relationships to decide on the expected output. The idea is that even if the correct model output of a single input is not known, it is still possible to establish a relationship between the outputs of multiple inputs. In the case of image data, for example, it is assumed that the mutations applied will not change the semantic meaning of the image i.e. an image of a cat should remain an image of a cat even after applying the mutations. The model is considered to fail on the image if the output is not a cat.

Applying the tool, the model's out-of-distribution generalisation capabilities can be seen in the percentage of new data that causes the model to fail. The lower the percentage, the better the generalisation capabilities of the model.

If a model’s out-of-distribution generalisation is poor, the model is considered brittle and should not be deployed into production. The tool currently supports models trained with image or text data and provides additional information to help improve the original model. For example, it can flag out which classes of images are more brittle and require more data to re-train the model.

Conclusion

Ensuring the security of AI systems is trickier than securing traditional software due to the additional aspects of data and model training, which could lead to new vulnerabilities such as data poisoning and model evasion. With AI playing a growing role in different aspects of our lives, these vulnerabilities could have wider security and safety implications that have to be addressed with the right combination of people, process and technology.

AI Security: Managing its Implications and Impact

Recent Posts

Comments