Me, myself and AI
When AI meets personal data
29 November 2017
According to the Artificial General Intelligence Society, Artificial Intelligence is “an emerging field aiming at the building of thinking machines; that is, general-purpose systems with intelligence comparable to that of the human mind (and perhaps ultimately well beyond human general intelligence).”
Machines rely on a vast input of data in order to effectively mimic (and sometimes surpass) a level of human intelligence, and more often than not this data is personal data. This could be data relating to an individual’s age, gender, web browsing history, online shopping habits, recent route taken on a weekly 10km run; the list goes on and, in the connected world we live in, the amount of personal data that we each generate is phenomenal and most likely far greater than we are actively aware of.
A simple example demonstrates how AI uses our personal data in seemingly innocuous ways: the music app, Spotify, feeds data relating to music listened to, playlists selected, artists browsed (all of which is attributable to an individual user and therefore personal data) into an algorithm which will then automatically suggest to the user new albums, playlists and similar artists.
It is therefore important that we know how our personal data is being used in algorithms which allow machines to be artificially intelligent. The General Data Protection Regulation (GDPR), which replaces the current Data Protection Act 1998 (DPA) will have direct effect in the UK from 25th May 2018. It aims to achieve a harmonised set of rules across Europe which are focussed on protecting the rights of individuals and ensuring fair and transparent processing of personal data. This includes specific language targeted at automated decision-making. Article 22(1) states:
"The data subject shall have the right not to be subject to a decision based solely on automated processing, including profiling, which produces legal effects concerning him or her or similarly significantly affects him or her."
Whilst the concepts of AI can, and likely will, be considered in the light of this “automated decision-making” provision, the provision itself is not necessarily fit for purpose. AI is a new and rapidly developing technology therefore the GDPR’s ability to deal with the implications of AI on the use of personal data may be limited.
‘Automated’ vs ‘autonomous’ decision-making
The problem with Article 22 is that, whilst it provides a blanket prohibition on “automated decision-making” which produces legal effects concerning the data subject or similarly significantly affects the data subject, it fails to define the meaning of the term, leaving it open to interpretation. Is it correct that decisions made by AI systems are considered to be automated? Perhaps just a matter of semantics? The issue was discussed in the House of Commons Science and Technology Committee report on robotics and artificial intelligence (although in the context of the definition of Robotics and Autonomous Systems, one of the “Eight Great Technologies” identified by the UK government in 2013 to form part of the UK's industrial strategy). The House of Commons report discussed that whilst both terms refer to processes which may be executed independently, from start to finish, without human intervention, “automated” is used to refer to processes which involve well-defined tasks and that produce consistent and predictable outcomes. “Autonomous”, on the other hand, refers to the capacity to learn, respond and adapt to situations that were not pre-programmed or anticipated in the design and it is within this term that AI squarely fits.
In its drafting of the GDPR, it is not clear whether this distinction was considered by legislators – it is likely it was not, and the intention is for “automated decision-making” to also include decisions made by AI systems; as the House of Commons report notes, the two concepts are sometimes used interchangeably and it does not identify any particular problem with this in practice. However, it is useful to explore this distinction between “automated” and “autonomous” because it brings into focus a key aspect of AI which makes it harder to regulate than other forms of automated processes - it is unpredictable, reactive and often produces results that humans cannot understand.
"Meaningful information about the logic involved"
One of the key ways that the GDPR promotes fair and transparent processing is by requiring data controllers to provide data subjects with specified information about the data processing (Articles 13 and 14). This includes ensuring that when automated decision-making, including profiling, is carried out, the data subject is provided with “meaningful information about the logic involved”. With some forms of automated decision-making this may be a relatively simple obligation to fulfil; for example, when a decision is made using pre-determined rules, the outputs can be traced back by following a logical trail. However, when AI is being used by a data controller to make decisions and profile individuals based on their personal data it is possible, and as AI continues to develop at unprecedented rates, increasingly likely, that humans will not be able to explain the logic behind a decision. As the distinction between "automated" and "autonomous" has illustrated, AI's ability to mimic human "thinking" means that it can create its own logic, meaning there is an inevitable opacity that makes it very difficult to understand the reasons for decisions made.
This is not an abstract problem. It is currently rare for AI systems to be set up to provide a reason for reaching a particular decision . For example, Google's AI system, DeepMind, has developed software called AlphaGo Zero which has been developed to play the game Go. Unlike its predecessor, AlphaGo, AlphaGo Zero did not have any pre-set moves inputted into it. It instead taught itself by playing millions of games against itself. By teaching itself the rules and principles of Go, AlphaGo Zero was able to beat Alpha Go in just three days (Alpha Go having beaten the (human) world champion Lee Sedol in 2016). AlphaGo Zero had developed intriguing new strategies of its own and had "genuine moments of creativity" . Humans could not explain some of the moves made by AlphaGo Zero and at times the moves made seemed illogical and wrong.
The ability for AI to learn and surpass human thinking is what makes it so exciting. But this also has huge implications for privacy, especially in relation to the security of personal data. How the GDPR is interpreted in the context of AI remains to be seen, but it is clear that this is an area which will require a balancing act from regulators, to encourage innovation and investment in AI whilst also ensuring fairness and transparency in the processing of personal data.
"Suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests"
Provided that the use of automated decision-making falls into one of the excepted situations as set out in the GDPR (Article 22(2)), the blanket prohibition will not apply. However, such automated decision-making is subject to some safeguarding requirements, including that the data controller must “implement suitable measures to safeguard the data subject’s rights and freedoms and legitimate interests” including “at least the right to obtain human intervention on the part of the controller” . Determining what will constitute "suitable measures" will require a case-by-case analysis by the data controller and may vary depending on the amount of personal data processed and the level of risk involved for the data subject's rights. In the context of decision-making by an AI system this exercise is particularly complex because traditional measures of verification and validation, including those which rely on human intervention, are not easily applied to AI. The verification and validation of AI systems has been described as "extremely challenging" , partly due to the fact that the focus has been on developing the algorithms to produce results, rather than on interpreting the algorithms themselves.
One potential form of human intervention which could be applied to AI is the use of a kill switch. This is code that would ensure that an AI system could "be repeatedly and safely interrupted by human overseers without the machines learning how to avoid or manipulate these interventions" . Whilst still in the early stages of development, a kill switch would give a data controller ultimate control over how personal data is used and would likely be viewed as a measure which effectively safeguards the data subject's rights and freedoms.
Another measure which could be implemented is algorithmic auditing. This is the idea that auditability is 'baked in' to algorithms in development stage to enable third parties to check, monitor, review and critique their behaviour .
A non-technical approach to ensuring the transparency of AI systems is the use of ethics boards. These can be used to specifically appraise and make decisions on the development and application of machine learning algorithms. Regular reporting to a board on the outcomes of an AI system and the effects on a data subject will give companies the opportunity to raise relevant questions and implement adjustments. For example, Google set up its own AI ethics board when it originally acquired the UK company DeepMind in 2014 .
AI is a new technology which is difficult to understand, explain and control. We should therefore be wary when combining AI and personal data; as the implementation of GDPR shows us, the world is waking up to the importance of privacy of personal data and the numerous systems used by companies, including AI, should be built to take this into consideration. It remains to be seen how the specific requirements set out in Articles 13, 14 and 22 are dealt with in practice, but it is clear that there are exciting and turbulent times ahead as regulators and data controllers across all industries start to grapple with these issues.