In the long-established field of artificial intelligence (AI), a new dawn has emerged with the advent of generative AI. Generative AI is a branch of artificial intelligence that creates new content, such as images or text, that appears to be created by humans. This revolutionary technology has captured the attention of researchers, developers, and the wider public and sparked a myriad of legal and ethical considerations.
Currently there is little to no regulation in this area – I have heard the current landscape described as ‘the wild west’ more than once. However, as generative AI continues its exponential rise, the time has come for us to navigate the complex web of legal uncertainties and ethical implications that accompany its rapid ascent. Whilst the legal implications (intellectual property, liability, privacy etc.) are becoming clear, the ethical considerations might be more nuanced.
Generative AI is ‘trained’ on data, which consists predominantly of sources from the internet, and mimics the style, patterns, and characteristics of that training data. These data sets are vast and reflect patterns and information up until the period in which they were acquired. For example, the current version of ChatGPT is trained on data until September 2021. Unlike Google, it doesn’t access the internet ‘live’. If you were to ask it who won the Premier League this year it would know as much as I would (absolutely nothing if that wasn’t clear) but could probably tell you how successful a 4-4-2 formation was in 2020 (ok, my daughter plays football, I know a little).
Where technology is trained on a data set the dangers are clear – what if the training data itself is biased? This isn’t a new concept; we have seen many examples of this already, including sampling bias in facial recognition technologies and medical testing. However, as we find ourselves on the precipice of an AI revolution this has the potential to infiltrate ever more aspects of our everyday lives.
Biased models will amplify and perpetuate existing societal biases, leading to discriminatory outcomes, and surely the last thing we want to do is increase inequality. If I’m right, then AI will eventually be incorporated into everything from recruitment, to teaching and to criminal justice.
Consider a scenario for example where AI associates your name with negative outcomes. Again, we don’t need to imagine this as it has already happened. Like the algorithm used by officials in Florida that systematically judged white offenders as being less likely to reoffend than their black counterparts or the time that Amazon had to bin their AI recruitment tool that didn’t rate women much. Actually, the Amazon recruitment tool is a good example of bias in the training data. The tool was much less likely to identify female candidates for technical jobs because, historically, Amazon had many more male applicants and employees in these roles. This was reflected in the training data and therefore in the output.
To be clear, I’m not for a minute suggesting that humans are not biased. Even those of us with the purest intent will have some unconscious bias, including those with an impact on our everyday lives: judges, politicians, doctors, police officers etc.. The difference is that we have introduced legislation and developed systems of oversight to mitigate this as best we can.
Temporal bias also needs to be considered as societal norms shift over time. If the training data does not accurately represent the current context, then the relevance of the output will be compromised. Pre-Covid and post-Covid AI would probably diverge on some topics for example.
Another area of concern from an ethical standpoint is misinformation. In this post-truth era where bots are used to influence presidential elections and many people genuinely believe that Covid vaccines are a cover for Bill Gates to microchip us all, the consequences of advancing AI have to be seriously considered. If an AI model is trained on a dataset that includes false or misleading information, it could inadvertently generate or spread misinformation, with the potential to reach a vast audience undetected. Algorithmic bias can also lead to the magnification of misinformation that aligns with particular ideologies and exacerbate social divisions.
Remember that the AI does not have morals in and of itself and it doesn’t recognise the concept of ‘truth’ in the same way that humans do. When I asked ChatGPT if it understood truth, it told me that it does not ‘possess personal beliefs, emotions, or subjective understanding’ and that its ‘responses are generated by analysing the patterns in the data (it has) been trained on, rather than by having a personal understanding of truth’.
So where do we go from here? How do we address these issues? The fact that a considerable part of AI development is contained within large organisations puts the moral onus on these companies to acknowledge and mitigate these issues. It is perhaps a little ominous then that Microsoft seem to have laid off their entire department for ethical AI development.
Addressing biases in training data requires careful data collection, processing, and ongoing evaluation. It will involve techniques such as diverse sampling, bias-aware learning, and regular auditing of model outputs. Furthermore, by actively addressing temporal bias, AI models can better reflect the current reality and maintain their relevance over time. This can be achieved through more regular training data updates.
It will be crucial going forward to build a gold standard for developing responsible AI. Developers will need to recognise and fulfil their role as the moral gatekeepers of this technology, whether by following best practice or (inevitably) by adhering to legislation.