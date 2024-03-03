A crucial phase in the AI value chain, data annotation, involves labelling data with useful information of its contents to train the AI model, so that it can interpret and process the data. Typically involving severely under-paid workers in certain parts of the majority world, like Kenya, Pakistan and India, data annotation can get highly subjective. Recent research suggests data annotation may be impacted by the gender of the annotator, their race, and their proximity to the data that is being annotated (based on their identity).

In a 2021 academic study involving 291 racially-diverse annotators from Amazon, Mechanical Turk and college classrooms, researchers found that, depending on the topic covered, race played a role in how annotators labelled racially-charged tweets.

On the topic of police brutality, a tweet saying, “Lorenzo Clerkley, a 14-year-old black kid who was with friends playing with a BB gun in broad daylight was shot 4 times by an officer after being given 0.6 second warnings” was considered “moderately positive” by White raters but, on average, considered “neutral” by non-White raters. Such subjectivities are bound to seep through, at the level of annotation, algorithm optimisation, and application design because humans (and their biases) make up a core part of AI. The fix for bias is not straightforward and not foolproof.

While ‘de-biasing’ strategies exist, there is no guarantee that the model, or the humans in the loop, will abide by those parameters —and if they do, abide in a way that makes sense. Recently, Google took down its AI image generator, Gemini, after netizens pointed out that the model generates “woke” images of Black men and non-white women when asked to provide images of a pope or a Founding Father of America. In an attempt to preempt how users might react to historically accurate images, Google tried to get ahead by employing de-biasing techniques that would produce images that “erred towards [a] ‘dream world’ approach”. The issue with that is, outside of a historically inaccurate, innocent image of a Black pope, you might also get racially diverse Nazis.

Former co-lead of Ethical AI at Google and Chief Ethics Scientist at Hugging Face, Margaret Mitchell offered that Google probably “[added] ethnic diversity terms to user prompts ‘under-the-hood’”, so that the range of photos generated would be diverse. According to Mitchell, Google could have also tweaked the model to prioritise showing users images with darker skin tones. Google’s attempts at de-biasing the model backfired because these are “post-hoc solutions” implemented after the model has been trained, neglecting the data issue.

Other de-biasing strategies include augmenting training data with counterfactual data, like changing “The doctor is a man.” to “The doctor is a woman.” in the original dataset and fine-tuning the model; de-biasing word embeddings, i.e. the associations between representations of words (like the association between ‘woman’ and ‘housework’, for example); and reducing bias amplification in algorithms, which sets constraints around a model’s optimisation function to course-correct for existing bias. While these de-biasing strategies attempt to target various parts of the model and its outputs, they might not be enough. Research suggests that these strategies seem to work only in limited contexts, merely hide or distort bias (like with Google’s Gemini), can add “noise” to the models, and might not even be wholly effective on multilingual (Indic) models. More importantly, these de-biasing strategies tend to circumvent the data issue and go straight to the output issue—as though adding buttercream frosting to a salty cake will save it.

If we are training models on existing data that is biased, and can do little to cut down the various human touch points along the AI value chain, where does that leave us in our endeavour to innovate for tomorrow? Maybe a better question is ‘How must we endeavour to innovate for tomorrow?’ Perhaps, a more strategic alternative is that we take full ownership of the guarantee of these biases, and their repercussions, and strive to be more discerning and realistic with how we develop and deploy AI.

Urvashi Aneja, Founder & Director, Digital Futures Lab

Sasha John, Research Associate, Digital Futures Lab

(Views are personal)