The tractor rolled off the chair and fell on the floor, bouncing several times before coming to a screeching halt. “Papa, tractor dhamaaaal”, my two-year-old burst into laughter. It followed our 80-20 rule (80% Artificial Laughter). We have mastered the art of over-laughing. He jumps around the house until 2 in the morning. Unsurprisingly, the residents immediately below our apartment have often raised concerns about the noise. I caution my son, “Neeche uncle sleeping”, pointing my hand downwards. He repeats, “Neeche uncle sleeping”, touching the floor with one of his fingers. We laugh again, following the same 80-20 rule. I am pretty sure, he thinks, a tiny uncle is sleeping on our floor, invisible to our naked eyes, and hence he should not jump on him.

This is what he has learned from earlier occasions. For him, the AL (Artificial Laugh), playing at 2 in the morning, and a tiny invisible uncle sleeping on our floor is standard. He expects me to repeat the AL at social occasions. I hesitate. It will be another few years before his behaviour becomes more aligned with society’s expectations.

An LLM has been fed an enormous volume of training records collected online. And what does that content contain? Reflections on our society, narrations of our behaviour, and examples of our thoughts. It contains everything plus the biases we have been having. It includes financial, social, geographical, intellectual, and gendered inequalities. Biases get carried forward to the LLMs. The context and the relationship among words are stored numerically as vectors. If the word “women” is closely associated with “homemaking” in many online stories, the new content generated through various Generative AI tools would follow the same pattern. 1) Men protect women, 2) Score well to learn well, 3) Be rich to be good, 4) My religion is better than yours, 5) My country is better than yours, and 6) Being heterosexual is the only orientation are a few other examples of biases prevalent in society.

This means that LLMs are inherently as biased and stereotyped as us. The only entity capable of acting differently from the past is humans. One cannot expect LLMs to do that. They are trained to be us. We have to place a hell of a lot of barricades to reroute LLMs on specific instances. Adding to the problem, we do not have a complete list of such instances. We know only when something escalates out of hand.

This is no different from the way classic AIML models work. They give an impression of understanding real-world phenomena because they are trained with a vast number of past events. The best model would try to mimic something that happened in the past. You introduce one additional actor during the event and the model breaks. Bad decisions made in the past will lead the AIML model to recommend bad choices for the future. The bias passed on to the ML models is called algorithmic bias.

Our complete focus is on making the classical AIML or the Generative AI models as good as our past. How can we expect them to create a better future for humankind? My only answer to this question now is human-in-the-loop, barricades, guardrails, and regulations – a lot of them. Not to forget for humans to shed their egos, biases and inequalities as a society so that the training data we will have for building LLMs ten years from now represents stories of happy and ideal humans. I am intending to look forward to those LLMs. 

I am not a naysayer. I am pretty excited that we are making this technological progress. My passion and livelihood converge into data science. After all, you try to fix something you care about – human beings and, hence, Generative AI.

Linkedin
Disclaimer

Views expressed above are the author's own.

END OF ARTICLE