ChatGPT is mythical but not magical

Recently, I saw on an online forum someone desperately asking if he could use ChatGPT to help him schedule a work roster once he provides a long list of jobs planned, along with a list of available hours and worker skills. As mean as many of those people online can be, he was humiliated. A simple algorithm could solve this. Using Artificial Intelligence (AI) for this is overkill, and ChatGPT probably is not good at this task.

But this reveals how the uninformed see modern AI as a magic silver bullet for everything. Only it is not.

Understanding human language

We have been seeking ways to make computers understand human language for a long time. As early as the 1960s, we attempted to build the ELIZA – an early Natural Language Processing (NLP) computer program* — to let computers talk to humans using pattern matching and substitutions. Now we have ChatGPT, which has achieved such a decades-old goal. It has been a long journey, and various techniques have been attempted. A few years ago, we learned about a deep learning model called Transformer (for which the ‘T’ in GPT stands for), and recently we discovered that the transformer model at a large size can understand human languages well. Therefore, ChatGPT was built.

To understand it, the ‘GPT’ family of models is merely a word predictor. It can predict what goes next if presented with a sentence. Repeatedly providing the sentences and words to the model and asking it to predict what goes next can generate a long article, like it is writing something autonomously. However, like all machine learning models, they are trained with data first. ChatGPT was trained with a vast amount of text data, public and private. That is how ChatGPT learned our language and certain words that are more likely used together. It is also why ChatGPT would normally answer in English when an English sentence is presented, since we rarely see in any book a dialogue between English and German.

Therefore, the new AI technology you heard everyone talking about is not magical. You ask what you can make for dinner, and it can provide you with a list of 10 dishes because it learned ‘dinner’ is related to ‘food’, and it was trained with many recipe books which are all about food. It merely digs up what words are correlated and presents them based on a mathematical model. If, as Émile Borel imagined in the early 20th century, a vast number of monkeys keying on typewriters randomly might have the works of Shakespeare typed out**, then the large transformer model we have today is that monkey. We should not be surprised to see it.

Context, not magic

Adrian S.W. Tam, Ph.D.,
Director of Data Science,
Synechron

The better question is how the model could know what words correlate to the input. Amazingly, the input to the model can run from a few words to a few thousand words, but the model can understand them equally well. These transformer models remember the ‘context’ of your input rather than the exact words you provided. This is how the key ideas are distilled so that the response can be based upon them. We have seen that the model can react accurately and naturally in a context. It should be attractive to epistemologists (those who study how we know things) since we now have proof that ideas can be represented mathematically in the ‘context vector’. Mysteriously, since the deep learning model of the transformer is trained from data, rather than crafted from a human design, we don’t know how, or why it works.

We are all amused by seeing what the new AI models, such as ChatGPT, can do. What’s truly amazing is that the context vector can hold the meaning of a paragraph, and the finite-sized transformer models can talk like someone possessing an immeasurable amount of knowledge.

However, we should not treat it like a magic silver bullet until after we learn how it works. These models cannot truly answer our questions, or look up, or recall things. They are as innocent as the mathematical equations behind them. Should they be responding impolitely or insultingly, they are spoiled by our provided data.

Disclaimer: The views expressed in this article are those of the author and do not necessarily reflect the views of ET Edge Insights, its management, or its members

Disclaimer: The views expressed in this article are those of the author and do not necessarily reflect the views of ET Edge Insights, its management, or its members

Related Articles