State of the art AI, part II: meet ChatGPT!28 Febbraio 2023 2023-02-28 10:00
State of the art AI, part II: meet ChatGPT!
State of the art AI, part II: meet ChatGPT!
In an article which appeared on this blog on February 14th, we have let a new acquaintance of ours present Valentine’s Day celebrations. The article was signed by ChatGPT, an Artificial Intelligence product developed by OpenAI, an American company specializing in the field.
Though the usage of the tool has skyrocketed since we interacted with it, in principle OpenAI has a free version that everybody can try by going to the dedicated webpage through the link we have posted and try ChatGPT for themselves, for instance asking the question “Tell me something about Valentine’s day in roughly 30 lines of text with a playful tone.”, which is what we asked in order to get our previous article, which e signed ChatGPT.
What is a Large Language Model (LLM)?
The fun does not stop here. ChatGPT is a LLM (Large Language Model) trained and optimized on millions of web pages and several manuals, especially about programming languages, thus mustering much of the knowledge publicly shared in written form out there on the internet.
What does this mean?
A LLM is a member of a class of mathematical algorithms known as Neural Networks. These are, very roughly speaking, mathematical structures with a huge amount of parameters that can be tuned. The edge they have over different kinds of programs, which gives them the flexibility and pattern-recognition ability which have made them so popular, is that the tuning is performed by exposing the mathematical structures to properly formatted real-world data. I have said something more about this when presenting Patrice Kiener’s talk in the first-ever R+ meeting, of which I wrote about here.
Long story short: the data that an LLM (and thus ChatGPT) is trained on are all human-written texts. The program literally reads and sorts the content of millions of web pages about all sorts of topics, including code documentation, and learns to predict the next word in a sentence in a way that is much similar to what humans do with their brains.
This is not properly true, but it is enough true for menial intellectual tasks, by which we mean acts of intelligence that do not require significant amounts of creativity and lateral thinking.
ChatGPT is superb at learning by heart, though it is not sentimental at all (pun intended), and has already passed several university-grade tests. Its capabilities include reorganizing what it has learnt into written language suited to the question asked of it by the end user. Wether this means that our university education system is broken and students are being asked to just stuff their brains with pre-packaged notions, instead of proving really human skills in solving real-world problems, is up for debate.
The fact that such achievements are unprecedented for a computer model stays true.
Harvesting knowledge by scraping the internet: what does ChatGPT know about Dentoni?
Speaking about this in abstract terms does too little justice to the magnificence of the OpenAI-trademarked accomplishment. It is much better to offer an instance of the model’s performance in a real, short dialogue, so please meet ChatGPT answering Mirko’s question about Dentoni:
And also a follow-up question, about Mirko’s personal obsession:
As you can see for yourself, the program knows very little about our business; moreover, what he knows is mixed up with knowledge about some other confectionery. For sure he cannot have learned such a thing as the fact that we are open since 1820 anywhere, so this information must come from misusing informations collected over several places out there on the internet.
This is a hint we can use to briefly explain how the model works. The few lines we are about to jot down next are -necessarily so, for a blog of this sort- a real injustice, if we compare their meager content to the momentous effort which has gone into creating the model, but they will provide an essential grasp to the reader, starting from a very concrete example.
Next word prediction algorithms: in what sense does ChatGPT know what it says?
The title of this section contains the essential intuition to understand the behind-the-scenes workings of ChatGPT. In its barren essence, it is a (very sophisticated) statistical tool which is able to do such a thing:
- Take the input text and split it up into sub-units which are ordered as a function of their importance with respect to the general context.
- Harvest data collected during its training phase, which includes hundreds of gigabytes of text collected from all over the place and supplemented with human knowledge, structuring them into a hierarchy of relevance for the user-submitted prompt.
- Use such data to generate an answer, which is outputted using a statistical analysis of the collected hierarchy of relevant concepts, putting words in line on the ground of the likelihood of their appearance next to each-other.
If it sounds sophisticated, it is because it is. And it is not perfect at all, as long as human text can be perfect. Indeed, the statistical nature of ChatGPT’s approach to context is apparent from the answer it gave to Mirko’s question about Pasticceria Dentoni: it contains some true information mixed with false data.
Boiled down to its core, the reason for this is that the amount of information about our small business which you can find on the internet is pretty scarce. Since ChatGPT (and Artificial Intelligence in general) does not tell apart concepts in the sense a human does, something about which I will elaborate in my next post, it is all but natural for the algorithm to mix this scarce information up with something which resembles it and that is judges to be likely to increase the relevance of the answer, at least on average.
To provide a counterexample to my reasoning, please meet ChatGPT answering a question about the best restaurant in the world:
Much more on point, isn’t it…? It even knows its most famous dish out of the box, whereas it is totally oblivious about our flagship cake, even when I ask it about it explicitly.
And rightly so, because Bottura’s venue is one of the best and most publicized restaurants in the entire world, therefore there has been way more web pages discussing it and, so, way more data on which ChatGPT has had a chance to adjust the statistical weights of its Neural Network.
How may knowledge work change because of ChatGPT?
As blurry and imprecise as ChatGPT’s knowledge may be when questioned about details of the world, the fact that it does get right the gist of several basic things and can write articles about any general knowledge topic in the span of a few minutes (I encourage you to try this for yourself) is potentially a revolution. A dangerous one in at least one sense, to be sure.
It takes a few seconds to observe that all present knowledge work which is based on menial and repetitive tasks is on the line to get swept away by advances in this line of technologies. Armies of people filling forms all day in administration jobs will potentially be replaced by a few of them supervising AI models toiling at such task more efficiently and rapidly than any human can do. It would just take to train the AI on massive amounts of specialized data, having the training supervised by experienced humans first.
Customer service operators based on Whatsapp messaging, which is already highly automated through rougher chatbots by companies providing highly standardized services, could equally be swept away, for the very same reason.
Even the job of reviewing academic papers could change dramatically: a first step to reject a paper could be to try to generate its content with such an AI model trained on millions of articles in the subject concerned. That would actually bring potentially beneficial effects, allowing the community to get rid of copycats very easily, easing the plague which has been affecting academia for some decades now.
Even tellers could be replaced by the combination of automatic cash machines and a LLM-based AI, which would provide the customer with all the information he/she may request before making a purchase.
We surely live in interesting times from the point of view of technological advancements. As I have argued elsewhere, this is not necessarily good for well-being: we are not trained to fend off its creepiest side effects, since we are evolutionarily biased to binge on them; but abundance does not rhyme with happiness and enjoyment.
Still, there is absolutely nothing we can do to call ourselves out of the game, at least as far as or lives require our involvement with society. Technology is not a force we can fight, because it gains momentum -irrespective of its initial disruptiveness- by massive amounts of individual level adoptions driven by unquestionable increased marginal utility at this level.
Put simply: everybody gets a phone because of the unquestionable comfort it implies in individual life; fast-forward twenty years, half of the world is hooked and often slave to their phones. But now try to turn back! Technology thrives on the temporal asymmetry of the spread of habits through complex human systems.
It is a force to be reckoned with.
As small as we are on the world scale, our business is investing very heavily in technology, automating everything which is possible to automate, from take-away orders filing, to self-service cash machines to fourth generation, computer-supervised pastry-making machines. Qualified human labor is getting harder and harder to find, especially on a seasonal basis, but P&L positivity is always as essential as ever to stay on top of the game.
Mirko Serino and Francesco Panarese