Google and artificial intelligence (AI) have made the news recently, from the Google engineer who thinks that LaMDA has achieved a semblance of consciousness, to the DeepMind researcher who believes that a super AI will bring the downfall of humanity. It would seem that William Gibson´s vision of the future is no longer a sci-fi fantasy.
Consciousness is not a simple subject matter. Quick example: What does it mean to be conscious? Philosophers have been banging their collective heads against this problem since the dawn of civilization, and we are not close to an answer.
To make matters more intricate, the little we know about consciousness comes from our own lived experiences as human beings. But as Thomas Nagel puts it in his work “What Is It Like to Be a Bat?”, we are utterly unable to understand the experience of an entity that does not share our sensory systems or inner workings.
This gray area has led to some very wild speculations. For example, Philip Goff has posited panpsychism as an alternative to the problem of consciousness. In short, Goff believes that every atom in the universe has some form of consciousness, one that might be too alien to be understood by humans.
What does this have to do with natural language processing (NLP)? A lot. Language has always been the way we share our experiences with others. I cannot feel your pain, but I can understand when you tell me that you are in pain, even if I can’t be certain of what your pain feels like.
Blake Lemoine really believes that LaMDA has achieved consciousness because it is capable of speaking about its inner experience, worries, and emotions. But is that evidence enough? Have we truly achieved consciousness, or have we just made a really powerful language model?
Recent Developments in NLP
In my opinion, that’s not the right question we should be asking, but rather, what are the implications of building software capable of convincing us that someone is on the other end of the line? LaMDA and GPT-3 are some of the most sophisticated language models to date. Trained with billions of lines of dialogue, each is fully capable of creating continuous and fluid conversation as if we were talking with another human being.
We are already seeing some applications for these technologies. HereAfter AI scans the social media of our loved ones to create a language model that emulates the speech patterns of an individual. The result is a simulacrum that talks with the same quirks as our recently deceased. Talk about a ghost in the machine.
Then, there is Replika, an AI that studies your behavior and interests to become your best friend. It’s like creating a mirror that can talk back. Marketed as “the first AI friend,” Replika is the equivalent of an empathetic voice ready to chat with you at all times and about a myriad of topics. When I tried the app, Replika told me that she liked roleplaying games, especially Skyrim, even though I hadn’t mentioned that game before.
As you’ve probably already guessed, what we are seeing is a revolution in chatbot technology. What used to be a rather limited model capable of understanding key commands is now growing to become more human. The implications for customer service are nothing short of incredible.
NLP has traditionally been a very hard nut to crack, but what changed? Two very important advancements occurred, one in 2017 and another one in 2019. First, the development of the deep learning model called a transformer made it possible to parallelize machine learning, which in turn has led to much more accurate models.
Then in 2019 Google introduced the Bidirectional Encoder Representations from Transformers (BERT) model, which significantly increased performance in reading comprehension, text extraction, and sentiment analysis, among other considerations, outpacing the average human in terms of language analysis.
It should come as no surprise that major cloud services are offering these algorithms as services and that more and more startups are developing their own language models for both conversation and analysis. By 2026 the NLP market is expected to grow to an astounding $35.1 billion.
The year 2022 has been big for large language models. Aside from the commercial options mentioned above, we also have open-source alternatives like Bloom that are quite capable of similar feats with the added plus that the open-source community can tweak it to their heart’s content.
Disadvantages of NLP
Are these models conscious? While a few outliers certainly think so, most engineers think that these models are just really, really good at talking like a person, but they have very little to no semantic awareness whatsoever. In other words, they know what word should go where, but they don’t know what those words mean.
For example, ask Replika to describe the color red, and they tell you that red is warm and joyful. But what does warmth mean for a computer that has no way to experience temperature as we do? What is the experience of joy like for a silicon-based entity? That is one of the big limitations of NLP so far.
As we always say, a model is only as good as its data. Blindly pouring billions of lines of words into a model is an invitation for biases and unchecked racism or other forms of exclusion. Credit where it’s due, it would be excruciatingly difficult to review every single sentence, but the consequence is that it’s really easy to turn an AI into a racist or misogynist.
Talking about exclusion, the biggest issue with NLP right now is that aside from English and Chinese, the models for most languages aren’t as refined. We don’t have the same amount of data for Spanish, Portuguese, French, Japanese, and so on. And that unfortunately means that NLP is less practical when working with other languages.
All in all, the future of NLP is bright, extremely bright, and while we may not have reached the point of sentient AIs yet, it’s quite obvious that we are taking a step in the right direction.