Back in 2008, when the first Iron Man was released, audiences were blown away by it. It was a great origin story that combined action, character development, and funny one-liners into one gripping film. As years went by, that movie still feels like a fan favorite (at least among those that love Marvel movies). But even with all the love and praise (and all the time and subsequent films that went by after), there are people (like me!) that still watch it and think: “could we develop all of this tech?”
That’s especially true when it comes to J.A.R.V.I.S., that cocky know-it-all digital butler that runs Tony Stark’s house. Having a powerful assistant as that A.I. has been a dream of many superhero fans since that movie came out, mainly because it would be so awesome to have a digital system so capable and witty (and voiced by Paul Bettany, of course).
In fact, many people have been chasing that dream ever since. If you make a quick Google search, you’ll find that there are people discussing the J.A.R.V.I.S. program code and even building something like it. Of course, in those discussions, you’ll quickly find developers that point out the many problems that come with trying to take on such a project.
Fortunately, we have come a long way since 2008 in artificial intelligence development, so we have more expertise, experience, and tools to make it happen. Still, some questions remain: how many J.A.R.V.I.S. tasks could we actually code into a similar AI? How many we can’t? What are the challenges? What are some of the considerations from the technical perspective? And, the most important question of all – could we build it?
Let’s try to answer all of these questions.
How many J.A.R.V.I.S. tasks could we code now?
It would be easy to answer the question in the title with the (many) challenges involved in building a J.A.R.V.I.S.-like system, say “not right now”, and move on. But focusing on the negative would prevent us from seeing the many things that we could actually do – which aren’t few! A quick recap of J.A.R.V.I.S. capabilities can help us with that.
In the beginning, J.A.R.V.I.S. (an acronym for Just A Rather Very Intelligent System) was a natural-language user interface. That means that Tony Stark could use his voice to issue commands that J.A.R.V.I.S. executed. That definitely should sound familiar to you, especially since Siri (considered the first digital virtual assistant) was released in 2011 along with the iPhone 4S.
Today, virtual assistants are extremely common. In fact, virtually all smartphones have one. The rise in popularity of these assistants has impacted our digital lives so much that voice-activated interfaces are now a trend in the UI world. What I’m trying to say is that we arguably already have that primitive version of J.A.R.V.I.S. (albeit less sophisticated and funny).
J.A.R.V.I.S. didn’t stay as a user interface, though. Over time, it was upgraded into a fully-fledged artificial intelligence system that ran the entire Stark Industries businesses and the security for Stark Tower and Tony Stark’s Mansion. It also ended up uploaded into all of the Iron Man Armors to assist the hero wherever he went.
In doing so, it tackled a lot of things. There are scenes in the Iron Man movies that show J.A.R.V.I.S. alerting Tony about problems with his armor and suggesting improvements. There’s a scene in which it diagnoses Tony with an anxiety attack. There’s also the scene in which it functions as an alarm clock, turns on the lights in Tony’s mansion, and prevents unauthorized access to specific parts of the home.
Some of those things are already a reality, but fairly limited when compared to J.A.R.V.I.S. Smart homes are becoming a thing now and will be more popular as 5G and IoT devices become more common. We can now enjoy many smart appliances from fridges and lights to thermostats and windows. We can even install smart locks in our doors to increase our security (even when the jury is still out on that one).
The combination of 5G and the Internet of Things is also making it possible to smarten up pretty much every tech we have around us. Though we are still very, very far from having flying suits, driverless cars are closer to the markets than they ever were. In a sense, the onboard systems in them are J.A.R.V.I.S.-like in that they take care of the driving, diagnose the vehicle, choose the best route to a destination, and more.
Finally, the healthcare industry is really taking strides forward when it comes to AI in its field. Researchers are developing algorithms to work with data coming from wearable devices to help with diagnoses and treatments. It’s not crazy to think that an AI could pinpoint a panic attack based on vital signs, especially because we already have some algorithms that can help analyze and treat psychological patients based on their symptoms.
As you can see, there are plenty of things we already have that are similar to what J.A.R.V.I.S. can do. But we are still far from having a J.A.R.V.I.S.-like system.
What J.A.R.V.I.S. tasks we can’t code?
A super-advanced system like J.A.R.V.I.S. has many complex sides that we can’t still get. Let’s start with the most obvious one – we can’t code a personality as strong as J.A.R.V.I.S.’s. Sure, we can infuse some jokes and some witty responses and train a machine-learning algorithm to talk back sarcastically but the result is more like a compound of personality fragments rather than a full-on personality.
Without getting too much into psychology, let’s just say that human personality is extremely complex (and not completely understood, even now). Trying to artificially recreate it would be an impossible task mainly because we don’t truly know all of the components that make up a personality. We can mimic it but we still can’t create an algorithm that develops a personality of its own. If we don’t train an AI to have a personality (providing it with references and training datasets) we won’t get a witty (or a shy, lovable, whatever) personality.
And then there’s the human-like capability of J.A.R.V.I.S. that allows it to understand its surroundings, assess the situations, and talk with humans as if it was a human itself. Even with all the advances in deep learning and neural networks, we still can’t get to that level of reasoning.
Again, we can imitate the parts we know of but there’s something more that comes with human reason than that. There’s more than just experience and analysis of potential outcomes – there’s intuition, social cues, emotions, and other complex human aspects that we can (voluntarily) code into an AI.
Just those two things are a massive challenge for anyone trying to recreate J.A.R.V.I.S, especially because we’re not that accustomed to take psychology into consideration in software development. That’s because what separates J.A.R.V.I.S. from other intelligent systems is how it is one more of the gang. Even when it lives in the cloud and doesn’t have a face (well, at least until it becomes Vision), it’s just another character in the movie.
In a sense, J.A.R.V.I.S. is closer to the AI in the movie Her or to Genesys in that awful terminator movie – they are human-like in virtually every sense of the word, something we (especially us, developers) can’t fully understand just yet.
What are the challenges of building a J.A.R.V.I.S.- like system?
Aside from those things that we can’t code, we could give it a shot to building a system like J.A.R.V.I.S. Doing so would imply a well-thought planning stage that boils down the technical capabilities of the AI to its core features, so we can later decide how we could create each and one of them.
I’ll discuss some technical considerations below, so I’ll stop here on the parts that feel the most difficult for me when building J.A.R.V.I.S. First, there’s the massive processing power it’ll take to run such a comprehensive system. It’s not something unthinkable or impossible to achieve, mind you, as we could rely on cloud computing to garner enough to run a system that powers multiple devices across different locations.
It’s the incredible speed at which J.A.R.V.I.S. acts that is hard to get. Tony talks to it and it gets immediate and thorough assessments of anything he wants. J.A.R.V.I.S. always has a quick response, many times adorned with sarcastic remarks or jokes. Such a response time needs a combination of a highly comprehensive database and impressive processing power that guarantees real-time answers.
A combination of edge computing and 5G might pull the trick but in the current landscape, we’d still suffer from lags in the response times. This brings me to the other big challenge in building J.A.R.V.I.S. – its amazing, glitch-free connectivity. I know that that is a challenge that exceeds the creation of the system itself (after all, connectivity doesn’t just depend on how good the connected system actually is but also on the infrastructure).
But that’s sort of the point. Even if we are capable of creating an amazing “receiving end” to embed in the different devices our J.A.R.V.I.S. would control, we’d still have to rely on our current mobile networks. That means that we would need a widely deployed 5G network that lives up to its full potential to (maybe) guarantee that sort of connectivity. That, of course, is not the case today.
Our current 5G still lacks range and density, while its underlying technology doesn’t deliver the massive speeds and the reduced latency that it has promised. Naturally, this is a challenge we might sort out later on but, as of today, a J.A.R.V.I.S.-like system would suffer greatly from the networks we have available.
Some technical considerations
Finally, I’d like to go over some of the technical considerations surrounding such an endeavor. I’ve seen some discussions around the best programming languages to use when building J.A.R.V.I.S., its architecture, and the processing concerns that come with it. I think some of those are somewhat obsolete discussions, mainly because of one powerful reason: we have IBM Watson out there showing us the way.
I haven’t mentioned Watson up until now on purpose, as I think that IBM’s AI is the best path we have to build J.A.R.V.I.S. We could go on and on about the programming languages and the benefits of using C, C++, JAVA, Ruby, or Python (all of which are perfect languages for developing AI solutions). We could discuss processing architectures (including clustered CPUs with arrays of GPUs) to power the system. But I firmly believe that we could see IBM as some sort of parent to a J.A.R.V.I.S.-like system.
That’s because IBM’s AI has enough tools, embedded know-how, and processing capabilities on which we can try to build J.A.R.V.I.S. With that in mind, I think it’s more valuable to discuss the development process that would go into doing so. To begin with, I’d divide the project into 3 parts: hardware, networking, and machine learning. Each of those parts calls for a different language and has its own technical considerations that exceed the purpose of this article.
It should be enough to say that the AI-part should go through a process in which we create a simple core program to learn, feed it curated databases related to the tasks we want it to do, and follow a cycle of observation, analysis, and adjustment that loops to refine the knowledge. Creating the core program is a technical challenge in itself but I feel it’s nothing compared to preparing the training databases for it. Depending on the tasks we want our J.A.R.V.I.S. to do (and we want basically a human in algorithm form) we would need massive databases covering everything from natural language processing to optical recognition.
I’d like to finally bring to the table something I’ve read in a comment in a J.A.R.V.I.S.-like project. It said that we should also consider the AI effect. According to Tesler’s Theorem “AI is whatever hasn’t been done yet.” That means that the things our AI algorithms are already capable of doing (such as optical character recognition) aren’t AI anymore, as they are now routine technology. This means that J.A.R.V.I.S, a super-advanced AI, needs to go beyond the current capabilities we already know of.
That reinforces my opinion of Watson as a foundation for a J.A.R.V.I.S-like system. We already know what we’re capable of. Developing J.A.R.V.I.S (or something similar) should take those capabilities as a stepping stone to jump to the next level. So, to answer the question in the title: yes, we could develop a J.A.R.V.I.S, but we are in the learning process to get there.