Interview
"Understanding language is the next big challenge for AI,” says Amnon Shashua
The latest startup by Mobileye’s founder, AI21, is working to revolutionize written communication by creating artificial intelligence software that can understand and create written text
The biggest breakthrough in the field, Shoham said, happened when Google introduced a new algorithm called BERT (Bidirectional Encoder Representations from Transformers) that was meant to improve search results by helping the computer decipher what users mean.
BERT, Shoham said, lets computers understand the context and differentiate between the different meanings of a specific word, offering different algebraic representations according to context. “It is nothing short of a revolution in an AI’s ability to understand text,” he added.
Similar systems developed by other tech giants like Facebook and Microsoft have also played a part and since then algorithms continue to become more and more sophisticated, Shoham said. Everyone uses the same basic engine, he explained, but each company fine-tunes it towards its own mission. “For Google, this means search, translation, or answering questions,” he said, “for us, it is reading and phrasing.” According to Shoham, AI21 has built upon BERT to add the aspect of semantic representation.
These generators are still unripe and, at times, fidgety, but they demonstrate some of the capabilities AI21 is working to develop. Most of all, they are testimonies to the grand ambitions of this young company, which strives to give computers what no one could before—the ability to truly understand written text.
The goal is transitioning from a computer that fixes your grammar to a computer that can write a whole text for you, AI21’s chairman Amnon Shashua said in a recent interview with Calcalist. “You just need to throw your ideas at it in the right order and it creates a text that preserves their original meaning,” he said.
It is difficult to exaggerate how presumptuous this actually is. Algorithms that attempt to communicate with humans have been around since the 1960s, with the development of ELIZA, a mock psychotherapist, by MIT researchers.
Nowadays, numerous websites have chatbots, meaning algorithms that interact with users, but their main function is to collect information that can later be handled by a human operator. These new bots are not meant to understand what users are actually saying.
Even the automatic translation service offered by Google—one of the biggest pioneers in the field of AI—still has trouble translating a paragraph coherently and a perfect translation of a whole page is completely out of the question.
To put things simply, a computer’s ability to process abstract notions has hardly evolved since AI’s early days.
Shashua and his partners are going even further than adequate translation. They are trying to decipher the user’s thoughts and put them down on virtual paper. If it works—and you have every right to be skeptical—it will not be just another technological advancement but a huge leap forward.
"Nowadays, the computer doesn’t write for you, at best, it offers you a synonym,” Shashua said. “Some software also corrects your spelling and grammar or lets you know when you have used the same word twice but that is about it,” he added. “The computer does not write for you because existing automation tools cannot understand what it is you are trying to write.”
To reach the ultimate goal of a computer that understands text, AI21 gathered an impressive collection of local and international experts.
The company was founded in 2017 by Yoav Shoham, a computer scientist and Professor Emeritus at Stanford University. Shoham already has five AI-related exits under his belt and his partner, Ori Goshen, was among the co-founders of cellular communication radiation measurement startup CrowdX, incorporated as Tawkon Ltd., which was acquired by Singapourian mobile network company Cellwize Wireless Technologies Pte. Ltd. in 2016.
AI21’s team of external experts includes Shai Shalev-Shwartz, the chief technology officer of Jerusalem-headquartered automotive chipmaker Mobileye and a renowned machine learning scholar; Omri Abend, a member of the faculty at the Department of Cognitive Science at the Hebrew University of Jerusalem and a leading researcher of computational linguistics and natural language processing; and Daniel Jurafsky, a professor of Linguistics and Computer Science at Stanford and one of the developers of the first automatic system for semantic role labeling. AI21 already raised $9.5 million in seed, most of it from Shoham and Shashua, who is also the founder and CEO of Mobileye. The rest of the funding came from Pitango Venture Capital, VC8, and TPY Capital. Shashua joined AI21 after meeting with Shoham to promote a completely different venture he chairs, called WeCode, a coding boot camp for underprivileged populations in Israel. The conversation found its way to Shoham’s new project and Shashua, the serial entrepreneur that he is, decided to join as a partner and main backer. Since Mobileye’s $15.3 billion acquisition by Intel in 2017, Shashua became the hottest name in Israeli tech. On top of leading Mobileye, Shashua also found the time to co-found visual aid company OrCam Technologies Ltd., consult the Israeli Prime Minister on dealing with the coronavirus (Covid-19) crisis, and establish the first digital bank to be licensed in Israel.
If you examine what people are doing at the office, you realize that most of the time we are reading and writing, Shashua said. “The writing part could become dramatically more efficient if only we had a partner that could write,” he said, “the computer can be this partner.”
Who needs a tool that can write?
"Anyone that needs to write and it is a very wide spectrum, at the very end of which are poets and playwrights, but we have a long way to go before we reach that level. At the closer end of the spectrum are, say, insurance agents. I, for example, entered the digital banking sector a year ago. Instead of me having to read 100 essays and summarize them, which could take a month, I could skim through them for two hours, highlighting the interesting segments, and the computer would connect the dots and turn them into a coherent document, saving me most of the technical labor.”
According to Shashua, the system could summarize a single essay or a series of publications in a certain field, saving us the need to read them. But AI21’s vision goes much further. The idea is that the system could function as a ghostwriter, especially for technical texts one is required to write as part of their job.
All they would have to do is provide the system with a series of ideas and it would automatically turn them into an organized coherent text, that even maintains their personal writing style. The user could also control the relevant parameters: need a short and concise text fit for a Facebook post? No sweat. Want a more detailed text to send to the CEO? Just say the word.
Mobileye did similar things as it brought AI into the automotive industry but, still, even that technology has yet to mature and there are no fully autonomous vehicles available. And driving is a relatively simple technical act. Understanding text, not to mention writing it, is far more difficult.
"Understanding language is the next big challenge for AI,” Shashua said, adding he believes they will get there in two to three years. Or, maybe, five.
If the previous decade was all about computer vision and its various applications, including facial recognition technologies and self-driving cars, this decade will be language’s turn to shine, Shoham said. “The developments in this field are nothing short of amazing,” he said,” we are at the forefront but there are still significant barriers that need to be broken to turn the computer into a reading and writing collaborator.”
To exemplify how difficult that is, Shoham pointed to the way children speak. “A five-year-old comes home and says: ‘Danny hit me at the kindergarten, I hit him back, then the teacher saw me hitting and punished me, it isn’t fair.’ Now, any person would understand this sentence but no computer in the world can even begin to comprehend it,” Shoham said. “Consider how many layers of meaning it has: there is a series of events happening over time, people taking action, and a causal link between these actions, one person knew something unknown by someone else, there are feelings and abstract notions, such as justice and fairness,” he listed. “These are things that, to us, or to a five-year-old, are very intuitive, but that is not the case for a computer.”
Related Stories
The biggest breakthrough in the field, Shoham said, happened when Google introduced a new algorithm called BERT (Bidirectional Encoder Representations from Transformers) that was meant to improve search results by helping the computer decipher what users mean.
BERT, Shoham said, lets computers understand the context and differentiate between the different meanings of a specific word, offering different algebraic representations according to context. “It is nothing short of a revolution in an AI’s ability to understand text,” he added.
Similar systems developed by other tech giants like Facebook and Microsoft have also played a part and since then algorithms continue to become more and more sophisticated, Shoham said. Everyone uses the same basic engine, he explained, but each company fine-tunes it towards its own mission. “For Google, this means search, translation, or answering questions,” he said, “for us, it is reading and phrasing.” According to Shoham, AI21 has built upon BERT to add the aspect of semantic representation.
To explain what semantic representation is, Shoham offers the following example: I want to sleep in my bed versus, I walked by the river bed. The word bed appears in both sentences but with completely different meanings. “Our system codes the word bed differently for each sentence also noting the semantic difference between them,” Shoham said.
"If a certain sentence refers to a transaction, it is not going to be enough just to know what each word means,” Shashua explained. “You need to understand that these words, put together, describe a situation that includes a buyer, a seller, a product, and a price,” he said. “It is not enough to just identify words, every word in the paragraph must remain within a context that describes the situation.”
According to Shoham, language is the hottest area of AI. “Even Apple’s Siri and Amazon’s Alexa are just systems trying to decipher text,” he said.
And what makes you think a small company, such as AI21, could improve on an idea concocted by Google, a giant company with unlimited resources that lives off this very field of text analysis?
"Why did Mobileye make it? Google is a glorious company but there is room for startups in a field where there are also giants,” Shashua said. “Like every other startup, we rely on our innovation and our workforce. Combine these with focus, drive, experience, and entrepreneurship and you could have a miracle,” he added.
"Google has product developers that must reach certain benchmarks in terms of revenue,” Shoham said, “which means they can’t really think long term. I know Google well, I sold it two companies. It is a wonderful company with very bright people, but their ability to truly innovate is close to nonexistent. That is why startups are so successful and then big companies buy them.”
"I won’t say anything bad about Google,” Shashua said, “but innovation is for smaller teams. A big company has clear financial goals and is less open to it.”
In the future, AI21 intends to support multiple languages but right now it is focused solely on English, Goshen said. “There are 700 million people in the world who are not native in English but have to communicate in it,” he said. “When we are talking about help with phrasing, that is where the difficulty is most prominent.”
According to Shashua, HAIM and HAIMKE are building blocks and not actual products. “We released them for the public to play with and we get feedback on their output,” he said.
AI21 intends to launch its first product by the end of the year, Shashua said, adding its target audience will be people who need to produce documents at work. The challenge, he explained, is for the computer to be able to not just add volume but to stay in the right semantic field, creating a text that makes sense and has value.
Shashua believes that computers that understand language and can express complex notions will make people more creative. “This is what happened with electronic calculators,” he said, “instead of wasting energy on extracting roots, our creativity shifted to solving equations that are far more significant. In much the same way, if we could mechanize large portions of reading and writing, we can use the time saved to become more creative.”
To concerns that a writing computer will diminish human literary accomplishments, Shashua replied with an example from a different creative field. “A few decades ago, in order to produce a musical piece, you needed a very high level of musical education,” he said. “Now, you have a lot of options to use existing musical segments: you have mixing tools and you can cut and paste, and this lets people express themselves. You no longer have to start drawing notes on a blank piece of paper,” he explained.
It is likely that true musical genius cannot be mechanized and perhaps these works will not become canonical, but it democratizes creativity, Shashua said. “You only get one singer like Adele in a generation, and that isn’t going to change,” he said, “but, what about everyone else? Don’t they deserve to express themselves?” It isn’t really expressing their creativity, more like their copying skills.
“It’s not copying, it is expressing your limited creativity, but you get to start from something—in this case mixes of various tunes—and add your creative input on top of it,” Shashua said.
"How many people out there have beautiful ideas, but they cannot turn them into a coherent text? Let the computer write the story and you democratize writing. This means suppressing things that were once significant barriers to expressing yourself in writing, such as an extensive education. On top of that, the practical implication is that when you want to write an email to your boss you can do it in just one minute.”
Some people really love to drive, Shalev-Shwartz said, as an example, but when they are on the highway during rush hour, they would be very happy not to have to drive and have a computer do it. The same goes for writing, he said. “Say, I am writing an email to my boss, a report to investors, or an assignment for school. Now, I’m no Shakespeare and I would love to automate this process because it is like driving in heavy traffic at 10 kilometers per hour. When you want to have fun with it, you can always do it manually and enjoy the process.”
Even if mechanized writing can be achieved, there is still something disturbing about this notion, because writing is unlike other tasks. It is a highly creative act and understanding languages is perhaps the thing that defines us most as humans.
Shashua is quick to alleviate our fears, bringing our expectations from the capabilities of AI21’s system back to the realms of reality. “The creative part of writing will remain creative,” he said. “The sentences I input into HAIMKE are the creation, the rest can be mechanized,” he added. The assumption, according to Shashua, is that 1% of a book is human ingenuity and the rest can be done by the computer.
"Take Harry Potter creator J. K. Rowling,” Shashua suggested, “our tool can accelerate her writing, she could write the ideas and the computer would write the rest, in her own style or in a Shakespearian style, if she so wished.”
And what if I woke up sad this morning and I want to create a melancholic text? Or, perhaps, an optimistic or angry one? "Your mechanical ghostwriter will do whatever you tell it to do,” Shashua said. For the text to represent the user accurately, he explained, it needs to be given specific guidance to be angrier or more appeased when making a certain point. “An angry tone is far more generic than personal style, “ he added. But imitation is no revolution. To express reality, you need more than a simulation of understanding. You need a sense of what it is to be human. "Our computer doesn’t understand text at the intelligence level of a human being,” Shashua said. “In much the same way, Mobileye’s system does not understand the visual world to the extent of how even a two-year-old comprehends it. The gap between human capabilities and what the computer is able to do is still enormous,” he said.
According to Shashua, the only instances when computers can do better than humans, are when they are given a narrow and well-defined task—for example playing chess or recognizing faces. “When Mobileye started in 2012, autonomous driving seemed like a preposterous notion, but now, autonomous vehicles can think, act, and plan ahead on the road.
"In much the same way, you can take the technology we have today and define linguistic tasks that are needed for understanding text. A computer that takes bullet points and connects them will revolutionize the way language is consumed, without the pretension of mimicking human intelligence.”
So, is it artificial intelligence or is it not?
"Intelligence is a tricky concept. For years, they said that if a computer could solve a certain problem it would be considered intelligent, for example, if it could beat a chess master. But, then, in 1997, chess-playing computer Deep Blue beat world champion Garry Kasparov and, yet, it was not intelligent.
"In 2016, AlphaGo beat the world champion at the board game Go, which is more difficult than chess. This was a huge advancement but this computer still could not carry a conversation. A computer that read Harry Potter and could answer questions about it, is closer to being intelligent. So, yes, we are trying to create intelligence, it isn’t our goal, it just derives from it.
"We are on a journey. With some journeys, you never know how and when they are going to end. I cannot define the map that would lead to computers having intelligence that is similar to that of humans, but I can define some stops in this journey, for example, software that can automatically summarize 100 essays or read them and offer possible research angles.”
Shashua is not blind to the possible misuse of technologies such as AI21’s. “A computer that could write would make social media bots even stronger, as they will no longer be limited to writing a comment that is a sentence and a half long on Facebook but could write entire essays,” he said.
This could mean inflation of fake news. “I already have bots that, given the right parameters, can create a fluent text using the correct semantics,” Shashua said. “Today, we still assume the information we have is correct unless proven otherwise, but that is going to change and humanity needs to rise up to this challenge.”
Another challenge is the jobs that could be lost due to this technology. As journalists, should we be concerned?
"Not necessarily. Fewer teammates will be needed and some would move on to do different things, that is what automation does. It is true that, at first, it eliminates certain types of Sisyphic labor, but it paves the way to new jobs and more of them.
“In World War II a lot of women were employed making calculations that are now done entirely by computers. Technology has made these jobs redundant but it also created an unmeasurably larger number of new jobs. Also, for now, we are only developing the system in English which is the most common language and there is a whole list of other languages that will come before Hebrew, so your jobs in the Israeli media are still safe.”
What’s the next step in this revolution?
"A computer that is a friend. For now, it is a tool that we use to surf the web or create data charts and presentations, it is a work tool. In the future, we could talk to it. There will be software for an adventurous friend, a philosopher friend, or a psychologist friend for when you are feeling down, that would make you feel as if you were talking to a person. You would tell it about your day, about your distresses and passions, as you would a friend. Today, we communicate solely with humans, but, in the future, we could do so with computerized beings that would be so good, that you could have a great conversation. The future is basically, conversational intelligence.”