With more than 225 indigenous languages, Europe is rich with multi-lingualism, and that’s why the Council of Europe proclaimed today the European Day of Languages. It aims to promote rich linguistic and cultural diversity and encourage lifelong language learning.
However, the interest in languages (and the human language in general) is not only close to the heart of institutions, but also – perhaps somewhat unexpectedly – to the global technology industry.
Historically computers only worked with very narrowly defined input like digits on punch cards. Over time, computers evolved to manage simple structured data. Our human world, dealing with fuzzy facts and uncertainty thus needed to be translated into rigid structural representation. And reverse conversion also took place with the change from “computer language” into the “natural one” – it was people who had to adapt.
To deal with large amounts of data we need a new protocol for communicating with computers, pushing it closer to the level of human communication skills where adaptation is now up to the machine.
Natural language has evolved over centuries, not only as a tool for information sharing. Reflecting the complex nature of our world, it includes much more than facts: multiple meanings, fuzziness, irony, humor and a great load of social functions. There is also a gap between the spoken and written form and our communication tends to be multimodal, accompanied by nonverbal aspects, such as facial expressions and body gestures.
For the first time we can see human language becoming a component of the computer world: after logic and mathematic computations. People’s greatest invention of all times, human language, has been now borrowed by IT experts and made into an information protocol for the upcoming big data era.
Working with more than one language, we must be aware of numerous linguistic differences. In some languages, for example, the word order may be flexible, while in others it tends to be rigid. In the latter case, one must pay attention to structuring the sentence, otherwise it sounds a little strange – just think of Yoda in The Star Wars: “Careful you must be.” There are languages, like Czech, that only work with three basic tenses – past, present and future, while some others (including English or Romanic languages) have finer and more complex tense systems to express succession of events.
At IBM’s R&D Lab in Prague, we are focusing on conversational interactions between people and machines:
- One area is dialogue systems that are used for safe information access and communication while driving a car. One example is GetHomeSafe, a European Commission FP6 collaborative project carried out together with partners including Daimler, Nuance and universities KTH and DFKI. The project explores in-car systems working not only in passive mode (answering driver’s questions), but also taking initiative and pro-actively serving information useful for the current driving context based on the user’s profile and content learning. Such systems are balancing several key metrics such as driver’s safety, short task completion time, and user’s satisfaction and driving comfort.
- Other examples may include multimodal dialog applications for smartphones; conversational information kiosks with talking avatars; education tools for improving language skills (Reading Companion), brain jogging voice interfaces for senior citizens.
The development of linguistic interface largely owes to the proliferation of smart phones and tablets: with the absence of keyboards, the speech appears to be the prime choice for information access and transactions. Speech and natural language understanding technologies have been maturing in labs for many years and it was the emergence of intelligent personal search devices that made them so popular today.
New solutions must be easily adjustable for various language environments: this will be the era of human language which also means a great boom for linguists and multidisciplinary experts – computer scientists with machine learning skills including psychologists, statisticians, machine learning and other experts. And this new wave of linguistic computing certainly surpasses borders of Europe and applies to the whole world.