
Mt Fuji photo source: Unsplash
Google’s annual I/O event is currently underway. It’s being held online this year, but as always it’s an opportunity to showcase new advances and work in progress.
This year Google previewed a new technology called MUM (Multitask Unified Model), which it described as “a new AI milestone” for understanding information.
Like BERT, it’s built on Transformer architecture but it’s much more powerful - 1,000 times more powerful than BERT. MUM’s key feature is multitasking and the aim is to help users get answers to complex queries with fewer searches. Google says it takes people on average eight searches to complete complex tasks.
MUM is trained across 75 different languages and many different tasks at once, enabling it to develop a more comprehensive understanding of information than previous models. It is multimodal, so can understand information across text and images. In the future it will be able to expand further eg to video and audio.
Google says it has started internal pilots with MUM, but gave no further dates.
Google’s Prabhakar Raghavan gave the example of a query such as “I’ve hiked Mt. Adams and now want to hike Mt. Fuji next fall, what should I do differently to prepare?” . This is something that would stump search engines today, with users having to input a number of different queries to get answers to individual elements, such as elevation, temperature, the right gear etc.
If you were talking to a hiking expert you could ask the same question and get a nuanced answer taking these aspects into consideration. Google says with MUM, it’s “getting closer” to being able to meet these types of complex needs.
Understanding that you're comparing two mountains, MUM would know that information about temperature, elevation, trails etc might be relevant. It would also have a wider contextual understanding of preparation in the hiking scenario, to include for example fitness training, as well as finding the right gear.

Multilingual
MUM has the potential to transfer information across different languages. If there’s useful information about Mt Fuji in Japanese, currently you would only see this if you searched in Japanese. Google says MUM could “transfer knowledge from sources across languages, and use those insights to find the most relevant results in your preferred language.”
Multimodal
MUM can understand information across different formats simultaneously and according to Google, eventually you might be able to upload a photo of your hiking boots and ask if these were suitable for Mount Fuji. “MUM would understand the image and connect it with your question to let you know your boots would work just fine. It could then point you to a blog with a list of recommended gear.”

MUM is currently in development, but looks to be a big advance in understanding search queries. If Google does manage to deliver all the above, it has the potential to transform the way we search.
For more information, see Google's blog post.
