- The annual Google I/O developer conference in California gave us a glimpse into everything the company’s been up to lately—most of it unsurprisingly revolves around AI.
- Google’s native chatbot Gemini has also gotten some interesting updates such as integration with Google Maps and Gmail. The latest 1.5 version of the chatbot was also announced.
- The company is also planning to launch Gemini Nano—its smallest-ever AI model.
On Tuesday (May 14), at its annual Google I/O developer conference in California, Google announced a bunch of updates. During the two-hour presentation, CEO Sundar Pichai shared the company’s future plans, most of which include dabbling in AI.
There’s a lot of chatter about the Google I/O updates on the internet, but if you’re looking for a one-stop guide to find out every big and small highlight of the keynote, this is the place to be. Read on.
Given that we’re well into 2024 now, it’s only understandable that every update from a major tech company will come with a bunch of AI-related changes. In fact, in the 110 minutes Google’s keynote lasted, the word “AI” was used a whopping 121 times.
Here’s what Google is doing with AI right now:
Gemma 2
For those who don’t know, Google Gemma is a database of lightweight open-source generative AI models for researchers and developers.
While it does come in several sizes, developers were long asking for a bigger model—and it is among us now. Welcome Gemma 2, with a new 27-billion-parameter model. It’s expected to launch this June.
Circle to Search
This is easily one of the most interesting features. Instead of switching apps to search for something, this lets you circle your query and then provides you with an answer almost instantaneously.
It doesn’t matter if it’s a text or an image; just circle it and you’ll have an accurate answer for your query.
The feature has already been expanded to over 100 million Samsung and Google devices and the number is likely to double by the end of the year.
Firebase Genkit
This will be another super helpful tool for developers. Added to the Firebase platform, it will basically help developers create AI-powered apps in JavaScript/TypeScript, with Go support in the pipelines and expected to come soon, too.
Also, since it’s open-source and uses the Apache 2.0 license, developers will be able to create those apps much faster. Also, it will support a number of third-party tools. For example, in addition to using Gemini, Firebase users can also work with other open-source models such as Ollama.
Veo
In order to compete with OpenAI’s Sora, Google has launched its own AI tool that can create 1080p video clips from just a simple text prompt.
Although the video length is capped at one minute for now, it makes up for the short duration by allowing you to add multiple cinematic styles such as landscapes and time lapses.
What’s more, once the video is ready, you can also edit it according to your needs and preferences.
Imagen 3
Imagen is Google’s native text-to-image generator and it just got its newest version, which is expected to be more accurate in interpreting prompts.
Pictures generated by Imagen 3 will also be more realistic and detail-oriented, and compared to Imagen 2, will have fewer distractions and errors.
Talking about the update, Demis Hassabis, CEO of DeepMind said: “This is also our best model yet for rendering text, which has been a challenge for image generation models.”
AI in Search
It looks like Google has decided to infuse AI into everything it does. Now, its popular search engine is also getting an AI makeover. For starters, users will finally get access to AI Overview, which is essentially a summary snippet of whatever topic you’re looking up.
Google is also planning to use AI to organize search results, deciding which page should be ranked where.
Last but not least, a certain plan to integrate Gemini (Google’s AI chatbot) with Google Search is also on the cards. Gemini will then double up as your agent and help you plan your meals and trips.
For example, you can type something like “Plan a dinner meal for Mother’s Day” and it will provide you with relevant recipe links. Exciting!
Ask Photos
This feature is scheduled to roll out later this summer and will allow you to search across Google Photos using voice commands.
It’s worth noting that Google Photos already allows you to simplify your search using keywords. For instance, if you want to check all mountain-related photos in your collection, you can just type in that keyword in the search box.
However, Ask Photos will make the process even easier. You could ask it “best photo from my trip to the Eiffel Tower” and it will go through all relevant images and pick the best one after analyzing the lighting and quality of the pictures.
Google Play
While AI-related news dominated the announcements this year, the tech giants had some interesting announcements for Google Play, too.
Developers will benefit from a new feature called Engage SDK that will allow them to showcase their apps in full-screen to potential users, thus creating an immersive and personalized experience.
The Play Store will also be offering custom store listings, under which developers can optimize their listings for different audience groups.
Other significant Google Play Store changes include:
- New app discovery feature
- Update to the Play Integrity API
- Gemini integration that can be used to write custom app descriptions
6th Generation Tensor Processing Units AI Chips
The 6th generation of TPU AI chips, nicknamed Trillium, is also set to be launched by Google later this year. Compared to the 5th generation, the latest version offers a 4.7X boost in computational performance per chip.
What makes these chips even more special is that they’re energy efficient—perhaps more energy efficient than anything Google has ever created before. At a time when the demand for AI chips is on the rise and the AI industry is facing a shortage of powerful chips, it’s good to see Google focusing on sustainable products whose energy needs can be met in the long term.
Furthermore, these chips are equipped with the 3rd generation of SparseCore, which in Google’s words is a “specialized accelerator for processing ultra-large embeddings common in advanced ranking and recommendation workloads.” Simply put, it will help the chips train the models faster and with lower latency.
Project IDX
Project IDX is officially available in beta mode. Previously, it was available on an invite-only basis. This new update will also come with the integration of Google Maps into the IDE, which will help add geolocation features to its app.
The platform will also be integrating with Chrome Dev Tools and Lighthouse to simplify debugging applications.
💡Note: Project IDX is an AI-assisted workspace for full-stack, multi-platform app development in the cloud.
Gemini Updates
Coming to where most of the fun is, let’s now take a quick look at all the Gemini (Google’s proprietary AI-powered chatbot) updates:
Gemini 1.5 Pro
The most important announcement is that Gemini is getting updated, and it will now be able to analyze longer documents, audio recordings, video recordings, codebases, and loads more thanks to the world’s longest context window.
To give you an idea of how much better its input capacity is now, Gemini Advanced can take up to 1 million tokens at once. Meta’s Claude 3 can take 300K tokens and GPT-4 128K.
Important: Token stands for the smallest budding unit of an input. For example, when processing a word, it’ll be broken into syllables. ‘Fantastic’ will be broken down as ‘fan,’ ‘tas,’ and ‘tic’ – that’s three tokens.
Gemini and Gmail
Google is trying to simplify communication by integrating Gemini with Gmail. You can use the AI tool from within Gmail to draft, search, and summarize emails.
It will also be able to handle more complex tasks. For example, if you have to return an item you bought online, it can help you search for the receipt and fill out the return form if needed.
Gemini and Google Maps
Google has decided to add Gemini to Google Maps, starting with the Places API. This will allow developers to show a summary of places in their own apps and websites, helping them save time which they’d have to otherwise spend on writing custom descriptions.
Gemini on Android
Android’s beloved Google Assistant will soon be replaced by Gemini, which will offer a more integrated experience. For example, images created by Gemini can be directly dragged and dropped into Gmail instead of having to separately download and attach it.
Similarly, on YouTube, you will be able to use the “Ask This Video” feature to find specific information within a video instead of having to watch the entire video.
Gemini Live
If you want to communicate better with Gemini, Gemini Live will fancy your wits. It allows you to carry out in-depth voice conversations with the chatbot. The experience is so realistic and human-like that you can even interrupt the tool while it’s speaking to ask a follow-up question.
What’s more, you can also take photos and videos of your surroundings and share them with the tool. It will be able to analyze it and carry out a full conversation based on it.
Gemini Nano
Google is all set to introduce its smallest AI model ever. Gemini Nano will directly integrate into Chrome’s desktop version, starting with Chrome 126. This will mostly be of use to developers, who can use this model to enhance their custom-built AI features.
Gemini Nano is also expected to power existing tools such as the “help me write” tool from Workspace Lab in Gmail. Furthermore, it will also soon be used to detect spam calls.
For instance, if someone calls you and asks for your login credentials or passwords, it will initiate a trigger, and you’ll get a notification stating that you might be falling prey to a scam.
While there’s no specific release date for this feature yet, we do know that it’s going to be optional—you can choose if you want to use it or not.
Quiz Master
If you use YouTube to watch educational videos, this feature can be useful for you. It lets you “raise your hand,” ask follow-up questions on the topic, and take a quiz on the subject.
Plus, thanks to the Gemini model’s long-context capabilities, it works perfectly well with longer videos as well. However, it’s disappointing that this feature is only rolling out to select Android users in the US, at least for now.
Generative AI for Learning
Google has introduced a new line of generative AI models called LearnLM, which will be specifically focused on learning and education. It’s basically a conversational tutor that can guide you on a variety of subjects.
This is not just going to be beneficial for students; it can also help teachers come up with more content and activities to make their lessons more interesting.
New ‘Web’ Filter for Search Results
The ‘Web’ search filter allows you to see only text-based links the same way you can opt to see only images, videos or shopping links on Google.
On mobile devices, the ‘Web’ filter will be a default addition alongside other filters. But on the desktop, the filters you see will depend on the topic you’ve looked up.
For context, Google decided to add this feature after it got a bunch of first-hand feedback from users that sometimes they just want to see the classic, blue text-based links.
Open AI and Meta are some of the biggest competitors of Google. While Google was busy making a slew of announcements at the I/O conference, both of its rivals have been up to interesting projects as well.
OpenAI has just launched its newest AI model GPT-4o. It offers seamless, human-like conversations, where you can talk to the tool back and forth without waiting for it to finish a statement. Hold on a minute! This is almost exactly what the Gemini Live update says it will do. It looks like the battle lines have been drawn then!
GPT-4o also lets you run web searches from within the tool. Basically, when it doesn’t have the answer to a prompt, it creates a query and enters it into a search engine using your keyword and then retrieves the most relevant results.
Furthermore, you can also use it to create your own versions of ChatGPT. It’s worth noting that a lot of m new features have been added in addition to those mentioned above. I’ve covered all of them in my in-depth coverage of the GPT-4o launch.
Speaking of Meta, it released a new AI assistant powered by Llama 3 last month. This is being integrated across all major Meta platforms, including WhatsApp, Instagram, and Facebook.