‘Sophisticated,’ ‘Nuanced’ Google Gemini Supersedes AWS, ChatGPT
Did the search giant and cloud provider just rack up a major victory in the AI wars? The model outperformed human experts in 57 subjects.
The AI wars are intensifying with Wednesday’s official unveiling of Google Gemini 1.0.
A week after rival Amazon Web Services launched its hopeful enterprise ChatGPT-killer, Amazon Q, Google is bringing something to market that promises to supersede typical generative AI, even the company’s own Bard. That’s largely due to how the search giant and world’s third-largest cloud computing provider built and trained Google Gemini — and how it expects the tool to eventually underpin all of its products and services, all with a more human touch than AI has so far achieved.
On top of that, Google Gemini will have immediate impact for channel partners and their end users.
Google's Sundar Pichai
“We are developing Gemini in a way that it is going to be available at various sizes and capabilities, and we’ll be using it immediately across all our products internally as well as bringing it out to both developers and cloud customers,” Sundar Pichai, CEO of Alphabet, parent company of Google and Google Cloud, said on the company's 2023 third-quarter call with analysts.
We have more insight into the availability date below.
More on Google Gemini
The existence of Google Gemini has been no secret. Pichai first announced it this past May during the Google I/O developer conference. What has remained a question is exactly what it would encompass and when it would be available.
We now have answers to those questions.
First, think of Google Gemini as a human brain on otherwordly steroids. (And, if the real-world use cases for Gemini come to fruition as predicted, Google Cloud probably just racked up a major victory in the AI wars.) Over the last six months, in training, Gemini “was starting to show impressive capabilities that we hadn't seen in prior models,” Eli Collins, vice president of product at Google DeepMind, told media in a Dec. 5 briefing. (DeepMind is one of the teams behind Google Gemini; Brain Team is the other.)
“For a long time we wanted to build a new generation of AI models inspired by the way people understand and interact with the world — an AI that feels more like a helpful collaborator and less like a smart piece of software,” Collins continued. “Gemini brings us a step closer to that vision. With a score of over 90%, Gemini is the first AI model to outperform human experts on the industry-standard benchmark, MMLU (Massive Multitask Language Understanding).”
As such, Google Gemini combines different AI models into one, emerging with genuinely multimodal capabilities. In other words, the platform can process and generate data far beyond text, to include images, audio and more.
Google's Eli Collins
“It has sophisticated reasoning capabilities and it can code at an advanced level,” Collins said.
Because of its inherent multimodal approach, Google Gemini can “seamlessly understand and reason about all kinds of inputs far better than existing models,” Collins added. “Gemini can understand nuanced information from text, images, audio and code, and it can answer questions relating to complicated topics and reasoning in math and physics.”
Think about your kid’s homework assignments. If you’re feeling bad because you can’t help your child as much as you’d like, Google Gemini can come to the rescue. The tool can solve problems, read answers, understand what was right and wrong, and explain concepts that need clarification, Google said in a video played for media.
Such in-depth capabilities apply across domains, from science to business.
The 3 Flavors of Google Gemini
As such, Google is breaking out Google Gemini into three sizes so that it can run on everything from sprawling data centers to small mobile devices.
There’s Gemini Ultra, for highly complex tasks, including advanced coding. Then there’s Gemini Pro, best for scaling across a range of activities.
To that point, as of Dec. 6, Bard now contains a version of Gemini Pro to deliver more incisive reasoning, planning and understanding and more, Google said. But, early next year, Google will release Bard Advanced, giving users “access to our best models and capabilities, starting with Gemini Ultra,” Demis Hassabis, CEO of Google DeepMind, wrote in a Dec. 6 blog.
Finally, there’s Gemini Nano, delivering AI in manageable pieces to mobile devices, which lack the computing power of their larger counterparts. Look for Gemini functionality in the Pixel 8 Pro, including Summarize in the Recorder app, and in Smart Reply in Gboard. WhatsApp will represent the first messaging app to get the update. Other Android devices will get Gemini at some point, too.
Along those lines, Google expects to make Gemini available in more products and services, including Search, Ads, Chrome and Duet AI.
Developers and enterprise users may access Gemini Pro starting Dec. 13. Find the Gemini API in Google AI Studio or Google Cloud Vertex AI.
Google Gemini Ultra: You Can’t Touch This
Meanwhile, Gemini Ultra remains off limits.
“[W]e’re currently completing extensive trust and safety checks, including red-teaming by trusted external parties, and further refining the model using fine-tuning and reinforcement learning from human feedback before making it broadly available,” Hassabis said.
As with any AI, the debut of Google Gemini raises questions around ethical use. Those issues go beyond the scope of this article but Google says it’s on top of the matter. Only select customers, developers, partners and “safety and responsibility experts” will get to conduct early experimentation and send feedback to Google before Gemini Ultra rolls out fully in 2024.
“As we innovate, we're committed to advancing our responsible AI techniques to address new challenges as they arise,” Collins said. “I've personally been able to experiment with Gemini for the past few months and seeing what we've already been able to build with it, I’m in awe of what it's capable of. And what we've learned today is actually just a glimpse. This is the start of a new era for us at Google as we continue to rapidly innovate and advance the capabilities of our models.”
Google Gemini Miscellany
Alongside the Dec. 6 unveiling of Google Gemini, Google is talking up its associated processing power. The company’s latest tensor processing unit, version 5, lets teams train models with lower latency and nearly three times faster than its predecessor, per Google.
That’s important, given how much energy and time AI models can consume.
“One of the key points that we're really excited about with Gemini Ultra is, despite being our largest model, it's significantly cheaper to serve than our previous larger model,” Collins said. “So it's not just more capable, it's also far more efficient. We think that's important on a number of levels, not just for ESG, but for product applications. And one of the other points … is that we also made Gemini much more efficient to train. We still require significant compute to train Gemini but we're getting much more efficient in terms of our ability to train these large models.”
As for the accuracy of Google Gemini, hallucinations remain a possibility, if not a probability. Despite Gemini’s knowledge of 57 subjects, including history, law and medicine, it does not have a 100% accuracy rate.
“We made a ton of progress in what's called factuality with Gemini, so Gemini is our best model in that regard,” Collins told us on Dec. 5. “But it's still, I would say, an unsolved research problem.”
As for how Google Gemini compares to the latest edition of ChatGPT, ChatGPT-4, Collins didn’t quite answer the question.
“Gemini is state-of-the-art across a wide range of benchmarks − as I mentioned 30 out of 32 of the widely used ones in the ML research community − and so we do see it setting new kind of frontiers across the board,” he said.
Google is not specifying how many parameters have gone into Google Gemini. Parameters, of course, help define the actual result an end user receives.
“We're not sharing the specific technical details on the parameters but we are going to publish the technical white paper,” Collins said.
However, Collins did say that Gemini models train on data that come from the open web and that feature different languages.
“Obviously, we pass our datasets for quality filtering, and perform security filtering on inappropriate content,” he said.
In fact, Google Gemini trained on more than 100 languages, Collins noted. That’s helpful for everything from ingesting information to translating it.
In terms of how Google plans to make money off Gemini beyond adding it to products and services, that’s still up in the air.
“[O]ur focus is delivering the best product experience for people with Gemini,” said Sissie Hsiao, vice president and general manager of Assistant and Bard at Google. “We'll explore how monetization may look but we don't have anything specific on that to share right now.”
For channel partners themselves, the secret to monetizing AI lies in putting internal data to use. (Go here to learn what some of your peers are doing.)
Pichai on AI Momentum: ‘Profound,’ ‘Incredible’
2023 will go down in the tech annals as the year of generative AI. Google, however, may have just redefined (or even upended) the genre as the industry reviews an historic 12 months of innovation. Pichai, for his part, would probably agree.
“I believe the transition we are seeing right now with AI will be the most profound in our lifetimes, far bigger than the shift to mobile or to the web before it,” he said in a Dec. 6 blog. “AI has the potential to create opportunities — from the everyday to the extraordinary — for people everywhere. It will bring new waves of innovation and economic progress and drive knowledge, learning, creativity, and productivity on a scale we haven’t seen before.”
And the pace of progress will just keep accelerating, Pichai added. Momentum has been “incredible,” he said, “and yet, we’re only beginning to scratch the surface of what’s possible.”
Google Gemini, Pichai said, stands out as “the first realization of the vision we had when we formed Google DeepMind earlier this year. This new era of models represents one of the biggest science and engineering efforts we’ve undertaken as a company. I’m genuinely excited for what’s ahead, and for the opportunities Gemini will unlock for people everywhere.”
About the Author
You May Also Like