
بروزرسانی: 04 تیر 1404
What We Know So Far
What we know so far indicates Gemini could represent a significant advancement in natural language processing.
More clues about Gemini’s progress this week: The\xa0Information reported that Google gave a small group of developers outside Google early access to Gemini.
He said it is designed from the ground up to be multimodal, integrating text, images, and other data types. This could allow for more natural conversational abilities.
Meta most recently announced the release of Llama 2, an open-source AI model, in partnership with Microsoft. The company appears dedicated to responsibly creating AI that is more accessible.
The Countdown To Google Gemini
He revealed that Gemini builds on DeepMind’s multimodal work like the image captioning system Flamingo.
At the Google I/O developer conference in May 2023, CEO Sundar Pichai announced the company’s upcoming artificial intelligence (AI) system, Gemini.
In an interview with Wired, published a few days later, Pichai provided the most unambiguous indication of how Gemini fits into Google’s product roadmap.
There was no official response to the follow-up question by Elon Musk on whether the numbers provided by SemiAnalysis are correct.
Select Companies Have Early Access To Gemini
— Elon Musk (@elonmusk) August 30, 2023
Pichai stated that Gemini combines the strengths of DeepMind’s AlphaGo system, known for mastering the complex game Go, with extensive language modeling capabilities.
Hassabis stated Gemini is a “series of models” that will be made available in different sizes and capabilities.
He stated conversational AI systems like Bard are “not the end state” but waypoints leading towards more advanced chatbots.
In June, he told Wired that techniques from AlphaGo, like reinforcement learning and tree search, may give Gemini new abilities like reasoning and problem-solving.
While the news about Gemini is promising thus far, Google isn’t the only company reportedly ready to launch a new LLM to compete with OpenAI.
He also mentioned Gemini may utilize memory, fact-checking against sources like Google Search, and improved reinforcement learning to enhance accuracy and reduce hazardous hallucinated content.
Early Gemini Results Are Promising
Are the numbers wrong?
In an update to his professional bio over the summer, Google Chief Scientist Jeffrey Dean said Gemini is one of the “next-generation multimodal models” he is co-leading.
The latest news from Meta and Google comes a few days after the first AI Insight Forum, where tech CEOs privately met with a portion of the United States Senate to discuss the future of AI.
This suggests Gemini may soon be ready for a beta release and integration into services like Google Cloud Vertex AI.
Meta Working On LLM To Compete With OpenAI
Featured image: VDB Photos/Shutterstock
منبع
The large language model (LLM) is being developed by the Google DeepMind division (Brain Team + DeepMind). It could compete with AI systems like ChatGPT from OpenAI and possibly outperform them.
Hassabis also stated Gemini may employ retrieval methods to output entire blocks of information, rather than word-by-word generation, to improve factual consistency.
Additional details came from Demis Hassabis, CEO of DeepMind.
In a September Time interview, Hassabis reiterated that Gemini aims to combine scale and innovation.
Pichai said Gemini and future iterations will ultimately become “incredible universal personal assistants” integrated throughout people’s daily lives in areas like travel, work, and entertainment.
He stated it will utilize Pathways, Google’s new AI infrastructure, to enable scaling up training on diverse datasets.
This hints at Gemini potentially being the largest language model created to date, likely exceeding the size of GPT-3 with over 175 billion parameters.
It Will Come With Various Sizes And Capabilities
If Gemini lives up to expectations, it could drive a change in interactive AI, aligning with Google’s ambitions to “bring AI in responsible ways to billions of people.”
The fusion of DeepMind’s latest AI research with Google’s vast computational resources makes the potential impact challenging to overstate.
He said incorporating planning and memory is in the early exploratory stages.
OpenAI CEO tweeted what appeared to be a response to a paywalled-article reporting that Google Gemini could outperform GPT-4.
He reiterated that Gemini will combine strengths of text and images, stating that today’s chatbots will “look trivial” in comparison within a few years.
Competitors Are Interested In Gemini’s Performance
Overall, Hassabis said Gemini is showing “very promising early results.”
Advanced Chatbots As Universal Personal Assistants
While details remain scarce, here is what we can piece together from the latest interviews and reports about Google Gemini.
Google Gemini Will Be Multimodal
Pichai also hinted at future capabilities like memory and planning that could enable tasks requiring reasoning.
Gemini Can Use Tools And APIs
According to the Wall Street Journal, Meta is also working on an AI model that would compete with the GPT model that powers ChatGPT.