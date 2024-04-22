Gemini has marked a groundbreaking milestone in Google's commitment to democratising AI and empowering businesses across various sectors with transformative capabilities. Launched in December 2023, the cutting-edge platform has ushered in a new era of innovation across text, image, audio, and video benchmarks, revolutionising how organisations harnessed AI for efficiency, creativity, and growth.

Naren Kachroo, Head GTM AI, Google Cloud India, and Anirudh Murali, Customer Engineer, Google Cloud, India, showcased the various capabilities of Gemini at TechSparks Mumbai 2024, India’s premier technology and startup summit.

The demonstration offered a glimpse of what Gemini and the entire Google Cloud AI platform could do. It highlighted the achievements and advancements made by Google in the field of generative AI, paving the way for a new era in AI technology where innovation meets responsibility, efficiency, and transformative capabilities to help businesses towards unparalleled success.

For businesses eager to embrace the power of AI and position themselves at the forefront of innovation, this was an opportunity to experience cutting-edge technology and groundbreaking ideas first hand and become part of the AI revolution that has reshaped industries worldwide.

From founding fathers to Gemini's revolution

The journey of innovation and the invention of generative AI began at Google, positioning the company as one of the founding fathers of generative AI large language models. This transformative journey started in 2017 with transformers, extending contributions to diffusion models and driving continuous innovation.

In December 2023, Google proudly announced Gemini, its most capable, robust, and well-trained multi-modal model developed through a joint partnership between various teams within Google, including Google Research, DeepMind among others.

Gemini represents a new generation of generative AI, designed with three core principles in mind. “Firstly, it is natively multimodal, allowing seamless interaction across language, images, video, and code without being stitched together at the top layer. Secondly, it is highly optimised and has excelled in benchmark tests, showcasing exceptional performance in various scenarios.

“Thirdly, at its core it has the Google principle of responsible AI making it ethical and sustainable. It is built to run efficiently on Google's Tensor Processing Units (TPUs), minimising its carbon footprint and aligning with Google's commitment to environmental sustainability,” Kachroo said.

Beyond LLMs: The evolving AI landscape

For practitioners of AI, particularly in generative AI, it is evident that a large language model (LLM) alone is not sufficient for leveraging this technology effectively in enterprise or business contexts, Kachroo said. Customers demand more than just an LLM; they seek a comprehensive platform that enables them to build, train, and integrate models seamlessly within their environment.

They require a platform capable of hosting and scaling applications powered by large language models and generative AI. Additionally, they aim to streamline developer tasks and coding processes, from architecture design to application deployment. Furthermore, customers are interested in a platform that collaborates with a broader ecosystem of partners, including system integrators and infrastructure providers.

“At Google Cloud, we collaborate with various partners, including both large GSI partners and born-in-the-cloud partners to give our partners a host of options best suited to their individual needs,” Kachroo said.

Gemini's advanced AI capabilities and real-world impact

At the demonstration, Murali showcased capabilities of Gemini like the ability to recognise specific and localised content, data extraction and demonstrating proficiency in processing and understanding diverse conversational content and the ability to comprehend video content and offer suitable conversation.

“All of this is part of our Vertex AI studio available on the Google Cloud Console for experimentation and exploration of different models and parameters. Gemini's versatility in handling text, images, and videos underscores its advanced AI capabilities, making it a valuable tool for various applications,” Murali said.

Other practical applications of Gemini include catalogue management for ecommerce retailers, image-based search capabilities, working with limited data, and extraction of dynamic information from external domains. “These capabilities represent what Gemini is enabling for the GenAI era – natural conversations grounded in your data, empowering users and enhancing customer experiences,” he said.

Kachroo emphasised the uniqueness of Google's capabilities, stating, “The multimodal nature, understanding text, images, and video, along with their contents, natively in a model, is unique to us.” He added that this has helped collaboration with enterprise customers on various use cases such as CCTV feeds compliance, highlighting the breadth of applications at an enterprise level.

He concluded the demonstration by emphasising global and Indian enterprises' serious adoption of transformative technologies like Gemini, stating, “Enterprises all over the world, including in India, are taking this very seriously. Solving serious business problems, making customer experiences magical, and making user journeys seamless.”