Google Cloud Platform Technology Nuggets - June 16-30, 2025 Edition
Welcome to the June 16–30, 2025 edition of Google Cloud Platform Technology Nuggets. The nuggets are also available in the form of a Podcast.
AI and Machine Learning
Gemini 2.5 models have seen some significant updates. These include:
Gemini 2.5 Flash and 2.5 Pro is generally available
New Gemini 2.5 Flash-Lite in public preview
New Supervised Fine-Tuning (SFT) for Gemini 2.5 Flash is generally available
New updated Live API with native audio in public preview
Veo 3 is now available for all Google Cloud customers and partners in public preview on Vertex AI. Get started with Veo 3 to create near-cinematic quality generative video. Check out the blog post to get inspired.
Looking to optimize Gemini models for specific video-based analytical tasks? Here is a post on how to fine-tune Gemini 2.5 models using video inputs via Vertex AI. The expanded tuning capabilities now allow you to include image, audio, and video, moving beyond traditional text-based customisation. The tuning would help you to specifically address use cases for automated video summarisation, event recognition, content moderation, and improved video captioning. Check out the blog post that shares best practices on effective tuning techniques.
Large-scale AI model training jobs need to address key challenges like frequent hardware failures. Google Cloud’s new multi-tier checkpointing solution tackles this problem by implementing a more frequent and faster saving of training progress (checkpoints). By storing checkpoints across multiple tiers, including in-cluster memory and Cloud Storage, thereby helping rapid recovery from failures and reducing lost progress. Check out how this is made possible.
Containers and Kubernetes
When you run your workloads on GKE, you ideally want them to be cost effective while continuing to be flexible and performant. That is a tough task and Autoscaling in GKE plays a crucial role in determining workload scheduling. GKE offers multiple ways to auto scale i.e. Horizontal Pod Autoscaler, Vertical Pod Autoscaler, Cluster Autoscaler, and Node Auto Provisioning. How do these work, under what conditions would they be applied to get you the optimal result. The blog post has some terrific details on how this is handled under different scenarios, which range from sudden spikes in demand, capacity is limited, node types not available, resources not available in a specific region and more.
Identity and Security
The 2nd CISO Bulletin for June is out. This edition covers primarily the escalating cyber threats targeting European healthcare organizations.
Mandiant Consulting in its M-Trends 2025 report that stolen credentials “are now the second-highest initial infection vector, making up 16% of our investigations”. How often do developers end up storing some secrets in files that get checked into public source repositories. Google Cloud has developed a tool to scan open-source artifacts for leaked Google Cloud credentials, which helps to identify and report exposures rapidly, even retrospectively. Check out the blog post for more details.
Keeping your AI systems in sync with organization compliance requirements is a challenge. Are the data access controls enforced across the entire AI lifecycle, how do we validate the integrity of the models and many questions arise. Enter an automated solution, Recommended AI Controls framework, available now as a standalone service and as part of Security Command Center. This service available from the console is straightforward to use: Select the framework, select the scope (organization, folder, project) and run the assessment. Check out the post for more details.
We have a new thing to learn in IAM : IAM Deny policies. These explicitly prohibit certain actions, regardless of other permissions granted. This is in contrast to IAM Allow policies and the key thing to note is that Deny policies take precedence and are evaluated first. Check out the blog post that explains IAM Deny with practical use cases , including restricting high-privilege actions and enforcing organizational standards, and recommends using it in conjunction with other complementary security tools like Organization Policies and Policy Simulator.
Google has been named a Strong Performer in The Forrester Wave™: Security Analytics Platforms, Q2 2025. Check out the blog post and download the complimentary report.
Data Analytics
Consider the following requirement of your data warehouse “of the customers who complained about performance issues during interactions last month, show me the top 10 by revenue”. The customers could have complained over email, chat, over a phone call (you have the audio recordings) and more. You definitely have this data but will need to build out multiple data pipelines to address data across different formats (text, audio) and the normalized and extracted data would then need to be saved in structured format (e.g., in a BigQuery table) and joined with each customer’s revenue data. Sounds good but what if you could do all of that via a single BigQuery SQL query as given below:
The magic behind this is the introduction of ObjectRef, a new data type that support multi-modality, has SQL and Python support plus more. Check out the blog to dive deeper into this data type and understand how it works.
Looking to read up on all the news around databases, data analytics and related services on Google Cloud, check out the “Whats new with Google Data Cloud” summary.
Gartner® has named Google a Leader in the 2025 Magic Quadrant™ for Analytics and Business Intelligence, for the second consecutive year. Check out the blog post and download the complimentary 2025 Gartner Magic Quadrant for Analytics and Business Intelligence Platforms.
Databases
Google Spanner has been recognized with the 2025 ACM SIGMOD Systems Award for its approach to globally distributed data management. Couple of things stand out in Spanner features: TrueTime, that enables external consistency and serializability at scale, effectively resolving the long-standing dilemma between ACID transactions and horizontal scalability. Check out the blog post for information on these features and specifically Spanner in Google Cloud features.
Application Development
The John Lewis Digital Platform team had a vision “empower development teams and arm them with the tools and processes they needed to go to market fast, with full ownership of their own business services.” The context here is about their journey where they first converted their monolithic application to microservices and then doubled-down on a platform approach to enable multiple teams to build their own services and create delightful experiences for their customers. Check out a two-part series titled “Using Platform Engineering to simplify the developer experience”, that dives into this journey, provides excellent details on how Google Cloud, GKE and other services played a key role in helping them architect this platform. This is essential reading to understand Platform Engineering in the trenches. Here are the parts: Part 1 and Part 2.
Looking to develop a KYC (Know Your Customer) implementation based on Agents and using the Agent Development Kit (ADK)? Here is a reference architecture and solution on how to do that using ADK. It describes how to use a root agent that works with child agents that do multiple things like resume checker, check online for information, wealth calculator and more.
Developers & Practitioners
Model Context Protocol (MCP) Servers are now considered as the standard way to integrate additional tools into AI Agents. One of the best ways to make your MCP Server available to others is via hosting it remotely and what better place to do that than Cloud Run. Check out this article that gives an overview of MCP, its transport protocols and then demonstrates how you can write your first MCP Server and host in on Cloud Run.
And while we are on the topic of Tools for AI Agents, there has been a clear path as to how we got to MCP Servers. The point is that not each tool that is made available to an AI Agent needs to be a MCP Tool. It could start off with a simple function in code itself. This is a must-read article for developers engaged in building out Agents, as to how tools have evolved and/or continue to be available today with Agent Frameworks, right from Function Tools, In-built tools, MCP Servers and more.
Infrastructure
If you are running Kafka consumer workloads on Cloud Run, there are a couple of features that would be of interest:
Cloud Run worker pools: This is a specific service from Cloud Run, that has been introduced as a cost-effective and purpose-built environment for continuous, non-HTTP background processing, distinct from Cloud Run services or jobs. This is well suited for continuous, non-HTTP, pull-based background processing, which was previously difficult to run on Cloud Run and which is the primary mechanism for pull-based workloads such as Apache Kafka.
Cloud Run Kafka Autoscaler (Open Source): This works with worker pools to dynamically adjust consumer instances based on actual Kafka metrics like offset lag, even scaling down to zero.
Check out the blog post for more details.
Google Cloud’s new C4D virtual machine family, powered by 5th Gen AMD EPYC processors (Turin) and Google Titanium technology are in general availability (GA). They are well suited to a wide range of workloads including databases, AI inference, web and application servers, and mission-critical business applications. Check out the blog post for more details and performance benchmarks.
Backup vaults, which were introduced last year provide immutable and indelible backup capabilities for mission-critical VMs, for both VM metadata and all their attached disks. This feature has got two key updates: the addition of support for standalone Persistent Disk (PD) and Hyperdisk backups, and the general availability of multi-region backup vault creation. Check out the post to understand how this works.
Networking
Cloud CDN is a high-performance edge caching solution and it has recently added support for Service Extension plugins. As the blog post states “allowing you to run your own custom code directly in the request path in a fully managed Google environment with optimal latency. This allows you to customize the behavior of Cloud CDN and the Application Load Balancer to meet your business requirements.” These plugins are built on WebAssembly (Wasm) and enable several use cases like custom security, logging, traffic steering and more. Check out the post for more details, including a quickstart.
Learn more about Google Cloud
It is often said that what the network is the magic behind Google Cloud. Understand the network infrastructure in the cloud provider is an essential requirement as per this post. As the post states, “Whether you’re working on-premises or leveraging the vast power of the Google Cloud, mastering fundamental networking concepts will empower you to build, deploy, and optimize your applications with confidence”. Check out the post that covers key networking concepts in Google Cloud, with references to Hands-on Codelabs too.
Write for Google Cloud Medium publication
If you would like to share your Google Cloud expertise with your fellow practitioners, consider becoming an author for Google Cloud Medium publication. Reach out to me via comments and/or fill out this form and I’ll be happy to add you as a writer.
Stay in Touch
Have questions, comments, or other feedback on this newsletter? Please send Feedback.
If any of your peers are interested in receiving this newsletter, send them the Subscribe link.