Cloud compliance can present significant regulatory and technical challenges for organizations. These complexities often include delineating compliance responsibilities and accountabilities between the customer and cloud provider.
At Google Cloud, we understand these challenges faced by our customers’ cloud engineering, compliance, and audit teams, and want to help make them easier to manage. That’s why we’re pleased to announce that our Audit Manager service, which can digitize and help streamline the compliance auditing process, is now generally available.
Understanding compliance across layers in Google Cloud.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e11ab561700>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Traditional compliance methodologies, reliant on manual processes for evidence collection, are inefficient, prone to errors, and resource-intensive. According to the Gartner® Audit Survey, “When surveyed on their key priorities for 2024, 75% of chief audit executives (CAEs) cited audit’s ability to keep up with the fast-evolving cybersecurity landscape as their top priority — making it the most commonly cited priority.”
Introducing Audit Manager
Audit Manager can help organizations accelerate compliance efforts by providing:
Clear shared responsibility outlines: A matrix of shared responsibilities that delineates compliance duties between the cloud provider and customers, offering actionable recommendations tailored to your workloads.
Automated compliance assessments: Evaluation of your workloads against industry-standard technical control requirements in a simple and automated manner. Audit manager already supports popular industry and regulatory frameworks including NIST 800-53, ISO, SOC, and CSA-CCM. You can see the full list of supported frameworkshere.
Audit-ready evidence:Automated generation of comprehensive verifiable evidence reports to support your compliance claims and overarching governance activity. Audit Manager provides you with a quick execution summary of compliance at a framework level and the ability to deep-dive using control level reports.
Actionable remediation guidance: Insights to swiftly address each compliance gap that is identified.
The compliance audit journey with Audit Manager
The cloud compliance audit process involves defining responsibilities, identifying and mitigating risks, collecting supporting data, and generating a final report. This process requires collaboration between Governance, Risk, and Compliance analysts, compliance managers, developers, and auditors, each with their own specific tasks. Audit Manager streamlines this process for all involved roles, which can help simplify their work and improve efficiency.
Shift left your compliance audit process with Audit Manager.
Customer case study: Deutsche Börse Group
Deutsche Börse Group, an international stock exchange organization and innovative market infrastructure provider, began their strategic partnership with Google Cloud in 2022. Their cloud transformation journey is well under way, which brings with it the challenge of achieving and documenting compliance in their environment.
Florian Rodeit, head of cloud governance for Google Cloud, Deutsche Börse Group, first heard about Audit Manager during a Las Vegas Google Cloud Next 2024 session.
“The Audit Manager product promises a level of automation and audit control that has a lot of potential. At Deutsche Börse Group, we were excited to access the preview, explore the functionality further and build out a joint solution,” he said.
Following the European preview launch of Audit Manager, Deutsche Börse Group and Google Cloud set up a collaborative project to explore automating cloud controls via Audit Manager. Deutsche Börse Group had already created a comprehensive control catalogto manage their cloud control requirements across the organization. They analyzed the Cloud Security Alliance’s Cloud Controls Matrix against their written rules framework to create inputs for Audit Manager, and set out ownership and implementation guidelines for cloud-specific controls.
Now, Deutsche Börse Groupcan use Audit Manager to check if there are resources configured that deviate from the control framework, such as any resources that have been set up outside of approved regions. This provides automated, auditable evidence to support their specific requirements for compliant usage of Google Cloud resources.
Benjamin Möller, expert cloud governance, vice-president, Deutsche Börse Group, has been leading the collaborative project. “Moving forward, we hope that Audit Manager will allow us to automate many of our technical controls — giving us robust assurance that we are compliant, enabling us to quickly identify and rectify non-compliance, and minimizing the manual over-head of audit evidence. We are excited to continue making progress on our joint venture,” he said.
Take the next step
To use Audit Manager, access the tool directly from your Google Cloud console. Navigate to the Compliance tab in your Google Cloud console, and select Audit Manager. For a comprehensive guide on using Audit Manager, please refer to our detailed product documentation. We encourage you to share your feedback on this service to help us improve Audit Manager’s user experience.
We are thrilled to announce new capabilities that make running Dataproc Serverless even faster, easier, and more intelligent.
Elevate your Spark experience with:
Native query execution: Experience significant performance gains with the new Native query execution in the Premium tier.
Seamless monitoring with Spark UI: Track job progress in real time with a built-in Spark UI available by default for all Spark batches and sessions.
Streamlined investigation: Troubleshoot batch jobs from a central “Investigate” tab displaying all the essential metrics highlights and logs filtered by errors automatically.
Proactive autotuning and assisted troubleshooting with Gemini: Let Gemini minimize failures and autotune performance based on historical patterns. Quickly resolve issues using Gemini-powered insights and recommendations.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e461910aaf0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
Accelerate your Spark jobs with native query execution
You can unlock considerable speed improvements for your Spark batch jobs in the Premium tier on Dataproc Serverless Runtimes 2.2.26+ or 1.2.26+ by enabling native query execution — no application changes required.
This new feature in Dataproc Serverless Premium tier improved the query performance by ~47%in our tests on queries derived from TPC-DS and TPC-H benchmarks.
Start now by running the native query execution qualification tool that can help you easily identify eligible jobs and estimate potential performance gains. Once you have the list of batch jobs identified for native query execution, you can enable it and have the jobs run faster and potentially save costs.
Seamless monitoring with Spark UI
Tired of wrestling with setting up the persistent history server (PHS) clusters and maintaining them just to debug your Spark batches? Wouldn’t it be easier if you could avoid the ongoing costs of the history server and yet see the Spark UI in real-time?
Until now, monitoring and troubleshooting Spark jobs in Dataproc Serverless required setting up and managing a separate Spark persistent history server. Crucially, each batch job had to be configured to use the history server. Otherwise, the open-source UI would be unavailable for analysis for the batch job. Additionally, the open-source UI suffered from slow navigation between applications.
We’ve heard you, loud and clear. We’re excited to announce a fully managed Spark UI in Dataproc Serverless that makes monitoring and troubleshooting a breeze.
The new Spark UI is built-in and automatically available for every batch job and session in both Standard and Premium tiers of Dataproc Serverless at no additional cost. Simply submit your job and start analyzing performance in real time with the Spark UI right away.
Here’s why you’ll love the Serverless Spark UI:
Traditional Approach
The new Dataproc Serverless Spark UI
Effort
Create and manage a Spark history server cluster. Configure each batch job to use the cluster.
No cluster setup or management required. Spark UI is available by default for all your batches without any extra configuration.The UI can be accessed directly from the Batch / Session details page in the Google Cloud console.
Latency
UI performance can degrade with increased load. Requires active resource management.
Enjoy a responsive UI that automatically scales to handle even the most demanding workloads.
Availability
The UI is only available as long as the history server cluster is running.
Access your Spark UI for 90 days after your batch job is submitted.
Data freshness
Wait for a stage to complete to see that its events are in the UI.
View regularly updated data without waiting for the stage to complete.
Functionality
Basic UI based on open-source Spark.
Enhanced UI with ongoing improvements based on user feedback.
Cost
Ongoing cost for the PHS cluster.
No additional charge.
Accessing the Spark UI
To gain deeper insights into your Spark batches and sessions — whether they’re still running or completed — simply navigate to the Batch Details or Session Details page in the Google Cloud console. You’ll find a “VIEW SPARK UI” link in the top right corner.
The new Spark UI provides the same powerful features as the open-source Spark History Server, giving you deep insights into your Spark job performance. Easily browse both running and completed applications, explore jobs, stages, and tasks, and analyze SQL queries for a comprehensive understanding of the execution of your application. Quickly identify bottlenecks and troubleshoot issues with detailed execution information. For even deeper analysis, the ‘Executors’ tab provides direct links to the relevant logs in Cloud Logging, allowing you to quickly investigate issues related to specific executors.
You can still use the “VIEW SPARK HISTORY SERVER” link to view the Persistent Spark History Server if you had already configured one.
A new “Investigate” tab in the Batch details screen gives you instant diagnostic highlights collected at a single place.
In the “Metrics highlights” section, the essential metrics are automatically displayed, giving you a clear picture of your batch job’s health. You can further create a custom dashboard if you need more metrics.
Below the metrics highlights, a widget “Job Logs” shows the logs filtered by errors, so you can instantly spot and address problems. If you would like to dig further into the logs, you can go to the Logs Explorer.
Proactive autotuning and assisted troubleshooting with Gemini (Preview)
Last but not least, Gemini in BigQuery can help reduce the complexity of optimizing hundreds of Spark properties in your batch job configurations while submitting the job. If the job fails or runs slow, Gemini can save the effort of wading through several GBs of logs to troubleshoot the job.
Optimize performance: Gemini can automatically fine-tune the Spark configurations of your Dataproc Serverless batch jobs for optimal performance and reliability.
Simplify troubleshooting: You can quickly diagnose and resolve issues with slow or failed jobs by clicking “Ask Gemini” for AI-powered analysis and guidance.
Sign up here for a free preview of the Gemini features and “Investigate” tab for Dataproc Serverless.
Climate change is the biggest challenge our society faces. As scientists, governments, and industry leaders gather in Baku, Azerbaijan for the 2024 United Nations Climate Change Conference, a.k.a. COP29, it’s incumbent upon all of us to find innovative solutions that can drive impact at a global scale.
The gravity of climate change requires solutions that go beyond incremental change. To find those solutions, we need the ability to make better decisions about how to approach climate mitigation and adaptation across every human activity — from transport, industry, and agriculture to communications, finance, and housing. This requires processing vast volumes of data generated by these industries. The combination of AI and cloud technologies offer the potential to unlock climate change solutions that can be both transformational and global in scale.
We already have a lot of examples that we can draw from.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3eebc08468e0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Today, for example, Google Earth Engine is being used by the Forest Data Partnership, a collaboration for global monitoring of commodity-driven deforestation, to monitor every oil palm plantation around the globe, providing participating companies live early-warning signals for deforestation risks, and dramatically reducing the costs involved in forest monitoring. Similarly, NGIS is using Google Earth Engine to power TraceMark, helping businesses deliver traceability and transparency across global supply chains.
Another example is Global Fishing Watch, an international nonprofit co-founded by Google that is using geospatial analytics and AI to understand how human activity impacts the seas, global industries, climate, biodiversity and more. The datasets map global ocean infrastructure and vessels that don’t publicly broadcast their positions. This helps to advance policy conversations about offshore renewables development, provides insight into carbon dioxide emissions from maritime vessels, and enables marine protection.
It’s clear that AI can process large volumes of data, optimize complex systems, and drive the development of new business models. We see businesses harnessing the technology in the fight against climate change in four ways:
1. Measuring business performance
Businesses are using AI-powered insights to help monitor their advance towards sustainability targets, which ultimately contributes to building business resilience.
In today’s business landscape, this is of paramount importance as companies face growing demands for transparency and accountability regarding their environmental and social impact.
We are seeing cloud and AI being used to de-risk investments, improve transparency, and increase profitability through the use of large-scale datasets, machine learning, and generative AI. These technologies allow companies to analyze their ESG performance, gain insights into climate risks, and monitor supplier behaviors.
For example, Palo Alto Networks partnered with Watershed, a Google Cloud Ready – Sustainability Partner, to measure and track their carbon emissions across their entire business using Google Cloud. This partnership enabled them to gain a comprehensive understanding of their environmental impact and set actionable targets for reducing emissions.
Another example is HSBC, which developed a new credit ranking tool on Google Cloud that allows them to run multiple climate risk scenarios simultaneously. This tool empowers HSBC to make more informed investment decisions while considering the potential impact of climate change on their portfolio.
Secondly, businesses are using AI to optimize their operations and supply chains for energy and resource efficiency, as well as to cut costs.
This is crucial for companies seeking to enhance their sustainability performance while simultaneously improving their bottom line. Through the use of AI and machine learning, cloud technologies empower organizations to optimize their existing operations, improve cost efficiency, and minimize waste.
For example, Geotab, another Google Cloud Ready – Sustainability partner, is managing 75 billion data records in BigQuery for 4 million commercial fleet vehicles every day to optimize vehicle routes, increase driver safety behaviors and accelerate the path to fleet electrification.
3. Identifying cleaner business models
As the world shifts towards more sustainable practices, businesses must adapt and identify new avenues for growth. Cloud and AI is helping businesses do just that. Cloud and AI allow organizations to reimagine their business models, explore new markets, and create innovative products and services that align with their sustainability goals.
Recykal, for instance, has partnered with Google Cloud to build Asia’s largest circular economy marketplace. By leveraging Google Cloud’s AI and machine learning capabilities, Recykal is revolutionizing waste management and promoting sustainable practices in Asia.
Another example is Einride, a company that is reimagining freight transport by using electric, self-driving vehicles and an AI-powered platform. Their innovative approach to logistics is disrupting the transportation industry and contributing to a more sustainable future.
More recently, Climate Engine and Robeco have started using AI and geospatial technologies with their scientific expertise and investment knowledge to inform how publicly traded companies’ actions impact biodiversity. You can read their joint thought leadership paper here.
4. Building more sustainably
Finally, and very importantly, businesses want to ensure that the actual use of cloud and AI technologies doesn’t lead to increased climate impacts. From the get-go, developers need to take concrete steps towards reducing the carbon footprint and cost of their applications in the cloud.
This is why, through our Carbon Sense suite, we provide developers with the tools and resources they need to build and deploy applications in a way that minimizes their environmental impact, all while maintaining cost efficiency.
L’Oréal, for example, leverages Google Cloud’s Carbon Footprint tool to track the gross carbon emissions associated with their cloud usage. This allows L’Oréal to understand the environmental impact of their technology decisions and implement strategies to reduce their footprint.
Finally, Google takes its own carbon footprint very seriously, and is pursuing an ambitious goal to achieve net-zero emissions across all of its operations and value chain, supported by a goal to run on 24/7 carbon-free energy on every grid where it operates by 2030.
Google Cloud is committed to helping organizations of all sizes achieve their sustainability goals. With cloud, data analytics, and AI, we’re delivering new ways to build resilience, reduce costs, and unlock sustainable growth, while also accelerating the impact of organizations’ sustainability initiatives through the smarter use of data. This is an opportunity to drive tangible business results and create a more sustainable future for all.
Crafting the perfect prompt for generative AI models can be an art in itself. The difference between a useful and a generic AI response can sometimes be a well-crafted prompt. But, getting there often requires time-consuming tweaking, iteration, and a learning curve. That’s why we’re thrilled to announce new updates to the AI-powered prompt writing tools in Vertex AI, designed to make prompting easier and more accessible for all developers.
We’re introducing two powerful features designed to streamline your prompt engineering workflow:Generate prompt andRefineprompt.
Imagine you need a prompt to summarize customer reviews about your latest product. Instead of crafting the prompt yourself, you can simply tell the Generate prompt feature your goal. It will then create a comprehensive prompt, including placeholders for the reviews, which you can easily populate with your own data later. Generate prompt takes the guesswork out of prompt engineering by:
Turning simple objectives into tailor-made, effective prompts. This way, you don’t need to agonize over phrasing and keywords.
Generating placeholders for context, like customer reviews, news articles, or code snippets. This allows you to quickly add your specific data and get immediate results.
Speeding up the prompt writing process. Focus on your core tasks, not on perfecting prompt syntax.
Refine prompt: Iterate and improve with AI-powered suggestions
Once you have a prompt, either crafted by Generate prompt or one you’ve written yourself, Refine prompt helps you modify it for optimal performance. Here’s how it works:
Provide feedback: After running your prompt, simply provide feedback on the response, the same way you would critique a writer.
Instant suggestions: Vertex AI generates a new, suggested prompt in one step, taking your feedback into account.
Iterate and improve: You can accept or reject the suggestion and continue iterating by running the refined prompt and providing further feedback.
Prompt refinement boosts the quality of the prompt, while also saving significant times during prompt design. The quality is typically improved by augmenting the prompt instructions in a way that Gemini will better understand.
Below are some sample prompts that were revised with Refine prompt:
Original prompts
After using Prompt Refinement
Suggest engaging lesson plan ideas for art class
Suggest 3 engaging lesson plan ideas for a high school art class, each focusing on a different art form. Be concise and only include the most relevant information, such as the art form, target age group, and key activity.
Plan a schedule for a week with focus time and meeting time. Take in account that there are 2 teams with 6 hour delay
Create a detailed weekly schedule for a team with a 6-hour time difference. The schedule should include:
Specific time blocks for focus time and meetings.
Consideration of overlapping work hours to ensure effective communication and collaboration.
A balance of individual work and team interactions.
Suggestions for time zone conversion tools or strategies to facilitate scheduling.
A powerful duo: Generate prompt meets Refine prompt
These two features work in tandem to help you craft the most effective prompt for your objective – irrespective of your skill level. Generate prompt gets you started quickly, while Refine prompt allows for iterative improvement in five steps:
Define your objective: Tell Generate prompt what you want to achieve.
Generate a prompt: Generate prompt creates a ready-to-use prompt, often with helpful placeholders for context.
Run the prompt and review the output: Execute the prompt with your chosen LLM in Vertex AI.
Refine with feedback: Use Refine prompt to provide feedback on the output and receive AI-powered suggestions for prompt improvement.
Iterate until ideal performance: Continue refining and rerunning your prompt until you achieve your desired results.
How to get started
Go ahead and try out an AI-assisted prompt-writing through our interactive critiquing workflow. Vertex AI’s easy-to-use UI for refining prompts can be tested without setting up a Google Cloud account through this link (to demo without a Google Cloud account, be sure you are logged out of your Google account in your web browser or use incognito mode). For those with an account, you’ll have the ability to save, manage, and fine-tune your prompts.
Generative AI presents both immense opportunities and challenges for the Department of Defense (DoD). The potential to enhance situational awareness, streamline tasks, and improve decision-making is significant. However, the DoD’s unique requirements, especially their stringent security standards for cloud services (IL5), necessitate carefully crafted AI solutions that balance innovation with security.
The DoD’s 2023 Data, Analytics, and Artificial Intelligence Adoption Strategy report emphasizes the need to “strengthen the organizational environment” for AI deployment. This underscores the importance of solutions that seamlessly integrate into existing infrastructure, prioritize data security, and enable responsible and intelligent use of AI.
Google Public Sector’s 4 AI pillars: A framework for DoD AI adoption
To meet the DoD’s unique challenges, Google AI for Public Sector has focused on 4 areas when designing solutions to help empower the DoD:
Adaptive: AI solutions must seamlessly integrate into the DoD’s existing complex and evolving technology ecosystem. Google prioritizes adaptable solutions that minimize disruption and enable rapid adoption, aligning with the DoD’s focus on agile innovation.
Secure: Protecting sensitive DoD data is paramount. Google’s AI solutions are engineered with robust security measures, including Zero Trust architecture and adherence to IL5 requirements, ensuring the confidentiality and integrity of critical information.
Intelligent: Google’s AI capabilities are designed to deliver actionable insights from vast and diverse datasets. By harnessing the power of machine learning and natural language processing, our solutions enable the DoD to make data-driven decisions with greater speed and accuracy.
Responsible: Google is committed to developing and deploying AI in a responsible and ethical manner. Our AI Principles guide our research, product development, and deployment decisions, ensuring that AI is used for good and avoids harmful applications.
Breaking down data silos and delivering insights with enterprise search
Google Cloud’s solution for enterprise search is a powerful tool designed to help organizations overcome the challenges of data fragmentation. It acts as a central hub, seamlessly connecting to diverse data sources across the department, including structured and unstructured data.
Intelligent Information Retrieval: Leveraging advanced AI and natural language processing, enterprise search delivers precise and contextually relevant answers to queries, even when dealing with unstructured data like documents, images, and reports.
Seamless Integration: Federated search combined with Retrieval Augmented Generation (RAG) provides relevant query responses without the need to move data or train a custom Large Language Model (LLM).
Enhanced Transparency and Trust: The solution provides links to source documents alongside AI-generated responses, allowing users to verify information and build confidence in the system.
Robust Security: With all services used in the solution submitted for IL5 accreditation, enterprise search incorporates industry-leading security measures, including Role-Based Access Control (RBAC) and Common Access Card (CaC) compatibility, to safeguard sensitive DoD data.
Future-Proof Flexibility: The solution supports a wide range of Large Language Models (LLMs), including Google’s Gemini family of models and Gemma, our family of lightweight, state-of-the-art open models. Google offers choice, adaptability and avoids vendor lock-in, allowing the DoD to leverage the latest AI advancements without extensive redevelopment.
Google Cloud’s generative AI infused solution directly supports the DoD’s mission by consolidating data access, enhancing discoverability, and providing rapid, accurate insights, leading to improved decision-making and a strategic advantage.
Google Cloud is committed to supporting the DoD’s AI journey by providing solutions that are not only powerful and innovative, but also secure, responsible, and adaptable. By empowering the DoD to harness the full potential of its data, we are helping to enable more agile, informed, and effective service members. Learn more about how Google Public Sector’s AI solutions can empower your agency and visit Google AI for Public Sector for examples of how we are helping accelerate mission impact with AI.
Welcome to the first Cloud CISO Perspectives for November 2024. Today I’m joined by Andy Wen, Google Cloud’s senior director of product management for Google Workspace, to discuss a new Google survey into the high security costs of legacy tech.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
–Phil Venables, VP, TI Security & CISO, Google Cloud
aside_block
<ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6f0df2ddf0>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
Confronting the high security cost of legacy tech
By Phil Venables, VP, TI Security & CISO, Google Cloud, and Andy Wen, senior director, product management, Google Workspace
From a business perspective, it’s easy to understand why many organizations continue to rely on outdated technology. Replacing older systems can be expensive, but relying on them comes with hidden costs that can far outstrip the benefits.
Legacy technology can greatly increase the business and security risks that an organization will face, a serious concern given that the global average total cost of a security breach in 2024 was $4.88 million. Despite the availability of a plethora of more modern solutions, we’re still seeing too many organizations rely on defenses that were designed for the desktop era, according to a new Google Workspace global cyber security survey of more than 2,000 security and IT decision-makers.
Phil Venables, VP, TI Security & CISO, Google Cloud
The numbers paint a dire picture of the security impact of operating legacy systems:
71% said that legacy technology has left organizations less prepared for the future.
63% believe that their organization’s technology landscape is less secure than it was in the past.
More than 66% told us that their organizations are investing more time and money than ever in securing their environments — but still experience costly security incidents.
81% of organizations experience at least one security incident per year.
Organizations experience eight security incidents on average per year.
We know many security leaders have convinced the business to invest in more security tools, because the survey also found that 61% of organizations are using more security tools than they did two years ago. Yet while more than two-thirds of organizations are investing more time and money in securing their environments, many are still experiencing expensive security incidents.
Andy Wen, senior director, product management, Google Workspace
Environments with more security tools often attempt to compensate for legacy platforms that continue to be vulnerable to security incidents. Meanwhile, 81% of security leaders believe cloud-first platforms are safer than legacy platforms.
Organizations with 10 or more security tools reported an average of 14 security incidents per year, with 34% of them spending more than $250,000 on incidents per year.
Organizations with fewer than 10 tools reported an average of six incidents per year, with 19% of them spending more than $250,000 on incidents per year.
“The solution is not more security tools, but more secure tools,” said CISA Director, Jen Easterly, at her mWISE Conference keynote in September.
We have also made this point often. To be truly resilient in today’s security landscape, organizations must consider an IT overhaul and rethink their strategy toward solutions with modern, secure-by-design architectures that nullify classes of vulnerabilities and attack vectors.
It may be daunting to take on an overhaul, especially for large organizations, but security leaders need to look at investing in a cloud-first solution to be resilient. The change can be made in small steps to minimize disruption and evaluate return on investment, such as using Chrome Enterprise for secure browsing and providing Google Workspace to specific teams.
The bottom line is that adopting modern technology can help eliminate entire classes of threats, as well as improve business outcomes.
We’d like to highlight three customer interactions that underscore organizational value gained by modernizing. Organizations need a centralized solution that can evolve, especially as attacks continue to increase in quantity and sophistication. We recently did some work with the cybersecurity company Trellix, which did a complete overhaul of its security infrastructure.
Trellix was running into issues where its old software stack felt stagnant and didn’t connect into new things they were doing or building. These older solutions made it hard to control where data was sitting and who was accessing it. They’ve since fully migrated to Google Workspace, adopted the Zero Trust capabilities we’ve built in, and augmented them with their own security solutions, including a security operations console, email security, and endpoint protection.
Employees can now chat, email, view files, edit documents, and join meetings from their device of choice without worrying about security and access permissions. All these capabilities live within the same platform, making it easier and simpler for security admins to oversee data safety with features like endpoint management and Zero Trust access controls in Workspace — without slowing down employee collaboration.
Similarly, the city of Dearborn, Mich., replaced its legacy email solution. After making the switch to Gmail, users noticed a meaningful decrease in spam, phishing, and malware, which helped reduce their cybersecurity risks.
Humana’s dilemma was driven by a legacy suite of desktop-based office applications that its IT team needed to spend 70% of its time maintaining. Humana’s IT team rolled out Google Workspace to 13,000 Humana employees in the field and in the office in four months, migrating 22 terabytes of data. Workspace’s built-in security features and browser-based apps saved the team time and reduced costs, and also led to a steady reduction in help desk tickets during and after rollout.
For more leadership guidance from Google Cloud experts, please see our CISO Insights hub.
aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6f0df2d370>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>
In case you missed it
Here are the latest updates, products, services, and resources from our security teams so far this month:
Join our upcoming Security Talks to unlock the Defender’s Advantage: Our next Security Talks is coming on Nov. 19, and will focus on the Defender’s Advantage. This free, day-long virtual event is packed with insights and strategies to help you proactively secure your cloud environment. Register today.
Cyber risk top 5: What every board should know: Boards should learn about security and digital transformation to better manage their organizations. Here’s five top risks they need to know — and prepare for. Read more.
Mandatory MFA is coming to Google Cloud. Here’s what you need to know: To help keep our customers secure, starting in 2025 we will require them to use MFA when accessing Google Cloud. Read more.
Google Cloud expands CVE program: As part of our commitment to security and transparency on vulnerabilities found in our products and services, we now will issue CVEs for critical Google Cloud vulnerabilities. Read more.
Our 2025 Forecast report: Get ready for the next year in cybersecurity with our 2025 Forecast report, now available. Read more.
From AI to Zero Trust, Google Cloud Security delivers comprehensive public sector solutions: Google Cloud Security is committed to helping government agencies and organizations strengthen their defenses, and we recently made several announcements at the Google Public Sector Summit. Read more.
FedRAMP High development in the cloud: Code with Cloud Workstations: A Forrester Total Economic Impact™ (TEI) study found that Google Cloud Workstations enhance consistency, agility, and security while reducing costs and risks. Read more.
Please visit the Google Cloud blog for more security stories published this month.
(In)tuned to take-overs: Abusing Intune permissions for lateral movement and privilege escalation: Learn how the Mandiant Red Team was able to move laterally from a customer’s on-premises environment to their Microsoft Entra ID tenant, and obtained privileges to compromise existing Entra ID service principals installed in the tenant. Also learn how to defend against it. Read more.
Flare-On 11 Challenge solutions: The latest Flare-On challenge is over, and it proved a doozy: Only 275 players out of 5,300 completed all 10 stages. Read more.
Please visit the Google Cloud blog for more threat intelligence stories published this month.
Now hear this: Google Cloud Security and Mandiant podcasts
Gen AI security: Unseen attack surfaces and pentesting lessons: What’s the current state of gen AI security? From common mistakes to novel attack surfaces to unique challenges, podcast hosts Anton Chuvakin and Tim Peacock discuss with Ante Gojsalic, co-founder and CTO, SplxAI, today’s gen AI security concerns and their potential impact on tomorrow’s tech. Listen here.
Get the Google Security Operations perspective on SIEM and security data lakes: What’s a disassembled SIEM, and why you should care: Travis Lanham, uber tech lead for Security Operations Engineering, Google Cloud, goes SIEM-deep with Anton and Tim. Listen here.
To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in two weeks with more security-related updates from Google Cloud.
It’s an exciting time in the world of data and analytics, with more organizations harnessing the power of data and AI to help transform and grow their businesses. But in a threat landscape with increasingly sophisticated attacks around every corner, ensuring the security and integrity of that data is critical.
Google Cloud offers a comprehensive suite of tools to help protect your data while unlocking its potential. In our new ebook, Building a Secure Data Platform with Google Cloud, we dig into the many data security capabilities within Google Cloud and share how they can help support data-based innovation strategies.
Take a peek inside the ebook, then download the full version here.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6f0e6fc0d0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
Unlock data platform-level security with BigQuery BigQuery, Google Cloud’s unified data platform, offers a robust set of integrated security features to help you safeguard your data. The platform automatically encrypts all data at rest, which provides a foundational layer of defense against unauthorized access. For data sharing, BigQuery Analytics Hub and data clean rooms allow you to efficiently, securely, and easily share data across organizational boundaries. The platform also includes Dataplex, which enables you to implement comprehensive policies to govern how data is accessed, used, and shared within your organization.
Shield assets with granular access controls and guardrails With Cloud Identity and Access Management (IAM), you can manage access to critical data across BigQuery, Cloud Run, Cloud Run functions, and Google Kubernetes Engine (GKE) resources. Organization restrictions place further limits on which users can access resources in your organization. Combined with Cloud IAM, this feature supports your organization policies and helps you maintain a secure perimeter around your Google Cloud environment.
Create security boundaries with perimeter and data protection Google Cloud offers several ways to reinforce your perimeter. VPC Service Controls help prevent data exfiltration from cloud resources, providing precise control over access and movement by external entities or by insiders.
Meanwhile, with Sensitive Data Protection, you can identify and classify your sensitive data within BigQuery, which can help you implement targeted protection measures such as masking, tokenization, and redaction. You can also gain even more granular control over your encryption keys with Customer-managed encryption keys for BigQuery.
Strengthen data security posture with automated monitoring and compliance Establishing robust security controls for your data is essential for improving your security posture, but it’s just as important to monitor your environment for threats and maintain compliance with industry standards. Security Command Center gives you a comprehensive view of your security posture with direct visibility into your BigQuery datasets. With Cloud Logging, you can collect, store, and analyze logs to gain insights into system activities, detect anomalies, and respond to security incidents. Assured Workloads further simplifies compliance, providing peace of mind that you’ve established strong baseline controls and compliant configurations.
All-in-one data security with integrated solutions from Google Cloud
Building a secure data ecosystem requires a multi-layered approach. With comprehensive security features from Google Cloud, you can safeguard your sensitive data, comply with industry regulations, and discover the full potential of your data. Dive deeper into these tools, solutions, and strategies in the full ebook — Building a Secure Data Platform with Google Cloud — to ensure the safety and integrity of your organization’s most valuable asset. Download the full version here.
As open-source large language models (LLMs) become increasingly popular, developers are looking for better ways to access new models and deploy them on Cloud Run GPU. That’s why Cloud Run now offers fully managed NVIDIA GPUs, which removes the complexity of driver installations and library configurations. This means you’ll benefit from the same on-demand availability and effortless scalability that you love with Cloud Run’s CPU and memory, with the added power of NVIDIA GPUs. When your application is idle, your GPU-equipped instances automatically scale down to zero, optimizing your costs.
In this blog post, we’ll guide you through deploying the Meta Llama 3.2 1B Instruction model on Cloud Run. We’ll also share best practices to streamline your development process using local model testing with Text Generation Inference (TGI) Docker image, making troubleshooting easy and boosting your productivity.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e6f0cf8f040>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>
Why Cloud Run with GPU?
There are four critical reasons developers benefit from deploying open models on Cloud Run with GPU:
Fully managed: No need to worry about drivers, libraries, or infrastructure.
On-demand scaling: Scale up or down automatically based on demand.
Cost effective: Only pay for what you use, with automatic scaling down to zero when idle.
Performance: NVIDIA GPU-optimized for Meta Llama 3.2.
Initial Setup
First, create a Hugging Face token.
Second, check that your Hugging Face token has permission to access and download Llama 3.2 model weight here. Keep your token handy for the next step.
Third, use Google Cloud’s Secret Manager to store your Hugging Face token securely. In this example, we will be using Google user credentials. You may need to authenticate for using gcloud CLI, setting default project ID, and enable necessary APIs, and grant access to Secret Manager and Cloud Storage.
code_block
<ListValue: [StructValue([(‘code’, ‘# Authenticate CLIrngcloud auth loginrnrn# Set default projectrngcloud config set project <your_project_id>rnrn# Create new secret key, remember to update <your_huggingface_token>rngcloud secrets create HF_TOKEN –replication-policy=”automatic”rnecho -n <your_huggingface_token> | gcloud secrets versions add HF_TOKEN –data-file=-rnrn# Retrieve the keyrnHF_TOKEN=$(gcloud secrets versions access latest –secret=”HF_TOKEN”)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e6f0cf8f490>)])]>
Local debugging
Install huggingface_cli python package in your virtual environment.
Run huggingface-cli login to set up a Hugging Face credential.
Use the TGI Docker image to test your model locally. This allows you to iterate and debug your model locally before deploying it to Cloud Run.
Now, we will create a new Cloud Run service using the deployment script as follows. (Remember to update BUCKET_NAME). You may also need to update the network and subnet name as well.
New solutions, old problems. Artificial intelligence (AI) and large language models (LLMs) are here to signal a new day in the cybersecurity world, but what does that mean for us—the attackers and defenders—and our battle to improve security through all the noise?
Data is everywhere. For most organizations, the access to security data is no longer the primary issue. Rather, it is the vast quantities of it, the noise in it, and the disjointed and spread-out nature of it. Understanding and making sense of it—THAT is the real challenge.
When we conduct adversarial emulation (red team) engagements, making sense of all the network, user, and domain data available to us is how we find the path forward. From a defensive perspective, efficiently finding the sharpest and most dangerous needles in the haystack—for example, easily accessible credentials on fileshares—is how we prioritize, improve, and defend.
How do you make sense of this vast amount of structured and unstructured data, and give yourself the advantage?
Data permeates the modern organization. This data can be challenging to parse, process, and understand from a security implication perspective, but AI might just change all that.
This blog post will focus on a number of case studies where we obtained data during our complex adversarial emulation engagements with our global clients, and how we innovated using AI and LLM systems to process this into structured data that could be used to better defend organizations. We will showcase the lessons learned and key takeaways for all organizations and highlight other problems that can be solved with this approach for both red and blue teams.
Approach
Data parsing and understanding is one of the biggest early benefits of AI. We have seen many situations where AI can help process data at a fast rate. Throughout this post, we use an LLM to process unstructured data, meaning that the data did not have a structure or format that we knew about before parsing the data.
If you want to try these examples out yourself, please make sure you use either a local model, or you have permission to send the data to an external service.
Getting Structured Data Out of an LLM
Step one is to get the data into a format we can use. If you ever used an LLM, you will have noticed it will output as a story or prose text, especially if you use chat-based versions. For a lot of use cases, this is fine; however, we want to analyze the data and get structured data. Thus, the first problem we have to solve is to get the LLM to output the data in a format we can specify. The simple method is to ask the LLM to output the data in a machine readable format like JSON, XML, or CSV. However, you will quickly notice that you have to be quite specific with the data format, and the LLM can easily output data in another format, ignoring your instructions.
Luckily for us, other people have encountered this problem and have solved it with something called Guardrails. One of the projects we have found is called guardrails-ai. It is a Python library that allows you to create guardrails—specific requirements—for a model based on Pydantic.
To illustrate, take a simple Python class from the documentation to validate a pet from the output of the LLM:
from pydantic import BaseModel, Field
class Pet(BaseModel):
pet_type: str = Field(description="Species of pet")
name: str = Field(description="a unique pet name")
You can use the next code from the Guardrails documentation to process the output of the LLM into a structured object:
from guardrails import Guard
import openai
prompt = """
What kind of pet should I get and what should I name it?
${gr.complete_json_suffix_v2}
"""
guard = Guard.from_pydantic(output_class=Pet, prompt=prompt)
raw_output, validated_output, *rest = guard(
llm_api=openai.completions.create,
engine="gpt-3.5-turbo-instruct"
)
print(validated_output)
If we look at what this library generates underwater for this prompt, we see that it adds a structured object part with the instructions for the LLM to output data in a specific way. This streamlines the way you can get structured data from an LLM.
Figure 1: The generated prompt from the Pydantic model
For the next use case, we will show the Pydantic models we’ve created to process the output.
Red Team Use Cases
The next sections contain some use cases where we can use an LLM to get structured data out of data obtained. The use cases are divided into three categories of the attack lifecycle:
Initial Reconnaissance
Escalate Privileges
Internal Reconnaissance
Figure 2: Attack lifecycle
Initial Reconnaissance
Open Source Intelligence (OSINT) is an important part of red teaming. It includes gathering data about the target organization from news articles, social media, and corporate reports.
This information can then be used in other red team phases such as during phishing. For defenders, it helps them understand which parts of their organization are exposed to the internet, anticipating a possible future attack. In the next use case, we talk about processing social media information to process roles and extract useful information.
Use Case 1: Social Media Job Functions Information
During OSINT, we often try to get information from employees about their function in their company. This helps with performing phishing attacks, as we do not want to target IT professionals, especially those that work in cybersecurity.
Social media sites allow their users to write about their job titles in a free format. This means that the information is unstructured and can be written in any language and any format.
We can try to extract the information from the title with simple matches; however, because the users can fill in anything and in any language, this problem can be better solved with an LLM.
Data Model
First, we create a Pydantic model for the Guardrail:
class RoleOutput(BaseModel):
role: str = Field(description="Role being analyzed")
it: bool = Field(description="The role is related to IT")
cybersecurity: bool = Field(description="The role is related to
CyberSecurity")
experience_level: str = Field(
description="Experience level of the role.",
)
This model has two Boolean options if the role is IT or cybersecurity related. Additionally, we would like to know the experience level of the role.
Prompt
Next, let’s create a prompt to instruct the LLM to extract the requested information from the role. This prompt is quite simple and just asks the LLM to fill in the data.
Given the following role, answer the following questions.
If the answer doesn't exist in the role, enter ``.
${role}
${gr.complete_xml_suffix_v2}
The two last lines are placeholders used by guardrails-ai.
Results
To test the models, we have scraped the titles that employees use on social media. This dataset contained the titles that the employees used and contained 235 entries. For testing, we used the gemini-1.0-pro model.
Gemini managed to parse 232 entries. The results are shown in Table 1.
Not IT
IT
Cybersecurity
Gemini
183
49
5
Manual evaluation by a red team operator
185
47
5
False positive
1
3
0
Table 1: Results of Gemini parsing 232 job title entries
In the end, Gemini processed the roles quite on par with a human. Most of the false positives were questionable because it is not very clear if the role was actually IT related. The experience level did not perform well, as the model deemed the experience level as “unknown” or “none” for most of the entries. To resolve this issue, the field was changed so that the experience level should be a number from 1 to 10. After running the analysis again, this yielded better results for the experience level. The lowest experience levels (1–4) contained function titles like “intern,” “specialist,” or “assistant.” This usually indicated that the person had been employed at that role for a shorter period of time. The updated data model is shown as follows:
class RoleOutput(BaseModel):
role: str = Field(description="Role being analyzed")
it: bool = Field(description="The role is related to IT")
cybersecurity: bool = Field(description="The role is related to
CyberSecurity")
experience_level: int = Field(
description="Estimate of the experience level of the role on
a scale of 1-10. Where 1 is low experience and 10 is high.",
)
This approach helped us to sort through a large dataset of phishing targets by identifying employees that did not have IT and cybersecurity roles, and sorting them by experience level. This can speed up target selection for large organizations and may allow us to better emulate attackers by changing the prompts or selection criteria. To defend against this, data analysis is more difficult. In theory, you can instruct all your employees to include “Cybersecurity” in their role, but that does not scale well or solve the underlying phishing problem. The best approach with regards to phishing is, in our experience, to invest into phishing resistant multifactor authentication (MFA) and application allowlisting. If applied well, these solutions can mitigate phishing attacks as an initial access vector.
Escalate Privileges
Once attackers establish a foothold into an organization, one of their first acts is often to improve their level of access or control through privilege escalation. There are quite a few methods that can be used for this. It comes in a local system-based variety as well as wider domain-wide types, with some based on exploits or misconfigurations, and others based on finding sensitive information when searching through files.
Our focus will be on the final aspect, which aligns with our challenge of identifying the desired information within the vast amount of data, like finding a needle in a haystack.
Use Case 2: Credentials in Files
After gaining initial access to the target network, one of the more common enumeration methods employed by attackers is to perform share enumeration and try to locate interesting files. There are quite a few tools that can do this, such as Snaffler.
After you identify files that potentially contain credentials, you can go through them manually to find useful ones. However, if you do this in a large organization, there is a chance that you will have hundreds to thousands of hits. In that case, there are some tools that can help with finding and classifying credentials like TruffleHog and Nosey Parker. Additionally, the Python library detect-secrets can help with this task.
Most of these tools look for common patterns or file types that they understand. To cover unknown file types or credentials in emails or other formats, it might instead be valuable to use an LLM to analyze the files to find any unknown or unrecognized formats.
Technically, we can just run all tools and use a linear regression model to combine the results into one. An anonymized example of a file with a password that we encountered during our tests is shown as follows:
@Echo Off
Net Use /Del * /Yes
Set /p Path=<"path.txt"
Net Use %Path% Welcome01@ /User:CHAOS.LOCALWorkstationAdmin
If Not Exist "C:Data" MKDIR "C:Data"
Copy %Path%. C:Data
Timeout 02
Data Model
We used the following Python classes to instruct Gemini to retrieve credentials with an optional domain. One file can contain multiple credentials, so we use a list of credentials to instruct Gemini to optionally retrieve multiple credentials from one file.
class Credential(BaseModel):
password: str = Field(description="Potential password of an account")
username: str = Field(description="Potential username of an account")
domain: Optional[str] = Field(
description="Optional domain of an account", default=""
)
class ListOfCredentials(BaseModel):
credentials: list[Credential] = []
Prompt
In the prompt, we give some examples of what kind of systems we are looking for, and output into JSON once again:
Given the following file, check if there are credentials in the file.
Only include results if there is at least one username and password.
If the domain doesn't exist in the file, enter `` as a default value.
${file}
${gr.complete_xml_suffix_v2}
Results
We tested on 600 files, where 304 contain credentials and 296 do not. Testing occurred with the gemini-1.5 model. Each file took about five seconds to process.
To compare results with other tools, we also tested Nosey Parker and TruffleHog. Both NoseyParker and Truffle Hog are made to find credentials in a structured way in files, including repositories. Their use case is usually for known file formats and randomly structured files.
The results are summarized in Table 2.
Tool
True Negative
False Positive
False Negative
True Positive
Nosey Parker
284 (47%)
12 (2%)
136 (23%)
168 (28%)
TruffleHog
294 (49%)
2 (<1%)
180 (30%)
124 (21%)
Gemini
278 (46%)
18 (3%)
23 (4%)
281 (47%)
Table 2: Results of testing for credentials in files, where 304 contain them and 296 do not
In this context, the definitions of true negative, false positive, false negative, and true positive are as follows:
True Negative: A file does not contain any credentials, and the tool correctly indicates that there are no credentials.
False Positive: The tool incorrectly indicates that a file contains credentials when it does not.
False Negative: The tool incorrectly indicates that a file does not contain any credentials when it does.
True Positive: The tool correctly indicates that a file contains credentials.
In conclusion, Gemini finds the most files with credentials, at a cost of a slightly higher false positive rate. TruffleHog has the lowest false positive rate, but also finds the least amount of true positives. This is to be expected, as a higher true positive rate usually is accompanied by a higher false positive rate. The current dataset has almost an equal number of files with and without credentials—in real-world scenarios this ratio can differ wildly, which means that the false positive rate is still important even though the percentages are quite close.
To optimize this approach, you can use all three tools, combine the output signals to a single signal, and then sort the potential files based on this combined signal.
Defenders can, and should, use the same techniques previously described to enumerate the internal file shares and remove or limit access to files that contain credentials. Make sure to check what file shares each server and workstation exposes to the network, because in some cases file shares are exposed accidentally or were forgotten about.
Internal Reconnaissance
When attackers have gained a better position in the network, the next step in their playbooks is understanding the domain in which they have landed so they can construct a path to their ultimate goal. This could be full domain control or access to specific systems or users, depending on the threat actor’s mission. From a red team perspective, we need to be able to emulate this. From a defender’s perspective, we need to find these paths before the attackers exploit them.
The main tool that red teamers use to analyze Active Directory is BloodHound, which uses a graph database to find paths in the Active Directory. BloodHound is executed in two steps. First, an ingester retrieves the data from the target Active Directory. Second, this data is ingested and analyzed by BloodHound to find attack paths.
Some tools that can gather data to be used in BloodHound are:
Sharphound
Bloodhound.py
Rusthound
Adexplorer
Bofhound
Soaphound
These tools gather data from the Active Directory and other systems and output it in a format that BloodHound can read. In theory, if we have all the information about the network in the graph, then we can just query the graph to figure out how to achieve our objective.
To improve the data in BloodHound, we have thought of additional use cases. Use Case 3 is about finding high-value systems. Discovering more hidden edges in BloodHound is part of Use Case 4 and Use Case 5.
Use Case 3: High-Value Target Detection in Active Directory
By default, BloodHound deems some groups and computers as high value. One of the main activities in internal reconnaissance is figuring out which systems in the client’s network are high-value targets. Some examples of systems that we are interested in, and that can lead to domain compromise, are:
Backup systems
SCCM
Certificate services
Exchange
WSUS systems
There are many ways to indicate which servers are used for a certain function, and it depends on how the IT administrators have configured it in their domain. There are some fields that may contain data in various forms to indicate what the system is used for. This is a prime example of unstructured data that might be analyzable with an LLM.
The following fields in the Active Directory might contain the relevant information:
Name
Samaccountname
Description
Distinguishedname
SPNs
Data Model
In the end, we would like to have a list of names of the systems the LLM has deemed high value. During development, we noticed that LLM results improved dramatically if you asked it to specify a reason. Thus, our Pydantic model looks like this:
class HighValueSystem(BaseModel):
name: str = Field(description="Name of this system")
reason: str = Field(description="Reason why this system is
high value", default="")
class HighValueResults(BaseModel):
systems: list[HighValueSystem] = Field(description="high value
systems", default=[])
Prompt
In the prompt, we give some examples of what kind of systems we are looking for:
Given the data, identify which systems are high value targets,
look for: sccm servers, jump systems, certificate systems, backup
systems and other valuable systems. Use the first (name) field to
identify the systems.
Results
We tested this prompt on a dataset of 400 systems and executed it five times. All systems were sent in one query to the model. To accommodate this, we used the gemini-1.5 model because it has a huge context window. Here are some examples of reasons Gemini provided, and what we think the reason was based off:
Domain controller: Looks like this was based on the “OU=Domain Controllers” distinguishedname field of BloodHound
Jumpbox: Based on the “OU=Jumpboxes,OU=Bastion Servers” distinguishedname
Lansweeper: Based on the description field of the computer
Backup Server: Based on “OU=Backup Servers” distinguishedname
Some of the high-value targets are valid yet already known, like domain controllers. Others are good finds, like the jumpbox and backup servers. This method can process system names in other languages and more verbose descriptions of systems to determine systems that may be high value. Additionally, this method can be adapted to allow for a more specific query—for example, that might suit a different client environment:
Given the data, identify which systems are related to
SWIFT. Use the first (name) field to identify the systems.
In this case, the LLM will look for SWIFT servers and may save you some time searching for it manually. This approach can potentially be even better when you combine this data with internal documentation to give you results, even if the Active Directory information is lacking any information about the usage of the system.
For defenders, there are some ways to deal with this situation:
Limit the amount of information in the Active Directory and put the system descriptions in your documentation instead of within the Active Directory
Limit the amount of information a regular user can retrieve from the Active Directory
Monitor LDAP queries to see if a large amount of data is being retrieved from LDAP
Use Case 4: User Clustering
After gaining an initial strong position, and understanding the systems in the network, attackers will often need to find the right users to compromise to gain further privileges in the domain. For defenders, legacy user accounts or administrators with too many rights is a common security issue.
Administrators often have multiple user accounts: one for normal operations like reading email and using it on their workstations, and one or multiple administrator accounts. This separation is done to make it harder for attackers to compromise the administrator account.
There are some common flaws in the implementations that sometimes make it possible to bypass these separations. Most of the methods require the attacker to cluster the users together to see which accounts belong to the same employee. In many cases, this can be done by inspecting the Active Directory objects and searching for patterns in the display name, description, or other fields. To automate this, we tried to find these patterns with Gemini.
Data Model
For this use case, we would like to have the account’s names that Gemini clusters together. During initial testing, the results were quite random. However, after adding a “reason” field, the results improved dramatically. So we used the next Pydantic model:
class User(BaseModel):
accounts: list[Account] = Field(
description="accounts that probably belongs
to this user", default=[]
)
reason: str = Field(
description="Reason why these accounts belong
to this user", default=""
)
class UserAccountResults(BaseModel):
users: list[User] = Field(description="users with multiple
accounts", default=[])
Prompt
In the prompt, we give some examples of what kind of systems we are looking for:
Given the data, cluster the accounts that belong to a single person
by checking for similarities in the name, displayname and sam.
Only include results that are likely to be the same user. Only include
results when there is a user with multiple accounts. It is possible
that a user has more than two accounts. Please specify a reason
why those accounts belong to the same user. Use the first (name)
field to identify the accounts.
Results
The test dataset had about 900 users. We manually determined that some users have two to four accounts with various permissions. Some of these accounts had the same pattern like “user@test.local” and “adm-user@test.local.” However, other accounts had patterns where the admin account was based on the first couple of letters. For example, their main account had the pattern matthijs.gielen@test.local, and the admin account was named: adm-magi@test.local. To keep track of those accounts, the description of the admin account contained some text similar to “admin account of Matthijs Gielen.”
With this prompt, Gemini managed to cluster 50 groups of accounts in our dataset. After manual verification, some of the results were discarded because they only contained one account in the cluster. This resulted in 43 correct clusters of accounts. Manually, we found the same correlation; however, where Gemini managed to output this information in a couple of minutes, manually this took quite a bit longer to analyze and correlate all accounts. This information was used in preparation for further attacks, as shown in the next use case.
Use Case 5: Correlation Between Users and Their Machines
Knowing which users to target or defend is often not enough. We also need to find them within the network in order to compromise them. Domain administrators are (usually) physical people; they need somewhere to type in their commands and perform administrative actions. This means that we need to correlate which domain administrator is working from which workstation. This is called session information, and BloodHound uses this information in an edge called “HasSession.”
In the past, it was possible to get all session information with a regular user during red teaming.
Using the technique in Use Case 4, we can correlate the different user accounts that one employee may have. The next step is to figure out which workstation belongs to that employee. Then we can target that workstation, and from there, hopefully recover the passwords of their administrator accounts.
In this case, employees have corporate laptops, and the company needs to keep track of which laptop belongs to which employee. Often this information is stored in one of the fields of the computer object in the Active Directory. However, there are many ways to do this, and using Gemini to parse the unstructured data is one such example.
Data Model
This model is quite simple, we just want to correlate machines to their users and have Gemini give us a reason why—to improve the output of the model. Because we will send all users and all computers at once, we will need a list of results.
class UserComputerCorrelation(BaseModel):
user: str = Field(description="name of the user")
computer: str = Field(description="name of the computer")
reason: str = Field(
description="Reason why these accounts belong to this user",
default=""
)
class CorrelationResults(BaseModel):
results: list[UserComputerCorrelation] = Field(
description="users and computers that correlate", default=[]
)
Prompt
In the prompt, we give some examples of what kind of systems we are looking for:
Given the two data sets, find the computer that correlates
to a user by checking for similarities in the name, displayname
and sam. Only include results that are likely to correspond.
Please specify a reason why that user and computer correlates.
Use the first (name) field to identify the users and computers.
Results
The dataset used contains around 900 users and 400 computers. During the assignment, we determined that the administrators correlated users and their machines via the description field of the computer, which was sort of equal to the display name of the user. Gemini correctly picked up this connection, correctly correlating around 120 users to their respective laptops (Figure 3).
Figure 3: Connections between user and laptop as correlated by Gemini
Gemini helped us to select an appropriate workstation, which enabled us to perform lateral movement to a workstation and obtain the password of an administrator, getting us closer to our goal.
To defend against these threats, it can be valuable to run tools like BloodHound in the network. As discussed, BloodHound might not find all the “hidden” edges in your network, but you can add these yourself to the graph. This will allow you to find more Active Directory-based attack paths that are possible in your network and mitigate these before an attacker has an opportunity to exploit those attack paths.
Conclusion
In this blog post, we looked at processing red team data using LLMs to aid in adversarial emulation or improving defenses. These use cases were related to processing human-generated, unstructured data. Table 3 summarizes the results.
Use Case
Accuracy of the Results
Usefulness
Roles
High: There were a few false positives that were in the gray area.
High: Especially when going through a large list of roles of users, this approach will provide fairly fast results.
Credentials in files
High: Found more credentials than comparable tools. More testing should look into the false-positive rate in real scenarios.
Medium: This approach finds a lot more results; however, processing it with Gemini is a lot slower (five seconds per file) than many other alternatives.
High-value targets
Medium: Not all results were new, nor were all high-value targets.
Medium: Some of the results were useful; however, all of them still require manual verification.
Account clustering
High: After taking into account the clusters with one account, the other ones were well clustered.
High: Clustering users is most of the time a tedious process to do manually. It gives fairly reliable results if you filter out the results with only one account.
Computer correlation
High: All results were correctly correlated users to their computers.
High: This approach produces accurate results potentially providing insights into extra possible attack paths.
Table 3: The results of our experiments of data processing with Gemini
As the results show, using an LLM like Gemini can help in converting this type of data into structured data to aid attackers and defenders. However, keep in mind that LLMs are not a silver bullet and have limitations. For example, they can sometimes produce false positives or be slow to process large amounts of data.
There are quite a few use cases we have not covered in this blog post. Some other examples where you can use this approach are:
Correlating user groups to administrator privileges on workstations and servers
Summarizing internal website content or documentation to search for target systems
Ingesting documentation to generate password candidates for cracking passwords
The Future
This was just an initial step that we on the Advanced Capabilities team on the Mandiant Red Team have explored so far when using LLMs for adversarial emulation and defense. For next steps, we know that the models and prompts can be improved by testing variations in the prompts, and other data sources can be investigated to see if Gemini can help analyze them. We are also looking at using linear regression models as well as clustering and pathfinding algorithms to enable cybersecurity practitioners to quickly evaluate attack paths that may exist in a network.
Managing applications across multiple Kubernetes clusters is complex, especially when those clusters span different environments or even cloud providers. One powerful and secure solution combines Google Kubernetes Engine (GKE) fleets and, Argo CD, a declarative, GitOps continuous delivery tool for Kubernetes. The solution is further enhanced with Connect Gateway and Workload Identity.
This blog post guides you in setting up a robust, team-centric multi-cluster infrastructure with these offerings. We use a sample GKE fleet with application clusters for your workloads and a control cluster to host Argo CD. To streamline authentication and enhance security, we leverage Connect Gateway and Workload Identity, enabling Argo CD to securely manage clusters without the need to manage cumbersome Kubernetes Services Accounts.
On top of this, we incorporate GKE Enterprise Teams to manage access and resources, helping to ensure that each team has the right permissions and namespaces within this secure framework.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e0eb810d220>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Finally, we introduce the fleet-argocd-plugin, a custom Argo CD generator designed to simplify cluster management within this sophisticated setup. This plugin automatically imports your GKE Fleet cluster list into Argo CD and maintains synchronized cluster information, making it easier for platform admins to manage resources and for application teams to focus on deployments.
Follow along as we:
Create a GKE fleet with application and control clusters
Deploy Argo CD on the control cluster, configured to use Connect Gateway and Workload Identity
Configure GKE Enterprise Teams for granular access control
Install and leverage the fleet-argocd-plugin to manage your secure, multi-cluster fleet with team awareness
By the end, you’ll have a powerful and automated multi-cluster system using GKE Fleets, Argo CD, Connect Gateway, Workload Identity, and Teams, ready to support your organization’s diverse needs and security requirements. Let’s dive in!
Set up multi-cluster infrastructure with GKE fleet and Argo CD
Setting up a sample GKE fleet is a straightforward process:
1. Enable the required APIs in the desired Google Cloud Project. We use this project as the fleet host project.
a. gcloud SDK must be installed, and you must be authenticated via gcloud auth login.
<ListValue: [StructValue([(‘code’, ‘# Create a frontend team. rngcloud container fleet scopes create frontendrnrn# Add your application clusters to the frontend team. rngcloud container fleet memberships bindings create app-cluster-1-b \rn –membership app-cluster-1 \rn –scope frontend \rn –location us-central1rnrngcloud container fleet memberships bindings create app-cluster-2-b \rn –membership app-cluster-2 \rn –scope frontend \rn –location us-central1rnrn# Create a fleet namespace for webserver.rngcloud container fleet scopes namespaces create webserver –scope=frontendrnrn# [Optional] Verify your fleet team setup.rn# Check member clusters in your fleet.rngcloud container fleet memberships list rn# Verify member clusters have been added to the right team (`scope`). rngcloud container fleet memberships bindings list –membership=app-cluster-1 –location=us-central1rngcloud container fleet memberships bindings list –membership=app-cluster-2 –location=us-central1’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e0ebbb52760>)])]>
4. Now, set up Argo CD and deploy it to the control cluster. Create a new GKE cluster as your application and enable Workload Identity on it.
5. Install the Argo CD CLI to interact with the Argo CD API server. Version 2.8.0 or higher is required. Detailed installation instructions can be found via the CLI installation documentation.
Now you’ve got your GKE fleet up and running, and you’ve installed Argo CD on the control cluster. In Argo CD, application clusters are registered with the control cluster by storing their credentials (like API server address and authentication details) as Kubernetes Secrets within the Argo CD namespace. We’ve got a way to make this whole process a lot easier!
8. To make sure the fleet-argocd-plugin works as it should, give it the right permissions for fleet management.
a. Create an IAM service account in your Argo CD control cluster and grant it the appropriate permissions. The setup follows the official onboarding guide of GKE Workload Identity Federation.
code_block
<ListValue: [StructValue([(‘code’, ‘gcloud iam service-accounts create argocd-fleet-admin \rn –project=$FLEET_PROJECT_IDrnrngcloud projects add-iam-policy-binding $FLEET_PROJECT_ID \rn–member “serviceAccount:argocd-fleet-admin@$FLEET_PROJECT_ID.iam.gserviceaccount.com” \rn–role “roles/container.developer”rnrngcloud projects add-iam-policy-binding $FLEET_PROJECT_ID \rn–member “serviceAccount:argocd-fleet-admin@$FLEET_PROJECT_ID.iam.gserviceaccount.com” \rn–role “roles/gkehub.gatewayEditor”rnrngcloud projects add-iam-policy-binding $FLEET_PROJECT_ID \rn–member “serviceAccount:argocd-fleet-admin@$FLEET_PROJECT_ID.iam.gserviceaccount.com” \rn–role “roles/gkehub.viewer”rnrn# Allow ArgoCD application controller and fleet-argocd-plugin to impersonate this IAM service account.rngcloud iam service-accounts add-iam-policy-binding argocd-fleet-admin@$FLEET_PROJECT_ID.iam.gserviceaccount.com \rn–role roles/iam.workloadIdentityUser \rn–member “serviceAccount:$FLEET_PROJECT_ID.svc.id.goog[argocd/argocd-application-controller]”rngcloud iam service-accounts add-iam-policy-binding argocd-fleet-admin@$FLEET_PROJECT_ID.iam.gserviceaccount.com \rn–role roles/iam.workloadIdentityUser \rn–member “serviceAccount:$FLEET_PROJECT_ID.svc.id.goog[argocd/argocd-fleet-sync]”rnrn# Annotate the Kubernetes ServiceAccount so that GKE sees the link between the service accounts.rnkubectl annotate serviceaccount argocd-application-controller \rn –namespace argocd \rn iam.gke.io/gcp-service-account=argocd-fleet-admin@$FLEET_PROJECT_ID.iam.gserviceaccount.com’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e0ebbb52fd0>)])]>
b. You also need to allow the Google Compute Engine service account to access images from your artifacts repository.
Let’s do a quick check to make sure the GKE fleet and Argo CD are playing nicely together. You should see that the secrets for your application clusters have been automatically generated.
code_block
<ListValue: [StructValue([(‘code’, ‘kubectl get secret -n argocdrnrn# Example Output: TYPE DATA AGErn# app-cluster-1.us-central1.141594892609 Opaque 3 64mrn# app-cluster-2.us-central1.141594892609 Opaque 3 64m’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e0ebbb52b50>)])]>
Demo 1: Automatic fleet management in Argo CD
Okay, let’s see how this works! We’ll use the guestbook example app. First, we deploy it to the clusters that the frontend team uses. You should then see the guestbook app running on your application clusters, and you won’t have to deal with any cluster secrets manually!
code_block
<ListValue: [StructValue([(‘code’, “export TEAM_ID=frontendrnenvsubst ‘$FLEET_PROJECT_NUMBER $TEAM_ID’ < applicationset-demo.yaml | kubectl apply -f – -n argocdrnrnkubectl config set-context –current –namespace=argocdrnargocd app list -o name rn# Example Output:rn# argocd/app-cluster-1.us-central1.141594892609-webserverrn# argocd/app-cluster-2.us-central1.141594892609-webserver”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e0ebbb52ca0>)])]>
Demo 2: Evolving your fleet is easy with fleet-argocd-plugin
Suppose you decide to add another cluster to the frontend team. Create a new GKE cluster and assign it to the frontend team. Then, check to see if your guestbook app has been deployed on the new cluster.
code_block
<ListValue: [StructValue([(‘code’, ‘gcloud container clusters create app-cluster-3 –enable-fleet –region=us-central1rngcloud container fleet memberships bindings create app-cluster-3-b \rn –membership app-cluster-3 \rn –scope frontend \rn –location us-central1rnrnargocd app list -o namern# Example Output: a new app shows up!rn# argocd/app-cluster-1.us-central1.141594892609-webserverrn# argocd/app-cluster-2.us-central1.141594892609-webserverrn# argocd/app-cluster-3.us-central1.141594892609-webserver’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e0ebbb521f0>)])]>
Closing thoughts
In this blog post, we’ve shown you how to combine the power of GKE fleets, Argo CD, Connect Gateway, Workload Identity, and GKE Enterprise Teams to create a robust and automated multi-cluster platform. By leveraging these tools, you can streamline your Kubernetes operations, enhance security, and empower your teams to efficiently manage and deploy applications across your fleet.
However, this is just the beginning! There’s much more to explore in the world of multi-cluster Kubernetes. Here are some next steps to further enhance your setup:
Deep dive into GKE Enterprise Teams: Explore the advanced features of GKE Enterprise Teams to fine-tune access control, resource allocation, and namespace management for your teams. Learn more in the official documentation.
Secure your clusters with Connect Gateway: Delve deeper into Connect Gateway and Workload Identity to understand how they simplify and secure authentication to your clusters, eliminating the need for VPNs or complex network configurations. Check out this blog post for a detailed guide.
Master advanced deployment strategies: Explore advanced deployment strategies with Argo CD, such as blue/green deployments, canary releases, and automated rollouts, to achieve zero-downtime deployments and minimize risk during updates. This blog post provides a great starting point.
As you continue your journey with multi-cluster Kubernetes, remember that GKE fleets and Argo CD provide a solid foundation for building a scalable, secure, and efficient platform. Embrace the power of automation, GitOps principles, and team-based management to unlock the full potential of your Kubernetes infrastructure.
As AI models increase in sophistication, there’s increasingly large model data needed to serve them. Loading the models and weights along with necessary frameworks to serve them for inference can add seconds or even minutes of scaling delay, impacting both costs and the end-user’s experience.
For example, inference servers such as Triton, Text Generation Inference (TGI), or vLLM are packaged as containers that are often over 10GB in size; this can make them slow to download, and extend pod startup times in Kubernetes. Then, once the inference pod starts, it needs to load model weights, which can be hundreds of GBs in size, further adding to the data loading problem.
This blog explores techniques to accelerate data loading for both inference serving containers and downloading models + weights, so you can accelerate the overall time to load your AI/ML inference workload on Google Kubernetes Engine (GKE).
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e27d0a72d90>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
1. Accelerating container load times using secondary boot disksto cache container images with your inference engine and applicable libraries directly on the GKE node.
The image above shows a secondary boot disk (1) that stores the container image ahead of time, avoiding the image download process during pod/container startup. And for AI/ML inference workloads with demanding speed and scale requirements, Cloud Storage Fuse (2) and Hyperdisk ML (3) are options to connect the pod to model + weight data stored in Cloud Storage or a network attached disk. Let’s look at each of these approach in more detail below.
Accelerating container load times with secondary boot disks
GKE lets you pre-cache your container image into a secondary boot disk that is attached to your node at creation time. The benefit of loading your containers this way is that you skip the image download step and can begin launching your containers immediately, which drastically improves startup time. The diagram below shows container image download times grow linearly with container image size. Those times are then compared with using a cached version of the container image that is pre-loaded on the node.
Caching a 16GB container image ahead of time on a secondary boot disk has shown reductions in load time of up to 29x when compared with downloading the container image from a container registry. Additionally, this approach lets you benefit from the acceleration independent of container size, allowing for large container images to be loaded predictably fast!
To use secondary boot disks, first create the disk with all your images, create an image out of the disk, and specify the disk image while creating your GKE node pools as a secondary boot disk. For more, see the documentation.
Accelerating model weights load times
Many ML frameworks output their checkpoints (snapshots of model weights) to object storage such as Google Cloud Storage, a common choice for long-term storage. Using Cloud Storage as the source of truth, there are two main products to retrieve your data at the GKE-pod level: Cloud Storage Fuse and Hyperdisk ML (HdML).
When selecting one product or the other there are two main considerations:
Performance – how quickly can the data be loaded by the GKE node
Operational simplicity – how easy is it to update this data
Cloud Storage Fuse provides a direct link to Cloud Storage for model weights that reside in object storage buckets. Additionally there is a caching mechanism for files that need to be read multiple times to prevent additional downloads from the source bucket (which adds latency). Cloud Storage Fuse is appealing because there are no pre-hydration operational activities for a pod to do to download new files in a given bucket. It’s important to note that if you switch buckets that the pod is connected to, you will need to restart the pod with an updated Cloud Storage Fuse configuration. To further improve performance, you can enable parallel downloads, which spawns multiple workers to download a model, significantly improving model pull performance.
Hyperdisk ML gives you better performance and scalability than downloading files directly to the pod from Cloud Storage or other online location. Additionally, you can attach up to 2500 nodes to a single Hyperdisk ML instance, with aggregate bandwidth up 1.2 TiB/sec. This makes it a strong choice for inference workloads that span many nodes and where the same data is downloaded repeatedly in a read-only fashion. To use Hyperdisk ML, load your data on the Hyperdisk ML disk prior to using it, and again upon each update. Note that this adds operational overhead if your data changes frequently.
Which model+weight loading product you use depends on your use case.The table below provides a more detailed comparison of each:
Zonal. Data can be made regional with an automated GKE clone feature to make data available across zones.
Create new persistent volume, load new data, and redeploy pods that have a PVC to reference the new volume.
As you can see there are other considerations besides throughput to take into account when architecting a performant model loading strategy.
Conclusion
Loading large AI models, weights, and container images into GKE-based AI models can delay workload startup times. By using a combination of the three methods described above — secondary boot disk for container images, Hyperdisk ML OR Cloud Storage Fuse for models + weights — get ready to accelerate data load times for your AI/ML inference applications.
As generative AI evolves, we’re beginning to see the transformative potential it is having across industries and our lives. And as large language models (LLMs) increase in size — current models are reaching hundreds of billions of parameters, and the most advanced ones are approaching 2 trillion — the need for computational power will only intensify. In fact, training these large models on modern accelerators already requires clusters that exceed 10,000 nodes.
With support for 15,000-node clusters — the world’s largest — Google Kubernetes Engine (GKE) has the capacity to handle these demanding training workloads. Today, in anticipation of even larger models, we are introducing support for 65,000-node clusters.
With support for up to 65,000 nodes, we believe GKE offers more than 10X larger scale than the other two largest public cloud providers.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e8fa5165eb0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>
Unmatched scale for training or inference
Scaling to 65,000 nodes provides much-needed capacity to the world’s most resource-hungry AI workloads. Combined with innovations in accelerator computing power, this will enable customers to reduce model training time or scale models to multi-trillion parameters or more. Each node is equipped with multiple accelerators (e.g., Cloud TPU v5e node with four chips), giving the ability to manage over 250,000 accelerators in one cluster.
To develop cutting-edge AI models, customers need to be able to allocate computing resources across diverse workloads. This includes not only model training but also serving, inference, conducting ad hoc research, and managing auxiliary tasks. Centralizing computing power within the smallest number of clusters provides customers the flexibility to quickly adapt to changes in demand from inference serving, research and training workloads.
With support for 65,000 nodes, GKE now allows running five jobs in a single cluster, each matching the scale of Google Cloud’s previous world record for the world’s largest training job for LLMs.
Customers on the cutting edge of AI welcome these developments. Anthropic is an AI safety and research company that’s working to build reliable, interpretable, and steerable AI systems, and is excited for GKE’s expanded scale.
“GKE’s new support for larger clusters provides the scale we need to accelerate our pace of AI innovation.” – James Bradbury, Head of Compute, Anthropic
Innovations under the hood
This achievement is made possible by a variety of enhancements: For one, we are transitioning GKE from the open-source etcd, distributed key-value store, to a new, more robust, key-value store based on Spanner, Google’s distributed database that delivers virtually unlimited scale. On top of the ability to support larger GKE clusters, this change will usher in new levels of reliability for GKE users, providing improved latency of cluster operations (e.g., cluster startup and upgrades) and a stateless cluster control plane. By implementing the etcd API for our Spanner-based storage, we help ensure backward compatibility and avoid having to make changes in core Kubernetes to adopt the new technology.
In addition, thanks to a major overhaul of the GKE infrastructure that manages the Kubernetes control plane, GKE now scales significantly faster, meeting the demands of your deployments with fewer delays. This enhanced cluster control plane delivers multiple benefits, including the ability to run high-volume operations with exceptional consistency. The control plane now automatically adjusts to these operations, while maintaining predictable operational latencies. This is particularly important for large and dynamic applications such as SaaS, disaster recovery and fallback, batch deployments, and testing environments, especially during periods of high churn.
We’re also constantly innovating on IaaS and GKE capabilities to make Google Cloud the best place to build your AI workloads. Recent innovations in this space include:
Secondary boot disk, which provides faster workload startups through container image caching
Custom compute classes, which offer greater control over compute resource allocation and scaling
Support for Trillium, our sixth-generation TPU, the most performant and most energy-efficient TPU to date
Support for A3 Ultra VM powered by NVIDIA H200 Tensor Core GPUs with our new Titanium ML network adapter, which delivers non-blocking 3.2 Tbps of GPU-to-GPU traffic with RDMA over Converged Ethernet (RoCE). A3 Ultra VMs will be available in preview next month.
A continued commitment to open source
Guided by Google’s long-standing and robust open-source culture, we make substantial contributions to the open-source community, including when it comes to scaling Kubernetes. With support for 65,000-node clusters, we made sure that all necessary optimizations and improvements for such scale are part of the core open-source Kubernetes.
Our investments to make Kubernetes the best foundation for AI platforms go beyond scalability. Here is a sampling of our contributions to the Kubernetes project over the past two years:
Incubated the K8S Batch Working Group to build a community around research, HPC and AI workloads, producing tools like Kueue.sh, which is becoming the de facto standard for job queueing on Kubernetes
Created the JobSet operator that is being integrated into the Kubeflow ecosystem to help run heterogenous jobs (e.g., driver-executer)
For multihost inference use cases, created the Leader Worker Set controller
Published a highly optimized internal model server of JetStream
Incubated the Kubernetes Serving Working Group, which is driving multiple efforts including model metrics standardization, Serving Catalog and Inference Gateway
At Google Cloud, we’re dedicated to providing the best platform for running containerized workloads, consistently pushing the boundaries of innovation. These new advancements allow us to support the next generation of AI technologies. For more, listen to the Kubernetes podcast, where Maciek Rozacki and Wojtek Tyczynski join host Kaslin Fields to talk about GKE’s support for 65,000 nodes. You can also see a demo on 65,000 nodes on a single GKE cluster here.
Rapidly evolving generative AI models place unprecedented demands on the performance and efficiency of hardware accelerators. Last month, we launched our sixth-generation Tensor Processing Unit (TPU), Trillium, to address the demands of next-generation models. Trillium is purpose-built for performance at scale, from the chip to the system to our Google data center deployments, to power ultra-large scale training.
Today, we present our first MLPerf training benchmark results for Trillium. The MLPerf 4.1 training benchmarks show that Trillium delivers up to 1.8x better performance-per-dollar compared to prior-generation Cloud TPU v5p and an impressive 99% scaling efficiency (throughput).
In this blog, we offer a concise analysis of Trillium’s performance, demonstrating why it stands out as the most performant and cost-efficient TPU training system to date. We begin with a quick overview of system comparison metrics, starting with traditional scaling efficiency. We introduce convergence scaling efficiency as a crucial metric to consider in addition to scaling efficiency. We assess these two metrics along with performance per dollar and present a comparative view of Trillium against Cloud TPU v5p. We conclude with guidance that you can use to make an informed choice for your cloud accelerators.
Traditional performance metrics
Accelerator systems can be evaluated and compared across multiple dimensions, ranging from peak throughput, to effective throughput, to throughput scaling efficiency. Each of these metrics are helpful indicators but do not take convergence time into consideration.
Hardware specifications and peak performance
Traditionally, comparisons focused on hardware specifications like peak throughput, memory bandwidth, and network connectivity. While these peak values establish theoretical boundaries, they are bad at predicting real-world performance, which depends heavily on architectural design and software implementation. Since modern ML workloads typically span hundreds or thousands of accelerators, the key metric is the effective throughput of an appropriately sized system for specific workloads.
Utilization performance
System performance can be quantified through utilization metrics like effective model FLOPS utilization (EMFU) and memory bandwidth utilization (MBU), which measure achieved throughput versus peak capacity. However, these hardware efficiency metrics don’t directly translate to business-value measures like training time or model quality.
Scaling efficiency and trade-offs
A system’s scalability is evaluated through both strong scaling (performance improvement with system size for fixed workloads) and weak scaling (efficiency when increasing both workload and system size proportionally). While both metrics are valuable indicators, the ultimate goal is to achieve high-quality models quickly, sometimes making it worthwhile to trade scaling efficiency for faster training time or better model convergence.
The need for convergence scaling efficiency
While hardware utilization and scaling metrics provide important system insights, convergence scaling efficiency focuses on the fundamental goal of training: reaching model convergence efficiently. Convergence refers to the point where a model’s output stops improving and the error rate becomes constant. Convergence scaling efficiency measures how effectively additional computing resources accelerate the training process to completion.
We define convergence scaling efficiency using two key measurements: the base case, where a cluster of N₀ accelerators achieves convergence in time T₀, and a scaled case with N₁ accelerators taking time T₁ to converge. The ratio of the speedup in convergence time to the increase in cluster size gives us:
A convergence scaling efficiency of 1 indicates that time-to-solution improves by the same ratio as the cluster size. It is therefore desirable to have convergence scaling efficiency as close to 1 as possible.
Now let’s apply these concepts to understand our ML Perf submission for GPT3-175b training task using Trillium and Cloud TPU v5p.
Trillium performance
We submitted GPT3-175b training results for four different Trillium configurations, and three different Cloud TPU v5p configurations. In the following analysis, we group the results by cluster sizes with the same total peak flops for comparison purposes. For example, the Cloud TPU v5p-4096 configuration is compared to 4xTrillium-256, and Cloud TPU v5p-8192 is compared with 8xTrillium-256, and so on.
All results presented in this analysis are based on MaxText, our high-performance reference implementation for Cloud TPUs and GPUs.
Weak scaling efficiency
For increasing cluster sizes with proportionately larger batch-sizes, both Trillium and TPU v5p deliver near linear scaling efficiency:
Figure-1: Weak scaling comparison for Trillium and Cloud TPU v5p. v5p-4096 and 4xTrillium-256 are considered as base for scaling factor measurement. n x Trillium-256 corresponds to n Trillium pods with 256 chips in one ICI domain. v5p-n corresponds to n/2 v5p chips in a single ICI domain.
Figure 1 demonstrates relative throughput scaling as cluster sizes increase from the base configuration. Trillium achieves 99% scaling efficiency even when operating across data-center networks using Cloud TPU multislice technology, outperforming the 94% scaling efficiency of Cloud TPU v5p cluster within a single ICI domain. For these comparisons, we used a base configuration of 1024 chips (4x Trillium-256 pods), establishing a consistent baseline with the smallest v5p submission (v5p-4096; 2048 chips). When measured against our smallest submitted configuration of 2x Trillium-256 pods, Trillium maintains a strong 97.6% scaling efficiency.
Convergence scaling efficiency
As stated above, weak scaling is useful but not a sufficient indicator of value, while convergence scaling efficiency brings time-to-solution into consideration.
Figure-2: Convergence scaling comparison for Trillium and Cloud TPU v5p.
For the largest cluster size, we observed comparable convergence scaling efficiency for Trillium and Cloud TPU v5p. In this example, a CSE of 0.8 means that for the rightmost configuration, the cluster size was 3x the (base) configuration, while the time to convergence improved by 2.4x with respect to the base configuration (2.4/3 = 0.8).
While the convergence scaling efficiency is comparable between Trillium and TPU v5p, where Trillium really shines is by delivering the convergence at a lower cost, which brings us to the last metric.
Cost-to-train
While weak scaling efficiency and convergence scaling efficiency indicate scaling properties of systems, we’ve yet to look at the most crucial metric: the cost to train.
Figure-3: Comparison of cost-to-train based on the wall-clock time and the on-demand list price for Cloud TPU v5p and Trillium.
Trillium lowers the cost to train by up to 1.8x (45% lower) compared to TPU v5p while delivering convergence to the same validation accuracy.
Making informed cloud accelerator choices
In this article, we explored the complexities of comparing accelerator systems, emphasizing the importance of looking beyond simple metrics to assess true performance and efficiency. We saw that while peak performance metrics provide a starting point, they often fall short in predicting real-world utility. Instead, metrics like Effective Model Flops Utilization (EMFU) and Memory Bandwidth Utilization (MBU) offer more meaningful insights into an accelerator’s capabilities.
We also highlighted the critical importance of scaling characteristics — both strong and weak scaling — in evaluating how systems perform as workloads and resources grow. However, the most objective measure we identified is the convergence scaling efficiency, which ensures that we’re comparing systems based on their ability to achieve the same end result, rather than just raw speed.
Applying these metrics to our benchmark submission with GPT3-175b training, we demonstrated that Trillium achieves comparable convergence scaling efficiency to Cloud TPU v5p while delivering up to 1.8x better performance per dollar, thereby lowering the cost-to-train. These results highlight the importance of evaluating accelerator systems through multiple dimensions of performance and efficiency.
For ML-accelerator evaluation, we recommend a comprehensive analysis combining resource utilization metrics (EMFU, MBU), scaling characteristics, and convergence scaling efficiency. This multi-faceted approach enables you to make data-driven decisions based on your specific workload requirements and scale.
Every November, we start sharing forward-looking insights on threats and other cybersecurity topics to help organizations and defenders prepare for the year ahead. The Cybersecurity Forecast 2025 report, available today, plays a big role in helping us accomplish this mission.
This year’s report draws on insights directly from Google Cloud’s security leaders, as well as dozens of analysts, researchers, responders, reverse engineers, and other experts on the frontlines of the latest and largest attacks.
Built on trends we are already seeing today, the Cybersecurity Forecast 2025 report provides a realistic outlook of what organizations can expect to face in the coming year. The report covers a lot of topics across all of cybersecurity, with a focus on various threats such as:
Attacker Use of Artificial Intelligence (AI): Threat actors will increasingly use AI for sophisticated phishing, vishing, and social engineering attacks. They will also leverage deepfakes for identity theft, fraud, and bypassing security measures.
AI for Information Operations (IO): IO actors will use AI to scale content creation, produce more persuasive content, and enhance inauthentic personas.
The Big Four: Russia, China, Iran, and North Korea will remain active, engaging in espionage operations, cyber crime, and information operations aligned with their geopolitical interests.
Ransomware and Multifaceted Extortion: Ransomware and multifaceted extortion will continue to be the most disruptive form of cyber crime, impacting various sectors and countries.
Infostealer Malware: Infostealer malware will continue to be a major threat, enabling data breaches and account compromises.
Democratization of Cyber Capabilities: Increased access to tools and services will lower barriers to entry for less-skilled actors.
Compromised Identities: Compromised identities in hybrid environments will pose significant risks.
Web3 and Crypto Heists: Web3 and cryptocurrency organizations will increasingly be targeted by attackers seeking to steal digital assets.
Faster Exploitation and More Vendors Targeted: The time to exploit vulnerabilities will continue to decrease, and the range of targeted vendors will expand.
Be Prepared for 2025
Read the Cybersecurity Forecast 2025 report for a more in-depth look at these and other threats, as well as other security topics such as post-quantum cryptography, and insights unique to the JAPAC and EMEA regions.
For an even deeper look at the threat landscape next year, register for our Cybersecurity Forecast 2025 webinar, which will be hosted once again by threat expert Andrew Kopcienski.
For even more insights, hear directly from our security leaders: Charles Carmakal, Sandra Joyce, Sunil Potti, and Phil Venables.
Have you heard of the monkey and the pedestal? Astro Teller, the head of Google’s X “moonshot factory,” likes to use this metaphor to describe tackling the biggest challenge first, despite being tempted by the endorphin boost of completing more familiar tasks.
It’s a challenge startups know well. When you’re re-inventing the industry standard, it’s all about failing fast. You’re looking for the quickest way to get to a “no” so you’re another step closer to reaching a “yes.” Every day you gain back from abandoning trivial features in favor of focusing on the biggest challenge becomes a day closer to your goal.
Fortunately, AI is not only playing an increasing role in the offerings of startups but also how they build those offerings, accelerating their execution and giving them new insights to act faster and iterate better.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e1b6ebe9820>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
What’s the fastest way you’re going to get your product launched? Piecing together data across your front and back ends in yet another platform only creates latency and poor user experience. Many of the successful funded gen AI startups — more than 60% of whom are building on Google Cloud — are using Vertex AI as the development host and productionalize backbone to accelerate innovation. In this moment of rapid transformation, every day matters.
Our mission at Google Cloud is to support ambitious startups, like the three profiled below who are driving innovation in customer service, healthcare research, and identity verification. Abstrakt, NextNet, and Ferret are among the long list of startups using Google Cloud’s AI-optimized infrastructure and Vertex AI platform to accelerate their innovation.
NextNet
NextNet is a specialized search engine for life sciences and pharmaceutical researchers that uses AI to analyze vast amounts of biomedical data. Leveraging Google Cloud Vertex AI and Gemini, it identifies hidden relationships and patterns within scientific literature, allowing researchers to ask complex questions in plain language and receive accurate answers. This accelerates research and drives innovation in medicine by facilitating a deeper understanding of complex biomedical information.
Specifically, NextNet uses Gemini for natural language processing and knowledge extraction, outperforming other commercial AI models in this domain. It also utilizes Vertex AI and other managed services to efficiently develop SaaS offerings and scale its knowledge base.
“Gemini, as a production platform, has been incredibly useful and allowed us to evaluate scientific research with subtlety and clarity,” Steven Banerjee, the CEO of NextNet, said. “On our specific language tasks, Gemini has equaled or outperformed other commercial AI models. We are extracting scientific insights now that would not have been possible 12 or 18 months ago. And the iteration speed of Google’s generative models has meant that we are staying state of the art.”
Abstrakt
Abstrakt focuses on enhancing contact center customer experiences through the use of generative AI. They leverage Google Cloud’s robust infrastructure and the Vertex AI suite to transcribe calls in real-time while simultaneously evaluating sentiment.
Their mission is to empower teams to have more meaningful and effective conversations with customers in real time, helping both call center workers and their customers resolve issues faster, so even more can get the help they need. Abstrakt aims to achieve this by providing instantaneous guidance and insights during calls, transparent progress tracking, and AI-guided coaching, leading to continued improvement for workers and customers alike.
Ferret.ai
Ferret.ai is using AI to offer transparent insights about the backgrounds of people in your personal and professional network. In a world where reputational risks seem to be growing and rarely go away thanks to digital “receipts,” Ferret is using world-class global data alongside AI to provide a curated relationship intelligence and monitoring solution to help avoid risk and identify opportunities.
The unique platform built by Ferret.ai pieces together information and finds patterns by using generative AI to analyze information, verify the source, assess its credibility, and achieve contextual understanding that identifies sentiment. They also use pattern recognition to analyze vast datasets to uncover potential red flags or inconsistencies that could be missed by human analysts. This is valuable for investors, businesses, and individuals who want to avoid scams, make smart partnerships, and ensure their safety.
Faster innovation, faster time to market
These founders saw significant pain points and directed all of their resources to solving these problems for their customers. Deploying packaged back-end solutions, like Vertex AI’s unified development platform, benefited their speed to market. When Google Cloud takes care of model accuracy and performance, you’re freed up to own what you do best.
Your needs as a startup can evolve quickly based on the dynamics of the market. Importantly our open ecosystem of models and APIs offer flexibility as you adapt and grow.
Go tackle your biggest challenges and let Google Cloud provide you with the most secure, fast, scalable platform so you can focus on the solutions that matter most to your users. For help getting started, you can apply for the Google Cloud for Startups Program or reach out to one of our startup specialists today.
At Google Cloud, we’re fortunate to partner with organizations that employ some of the world’s most talented and innovative professionals. Together, we’re reshaping industries, driving customer success, and pushing the boundaries of what’s possible. Our partners are more than collaborators — they’re the change-makers defining the future of business.
The Google Cloud Partner All-stars program celebrates these remarkable people. Each year, we recognize those who go above and beyond, leading with passion, innovation, and a commitment to excellence. These are the people driving our industry forward, and we’re thrilled to honor them for 2024.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4b65a4b9a0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
2024 Spotlight: Artificial Intelligence
For 2024, we’re excited to introduce a new category that highlights the power and potential of Artificial Intelligence (AI). As AI redefines the business and technology landscape, we’re proud to recognize those who are not just using AI, but actively shaping its future.
The Artificial Intelligence category honors those visionary leaders spearheading AI initiatives with bold ideas, experimentation, and ethical stewardship. They’re bringing AI from concept to reality, unlocking new possibilities, and driving meaningful results for their clients. These Partner All-stars are building the future, one breakthrough at a time.
What sets Partner All-stars apart?
The following attributes define the standout qualities of a Partner All-star:
Artificial Intelligence
Provides a clear vision for AI’s transformative potential in the business
Champions AI initiatives by securing resources, driving adoption, and promoting collaboration
Leads experimentation with AI, generating innovative solutions and tangible results for clients
Demonstrates a commitment to ethical AI practices, ensuring responsible and fair use
Delivery excellence
Top-ranked individuals on Google Cloud’s Delivery Readiness Portal (DRP)
Demonstrates commitment to technical excellence by passing advanced delivery challenge labs and other advanced technical training
Demonstrates excellent knowledge and adoption of Google Cloud delivery enablement methodologies, assets, and offerings
Exhibits expertise through customer project and deployment experience
Consistently meets and exceeds sales goals and targets
Aligns on shared goals to deliver amazing end-to-end customer experiences
Prioritizes long-term customer-relationship building over short-term selling
Marketing
Drives strategic programs and key events that address customer concerns and interests
Works across cross-functional teams to ensure the success of key campaigns and events
Takes a data-driven approach to marketing, investing time and resources in programs that drive the biggest impact
Always exploring areas of opportunity and improvement in order to uplevel future work
Sales
Demonstrates commitment to the customer transformation journey
Solutions engineering
Delivers superior customer experiences by keeping professional skills up to date, earning at least one Google technical certification
Embraces customer challenges head-on, taking responsibility for end-to-end solutioning
Works with purpose, providing deliverables in a timely manner while never compromising quality
Works effectively across joint product areas, leveraging technology in new and innovative ways to address customer needs
Celebrating excellence in 2024
On behalf of the entire Google Cloud team, I want to extend our heartfelt congratulations to the 2024 Google Cloud Partner All-stars, who we have notified of this distinction. Their dedication, innovation, and leadership continue to inspire us and drive success for our customers.
Stay tuned as we celebrate this year’s Partner All-stars and join the conversation by following #PartnerAllstars across social media.
Leveraging first-party data, and data quality in general, are major priorities for online retailers. While first-party data certainly comes with challenges, it also offers a great opportunity to increase transparency, redefine customer interactions, and create more meaningful user experiences.
Here at PUMA, we’re already taking steps to seize the opportunities presented by signal loss as organizations embrace privacy-preserving technologies. Our motto “Forever.Faster.” isn’t just about athletic performance, it also describes our rapid response to market changes. In that aim, we’re partnering with Google Cloud to leverage the capabilities of machine learning (ML) for greater customer engagement via advanced audience segmentation.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e347e68b700>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
Moving from manual segmentation to advanced audiences
In August 2022 we decided to test Google Cloud’s machine-learning capabilities to create advanced audiences based on high purchase propensity with different data sets in BigQuery. While Google Analytics offers predictive audiences, we used this pilot to build a custom ML model tailored to our specific needs, deepening our expertise and giving us more control over the underlying data. Designing our own machine learning model allows us to analyze and extract valuable insights from first-party data, enable predictive analytics, and attribute conversions and interactions to the right touchpoints.
The core products used in the process included Cloud Shell for framework setup, Instant BQML as the quick start tool for audience configuration, CRMint for orchestration, and BigQuery for advanced analytics capabilities. The modeling and machine-learning occur within BigQuery while CRMint aids in data integration and audience creation within Google Analytics. When Google Analytics is linked to Google Ads, audience segments are shared automatically with Google Ads where they can be activated in a number of strategic ways.
The Google Cloud and gTech Ads teams worked closely with us throughout the set-up and deployment, which was fast and efficient. Generally speaking, we were impressed with the support we received throughout the process, which was highly collaborative from initiation to execution. The Google teams offered guidance and resources throughout, and their support enabled us to leverage the advanced analytics capabilities of BigQuery to build our own predictive audience model and identify the users most likely to make a purchase. We also appreciated the amount of available documentation, which made things much easier for our developers.
Engaging the right users with advanced analytics
This was one of the first ML marketing analytics use cases at PUMA, and it turned out to be a very positive experience. Within the first six months, the click-through rate (CTR) of our advanced audience segments was significantly higher compared to other website visitor audiences or any other audience.
Among the 10 designated audiences, the top three showed a 149.8% increase in click-through rate compared to other audiences used for advertising. Additionally, we observed a 4.6% increase in conversion rate and a 6% increase in average order value (AOV) compared to the previous setup.
In addition to these results, which are helping us take steps towards increasing revenue, the new solutions are also enabling us to optimize and predict costs. Pricing is well structured, flexible, and transparent, and we can easily identify exactly where we’re spending money.
We’re looking forward to continuing to partner with Google Cloud as we work to adapt our advertising strategy to signal loss, which has been happening for years.
Our next step is to explore the development of advanced audiences using PUMA’s internal data, such as offline purchase information or other data not captured by Google Ads or Google Analytics. This opens up new opportunities to reach consumers we’re currently missing, while expanding the size of our audiences. At the same time, we’ll be scaling advanced audiences to all of our 20+ international entities.
We’re also exploring server-side tagging using Tag Manager and in one market, we’re also experimenting with real-time reporting based on server-side data collection, with promising results so far.
We’re looking to implement an event-driven architecture leveraging Google Cloud’s services, which is part of a broader strategy to reorganize and better structure our data-management processes to better support and operationalize AI use cases for both our organization and customers.
This project has opened our eyes to the possibilities of data-driven, machine learning automated audience creation. Added to this, the fact that it was so easy to deploy has bolstered our confidence when it comes to machine-learning projects in general. We look forward to a long-term partnership with Google Cloud and are excited to see where the future will take us.
In today’s data- and AI-driven world, organizations are grappling with an ever-growing volume of structured and unstructured data. This growth makes it increasingly challenging to locate the right data at the right time, and a significant portion of enterprise data remains undiscovered or underutilized — what’s often referred to as “dark data.” In fact, a staggering 66% of organizations report that at least half of their data falls into this category.
To address this challenge, today we’re announcing automatic discovery and cataloging of Google Cloud Storage data with Dataplex, part of BigQuery’s unified platform for intelligent data to AI governance. This powerful capability empowers organizations to:
Automatically discover valuable data assets residing within Cloud Storage, including structured and unstructured data such as documents, files, PDFs, images, and more.
Harvest and catalog metadata for your discovered assets by keeping schema definitions up-to-date with built-in compatibility checks and partition detection, as data evolves.
Enable analytics for data science and AI use cases at scale with auto-created BigLake, external or object tables, eliminating the need for data duplication or manually creating table definitions.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e44b8032520>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>
How it works
The automatic discovery and cataloging process in Dataplex is designed to be integrated and efficient, and performs the following steps:
Discovery scan: Discovery scan is configured by the user using the BigQuery Studio UI, CLI or gcloud, which scans your Cloud Storage bucket with up to millions of files, identifying and classifying data assets.
Metadata extraction: Relevant metadata, including schema definitions and partition information, is extracted from the discovered assets.
Creation of dataset and tables in BigQuery: A new dataset with numerous BigLake, external or object tables (for unstructured data) is automatically created in BigQuery with accurate, up-to-date table definitions. For scheduled scans, these tables will be updated as the data in cloud storage bucket evolves.
Analytics and AI preparation: The published dataset and tables are available for analysis, processing, data science, and AI use cases in BigQuery, as well as open-source engines like Spark, Hive, and Pig.
Catalog integration: All BigLake tables are integrated into the Dataplex catalog, making them easily searchable and accessible.
Key benefits
Dataplex’s automatic discovery and cataloging feature offers a multitude of benefits for organizations:
Enhanced data visibility: Gain a clear understanding of your data and AI assets across Google Cloud, eliminating the guesswork and reducing the time spent searching for relevant information.
Reduced manual effort: Cut back on the toil and effort of creating table definitions manually by letting Dataplex scan the bucket and create numerous BigLake tables that correspond to your data in Cloud Storage.
Accelerated analytics and AI: Integrate the data that’s discovered into your analytics and AI workflows, unlocking valuable insights and driving informed decision-making.
Simplified data access: Provide authorized users with easy access to the data they need, while maintaining appropriate security and control measures.
Automatic discovery and cataloging in Dataplex marks a significant step forward in helping organizations unlock the full potential of their data. By eliminating the challenges associated with dark data and providing a comprehensive, searchable catalog of your Cloud Storage assets, Dataplex empowers you to make data-driven decisions with confidence.
We encourage you to explore this powerful new feature and experience the benefits firsthand. To learn more and get started, please visit the Dataplex documentation or contact our team for assistance.
At Google Cloud, we recognize that helping customers and government agencies keep tabs on vulnerabilities plays a critical role in securing consumers, enterprises, and software vendors.
We have seen the Common Vulnerabilities and Exposure (CVE) system evolve into an essential part of building trust across the IT ecosystem. CVEs can help users of software and services identify vulnerabilities that require action, and they have become a global, standardized tracking mechanism that includes information crucial to identifying and prioritizing each vulnerability.
As part of our continued commitment to security and transparency on vulnerabilities found in our products and services,effective todaywe will be issuing CVEs for critical Google Cloud vulnerabilities, even when we do not require customer action or patching.
To help users easily recognize that a Google Cloud vulnerability does not require customer action, we will annotate the CVE record with the “exclusively-hosted-service” tag. No action is required by customers in relation to this announcement at this time.
”Transparency and shared action, to learn from and mitigate whole classes of vulnerability, is a vital part of countering bad actors. We will continue to lead and innovate across the community of defenders,” said Phil Venables, CISO, Google Cloud.
aside_block
<ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e44b80159d0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Our commitment to vulnerability transparency
The Cyber Safety Review Board (CSRB) has found that a lack of a strong commitment to security creates preventable errors and serious breaches, a serious concern for major platform providers who have a responsibility to advance security best practices. We can see why the CSRB emphasized best practices for cloud service providers in its report on Storm-0558 detailing how the APT group used forged authentication tokens to gain access to email accounts for around 25 organizations, including government agencies.
By partnering with the industry through programs including Cloud VRP, and driving visibility on vulnerabilities with CVEs, we believe we are advancing security best practices at scale. CVEs are publicly disclosed and can be used by anyone to track and identify vulnerabilities, which has helped our customers to understand their security posture better. Ultimately, issuing CVEs helps us build your trust in Google Cloud as a secure cloud partner for your enterprise and business needs.
As we noted in our Secure By Design paper, Google has a 20-year history of collaborating with external security researchers, whose independent work discovering vulnerabilities has been helpful to Google. Our vulnerability reporting process encourages direct engagement as part of our community-based approach to addressing security concerns.
This same community-focused journey took us down the path of launching our first CVE Numbering Authority in 2011. Since then, we’ve issued more than 8,000 CVEs across our consumer and enterprise products. We’ve since expanded our partnership with MITRE, and Google became one of their four Top-Level Roots in 2022.
Today’s announcement marks an important step Google Cloud is making to normalize a culture of transparency around security vulnerabilities, and aligns with our shared fate model, in which we work with our customers to continuously improve security.
While the Google Cloud VRP has a specific focus on strengthening Google Cloud products and services, and brings together our engineers with external security researchers to further the security posture for all our customers, CVEs enable us to help our customers and security researchers track publicly-known vulnerabilities.
Earlier this year, Google Cloud launched the highly anticipated C4 machine series, built on the latest Intel Xeon Scalable processors (5th Gen Emerald Rapids), setting a new industry-leading performance standard for both Google Compute Engine (GCE) and Google Kubernetes Engine (GKE) customers. C4 VMs deliver exceptional performance improvements and have been designed to handle your most performance-sensitive workloads delivering up to a 25% price-performance improvement over the previous generation general-purpose VMs, C3 and N2.
C4 VMs are already delivering impressive results for businesses. Companies like Verve, which is a creator of digital advertising solutions, are already integrating C4 into their core infrastructure; in Verve’s case, they’re seeing remarkable results with a 37% improvement in performance. For Verve, C4 isn’t only about better performance — it’s actually fueling their revenue growth.
Read on to discover how Verve leveraged C4 to achieve this success, including their evaluation process and the key metrics that demonstrate C4’s impact on their business.
aside_block
<ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e44b8003220>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>
Verve’s Challenge and Business Opportunity
Verve delivers digital ads across the internet with a platform that connects ad buyers to other ad-delivery platforms, as well as allows these advertisers to bid on ad space through a proprietary real-time auction platform. Real-time is the key here, and it’s also why C4 has made such a big impact on their business.
A marketplace for ad bidding is an incredibly latency and performance-sensitive workload. About 95% of the traffic hitting their marketplace, which runs on GKE, is not revenue generating, because the average ad fill-rate is only 5-7% of bids.
It takes a lot of cloud spend to fill bid requests that never generate revenue, and so any increase in performance or reduction in latency can have a tremendous impact on their business. In fact, the more performance Verve can get out of GKE, the more revenue they generate because the fill-rate for ads (successful bid/ask matching) grows exponentially.
Fast Facts on Verve and their ad-platform:
Verve’s GKE Architecture and C4 Evaluation Plan
Verve’s marketplace ran on N2D machines leveraging an Envoy-based reverse proxy (Countour) for ingress and egress. Verve is handling a high volume of traffic, with hundreds of millions of events daily (clicks, impressions, actions, in-app events, etc.).
This means they need to be able to scale their servers fast to handle traffic spikes and control who has access to our servers and with which permissions. Verve has built its infrastructure on top of Kubernetes to allow elasticity and scalability, and they rely a lot on spot pricing to be cost effective.
To setup the benchmark, Verve ran a canary, meaning one pod of the main application per node type, and measured two values, one related to performance exported from the application, vCPU per ad request 99th percentile in ms, and one related to spot price, which is given by the total compute price (vCPU + GB RAM):
Leveraging GKE Gateway to Save Costs and Improve Latency
Verve needs to scale their servers fast to handle traffic spikes with the lowest latency possible and rapid scalability, and for this they rely on Google GKE Gateway which leverages Google’s Envoy-based global load balancers.Their solution optimizes real-time bidding for ads, boosting revenue through improved response times and efficient cost management in a market where latency is correlated to bids and revenue, somewhat similar to High-Frequency Trading (HFT) in financial markets.
By migrating to GKE Gateway, Verve managed to improve its Total Cost of Ownership (TCO). Google only charges for traffic going through the Gateway, so Verve saw significant compute cost savings by not having to spin up GKE nodes for the proxies. Also, Verve saw a notable reduction in the burden of maintaining this GKE Gateway-based solution compared to an Ingress-based solution, which impacted their TCO. The cherry on top of it all is they saw improved latencies by 20-25% in the traffic itself and this generated 7.5% more revenue.
Saving Costs While Achieving Better Performance with Custom Compute Classes
Anticipating their high season, Verve worked with their GCP Technical Account Manager to get onboarded in the Early Access Program of Custom Compute Classes, a new feature which Verve had been eagerly anticipating for years.
Custom Compute Classes (CCC) is a Kubernetes-native, declarative API that can be used to define fallback priorities for autoscaled nodes in case a top priority is unavailable (e.g. a spot VM). It also has an optional automatic reconciliation feature which can move workloads to higher priority node shapes if and when they become available.
This lets GKE customers define a prioritized list of compute preferences by key metrics like price/performance, and GKE automatically handles scale-up and consolidation onto the best options available at any time. Verve is using CCCs to help establish C4 as their preferred machine, but they also use it to specify other machine families to maximize their obtainability preferences.
Pablo Loschi, Principal Systems Engineer at Verve, was impressed with the versatility his team was able to achieve:
“With Custom Compute Classes,” Loschi said, “we are closing the circle of cost-optimization. Based on our benchmarks, we established a priority list of spot machine types based on price/performance, and CCC enables us to maximize obtainability and efficiency by providing fall-back compute priorities as a list of preferred machines. We love how when out-of-stock machines become available again CCC reconciles to preferential infrastructure, finally eliminating the false dichotomy of choosing between saving costs and machine availability, even in the busy season”
Verve’s Results and Business Impact
Verve benchmarked their marketplace running on GKE across several GCE machines. Today their marketplace runs on N2D machines, and by switching to C4 they saw a 37% improvement in performance.
They also switched from a self-managed Contour-Envoy proxy to GKE Gateway, which saw a dramatic improvement in latency of 20% to 25%, which translated into 7.5% more revenue since more bids are auctioned. GKE Gateway also allowed them to save a lot of compute costs because the load balancer doesn’t charge per compute but only per network. Additionally, they benefited from reduced manual burden of managing, updating, and scaling this solution.
“We were able to directly attribute the reduced latency to revenue growth — more bids are being accepted because they are coming faster,” Ken Snider, Verve VP of Cloud Infrastructure, said.
The combination of switching to C4 and GKE Gateway is driving their business’ revenue growth. “We started on a path a year ago talking with the product team from Google to help solve this problem, and now we are seeing it come together,” Snider said.
The next phase for Verve’s optimization journey is to improve their compute utilization, ensuring maximal usage of all deployed GKE nodes. GKE features such as Node Autoprovisioning and Custom Compute Classes will continue to play an important role in his team’s efforts in driving top-line growth for the business while being good stewards of their cloud costs.
C4 Brings Unparalleled Performance
C4 VMs are built on the latest Intel Xeon Scalable processors (5th Gen Emerald Rapids), delivering a significant performance leap for mission-critical and performance-sensitive workloads such as databases, gaming, financial modeling, data analytics, and inference.
Leveraging Google’s custom-designed Titanium infrastructure, C4 VMs provide high bandwidth and low latency networking for optimal performance with up to 200 Gbps of bandwidth, as well as high-performance storage with Hyperdisk. With C4, storage is offloaded to the Titanium adapter, reserving the host resources for running your workloads. And by leveraging hitless upgrades and live migration, the vast majority of infrastructure maintenance updates are performed with near-zero impact to your workloads, minimizing disruptions and providing predictable performance. For real-time workloads, C4 offers up to 80% better CPU responsiveness compared to previous generations, resulting in faster trades and a smoother gaming and streaming experience.
But C4 offers more than just powerful hardware; it’s a complete solution for performance-critical workloads. C4 VMs integrate seamlessly with Google Kubernetes Engine (GKE), enabling you to easily deploy and manage containerized applications at scale.
A range of machine types with varying vCPU and memory configurations are available to match your specific needs. And with its superior price-performance, C4 VMs deliver exceptional value, helping you optimize your cloud spend without compromising on performance.