Cloud

2025 05 21

AWS – Amazon Aurora Global Database introduces support for up to 10 secondary Region clusters

Amazon Aurora Global Database now supports adding up to 10 secondary Regions to your global cluster, further enhancing scalability and availability for globally distributed applications.

With Global Database, a single Aurora cluster can span multiple AWS Regions, providing disaster recovery from Region-wide outages and enabling fast local reads for globally distributed applications. This launch increases the number of secondary Regions that can be added to a global cluster from the previously supported limit of up to 5 secondary Regions to up to 10 secondary Regions, providing a larger global footprint for operating your applications See documentation to learn more about Global Database.

Amazon Aurora combines the performance and availability of high-end commercial databases with the simplicity and cost-effectiveness of open-source databases. To get started with Amazon Aurora, take a look at our getting started page.

Read More for the details.

2025 05 21

AWS – Announcing EKS Dashboard, a multi-cluster view of Kubernetes infrastructure across AWS Regions and your AWS Organizations

Tibor Kiss AWS, Cloud AWS

Amazon Elastic Kubernetes Service (EKS) announces the general availability of EKS Dashboard, a new feature that provides centralized visibility into Kubernetes infrastructure across multiple AWS Regions and accounts. EKS Dashboard provides comprehensive insights into your Kubernetes clusters, enabling operational planning and governance. You can access the Dashboard in EKS console through AWS Organizations’ management and delegated administrator accounts.

As you expand your Kubernetes footprint to address operational and strategic objectives, such as improving availability, ensuring business continuity, isolating workloads, and scaling infrastructure, the EKS Dashboard provides centralized visibility across your Kubernetes infrastructure. You can now visualize your entire Kubernetes infrastructure without switching between AWS Regions or accounts, gaining aggregated insights into clusters, managed node groups, and EKS add-ons. This includes clusters running specific Kubernetes versions, support status, upcoming end of life auto-upgrades, managed node group AMI versions, EKS add-on versions, and more. This centralized approach supports more effective oversight, auditability, and operational planning for your Kubernetes infrastructure.

The EKS Dashboard can be accessed in the us-east-1 AWS Region, aggregating EKS cluster metadata from all commercial AWS Regions. To get started, see the EKS user guide.

Read More for the details.

2025 05 21

AWS – Amazon EC2 Mac instances now support configurable System Integrity Protection (SIP) settings

Tibor Kiss AWS, Cloud AWS

Starting today, customers can now configure System Integrity Protection (SIP) settings on their EC2 Mac instances, providing greater flexibility and control over their development environments. SIP is a critical macOS security feature that helps prevent unauthorized code execution and system-level modifications. This enhancement enables developers to temporarily disable SIP for development and testing purposes, install and validate system extensions and DriverKit drivers, optimize testing performance through selective program management, and maintain security compliance while meeting development requirements.

The new SIP configuration capability is available across all EC2 Mac instance families, including both Intel (x86) and Apple silicon platforms. Customers can access this feature in all AWS regions where EC2 Mac instances are currently supported. To learn more about this feature, please visit the documentation here and our launch blog here. To learn more about EC2 Mac instances, click here.

Read More for the details.

2025 05 21

AWS – AWS DMS introduces Data Resync for improved migration accuracy

Tibor Kiss AWS, Cloud AWS

AWS Database Migration Service (AWS DMS) now supports Data Resync, a new feature that automatically corrects data inconsistencies identified during validation between source and target databases.

Data Resync integrates with your existing DMS migration tasks and supports both Full Load and Change Data Capture (CDC) phases. It uses your current task settings—including connection configurations, table mappings, and transformations—to apply corrections automatically, helping ensure accurate and reliable migrations without manual intervention. With Data Resync, AWS DMS can detect and resolve common data issues, such as missing records, duplicate entries, or mismatched values, based on validation results.

Data Resync is available starting with AWS DMS replication engine version 3.6.1, and currently supports migration paths from Oracle and SQL Server to PostgreSQL. For detailed information on how Data Resync enhances migration accuracy, please refer to the AWS DMS Technical Documentation.

Read More for the details.

2025 05 21

AWS – AWS Cost Anomaly Detection enables advanced alerting through AWS User Notifications

Tibor Kiss AWS, Cloud AWS

AWS Cost Anomaly Detection now integrates with AWS User Notifications (via Amazon EventBridge), enabling customers to create enhanced alerting capabilities in the AWS User Notifications console. This integration lets customers configure sophisticated alert rules based on service, account, or other cost dimensions to identify and respond to unexpected spending changes faster. Using AWS User Notifications, customers can receive immediate or aggregated alerts through multiple channels including email, AWS Chatbot, and the AWS Console Mobile Application, while maintaining a centralized history of alert notifications.

This new capability allow customers to customize their cost monitoring by creating alert rules in AWS User Notifications. Now customers can configure rules with higher thresholds for machine learning services that naturally experience cost spikes during training, while setting lower thresholds for stable services like databases where small changes might indicate configuration issues. Customers also benefit from verified contact management, ensuring alerts reach the right teams through validated delivery channels that can be reused across multiple alert configurations.

These enhancements are available in all AWS Regions, except the AWS GovCloud (US) Regions and the China Regions. To learn more about setting up alerts in AWS User Notifications and getting started, visit the AWS Cost Anomaly Detection product page and documentation.

Read More for the details.

2025 05 21

AWS – AWS Deadline Cloud now supports Foundry Nuke version 16

Tibor Kiss AWS, Cloud AWS

Starting today, AWS Deadline Cloud will support the latest version of Foundry Nuke, a powerful compositing tool widely used for visual effects and post-production workflows. AWS Deadline Cloud is a fully managed service that simplifies render management for teams creating computer-generated graphics and visual effects, for films, television and broadcasting, web content, and design.

With support for Nuke version 16, you can access the latest improvements for Nuke while leveraging AWS Deadline Cloud’s managed infrastructure for your rendering pipelines, giving you the ability to create high-quality content using cutting-edge compositing features.

This new version is now available in all AWS regions where AWS Deadline Cloud is currently offered. To learn more about AWS Deadline Cloud and how to leverage Nuke version 16 in your workflows, visit the AWS Deadline Cloud documentation.

Read More for the details.

2025 05 21

AWS – Amazon RDS now supports easy retrieval of engine lifecycle support dates

Tibor Kiss AWS, Cloud AWS

Amazon RDS announces a new capability that helps you view engine lifecycle support dates for your databases. This new feature provides a centralized and convenient place to access engine support dates, offering greater control over your database lifecycle management

You can view start and end dates for RDS Standard Support and RDS Extended Support for RDS and Aurora major engine versions through the RDS API or AWS CLI. If RDS Extended Support is available for an engine version then both RDS Standard and Extended Support dates are shown. If RDS Extended Support is not available for an engine version, the response includes only RDS Standard Support dates

With this feature you can view lifecycle support dates for RDS MySQL, RDS MariaDB, RDS PostgreSQL, Aurora MySQL, and Aurora PostgreSQL engines. To learn more, visit Amazon RDS User Guide and Amazon Aurora User Guide.

Amazon RDS makes it simple to set up, operate, and scale database deployments in the cloud. Create or update a fully managed Amazon RDS database in the Amazon RDS Management Console.

Amazon Aurora is designed for unparalleled high performance and availability at global scale with full MySQL and PostgreSQL compatibility. To get started with Amazon Aurora, take a look at our getting started page.

Read More for the details.

2025 05 21

AWS – AWS Marketplace Sellers can now receive disbursements for partially paid invoices

Tibor Kiss AWS, Cloud AWS

AWS Marketplace now supports partial disbursements for AWS Marketplace invoices transacted through the AWS Inc., Europe, Middle East and Africa (EMEA), Australia (AU), and Japan (JP) Marketplace Operators (MPOs), allowing sellers to receive funds as buyers make partial payments on AWS Marketplace invoices. AWS Marketplace now automatically processes partial disbursements based on the invoice amount paid by the buyer, aligned with the seller’s disbursement schedule configured in the AWS Marketplace Management Portal (AMMP). Previously, sellers had to wait for complete invoice payments by buyers before receiving disbursements for invoices.

Sellers can now access funds faster through disbursement of partial payments without waiting for buyers to pay invoices in full. Enhancements have also been made to AWS Marketplace Seller reporting to provide better visibility into partially disbursed invoices. For more details on the AWS Marketplace Seller reporting experience, visit the billed revenue dashboard and collections and disbursement dashboard guides.

Partial disbursements are available to AWS Marketplace sellers who transact through the AWS Inc., EMEA, AU, and JP MPOs.

For more information about partial disbursements for AWS Marketplace invoices and updates to seller dashboards, access the partial disbursements documentation.

Read More for the details.

2025 05 21

AWS – AWS Transfer Family announces ML-KEM quantum-resistant key exchange for SFTP

Tibor Kiss AWS, Cloud AWS

AWS Transfer Family now supports ML- KEM (FIPS-203), a post-quantum algorithm standardized by the National Institute of Standards and Technology (NIST), for SFTP file transfers. Quantum-resistant public-key exchange helps protect transfers of data files that require long-term confidentiality against “harvest now, decrypt later“ threats. In such scenarios, an adversary may be recording present day traffic for decrypting once cryptanalytically relevant quantum computers become available.

AWS Transfer Family offers fully managed support for the transfer of files over SFTP, AS2, FTPS, FTP, and web browser-based transfers directly into and out of AWS storage services. With this launch, you can now use post-quantum (PQ) hybrid security policies that combine classical Elliptic Curve Diffie-Hellman with quantum-resistant ML-KEM key exchanges between your AWS Transfer Family SFTP endpoints and clients like OpenSSH, Putty, and JSch that support PQ algorithms. When using a PQ hybrid policy, your Transfer Family SFTP server preserves the standard connection options supported by most clients today, while leveraging the most secure PQ connection options with clients that support quantum-resistant key exchange.

ML-KEM quantum-resistant key exchange for SFTP file transfers is supported in all AWS Regions where AWS Transfer Family is available. Older PQ key exchange methods which included ML-KEM’s pre-standardized version (Kyber), introduced in AWS Transfer in 2023, will be removed from existing policies and no longer be included in the new PQ policy. To learn more about using PQ security policies to enable quantum-resistant key exchange, visit our documentation.

Read More for the details.

2025 05 21

GCP – Cloud CISO Perspectives: How Google Cloud’s security team helps build securely

Tibor Kiss Cloud, Google Cloud gcp

Welcome to the first Cloud CISO Perspectives for May 2025. Today, Iain Mulholland, senior director, Security Engineering, pulls back the curtain on how Google Cloud approaches security engineering and how we take secure by design from mindset to production.

As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.

aside_block: <ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7f61c98580>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

How Google Cloud’s security team helps engineers build securely

By Iain Mulholland, senior director, Security Engineering

Velocity is a chief concern in every executive office, but it falls to CISOs to balance the tension between keeping the business secure and ensuring the business keeps up. At Google, we’re constantly thinking about how to enable both resilience and innovation.

For decades, we’ve been taking a holistic approach to how security decision-making can work better. We believe that the success we’ve seen with our security teams is achievable at many organizations, and can help lead to better security and business outcomes.

My team is responsible for ensuring Google Cloud is the most secure cloud, and we approach security as an engineering function. It’s a different lens than traditional IT or compliance views, two parts of the business where security priorities are often set, which results in improved decision-making and security outcomes.

Our Office of the CISO security engineering team partners with product team software engineers at all stages of the software development lifecycle to find paths to ship secure software — all while maintaining product-release velocity and adhering to secure-by-design principles.

We’re still seeing too many organizations rely on defenses that were designed for the desktop era — despite successful efforts to convince business leaders to invest in more modern security tools, as Phil Venables and Andy Wen noted last year.

“To be truly resilient in today’s security landscape, organizations must consider an IT overhaul and rethink their strategy toward solutions with modern, secure-by-design architectures that nullify classes of vulnerabilities and attack vectors,” they said.

To turn this core security philosophy into reality, we’ve used it to guide how we build our teams. Cloud security engineers are embedded with product teams to help the entire organization “shift left” and take an engineering-centered approach to security. Our Office of the CISO security engineering team partners with product team software engineers at all stages of the software development lifecycle (SDLC) to find paths to ship secure software — all while maintaining product-release velocity and adhering to secure-by-design principles.

You can see this in action with our threat modelling practice. Security engineers and software development teams work closely to analyze potential threats to the product and to identify actions and product capabilities that can mitigate risks. Because this happens in the design phase, the team can eliminate these threats early in the SDLC, ensuring our products are secure by design.

With engineering as our security foundation, we can build capabilities at breadth, at depth, and in clear relationship to each other, so that our total power exceeds the sum of these parts.

Instead of simulating risk, we deploy our researchers to consider the whole cloud as an attack surface. They chain vulnerabilities in novel ways to improve our overall security architecture.

Protecting against threats is a great example of the impact of this approach. We characterize the vast cloud threat landscape in three specific areas: outbound network attacks (such as DDoS, outbound intrusion attempts, and vulnerability scans); resource misuse (such as cryptocurrency mining, illegal video streaming, and bots); and content-based threats (such as phishing and malware).

Across that landscape, threat actors often use similar techniques and exploit similar vulnerabilities. To combat these tactics, the team generates intelligence to prevent, detect, and mitigate risk in Google Cloud offerings before they become problems to our customers.

We “shift left” on threats, too: Identifying this systemic risk feeds into the lifecycle of software and product development. Once we identify a threat vector, we work closely with our security and product engineers to harden product defenses to help eliminate threats before they can take root.

We use AI, advanced data science, and analytics solutions to protect Google Cloud and our customers from future threats by focusing on three key capabilities: predicting future user behavior, proactively identifying risky security patterns, and improving the efficiency and measurability of threats and security operations.

It’s vital to our mission that we find attack paths before attackers do, reducing unknown security risks by finding vulnerabilities in our products and services before they are made available to customers. In addition to simulating risk, we push our researchers to consider the whole cloud as an attack surface. They chain vulnerabilities in novel ways to improve our overall security architecture.

Responding to threats is a critical third element of our engineering environment’s interlocking capabilities. Our security response operations assess and implement remediation strategies that come from external parties, and we frequently participate in comprehensive, industry-wide responses. Regular collaboration with Google Cloud’s Vulnerability Rewards Program has been a major driver of our success in this area.

Across all of these areas, there is incredible complexity, but the philosophy that guides the work is simple: By baking security into engineering processes, you can secure systems better and earlier than bolting security on at the end. Investing in a deep engineering bench coupled with embedding security personnel, processes, and procedures as early as possible in the development lifecycle can strengthen decision-making confidence and business resilience across the organization.

You can learn more about how you can incorporate security best practices into your organization’s engineering environment from our Office of the CISO.

aside_block: <ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7f61c98940>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

Threat Intelligence news

COLDRIVER using new malware to steal data from Western targets and NGOs: Google Threat Intelligence Group (GTIG) has attributed new malware to the Russian government-backed threat group COLDRIVER (also known as UNC4057, Star Blizzard, and Callisto) that has been used to steal data from western governments and militaries, as well as journalists, think tanks, and NGOs. Read more.
Cybercrime hardening guidance from the frontlines: The U.S. retail sector is currently being targeted in ransomware operations that GTIG suspects is linked to UNC3944, also known as Scattered Spider. UNC3944 is a financially-motivated threat actor characterized by its persistent use of social engineering and brazen communications with victims. Here’s our latest proactive hardening recommendations to combat their threat activities. Read more.

Please visit the Google Cloud blog for more threat intelligence stories published this month.

Now hear this: Podcasts from Google Cloud

How cyber-savvy is your board: We’ve long extolled the importance of bringing boards of directors up to speed on cybersecurity challenges both foundational and cutting-edge, which is why we’ve launched “Cyber Savvy Boardroom,” a new monthly podcast from our Office of the CISO’s David Homovich, Alicja Cade, and Nick Godfrey. Our first three episodes feature security and business leaders known for their intuition, expertise, and guidance, including Karenann Terrell, Christian Karam, and Don Callahan. Listen here.
From AI agents to provenance in MLSecOps: What is MLSecOps, and what should CISOs know about it? Diana Kelley, CSO, Protect AI, goes deep on machine-learning model security with hosts Anton Chuvakin and Tim Peacock. Listen here.
What we learned at RSAC 2025: Anton and Tim discuss their RSA Conference experiences this year. How did the show floor hold up to the complicated reality of today’s information security landscape? Listen here.
Deconstructing this year’s M-Trends: Kirstie Failey, GTIG, and Scott Runnels, Mandiant Incident Response, chat with Anton and Tim about the challenges of turning standard incident reports into bigger-picture review found in this year’s M-Trends. Listen here.
Defender’s Advantage: How UNC5221 targeted Ivanti Connect Secure VPNs: Mandiant’s Matt Lin and Ivanti’s Daniel Spicer join host Luke McNamara as they dive into the research and response of UNC5221’s campaigns against Ivanti. Listen here.

To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in a few weeks with more security-related updates from Google Cloud.

Read More for the details.

2025 05 21

GCP – The AI-driven telecom: A new era of network transformation

Tibor Kiss Cloud, Google Cloud gcp

The telecommunications industry is undergoing a profound transformation, with AI and generative AI emerging as key catalysts. Communication service providers (CSPs) are increasingly recognizing that these technologies are not merely incremental improvements but fundamental drivers for achieving strategic business and operational objectives. This includes enabling digital transformation, fostering service innovation, optimizing monetization strategies, and enhancing customer retention.

To provide a comprehensive and data-driven analysis of this evolving landscape, Google Cloud partnered with Analysys Mason to conduct an in-depth study “ Gen AI in the network: CSP progress in adopting gen AI for network operations. This research examines CSPs’ progress, priorities, challenges, and best practices in leveraging gen AI to reshape their networks, offering quantifiable insights into this critical transformation.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7f6108bc40>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Key findings: A data-driven roadmap

The Analysys Mason study offers valuable insights into the current state of gen AI adoption in telecom, providing a data-driven roadmap for CSPs seeking to navigate this transformative journey:

1. Widespread gen AI adoption and future intentions

Demonstrating the strong momentum behind gen AI, 82% of CSPs surveyed are currently trialing or using it in at least one network operations area, and this adoption is set to expand further, with an additional 9% planning to implement it within the next 2 years.

2. Strategic importance of gen AI

Gen AI empowers CSPs to achieve strategic goals within the network: 57% surveyed see it as a key enabler of autonomous, cloud-based network transformation initiatives, 52% for the transition to new business models like NetCo/ServCo and more digitally driven organizations, and all with the aim of enhancing customer experience and driving broader transformation.

3. Key drivers for gen AI investment

CSPs are strategically prioritizing gen AI investments to achieve a range of network objectives, including optimizing network performance and reliability, enhancing application quality of experience (QoE), and improving network resource utilization, recognizing gen AI’s potential to move beyond a productivity tool and become a cornerstone of future network operations and automation..

4. Challenges in achieving model accuracy

While gen AI offers significant potential, the study found that 80% of CSPs face challenges in achieving the expected accuracy from gen AI models, a hurdle that impacts use case scaling and ROI. These accuracy issues are linked to data-related problems, which many CSPs across different maturity levels are still working to resolve, and the complexity of customizing models for specific network operations.

5. Addressing the skills gap

With over 50% of CSPs citing it as a key concern, employee skillsets represent a major challenge, highlighting the urgent imperative for CSPs to invest in upskilling and reskilling initiatives to cultivate in-house expertise in AI, gen AI, and data science related fields.

6. Gen AI implementation strategies

While many CSPs begin their gen AI implementation by utilizing vendor-provided applications with embedded gen AI capabilities (the most common approach), the study emphasizes that to fully address their diverse network needs, CSPs also seek to customize models using techniques like fine-tuning and prompt engineering; this customization, however, is heavily reliant on a strong data strategy to overcome challenges such as data silos and data quality issues, which significantly impact the accuracy and effectiveness of the resulting gen AI solutions.

7. Deployment preferences

While 51% of CSPs indicated hybrid cloud environments as the predominant deployment choice for gen AI platforms in network operations, reflecting the need for flexibility and control, a significant 39% of CSPs show a strong preference for private cloud-only deployments specifically for their data platforms, driven by the critical importance of data security and control. Public cloud deployments are preferred for AI model deployments.

Recommendations for CSPs

In summary, to secure a competitive edge, CSPs will need to prioritize gen AI use cases with clear ROI by adopting early-win gen AI use cases while developing a long-term strategy, transform their organizational structure and invest in upskilling initiatives, develop and implement a robust data strategy to support all AI initiatives and cultivate strong partnerships with expert vendors to accelerate their gen AI journey.

Google Cloud: Your partner for network transformation

Google Cloud empowers CSPs’ data-driven transformation by providing expertise in operating planetary-scale networks, a unified data platform, AI model optimization, professional services for gen AI, hybrid cloud solutions, and a rich partner ecosystem. This is further strengthened by Google Cloud’s proven success in driving network transformation for major telcos, leveraging infrastructure, platforms, and tools that deliver the required near real-time processing and scale.

To kickstart your AI-powered journey for network transformation visit Google Cloud for Telecommunications.

Read More for the details.

2025 05 21

GCP – Introducing AI.GENERATE_TABLE: creating structured data from gen AI models in BigQuery

Tibor Kiss Cloud, Google Cloud gcp

The explosion of digital content from social media, smartphones, and other sources has created a massive amount of unstructured data like images, videos, and documents. To help you analyze this data, BigQuery is connected with Vertex AI, Google Cloud’s powerful AI platform, so you can use advanced AI models, like Gemini 2.5 Pro/Flash, to understand the meaning hidden within your unstructured data.

Google’s advanced AI models can analyze a wide range of data formats, from text and images to audio and video. They can extract key information like names, dates, and keywords, transforming raw data into structured insights that integrate with your existing tools. Plus, with new techniques like constrained decoding, these models can even generate structured data in JSON format, helping to ensure compatibility with your workflows.

To further streamline this process, we recently added a new BigQuery feature called AI.GENERATE_TABLE(), which builds upon the capabilities of ML.GENERATE_TEXT(). This function allows you to automatically convert the insights from your unstructured data into a structured table within BigQuery, based on the provided prompt and table schema. This streamlined process allows you to easily analyze the extracted information using your existing data analysis tools.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud data analytics’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7f6109c460>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/bigquery/’), (‘image’, None)])]>

Extracting structured data from images

Let’s dive deeper into how this new feature works with an example that uses three images. First, you have a picture of the Seattle skyline featuring the iconic Space Needle. Next, you have a city view of New York City. Finally, you have an image of cookies and flowers, which is unrelated to cityscapes.

To use these images with BigQuery’s generative AI functions, you first need to make them accessible to BigQuery. You can do this by creating a table namely “image_dataset” that connects to the Google Cloud Storage bucket where the images are stored.

code_block: <ListValue: [StructValue([(‘code’, ‘CREATE OR REPLACE EXTERNAL TABLErn bqml_tutorial.image_datasetrnWITH CONNECTION DEFAULT rnOPTIONS(object_metadata=”DIRECTORY”,rn uris=[“gs://bqml-tutorial-bucket/images/*”])’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e7f61c5aeb0>)])]>

Now that you’ve prepared your image data, let’s connect to the powerful Gemini 2.5 Flash model. You do this by creating a “remote model” within BigQuery, which acts as a bridge to this advanced AI.

code_block: <ListValue: [StructValue([(‘code’, ‘CREATE OR REPLACE MODELrn bqml_tutorial.gemini25flash001rnREMOTE WITH CONNECTION DEFAULT rnOPTIONS (endpoint = “gemini-2.5-flash-001″)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e7f61c5a3a0>)])]>

Now, let’s use the AI.GENERATE_TABLE() function to analyze the images. You’ll need to provide the function with two things: the remote model you created (connected to Gemini 2.5 Flash) and the table containing your images.

You’ll ask the model to “Recognize the city from the picture and output its name, belonging state, brief history, and tourist attractions. Please output nothing if the image is not a city.” To ensure the results are organized and easy to use, we’ll specify a structured output format with the following fields:

city_name (string)
state (string)
brief_history (string)
attractions (array of strings)

This format, known as a schema, ensures the output is consistent and compatible with other BigQuery tools. You’ll notice that the syntax for defining this schema is the same as the CREATE TABLE command in BigQuery.

code_block: <ListValue: [StructValue([(‘code’, ‘SELECTrn city_name,rn state,rn brief_history,rn attractions,rn urirnFROMrn AI.GENERATE_TABLE( MODEL bqml_tutorial.gemini25flash001,rn (rn SELECTrn (“Recognize the city from the picture and output its name, belonging state, brief history, and tourist attractions. Please output nothing if the image is not a city.”, ref) AS prompt,rn urirn FROMrn bqml_tutorial.image_dataset),rn STRUCT( “city_name STRING, state STRING, brief_history STRING, attractions ARRAY<STRING>” AS output_schema,rn 8192 AS max_output_tokens))’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e7f61486430>)])]>

When you run the AI.GENERATE_TABLE() function, it produces a table with five columns. Four of these columns match the schema you defined (city_name, state, brief_history, and attractions), while the fifth column contains the image URI from the input table.

As you can see, the model successfully identified the cities in the first two images, providing their names and the states in which they are found. It even generated a brief history and a list of attractions for each city based on its internal knowledge. This demonstrates the power of large language models to extract information and insights directly from images.

Extracting structured data from medical transcriptions

Now let’s see another example where you can use AI.GENERATE_TABLE to extract information from unstructured data stored in a BQ managed table. We are going to use the Kaggle Medical Transcriptions dataset which contains sample medical transcriptions from various specialities.

Transcriptions are long and verbose and have all kinds of information, e.g. a patient’s age, weight, blood pressure, conditions, etc. It is challenging and time-consuming for people to process them manually and make it well organized. But now, we can let the LLM and AI.GENERATE_TABLE help us.

Suppose you need the following information:

age (int64)
blood_pressure (struct<high int64, low int64)
weight (float64)
conditions (array of strings)
diagnosis (array of strings)
medications (array of strings)

We can come up with this SQL query:

code_block: <ListValue: [StructValue([(‘code’, ‘SELECTrn age,rn blood_pressure,rn weight,rn conditions,rn diagnosis,rn medications,rn promptrnFROMrn AI.GENERATE_TABLE(MODEL bqml_tutorial.gemini25flash001,rn (rn SELECTrn input_text AS promptrn FROMrn bqml_tutorial.kaggle_medical_transcriptionsrn LIMITrn 3),rn STRUCT(rn “age INT64, blood_pressure STRUCT<high INT64, low INT64>, weight FLOAT64, conditions ARRAY<STRING>, diagnosis ARRAY<STRING>, medications ARRAY<STRING>” AS output_schema,rn 1024 AS max_output_tokens))’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e7f61486b50>)])]>

You can see that the model successfully extracted the information from the medical transcriptions and the results are organized as the schema specified with the help of AI.GENERATE_TABLE.

The AI.GENERATE_TABLE() function can help you transform your data and create a BigQuery table for easy analysis and integration with your existing workflows. To learn more about the full syntax, refer to the documentation. Have feedback on these new features or have additional feature requests? Let us know at bqml-feedback@google.com.

Read More for the details.

2025 05 21

GCP – A deep dive into AlloyDB’s vector search enhancements

Tibor Kiss Cloud, Google Cloud gcp

If you’re building a generative AI application or an AI agent, there’s a high likelihood you’ll need to perform simultaneous searches on structured and unstructured data. For example, the prompt “Show me all pictures of sunsets I took in the past month” includes a structured part (the date is within the past month) and an unstructured part (the picture contains a sunset). In recent years, modern relational databases such as AlloyDB for PostgreSQL have added vector search capabilities to cover the unstructured part.

At Google Cloud Next 2025, we announced a series of key innovations in AlloyDB AI’s ScaNN index to improve performance and quality of search over structured and unstructured data. By deeply integrating with the AlloyDB query planner, the ScaNN index is able to optimize the ordering of SQL filters in vector search based on your workload characteristics. Let’s dive into what filter selectivity is and how AlloyDB ScaNN’s index leverages it to improve the performance and quality of your search.

aside_block: <ListValue: [StructValue([(‘title’, ‘Get started with a 30-day AlloyDB free trial instance’), (‘body’, <wagtail.rich_text.RichText object at 0x3e7f615b8940>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://goo.gle/try_alloydb’), (‘image’, None)])]>

Filtered vector search

To illustrate the power of filtered vector search in AlloyDB, imagine you’re an online retailer managing a product catalog within AlloyDB. With more than 100,000 items, this product catalog includes references to images, textual descriptions, inventory information, and catalog metadata in your products table.

To search through this data, you can leverage vector search with SQL filters to enable search across unstructured and structured data, providing users with higher quality search results. In the metadata, there may be fields such as color, gender, size and price stored in your table that you can leverage as search filters.

Say a user searches for a “maroon puffer jacket”. You might use “maroon” as a filter and “puffer jacket” as the part of the query upon which you perform a vector search. So you might have a SQL statement like:

code_block: <ListValue: [StructValue([(‘code’, “SELECT * from products WHERE color=’maroon’ ORDER BY text_embedding <-> google_ml.embedding(‘text-embedding-005’, ‘puffer jacket’) LIMIT 100″), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e7f61deb460>)])]>

In the products table, we have set a vector index on our text_embedding column and a B-tree index on our metadata column, color.

Depending on how commonly maroon appears in the color column of the dataset, which is called selectivity in database terminology, the AlloyDB query planner may choose to apply the filter before, after, or in-line with the vector search query. Let’s dive into why the planner may choose one option over the other.

High selectivity

When a filter is highly selective, it means that only a small percentage of your data meets the specified criteria. In our example, “maroon” is a rare color, with only 0.2% of the 100,000 products in the catalog being that color.

In cases where we have highly selective filters, the AlloyDB query planner often chooses to apply a pre-filter, i.e, applying the filtering conditions prior to the vector search. In our example, we would apply our filter condition WHERE color=’maroon’ before the vector search. Since “maroon” is rare, the B-tree index on the color column efficiently identifies a small subset of products (e.g., 200 out of 100,000). Subsequently, the computationally intensive vector search is performed only on this significantly reduced set of candidates. This strategy utilizes a K-Nearest Neighbors (KNN) vector search, which delivers results with 100% recall, i.e., the exact closest neighbors, within the set of results after the filter is replied.

Low selectivity

Conversely, if the filter isn’t highly selective (e.g., if 90% of products are “blue”), pre-filtering is inefficient because it doesn’t significantly narrow down the search space. In such cases, when a large proportion of your data satisfies the filtering conditions, a filter is considered to have low selectivity.

Say you’re searching for “blue puffer jackets”; if 90% of our catalog is blue, applying the filter first isn’t beneficial because it doesn’t narrow down our list of candidates all that much. If you applied a pre-filter, you would end up performing a KNN vector search against the majority of the dataset, which would be computationally expensive. Therefore, the AlloyDB query planner would choose to apply a post-filter.

Post-filtering means performing the vector search first, leveraging an Approximate Nearest Neighbors (ANN) vector index such as ScaNN on the text_embedding column to quickly identify a set of candidate results. Only after retrieving these initial candidates — the top 100 based on vector similarity — is the filter condition, WHERE color=’blue’, applied.

If your filter had high selectivity, there’s a risk this approach would yield very few candidates meeting your filter criteria. However because the condition WHERE color=’blue’ has low selectivity, you would likely obtain the approximate top-100 results. In the unlikely case you do not retrieve 100 results, the vector search would need to perform additional scans on the vector index to retrieve more candidates until the desired limit was reached. While effective for filters with low selectivity, post-filtering can become less efficient with highly selective filters, as the vector index might need to scan through many non-matching candidates.

Medium selectivity

When a filter has medium selectivity, the AlloyDB query planner may choose to apply either a pre-filter or a post-filter. However in cases of medium selectivity that range from 0.5-10% selectivity (such as, say, the color “purple”), AlloyDB supports a method called inline filtering, or in-filtering. Inline filtering applies the filter conditions in tandem with the vector search. With in-line filtering, AlloyDB leverages a bitmap from a B-tree index to select candidates matching the filter condition in tandem with the vector search in one pass.

So in this example, while the plan evaluates which candidates are purple, AlloyDB is simultaneously searching for the approximate neighbors of the search query against items in the data catalog. This approach balances the benefits of reducing the search space, as pre-filtering does, without the risk of returning too few results, a potential issue with post-filtering when combined with a highly selective filter.

Adaptive filtration

While the cases detailed above seem to clearly partition the search space across three different kinds of filtering, in practice it’s not so simple. At times the query planner may misjudge the selectivity of a filter due to outdated statistics, resulting in the vector search and filtering conditions being applied in a suboptimal order, for less high-quality results. This is where AlloyDB ScaNN’s latest innovation, adaptive filtration, comes in. With adaptive filtration, AlloyDB learns the selectivity of your filters at query time based on actual observed statistics and can adaptively change its execution plan. This results in more optimal ordering of filters and vector search, and greatly mitigates cases of planner misestimations.

In summary, real-world workloads are complex, and distinct filtering conditions have different selectivities that may change over time as your data and workloads grow. That’s where an intelligent database engine powering your vector search can make a difference — by optimizing and adapting filtering for your workload, helping to ensure consistently high-quality and performant search results as your data evolves.

Get started today

Get started with vector search leveraging AlloyDB’s ScaNN index today. Then, learn how you can use AlloyDB AI’s latest features to power multimodal vector search. Adaptive filtration is available in preview; get started by turning on the feature flag.

You can also sign up for a 30-day AlloyDB free trial.

Read More for the details.

2025 05 21

GCP – Advancing sovereignty, choice, and security in the cloud for our customers

Tibor Kiss Cloud, Google Cloud gcp

Like most organizations, Google Cloud is continually engaging with customers, partners, and policymakers to deliver technology capabilities that reflect their needs. When it comes to digital sovereignty solutions, Google Cloud has worked with customers for nearly a decade.

Today, we’re pleased to announce significant technical and commercial updates on our sovereign cloud solutions for customers, and details on how we’re helping them achieve greater control, choice, and security in the cloud — without compromising functionality.

Building on the first sovereign solutions we introduced years ago, we’ve massively scaled our infrastructure footprint globally, now consisting of more than 42 cloud regions, 127 zones, 202 network edge locations, and 33 subsea cable investments.

We have also forged key partnerships in Asia, Europe, the Middle East, and the United States to help deliver these sovereign solutions, including Schwarz Group and T-Systems (Germany), S3NS (France), Minsait (Spain), Telecom Italia (Italy), Clarence (Belgium and Luxembourg), CNTXT (Saudi Arabia), KDDI (Japan), and World Wide Technology (United States).

A commitment to customer choice

Digital sovereignty is about more than just controlling encryption keys. At its core, it’s about giving customers the flexibility their global businesses require. It’s about enabling them to operate on multiple clouds. And it’s about securing data with the most advanced technologies.

We’ve long been committed to enabling customers to choose the cloud provider and solution that best fit their needs, and not locking them into a single option. Sovereignty in the cloud is not one-size-fits-all. We offer customers a portfolio of solutions that align with their business needs, regulatory requirements, and risk profiles.

Our strong contractual commitments to our customers are backed by robust sovereign controls and solutions that are all available today. Our updated sovereign cloud solution portfolio includes:

Google Cloud Data Boundary gives customers the ability to deploy a sovereign data boundary and control where their content is stored and processed. This boundary also allows customers to store and manage their encryption keys outside Google’s infrastructure, which can help customers meet their specific data access and control requirements no matter what market.

Google Cloud Data Boundary customers have access to a large set of Google Cloud products, including AI services, and can enable capabilities, including Confidential Computing and External Key Management with Key Access Justifications to control access to their data and deny access for any reason.

In addition, Google Workspace customers can take advantage of Google Cloud Data Boundary’s sovereign controls to limit the processing of their content to the United States or EU, choose a country to locally store data, and use client-side encryption to prevent unauthorized access (even by Google) to their most critical content.

Today, we are also announcing User Data Shield, a solution that adds Mandiant services to validate the security of customer applications built on top of Google Cloud Data Boundary. User Data Shield provides recurring security testing of customer applications to validate sovereignty postures.

Google Cloud Dedicated delivers a solution designed to meet local sovereignty requirements, enabled by independent local and regional partners. As an example, Google Cloud has partnered with Thales since 2021 to build a first-of-its-kind Trusted Cloud by S3NS for Europe.

This offering with Thales is designed to offer a rich set of Google Cloud services with GPUs to support AI workloads and is operated by S3NS, a standalone French entity. Currently in preview, S3NS’ solution is designed to meet the rigorous security and operational resilience requirements of France’s SecNumCloud standards. We are expanding our Google Cloud Dedicated footprint globally, launching next in Germany.

“For France to truly embrace digital sovereignty, it is essential to have a cloud solution that marries the immense power of hyperscale technology with the strictest local security and operational controls. S3NS is committed to providing French organizations with access to advanced cloud services, including critical AI capabilities, all operated within France by a European operator to meet and exceed the rigorous SecNumCloud standards,” said Christophe Salomon, EVP, Information Systems and Secured Communication, at Thales.

Google Cloud Air-Gapped offers a fully standalone and air-gapped solution that does not require connectivity to an external network. This solution is tailored for customers in the intelligence, defense, and other sectors with strict data security and residency requirements. The air-gapped solution can be deployed and operated by Google, the customer, or a Google partner.

It is built with open-source components and comes with a targeted set of AI, database, and infrastructure services. Because air-gapped solutions run on open-source components, they are designed to provide business continuity and survivability in the event of service disruptions. Google Cloud Air-Gapped received authorization in 2024 to host U.S. government Top Secret and Secret-level data.

“Working with Google Cloud to introduce sovereign offerings can give our joint clients greater control, choice, and security in the cloud, without compromising the functionality of their underlying cloud architectures,” said Scott Alfieri, Senior Managing Director and Google Business Group Lead at Accenture. “Google Cloud’s extensive global infrastructure, coupled with Accenture’s transformation and industry expertise, helps organizations build an agile and scalable foundation, unlocking opportunities for growth and continuous innovation.”

Local control, global security

Security and sovereignty are two sides of the same coin. Local control of data and operations can provide customers a greater level of confidence in their security, but it’s also true that no organization can be considered sovereign if dependencies on legacy infrastructure leave its data vulnerable to loss or theft.

Analysis from the Google Threat Intelligence Group and Google Cloud’s Office of the CISO suggests that the global cyber threat landscape will only become more complex as malicious actors tap into AI-powered tools and techniques to prey on older software products, platforms, and outdated infrastructures.

With Google Cloud, customers not only get sovereign solutions, but also gain access to our leading security capabilities. This includes our rigorous focus on secure by design technology and deep expertise from Google Threat Intelligence Group and Mandiant Consulting, who operate on the frontlines of cyber conflicts worldwide and maintain trusted partnerships with more than 80 governments around the world.

In addition, Google Cloud CyberShield provides AI and intelligence-driven cyber defense to help governments defend against threats at national scale. And Mandiant Managed Defense services make it easy for customers worldwide to extend their security teams with our security team.

Google Sovereign Cloud solutions ultimately enable customers to leverage the secure foundation of Google Cloud, while gaining access to advanced security features — such as Confidential Computing, Zero Trust, post-quantum cryptography, and AI-powered platform defenses — faster and more cost-effectively than they could achieve on their own.

Sovereign solutions for any organization

We remain dedicated to fostering an environment of trust and control for our customers, empowering organizations globally to navigate the complex landscape of digital sovereignty with confidence. We continue to work with customers, partners, and policymakers around the world to refine our sovereign cloud offerings and deliver technologies that address their needs.

To learn more about how we are enabling our customers’ digital sovereignty capabilities, visit our web page or contact your account manager.

Read More for the details.

2025 05 20

AWS – Amazon RDS for Oracle now supports credential management with AWS Secrets Manager for databases using Oracle multitenant architecture

Tibor Kiss AWS, Cloud AWS

Amazon RDS for Oracle now supports credential management with AWS Secrets Manager for databases that adopt Oracle multitenant architecture. Oracle multitenant architecture enables customers to consolidate data and code from multiple databases into one database by setting up a multitenant container database (CDB) that can include multiple pluggable databases (PDBs). With this launch, customers can use AWS Secrets Manager to manage user credentials for their tenant pluggable databases.

Using AWS Secrets Manager to manage user credentials for tenant pluggable databases allows customers to automate regular password rotations, use AWS Identity and Access Management (IAM) for access control to authorized users, encrypt credentials using AWS Key Management Service (KMS), and enhance security posture by replacing the use of plaintext password in application code with programmatic calls to retrieve credentials from AWS Secrets Manager. RDS database management operations such as database restore from Amazon S3 or a snapshot and point-in-time recovery automatically use credentials managed in AWS Secrets Manager.

To learn more about using AWS Secrets Manager with Amazon RDS for Oracle database with the CDB architecture, see the Amazon RDS documentation. When storing database secrets in AWS Secrets Manager, your AWS account incurs charges. For information about AWS Secrets Manager pricing and capabilities, visit the AWS Secrets Manager product page.

This capability is available in all AWS Regions where Amazon RDS for Oracle and AWS Secrets Manager are available. For more information about regional availability, see the AWS Region table.

Read More for the details.

2025 05 20

GCP – Google AI Edge Portal: On-device machine learning testing at scale

Tibor Kiss Cloud, Google Cloud gcp

Today, we’re excited to announce Google AI Edge Portal in private preview, Google Cloud’s new solution for testing and benchmarking on-device machine learning (ML) at scale.

Machine learning on mobile devices enables amazing app experiences. But how will your model truly perform across the vast, diverse, and ever-changing landscape of mobile devices? Manually testing at scale – across hundreds of device types – is a laborious task that often requires a dedicated device lab. It’s slow, prohibitively expensive, and often out of reach to most developers, leaving you guessing about performance on users’ devices and risking delivering a subpar user experience.

Google AI Edge Portal solves the above challenges, enabling you to benchmark LiteRT models so you can find the best configuration for large-scale deployment of ML models across devices. Now, you can:

Simplify & accelerate testing cycles across the diverse hardware landscape: Effortlessly assess model performance across hundreds of representative mobile devices in minutes.
Proactively assure model quality & identify issues early: Pinpoint hardware-specific performance variations or regressions (like on particular chipsets or memory-constrained devices) before deployment.
Lower device testing cost & access latest hardware: Test on diverse and continually growing fleet of physical devices (currently 100+ device models from various Android OEMs) without the expense and complexity of maintaining your own lab.
Unlock powerful, data-driven decisions & business intelligence: Google AI Edge Portal delivers rich performance data and comparisons, providing the crucial business intelligence needed to confidently guide model optimization and validate deployment readiness.

1 - Example benchmark result — Fig. 1. Interactive dashboard to gain insights on model performance across devices

In this post, we’ll share how our partners are already using Google AI Edge Portal, the user journey, and how you can get started.

What our partners are saying

We’ve been fortunate to work with several innovative teams during the early development of Google AI Edge Portal. Here’s what a few of them had to say about its potential:

How Google AI Edge Portal helps you benchmark your LiteRT models

Upload & configure: Upload your model file via the UI or point to it in your Google Cloud Storage bucket.
Select accelerators: Specify testing against CPU or GPU (with automatic CPU fallback). NPU support is planned for future releases.
Select devices: Choose target devices from our diverse pool using filters (device tier, brand, chipset, RAM) or select curated lists with convenient shortcuts.

3 - Create New Benchmark Job — Fig. 2. Create a New Benchmark Job on 100+ Devices. (Note: GIF is accelerated and edited for brevity)

From there, submit your job and await completion. Once ready, explore the results in the Interactive Dashboard:

Compare configurations: Easily visualize how performance metrics (e.g., average latency, peak memory) differ when using different accelerators across all tested devices.
Analyze device impact: See how a specific model configuration performs across the range of selected devices. Use histograms and scatter plots to quickly identify performance variations tied to device characteristics.
Detailed metrics: Access a detailed, sortable table showing specific metrics (initialization time, inference latency, memory usage) for each individual device, alongside its hardware specifications.

4 - View Benchmark Result — Fig. 3. View Benchmark Results on the interactive Dashboard. (Note: GIF is accelerated and edited for brevity)

Help us shape the future of Google AI Edge Portal

Your feedback is crucial as we expand availability and enhance capabilities based on developer needs. In the future, we are keen to explore integrating features such as:

Bulk inference & evaluation: Run your models with custom datasets on diverse devices to validate functional correctness and enable qualitative GenAI evaluations.
LLM benchmarking: Introduce dedicated workflows and metrics specifically tailored for benchmarking the unique characteristics of large language models on edge devices.
Model optimization tools: Explore integrated tooling to potentially assist with tasks like model conversion and quantization within the portal.
Expanded platform & hardware support: Work towards supporting additional accelerators like NPUs, and other platforms beyond Android in the future.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5bc475b670>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Join the Google AI Edge Portal private preview

Google AI Edge Portal is available starting today in private preview for allowlisted Google Cloud customers. During this private preview period, access is provided at no charge, subject to the preview terms.

This preview is ideal for developers and teams building mobile ML applications with LiteRT who need reliable benchmarking data across diverse Android hardware and are willing to provide feedback to help shape the product’s future. To request access, complete our sign-up form here to express interest. Access is granted via allowlisting.

We are committed to making Google AI Edge Portal a valuable tool for the entire on-device ML community and we look forward to your feedback and collaboration!

Read More for the details.

2025 05 20

GCP – AI deployment made easy: Deploy your app to Cloud Run from AI Studio or MCP-compatible AI agents

Tibor Kiss Cloud, Google Cloud gcp

Cloud Run has become a go-to app hosting solution for its remarkable simplicity, flexibility, and scalability. But the age of AI-assisted development is here, and going from idea to application is faster and more streamlined than ever. Today, we’re excited to make AI deployments easier and more accessible by introducing new ways to deploy your apps to Cloud Run:

Deploy applications in Google AI Studio to Cloud Run with a single button click
Scale your Gemma projects with direct deployment of Gemma 3 models from Google AI Studio to Cloud Run
Empower MCP-compatible AI agents to deploy apps with the new Cloud Run MCP server

1. Streamlining app development and deployment with AI Studio and Cloud Run

Google AI Studio is the fastest way to start building with Gemini. Once you develop an app in AI Studio, you can deploy it to Cloud Run with a single button click, allowing you to go from code to shareable URL in seconds (video at 2x speed):

Build apps in AI Studio and deploy to Cloud Run

Once deployed, the app is available at a stable HTTPS endpoint that automatically scales, including down to zero when not in use. You can re-deploy with updates from AI Studio, or continue your development journey in the Cloud Run source editor. Plus, your Gemini API key remains securely managed server-side on Cloud Run and is not accessible from the client device.

It’s also a very economical solution for hosting apps developed with AI Studio: Cloud Run has request-based billing with 100ms granularity and a free tier of 2 million requests per month, in addition to any free Google Cloud credits.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Cloud Run’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5bc1fd7ca0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘https://console.cloud.google.com/run’), (‘image’, None)])]>

2. Bring your Gemma app to production in a click with Cloud Run

Gemma is a leading open model for single-GPU performance. To help you scale your Gemma projects, AI Studio now enables direct deployment of Gemma 3 models to Cloud Run:

Selecting Gemma from AI Studio and deploying it to Cloud Run with GPU via a single click in under a minute, with no quota request requirements (video at 4x speed)

This provides an endpoint running on Cloud Run’s simple, pay-per-second, scale-to-zero infrastructure with GPU instances starting in less than five seconds, and it scales to zero when not in use. It’s even compatible with the Google Gen AI SDK out-of-the-box, simply update two parameters in your code to use the newly deployed endpoint:

code_block: <ListValue: [StructValue([(‘code’, ‘from google import genairnfrom google.genai.types import HttpOptionsrnrn# Configure the client to use your Cloud Run endpoint and API keyrnclient = genai.Client(api_key=”KEY_RECEIVED_WHEN_DEPLOYING”, http_options=HttpOptions(base_url=”CLOUD_RUN_ENDPOINT_URL”))rnrn# Example: Stream generate contentrnresponse = client.models.generate_content_stream(rn model=”gemma-3-4b-it”,rn contents=[“Write a story about a magic backpack. You are the narrator of an interactive text adventure game.”]rn)rnfor chunk in response:rn print(chunk.text, end=””)’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3e5bc1fd7700>)])]>

3. Empower AI agents to deploy apps with the new Cloud Run MCP server

The Model Context Protocol (MCP) is an open protocol standardizing how AI agents interact with their environment. At Google I/O, we shared that supporting open standards for how agents will interact with tools is a top priority for us.

Today, we are introducing the Cloud Run MCP server to enable MCP-compatible AI agents to deploy apps to Cloud Run. Let’s see it in action with a variety of MCP clients: AI assistant apps, AI-powered Integrated Development Environments (IDEs), and agent SDKs.

1. AI assistant apps

Using the Claude desktop application to generate a Node.js app and deploy it to Cloud Run (video at 4x speed)

2. AI-powered IDEs

Updating a FastAPI Python app from VS Code with Copilot in agent mode using Gemini 2.5 Pro, and deploying it using the Cloud Run MCP server (video at 4x speed)

3. Agent SDKs, like the Google Gen AI SDK or Agent Development Kit also have support for calling tools via MCP, and can therefore deploy to Cloud Run using the Cloud Run MCP server.

Add the Cloud Run MCP server to your favorite MCP client:

code_block: <ListValue: [StructValue([(‘code’, ‘{rn “cloud-run”: {rn “command”: “npx”,rn “args”: [“-y”, “https://github.com/GoogleCloudPlatform/cloud-run-mcp”]rn }rn}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e5bc1fd7760>)])]>

Get started

Build, deploy, and scale AI apps faster with AI Studio’s integration with Cloud Run and the new Cloud Run MCP server. Give it a try:

Build in AI Studio and deploy to Cloud Run
Install Cloud Run MCP server on your local machine
Chat with Gemma 3 in AI Studio and deploy Gemma 3 to Cloud Run

Read More for the details.

2025 05 20

GCP – Expanding Vertex AI with the next wave of generative AI media models

Tibor Kiss Cloud, Google Cloud gcp

Today, we are introducing the next wave of generative AI media models on Vertex AI: Imagen 4, Veo 3, and Lyria 2.

We’ve already seen customers generate stunning, photorealistic images with Imagen 3, Google’s image generation model. Customers have taken these images and transformed them into high quality videos and assets with Veo 2. We’ve even seen customers take these remarkable videos and bring them to life with professional-grade audio using Lyria, Google’s advanced AI music generation model.

With a surge of momentum in the generative AI media space across marketing, media, and more, storytelling has never been easier. Users are creating campaign assets quicker, and building breakthrough creative content. Let’s take a look into each model and the ways you can get started today.

Imagen 4: Higher quality image generation

Today we’re introducing Imagen 4 text-to-image generation on Vertex AI in public preview. As Google’s highest quality image generation model, Imagen 4 delivers:

Outstanding text rendering and prompt adherence
Higher overall image quality across all styles
Multilingual prompt support to help creators globally

Prompt: Capture an intimate close-up bathed in warm, soft, late-afternoon sunlight filtering into a quintessential 1960s kitchen. The focal point is a charmingly designed vintage package of all-purpose flour, resting invitingly on a speckled Formica countertop. The packaging itself evokes pure nostalgia: perhaps thick, slightly textured paper in a warm cream tone, adorned with simple, bold typography (a friendly serif or script) in classic red and blue “ALL-PURPOSE FLOUR”, featuring a delightful illustration like a stylized sheaf of wheat or a cheerful baker character. In smaller bold print at the bottom of the package: “NET WT 5 LBS (80 OZ) 2.27kg”. Focus sharply on the package details – the slightly soft edges of the paper bag, the texture of the vintage printing, the inviting “All-Purpose Flour” text. Subtle hints of the 1960s kitchen frame the shot – the chrome edge of the counter gleaming softly, a blurred glimpse of a pastel yellow ceramic tile backsplash, or the corner of a vintage metal canister set just out of focus. The shallow depth of field keeps attention locked on the beautifully designed package, creating an aesthetic rich in warmth, authenticity, and nostalgic appeal.

Prompt: This four-panel comic strip uses a charming, deliberately pixelated art style reminiscent of classic 8-bit video games, featuring simple shapes and a limited, bright color palette dominated by greens, blues, browns, and the dinosaur’s iconic grey/black. The setting is a stylized pixel beach. Panel one shows the familiar Google Chrome T-Rex dinosaur, complete with its characteristic pixelated form, wearing tiny pixel sunglasses and lounging on a pixelated beach towel under a blocky yellow sun. Pixelated palm trees sway gently in the background against a blue pixel sky. A caption box with pixelated font reads, “Even error messages need a vacation.” Panel two is a close-up of the T-Rex attempting to build a pixel sandcastle. It awkwardly pats a mound of brown pixels with its tiny pixel arms, looking focused. Small pixelated shells dot the sand around it. Panel three depicts the T-Rex joyfully hopping over a series of pixelated cacti planted near the beach, mimicking its game obstacle avoidance. Small “Boing! Boing!” sound effect text appears in a blocky font above each jump. A pixelated crab watches from the side, waving its pixel claw. The final panel shows the T-Rex floating peacefully on its back in the blocky blue pixel water, sunglasses still on, with a contented expression. A small thought bubble above it contains pixelated “Zzz…” indicating relaxation.

Prompt: Filmed cinematically from the driver’s seat, offering a clear profile view of the young passenger on the front seat with striking red hair. Her gaze is fixed ahead, concentrated on navigating the dusty, lonely highway visible through her side window, which shows a blurred expanse of dry earth and perhaps distant, hazy mountains. Her arm rests on the window ledge or steering wheel. The shot includes part of the aged truck interior beside her – the door panel, maybe a glimpse of the worn seat fabric. The lighting could be late afternoon sun, casting long shadows and warm highlights across her face and the truck’s interior. This angle emphasizes her individual presence and contemplative state within the vast, empty landscape.

To get started with Imagen 4 in public preview on Vertex AI, you can use Media Studio or run the following code sample, which uses the Google Gen AI SDK for Python.

code_block: <ListValue: [StructValue([(‘code’, ‘from google import genairnrn# TODO(developer): Update and un-comment below linesrn# project_id = “PROJECT_ID”rnclient = genai.Client(vertexai=True, project=project_id, location=”us-central1”)rnrnprompt = “””rnA white wall with two Art Deco travel posters mounted. First poster has the text: “NEPTUNE”, tagline: “The jewel of the solar system!’ Second poster has the text: “JUPITER”, tagline: “Travel with the giants!rn”””rnrnimage = client.models.generate_images(rn model=”imagen-4.0-generate-preview-05-20”,rn prompt=prompt,rn)rnrn# OPTIONAL: View the generated image in a notebookrn# image.generated_images[0].image.show()’), (‘language’, ‘lang-py’), (‘caption’, <wagtail.rich_text.RichText object at 0x3e5bc47874f0>)])]>

Veo 3: Higher-quality video generation with audio and speech

Veo 3 is our latest state-of-the art video generation model from Google DeepMind. With Veo 3, you can generate videos with:

Improved quality when generating videos from text and image prompts
Speech, such as dialogue and voice-overs
Audio, such as music and sound effects

Here’s what a few of our customers have to say about productivity and creative gains with Veo:

Klarna, a leader in digital payments, is leveraging Veo and Imagen on Vertex AI to boost content creation efficiency. From b-roll to YouTube bumpers, the company is significantly reducing production timelines.

“At Klarna, we’re constantly exploring ways to push the boundaries of innovation in our marketing efforts, and Veo has been a game-changer in our creative workflows. With Veo and Imagen, we’ve transformed what used to be time-intensive production processes into quick, efficient tasks that allow us to scale content creation rapidly. Whether it’s producing engaging b-roll, crafting eye-catching YouTube bumpers, or developing dynamic social media animations, these tools have empowered our teams to be more agile and creative. The results speak for themselves, driving increased engagement and content performance. With Google Cloud, we’re laying the groundwork for the future of commerce and revolutionizing how we bring our brand to life.” – David Sandström, Chief Marketing Officer, Klarna

Jellyfish, a renowned digital marketing company within The Brandtech Group, has integrated Veo into their top performing AI marketing platform, Pencil, and teamed up with Japan Airlines to offer AI generated in-flight entertainment.

“The addition of Veo 2 in Pencil reinforces our commitment to empowering marketers with sophisticated AI, enabling them to produce campaigns that are not only smarter and faster but also bolder and more artistically inspired. Our pilots have shown incredible results, with an average 50% reduction in costs and time-to-market efficiencies. This step change in control and quality turns previously impossible ideas into real marketing content in minutes. Japan Airlines is leading the way in applying Gen AI to the travel industry, and we’re excited to see how other brands follow suit.” – David Jones, Founder & CEO, Brandtech

Kraft Heinz’s Tastemaker platform empowers their teams with access to Imagen and Veo, dramatically accelerating creative and campaign development processes.

“With Veo and Imagen on Vertex AI as part of our Tastemaker platform, Kraft Heinz has unlocked unprecedented speed and efficiency in our creative workflows. What once took us eight weeks is now only taking eight hours, resulting in substantial cost savings.” – Justin Thomas, Head Digital Experience & Growth

Envato, a global leader for digital creative assets and templates, used Veo 2 to develop their newly launched video generation feature, VideoGen, to enable creative professionals to turn text or images into hyper realistic and cinematic video content.

“We’ve tried many of the top video models, and Veo 2 has driven the most impressive results in terms of speed and quality across a diverse set of text and image inputs. Within the first few days of launch, tens of thousands of Envato subscribers were already accessing VideoGen, with nearly 60% of their generated videos being downloaded for use in creative projects. Since March, Envato has seen VideoGen usage surpass 100%+ month over month. It’s been a pleasure working with Google Cloud to bring Envato’s VideoGen feature to life with Veo.” said Aaron Rutley, Head of Product for AI at Envato.

See how it works: Veo 3 is capable of handling intricate prompt details, as demonstrated in the following examples.

Prompt: A medium shot, historical adventure setting: Warm lamplight illuminates a cartographer in a cluttered study, poring over an ancient, sprawling map spread across a large table. Cartographer: “According to this old sea chart, the lost island isn’t myth! We must prepare an expedition immediately!”

Prompt: A low-angle shot shows an open, light purple door leading from a room with light purple walls and a gray floor to a vibrant outdoor scene. Lush green grass and wildflowers spill from the doorway onto the indoor floor, creating a whimsical transition between spaces. Beyond the door, rolling green hills dotted with more wildflowers stretch towards a bright, clear sky. A single tree stands prominently in the foreground of the outdoor scene, its leaves adding depth to the view. The sunlight and natural elements contrast with the simplicity of the indoor space, inviting a sense of wonder and escape.

Veo 3 is in private preview on Vertex AI and will be available more broadly in the coming weeks. If you’re interested in early access, please fill out this form.

Lyria 2: Greater creative control with music generation

At Google Cloud Next 2025, we announced Lyria in Vertex AI, Google’s text-to-music model. Today, we’re announcing Lyria 2 is generally available in Vertex AI. As Google’s latest music generation model, Lyria 2 features high-fidelity music across a range of styles. As your next creative collaborator, Lyria 2 provides:

High-quality audio content from text prompts
Greater creative control over instruments, BPM, and other characteristics

To start creating content with Lyria 2, check out Media Studio on Vertex AI. Once there, you can start generating music from text prompts or access the model API via Vertex AI. For inspiration, check out some of the music clips and prompts below.

Prompt: Upbeat, Rhythmic Peruvian Cumbia with a psychedelic edge, LA, Live performance at a Latin music Festival, incorporating electric guitars, bass, and often utilizing a prominent timbales percussion section, creating a powerful and danceable vibe. Vibrant and energetic.

Prompt: Sweeping Orchestral Film Score, Pristine Studio recording, London, 100-piece Orchestra, Majestic and profound. A blend of soaring melodies, dramatic harmonic shifts, and powerful percussive elements, with instruments such as french horns, strings, and timpani, and a thematic approach, featuring intricate orchestrations, dynamic range, and emotional depth, evoking a cinematic and awe-inspiring atmosphere.

See what some of our customers have to say about Lyria 2 so far:

Captions is an AI-powered video creation tool that allows users to create studio-grade talking videos quickly and easily. They have integrated Lyria 2 into their Mirage Edit feature enabling customers to quickly generate complete videos with customized sound.

“At Captions, our Mirage Edit feature already gives subscribers the power to go from prompt to fully-edited AI talking video — complete with images, B-roll clips, voiceovers, and transitions. Now, we’re adding a keystone element: adaptive music powered by Google’s Lyria 2. With a single prompt, Lyria composes a score that syncs to the script, pacing, and transitions at every emotional beat, so our customers can publish cinematic short-form videos without ever leaving Captions or shuffling through stock libraries.” said Dwight Churchill, Co-Founder and COO, Captions.ai

Dashverse, owner of digital content platforms such as Dashtoon and DashReels, is leveraging Google’s Lyria 2 on Vertex AI to provide the next generation of AI-native creators with advanced music generation capabilities. This integration allows users to craft dynamic and emotionally responsive soundtracks that seamlessly adapt to the narrative and pacing of their content on platforms like DashReels.

“We’ve always believed in empowering everyday creators at Dashverse — whether they’re making comics with Dashtoon or short dramas on DashReels. Our move into dynamic, emotionally resonant storytelling with DashReels needed a music engine that was just as expressive and responsive. Lyria 2 on Vertex AI delivers exactly that. It gives our users studio-level control over music — adapting to emotion, scene, and pacing — without the overhead. It’s not just a soundtrack generator; it’s a storytelling amplifier. We’re incredibly excited about what this unlocks for the next generation of AI-native creators.” said Soumyadeep Mukherjee, CTO, Dashverse

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5bc32b6700>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Create securely and share responsibly

The security and safety of any AI generated content is crucial. Therefore, these models are designed with built in safeguards, allowing you to concentrate on your creative work. Veo 3, Imagen 4, and Lyria 2 are all built with safety as a fundamental design principle in partnership with Google DeepMind.

Watermarking: By default, all creations generated with Veo, Imagen, and Lyria utilize SynthID, a technology that embeds an invisible watermark directly into the generated output. This watermark allows for the identification of AI generated media, ensuring transparency.

Safety filters: Both input prompts and output content for all generative AI media models are accessed against a list of safety filters. By being able to configure how aggressively the content is filtered, you can ensure the assets meet your brand values. In visual output data, you also have control over person generation.

Get started

You can learn more about these new models by checking out the resources below:

Read More for the details.

2025 05 20

GCP – Gemini 2.5 Flash and Pro expand on Vertex AI to drive more sophisticated and secure AI innovation

Tibor Kiss Cloud, Google Cloud gcp

Today at Google I/O, we’re expanding Gemini 2.5 Flash and Pro model capabilities that help enterprises build more sophisticated and secure AI-driven applications and agents:

Thought summaries: For enterprise-grade AI, we’re bringing clarity and auditability with thought summaries. This feature organizes a model’s raw thoughts — including key details and tool usage — into a clear format. Customers can now validate complex AI tasks, ensure alignment with business logic, and dramatically simplify debugging, leading to more trustworthy and dependable AI systems.
Deep Think mode: Using new research techniques that enable the model to consider multiple hypotheses before responding, will help Gemini 2.5 Pro get even better. This enhanced reasoning mode is designed for highly-complex use cases like math and coding. We will be making 2.5 Pro Deep Think available to trusted testers soon on Vertex AI.
Advanced security: We’ve significantly increased Gemini’s protection rate against indirect prompt injection attacks during tool use, a critical factor for enterprise adoption. Our new security approach makes Gemini 2.5 our most secure model family to date.

Gemini 2.5 Flash will be generally available for everyone in Vertex AI early June, with 2.5 Pro generally available soon after. Let’s dive into how these advancements can impact your business, from operations to customer engagement.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3e5bc2b0a8b0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Powering diverse enterprise needs with Gemini 2.5 Flash and Pro on Vertex AI

Our customers are seeing real value and boosting efficiency with Gemini 2.5 models on Vertex AI. From delivering faster response times to tackling complex extraction, enterprises are pushing the boundaries of automation.

“With respect to Geotab Ace (our data analytics agent for commercial fleets), Gemini 2.5 Flash on Vertex AI strikes an excellent balance. It maintains good consistency in the agent’s ability to provide relevant insight to the customer question, while also delivering 25% faster response times on subjects where it has less familiarity. What’s more, our early analysis suggests it could operate at potentially 85% lower cost per question compared to the Gemini 1.5 Pro baseline. This efficiency is vital for scaling AI insights affordably to our customers via Ace.” –Mike Branch, Vice President Data & Analytics, Geotab.

For the most complex enterprise solutions, Gemini 2.5 Pro is our most advanced and capable model. The introduction of Deep Think mode will help make 2.5 Pro even better by using our latest cutting edge research in thinking and reasoning, including parallel thinking techniques.

We’ve enhanced 2.5 Pro usability with features like configurable Thinking Budgets (up to 32K tokens), allowing for fine-tuned control over processing. This means enterprises can tackle more intricate challenges and gain deeper insights.

“Box is revolutionizing how enterprises interact with their vast, and rarely organized, amounts of content. With Box AI Extract Agents, powered by Gemini 2.5 on Vertex AI, users can instantly extract precise insights from complex, unstructured content – whether it’s scanned PDFs, handwritten forms, or image-heavy documents. Gemini 2.5 Pro’s advanced reasoning makes it the top choice for tackling complex enterprise tasks, delivering 90%+ accuracy on complex extraction use cases and outperforming previous models in both clause interpretation and temporal reasoning, leading to a significant reduction in manual review efforts. This evolution pushes the boundaries of automation, allowing businesses to unlock and act upon their most valuable information with even greater impact and efficiency.” – Yashodha Bhavnani, Vice President of AI Product Management, Box

The versatility of the Gemini 2.5 family allows diverse organizations like LiveRamp to innovate and democratize data access.

“With its improved reasoning capabilities and insightful responses, Gemini 2.5 provides tremendous potential for LiveRamp. Its advanced features can enhance our data analysis agents and add support across our product suite, including segmentation, activation, and clean room-powered measurement for advertisers, publishers, and retail media networks. We are committed to assessing the model’s impact across a wide array of features and functionalities to ensure our clients and partners can unlock new use cases and enhance existing ones.” – Roopak Gupta, Vice President Engineering, LiveRamp”

Google Developer Experts building with Gemini 2.5

Google Developer Experts (GDEs) are a global community of tech experts, influencers, developers, and thought leaders. The community is already testing out Gemini 2.5 and building solutions. Take a look at some examples:

Kalev built a persona-based news recommender service for a supply chain analyst using Gemini 2.5 Pro’s large context window and reasoning abilities to filter and summarize global news relevant to their specific role.
Rubens built the Xtreme Weather App, a disaster preparedness multi-agent system, using Gemini 2.5 Pro for intelligent query routing and generating clear, actionable emergency guidance from diverse weather and hazard data. This use case demonstrates Gemini’s benefit in transforming complex environmental information into personalized advice, empowering users to prepare effectively for potential climate threats in their specific location.
Truong built a GitHub Action that automatically reviews pull requests using Google’s Gemini AI. This action helps catch errors, inconsistencies, and potential bugs early on. This leads to more robust and reliable software.

These developers are showing how Gemini on Vertex AI enables businesses and developers to achieve new efficiencies and foster innovation.

Get started today

Explore the documentation, experiment in Google AI Studio, and start building next-generation AI-powered solutions on Vertex AI today!

Read More for the details.

2025 05 20

AWS – AWS Site-to-Site VPN Tunnel Endpoint Lifecycle Control is now available in AWS Mexico (Central) Region

Tibor Kiss AWS, Cloud AWS

AWS Site-to-Site VPN Tunnel Endpoint Lifecycle Control is now available in AWS Mexico (Central) Region, providing you with better visibility and control of your VPN tunnel maintenance updates.

AWS Site-to-Site VPN is a fully-managed service that allows you to create a secure connection between your data center or branch office and your AWS resources using IP Security (IPSec) tunnels. Enabling Tunnel Endpoint Lifecycle Control feature provides you with advanced notice of an upcoming maintenance updates to help you plan and minimize service disruptions for your VPN connections. It provides you with added flexibility to apply updates to your VPN tunnel endpoints at a time that best suits your business.

To learn more, visit the documentation.

Read More for the details.

Cloud

How Google Cloud’s security team helps engineers build securely

In case you missed it

Threat Intelligence news

Now hear this: Podcasts from Google Cloud

Key findings: A data-driven roadmap

Extracting structured data from images

Extracting structured data from medical transcriptions

Filtered vector search

High selectivity

Low selectivity

Medium selectivity

Adaptive filtration

Get started today

A commitment to customer choice

Local control, global security

Sovereign solutions for any organization

How Google AI Edge Portal helps you benchmark your LiteRT models

Help us shape the future of Google AI Edge Portal

Join the Google AI Edge Portal private preview

1. Streamlining app development and deployment with AI Studio and Cloud Run

2. Bring your Gemma app to production in a click with Cloud Run

3. Empower AI agents to deploy apps with the new Cloud Run MCP server

Get started

Imagen 4: Higher quality image generation

Veo 3: Higher-quality video generation with audio and speech

Lyria 2: Greater creative control with music generation

Create securely and share responsibly

Get started

Powering diverse enterprise needs with Gemini 2.5 Flash and Pro on Vertex AI

Google Developer Experts building with Gemini 2.5

Get started today