Google Cloud

2025 01 22

GCP – Announcing the 2025 Google for Startups Accelerator: AI First UK

According to the UK Department for Science, Innovation & Technology, the UK’s AI sector is rapidly expanding, with over 3,000 AI companies generating more than £10 billion in revenues, employing over 60,000 people, and contributing £5.8 billion in Gross Value Added (GVA).

Google is committed to unlocking AI’s potential in the UK through initiatives that advance research, support small businesses, and foster the development of sustainable technologies. We believe that innovation can come from anywhere and we are committed to fostering it. That’s why we’re thrilled to announce the Google for Startups Accelerator: AI First program. This 12-week, equity-free accelerator is designed to propel the growth of early-stage UK AI startups through expert mentorship, tailored workshops, and access to Google resources.

This program is specifically designed to enable emerging AI startups with the comprehensive support they need to thrive and achieve long-term success.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e0f669dcd30>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Google for Startups Accelerator: AI First

The Accelerator: AI First program is tailored for seed to Series A UK-based startups building solutions with AI at their core. This program offers a unique blend of:

Expert mentorship: Guidance from experienced Google and industry mentors with expertise in areas like AI product development, go-to-market strategy, and fundraising.
Tailored workshops: Deep dives into critical AI topics, including AI ethics, responsible AI development, scaling AI technologies, growth, and crafting effective fundraising pitches.
Technical support: Access to Google Cloud credits (upon eligibility) and technical expertise to help scale AI solutions.
Networking opportunities: Connections with a curated network of investors, potential partners, and industry leaders to foster growth and innovation.

Hear from our alumni

[2] Harry Slagel

Harry Slagel (cofounder & CEO) and Lucy Adams (cofounder & COO) of Martee AI share: “The Google AI First Accelerator has been a huge help for us at Martee AI. It gave us the chance to step back from the day-to-day and really focus on how to scale the business. The program has been a key part of our growth as founders and for Martee’s future.”
[3] Ayan Mohamed

Ayan Mohamed, founder & CEO of Digitech Oasis shares: “This experience with the Google for Startups Accelerator has been transformative, equipping us with the tools, knowledge, and connections to drive our company’s growth and innovation in the AI space.”
[4] Adrian de Pedro

Adrián de Pedro, CEO of Shakers benefited from leveraging Google Cloud Platform (GCP) and cutting-edge technologies like Gemini to enhance the performance and latency of their AI-driven models. He shares: “The network and ecosystem we joined were truly inspiring. We learned valuable insights from every founder and company we interacted with.”

Apply now: Don’t miss the chance to join this year’s Google for Startups Accelerator: AI First UK cohort

If you’re an early-stage AI startup in the UK, the Google for Startups Accelerator: AI First program is your chance to accelerate your growth and achieve your full potential. Apply by February 21st and take the next step toward scaling your AI startup.

Read More for the details.

2025 01 21

GCP – Securing Cryptocurrency Organizations

Tibor Kiss Cloud, Google Cloud gcp

Written by: Joshua Goddard

The Rise of Crypto Heists and the Challenges in Preventing Them

Cryptocurrency crime encompasses a wide range of illegal activities, from theft and hacking to fraud, money laundering, and even terrorist financing, all exploiting the unique characteristics of digital currencies. Cryptocurrency heists, specifically, refer to the large-scale theft of cryptocurrencies or digital assets through unauthorized access, exploitation, or deception.

Cryptocurrency heists are on the rise due to the lucrative nature of their rewards, the challenges associated with attribution to malicious actors, and the opportunities presented by nascent familiarity with cryptocurrency and Web3 technologies among many organizations. Cofense highlighted that phishing activity targeting Web3 platforms increased by 482% in 2022, Chainalysis reported that $24.2 billion USD was received by illicit addresses in 2023, and Immunefi reported that in Q2 2024, compromises of Web3 organizations resulted in losses of approximately $572 million USD.

When threat actors gain access to cryptocurrency organizations, the potential for rapid, high-value financial losses due to unauthorized access is significantly elevated. A single malicious command executed on a vulnerable system can lead to the theft of millions of dollars worth of assets. This starkly contrasts with traditional organizations, where achieving financial gain or extracting value from stolen data often requires prolonged social engineering campaigns, all while facing the risk of detection and apprehension by law enforcement or financial institutions. The prospect of swift and substantial financial gains presents a compelling motivation for threat actors to target cryptocurrency organizations.

Cryptocurrency organizations are those whose core operations revolve around the use, management, or exchange of digital currencies, including:

Cryptocurrency exchanges that facilitate the buying and selling of cryptocurrencies
Financial institutions or payment gateway platforms that provide on or off ramp buying and selling of cryptocurrencies
Financial institutions that hold cryptocurrency assets as investment products for their customers
DeFi protocol providers that provide financial solutions for interacting with cryptocurrency assets
Web3 game creators that use blockchains for their in-game economics
Providers of hardware or software cryptocurrency wallets, wallet custodians, and wallet smart contract providers, which facilitate storage solutions for cryptocurrency assets
Cryptocurrency mining organizations, which validate transactions to generate new cryptocurrency tokens

The threats posed to these organizations are significant. Mandiant has observed cryptocurrency organizations employing heightened security controls driven by pressures from widespread reporting of impactful heists, however, many still remain unprepared for the threats they face.

Across its Incident Response engagements conducted at cryptocurrency organizations, Mandiant has observed common challenges relatively unique to these types of organizations. Mandiant has observed that these challenges introduce significant technical security debt, complexity, and widened attack surfaces, which make preventing, detecting, and responding to intrusions increasingly challenging.

Hyperfocus on wallet infrastructure: Many organizations focus on the security of wallet infrastructure but fall short on fundamental enterprise security practices.
Rapid development lifecycles: Cryptocurrency organizations, especially startups, often need to develop platforms fast, driven by aggressive market competition and investor pressure.
Unmanaged workforces: Given the demand for cryptocurrency platform developers, many organizations employ contractors or freelancers, who work for multiple organizations at the same time from their own devices. These devices are generally not monitored or policy-enforced by the organizations the user works for, and compromise of one of these devices may lead to intrusions at multiple organizations.
Unmanaged or disparate infrastructure: Given how rapidly many organizations have grown, infrastructure can be relatively chaotic, with disparate systems across multiple environments or cloud providers, with ad-hoc inventory or change management practices.

Mandiant has observed similarities in how threat actors were able to compromise and steal cryptocurrencies from intrusions it has investigated. In Securing Cryptocurrency Organizations, Mandiant has collated recommendations and insight from its frontline work, designed to specifically assist organizations whose core business involves interacting with cryptocurrencies. The recommendations provide an overview of security controls that cryptocurrency organizations should implement to prevent intrusions, detect them earlier in the Attack Lifecycle, and respond to them more effectively. While these recommendations are based on specific security failings observed in intrusions Mandiant has investigated, alternative frameworks outlining wider security controls for Cryptocurrency Organizations are available within the industry. Notably, the Security Frameworks by Security Alliance (SEAL) provides an excellent overview of key controls to consider when approaching security in a range of cryptocurrency organizations.

Prevention

Securing Cryptocurrency and IT Infrastructure to Prevent Compromise

Cryptocurrency Infrastructure and Wallet Management

Cryptocurrency infrastructure is the network of systems, protocols, and technologies that facilitate the creation, storage, transfer, and management of digital assets. At the core of this infrastructure are cryptocurrency wallets, which serve as containers for storing and managing cryptographic keys that provide ownership and control over cryptocurrencies. The highest priority for every cryptocurrency organization is the security of this infrastructure, since compromise of these keys can lead to significant financial loss.

Standards for Crypto Infrastructure

Traditional cryptographic security controls are commonplace in most organizations’ security strategy. Many of the same controls and use cases can be applied to the protection of cryptocurrency keys and wallets.

The CryptoCurrency Security Standard (CCSS) from the CryptoCurrency Certification Consortium (C4) is a widely accepted standard for the technical security of systems that store or interact with cryptocurrencies. While not formally required by any financial regulator in any market, the controls it introduces are a strong starting point for organizations looking to secure their cryptocurrency infrastructure, specifically, and complement other local and international IT security standards. Mandiant recommends that organizations maintaining cryptocurrency infrastructure should adhere to these standards, both on custom implementations and with third-party solutions.

Custodian Solutions

Many cryptocurrency organizations choose to use commercial cryptocurrency custodian solutions to manage their wallet infrastructure. These solutions offer several advantages, especially for green-field or start-up organizations that may lack the skills, budget, or requirements to build or commission custom systems. The primary advantages of custodian solutions include simplified authentication and permissions management, streamlined logging and monitoring, and the provision of API access for easy integration with other systems.

Commercial custodian platforms can introduce new risks over custom-built solutions, especially when those platforms are not self-hosted. These include the potential compromise of the custodian platform itself, vulnerabilities in platform dependencies or content delivery networks (CDNs), and reliance on internal vendor security testing and monitoring for closed-source commercial platforms. Organizations should implement and validate strict controls around any custodian solution.

Perform internal security testing: When using self-hosted solutions, organizations should conduct internal security testing for every new implementation or significant change. This may be performed internally or by using multiple third-party vendors. This helps identify misconfigurations and vulnerabilities and provides extra assurance above and beyond the testing performed by the custodian vendor.
Implement strict access controls: Implement strict authentication and access controls, including principles of least privilege and multifactor authentication. Any account with permission to withdraw cryptocurrency assets or make platform or wallet policy changes should be considered a highly privileged account.
Implement strict scopes: Implement limits on spending and access at user, wallet, and API key levels. Implement transaction allow lists for destination send addresses and regularly audit the scopes.
Manage credentials: Protect API keys and credentials by using just-in-time keys, avoiding local storage, and allowing developer access to only developer custodian instances or environments. Integrate the credentials for custodian solutions into privileged access and identity management solutions to enhance auditing and management of credentials.
Secure access: When using on-premises or cloud-hosted custodian web platforms, ensure that users visit the platform using saved bookmarks, and always verify the website they are visiting is legitimate before transacting.
Limit transaction types: When managing or interacting with smart contracts through a custodian platform, the platform should support limiting which functions can be called. This would typically include only allowing token transfers instead of risky operations like smart contract updates. The exact functions callable by the custodian will depend on the smart contract deployed.
Configure multi-signature (multi-sig) requirements: If the platform is managing multi-sig smart wallets, ensure that it is one of the required signatures and it only provides that signature if the transaction meets all of the configured criteria, including destination addresses and called functions. The custodian must have context around the exact functions being called on any smart contract to configure these criteria.

Smart Contracts and Multi-Sig Wallets

Organizations should employ smart contracts for cold wallets that necessitate multiple hardware signatures to authorize significant send transactions. While the specific capabilities of these contracts can vary across blockchains, Mandiant recommends that organizations should consider several key controls.

Use audited contract code: For multi-sig wallets, opt for open-source, publicly audited contract code without any modifications. Where custom contracts are required, their code should be audited and tested by experts familiar with the blockchain and coding language used.
Configure signature requirements: Enforce a minimum requirement of at least four signatures to authorize any send transaction. These signatures should originate from clear-signing capable hardware wallet devices of multiple vendors assigned to and in the possession of different trusted employees at the organization. A strict process around signing transactions should be implemented (see Management and Protection of Cold Wallets and Their Transactions). Additionally, ensure the passphrases or recovery seeds for these devices are stored in secure geographically dispersed locations to mitigate the risk of a single point of failure or physical compromise.
Secure private keys and contract upgrades: As with standard wallets, private keys for smart contracts that allow the upgrading of contract code or implementation contracts should be secured on cold devices. Multiple keys or signatures should be configured for any contract upgrade.
Distribute assets across multiple wallets: Distribute assets across multiple multi-sig wallets with multiple cold signing wallets to a level where loss of assets from one multi-sig wallet would not be significant.
Enforce a strict multi-sig signing process: Implement and enforce a multi-sig signing process that requires signers to be in geographically dispersed locations with different internet connections, and enforce the use of signing systems only for any multi-sig transaction. Signing systems should be audited to confirm they meet a minimum set of security requirements prior to being used for signing. Signers must also verify the transaction hashes for any signed transaction, on-device if using cold wallets, and verify raw transaction data in a third-party system prior to signing.
Scrutinize transactions where signing errors occur: Wherever signing on a multi-sig transaction fails, implement a process to scrutinize the transaction and communicate the concern to all signers to prevent further signing activity. In cases of multi-sig compromise, Mandiant has frequently observed victims reporting multiple errors in the signing process before malicious transactions are successfully approved.

Case Study: Abusing Smart Contracts

The March 2023 Euler Finance exploit, resulting in a $200 million USD loss, exposed the dangers of flash loan manipulation within DeFi protocols. Attackers exploited vulnerabilities in Euler’s code to manipulate the protocol’s lending and collateralization mechanisms, allowing them to obtain and drain massive amounts of cryptocurrency through a series of flash loans. This incident highlighted the need for secure coding practices and thorough testing, especially when dealing with complex financial instruments like flash loans. It highlighted the importance of anticipating and mitigating potential attack vectors, including those that exploit the unique characteristics of DeFi protocols and smart contracts.

Management and Protection of Cold Wallets and Their Transactions

The standard practice for cryptocurrency organizations is to store most of their assets in multi-sig smart wallets that require cold wallets for signing send transactions. Hot wallets are topped up from these multi-sig wallets as needed when they are near depletion. Since the majority of cryptocurrency organizations’ assets are stored on these cold wallets, their security is important.

Cold Wallet Security

Cold wallets may hold private keys for cryptocurrency wallets directly or may be used for signing transactions in a multi-sig smart wallet setup. Security of these wallets is therefore critical in preventing unauthorized transactions or contract upgrades.

Use cold wallet devices from multiple vendors: Use cold signing wallet devices from multiple vendors to mitigate risks associated with vulnerabilities in specific vendor platforms or hardware.
Use cold wallet devices that support clear signing: Ensure all signing devices support clear signing of transactions, allowing signers to accurately match the transaction hash and payload to the intended transaction. Blind signing introduces significant risks of transaction data interception between IT systems and the signing device.
Store passphrases and recovery codes securely offline: Ensure that hard copies of passphrases or backup codes for wallets are stored securely. In multi-sig wallets, these should be stored in separate secure geographical locations.

Dedicated Signing Systems

Mandiant recommends using dedicated, hardened signing systems for transfer or contract update transactions on multi-sig smart wallets.

Enforce dedicated use: The signing systems should be solely used for signing transactions and no other business activities.
Apply security updates frequently: The signing systems and any wallet software must be kept up-to-date with security patches. Patches should be applied much more frequently than for other systems. Verified updates should be checked and applied before signing as part of the signing process.
Use separate internet connections: The signing systems should connect to the internet from separate internet lines (mobile hotspots or locations), not through corporate exit nodes.
Limit execution of applications: The signing systems should have extremely strict controls around program execution. Typically, only a managed web browser, security software, hardware wallet software, and update software should be allowed.
Limit network traffic: Network traffic should only be allowed to specified destination addresses in line with the individual signing process, such as any cryptocurrency custodian platforms and their dependent systems or only to self-managed infrastructure. Updates may be fetched from internal staging servers or introduced through removable media after verification.
Verify applications: Signers should use the latest verified version of official hardware vendor software. This should generally be used in place of web USB solutions to mitigate the risk of compromise of third-party libraries, CDNs, or the platforms themselves.

Case Study: Risks of Blind Signing

In one case Mandiant investigated, a cryptocurrency exchange inadvertently signed a malicious transaction due to their hardware wallet devices not supporting clear-signing. The signers had no way to verify the transaction, which led to significant financial loss.

General Infrastructure Security

Cryptocurrency organizations, like any other technology-dependent organization, should adhere to best practices in enterprise security. Organizations in this industry, as in the wider financial services industry, are expected to have a higher level of capability or maturity than other industries due to the specific risks and targeting they face.

Manage third-party technologies: The security of technologies and dependencies of custodian and web platforms are of particular importance. The security of the upstream suppliers of these technologies has a direct impact on the security of the organization using them. It is critical for cryptocurrency organizations to effectively manage, monitor, and enforce security within their supply chain. Where third-party technologies are implemented, regular audits of their security should be conducted to verify the integrity of any subsystems (scripts, libraries) against known-good distributions. For web services, the content hosted on third-party servers or provided through content delivery networks should also be regularly audited.
Manage vulnerabilities: Regularly perform vulnerability scans against deployed technologies to ensure public vulnerabilities are identified, prioritized, and mitigated effectively. Implementing a bug bounty program that rewards reporters is also advisable.
Manage misconfigurations: Implement an Attack Surface Management solution to continually audit the configuration of external infrastructure, and compare the results to change management records.
Validate security controls: Implement a continuous security validation program to verify the effectiveness of prevention and detection controls as infrastructure changes.
Manage identities and access: Implement hardware-token multifactor authentication and manage permissions and identities across all production and development environments.
Manage remote access: Where remote access to environments is required, ensure that authentication is hardened and access is restricted to only managed and hardened devices through host integrity checking and device certificate-based authentication.
Provide employee awareness training: All employees of cryptocurrency organizations should be aware of the significant security threats their organization faces. Signers and developers should have an even greater understanding. Regular training or communications on the threats and how to manage them should be conducted, especially related to common tactics attackers use to compromise victims, including unsolicited job offers or freelance development work.
Conduct regular targeted threat hunting activities: Threat hunting can identify evidence of compromise that has been missed by existing detection capability, identify gaps in visibility, and assist in the development of new use cases for detection engineering. Mandiant generally recommends engaging a third party for such activities to bring new hunting hypotheses and intelligence to those already proposed by internal security teams.

Secrets Management

Effective and secure management of secrets within cryptocurrency organizations is critical, even more so than for organizations in other industries. The theft of secrets used to secure wallets in particular, directly or indirectly, can lead to immediate and significant financial loss.

Rotate secrets: Rotate secrets related to critical wallets and wallet infrastructure on a frequent basis, ideally monthly. This reduction of lifespan in secrets limits the potential impact of an attacker gaining access to them. Frequent secret rotation can introduce its own set of challenges, including the secure distribution and communication of new secrets across systems and personnel. Implementing a robust privileged access management (PAM) or secrets management system can make this easier. A PAM solution can streamline and secure the process of secret rotation, reducing the risk of exposure or compromise.
Manage secret storage: Organizations should consider implementing an enterprise password or secrets storage and management solution with enhanced monitoring and regular auditing. Privileged secrets should never be stored locally on user systems. Regular auditing of systems used by privileged users, such as developers or signers, should be performed to identify any improperly stored secrets.

Securing Developer Workstations

Developers and their workstations are particularly attractive targets for cyberattacks, especially when they interact with cryptocurrency infrastructure. Mandiant has frequently observed initial access in cryptocurrency organization intrusions being through developer workstations.

Restrict direct access: Minimize or eliminate direct access to production cryptocurrency infrastructure from developer workstations. Instead, implement secure deployment pipelines and staging environments for testing and code integration.
Manage secrets: Employ robust secret management solutions to protect sensitive keys and credentials. Avoid storing secrets directly on developer workstations or within code repositories.
Enforce principles of least privilege: Enforce the principle of least privilege, granting developers only the minimum necessary access required to perform their tasks.
Regularly conduct security audits: Conduct regular security audits of developer workstations and their access privileges to identify and address potential vulnerabilities.
Conduct security awareness training: Provide comprehensive security awareness training to developers, emphasizing the importance of protecting sensitive information and recognizing potential threats.
Enforce and manage security software: Endpoint detection and response tooling complements built-in security mechanisms and anti-virus to alert on threats quicker. Detections should be engineered for any anomalous or unauthorized behavior consistent with the end user’s job role.

Case Studies: Targeting Developers and Their Workstations

Direct access to production cryptocurrency infrastructure from developer workstations is especially risky. Mandiant has observed threat actors leveraging footholds on developer systems to interact with such infrastructure:

Targeted phishing via fake job opportunities: State-sponsored threat actors leveraged fake job postings on LinkedIn to deliver malware to developers and their highly privileged systems, leading to privileged access to cryptocurrency infrastructure.
AWS CLI abuse: Attackers have used the AWS CLI to interact with and extract secrets from wallet infrastructure hosted in Elastic Kubernetes Service (EKS).
API key compromise: In some cases, attackers leveraged their foothold to make API requests to the cryptocurrency custodian platform using keys stored within scripts and API testing applications.

Protecting Customers

The security of a cryptocurrency platform’s customers is an important part of a comprehensive security strategy. In any business-to-consumer organization, some level of customer account compromise is inevitable. However, cryptocurrency platforms can implement targeted controls to limit the chance of success and level of impact for such breaches.

Enforce multifactor authentication for all users: Enforce strong multifactor authentication, avoiding SMS-based tokens due to their vulnerability to SIM swapping attacks. Consider offering hardware-based tokens or app-based authenticators for enhanced security.
Provide users with tools to manage their own security: Empower users with greater control over their security by providing features like hardware multifactor authentication support, session management (including termination), login notifications, and login history visibility. This not only fosters user confidence, but also reduces the burden on the organization when investigating individual account compromises.
Enforce withdrawal controls: Implement withdrawal limits, delays, or cool-down periods for large transactions, coupled with secondary verification or notifications. These measures can significantly restrict the amount of assets an attacker can transfer if an account is compromised.
Back all assets 1-to-1: Organizations holding user assets should maintain their full value in cold storage completely segregated from the core infrastructure. This ensures that in the event of a significant cryptocurrency theft caused by the organization, users can be reimbursed. Many financial regulators already mandate this practice for cryptocurrency exchanges.
Provide user awareness communications: It is the responsibility of all cryptocurrency organizations to educate their user base about threats to their assets. Regular communication and awareness campaigns can empower users to protect themselves and recognize potential threats before security incidents occur.

Detection

Maintaining Visibility and Engineering Detections to Catch Attackers Early

Most organizations can benefit from security monitoring. By monitoring activity across host, network, and application data sources, organizations can leverage threat intelligence and identify suspicious activities related to generic IT behavior and cryptocurrency transactions, wallet interactions, and privileged access. Effective security monitoring allows organizations to detect and respond to threats earlier in the Attack Lifecycle and ultimately limit their impact.

Monitoring User Transactions and Activity

For cryptocurrency exchanges in particular, monitoring transactions among its user base can provide insight into fraudulent activities in the early stages of a heist. Transaction monitoring should be implemented from logs generated by the cryptocurrency applications or the custodian platforms if they are used. From these log sources, detections should be engineered relative to normal business processes and customer activity.

High-volume or -velocity transactions: Sudden surges or unusually high frequencies of withdrawals might indicate individual user accounts have been compromised or that an attacker is attempting to trigger hot wallet replenishment from cold storage. If user account compromise is suspected, organizations should investigate commonalities among affected users, particularly their platform access points, to implement targeted blocks, heightened monitoring, and trigger account credential resets. If potential attacker-induced top-up activity is detected, organizations should exercise extreme caution before initiating any replenishment procedures from cold wallets.
Unusual transaction patterns: Deposits and withdrawals involving unusual or unexpected wallets should be flagged and scrutinized. A wallet might be defined as unusual if it is linked to previous malicious behavior internally or from threat intelligence sources. Transaction behavior may also be monitored, such as typical times of day for each user and normalized frequencies of transactions. Advanced user behavior profiles may be built around this data and engineered into detections, such as identifying which users rarely make withdrawals and which users trade frequently. These detections should be integrated with transaction delay or cool-down controls.
Anomalous behavior: Detect deviations from typical user behavior, such as login attempts from unusual IP addresses or geolocations. Wherever possible, platform application security logs should be linked with transaction logs to provide context into user cryptocurrency activity.

Monitoring Internal User and Wallet Interaction

Monitoring internal user activity within a cryptocurrency organization is arguably more important than monitoring customer activity. Effective internal monitoring can identify compromised internal accounts and secrets early on and significantly reduce the impact of a compromise.

Secrets usage: Monitor and audit the use of API keys and other credentials when interacting with internal systems, particularly custodian platforms and wallet infrastructure. Use context from secrets management platforms or records to identify suspicious activities, such as unusual source systems, failed access attempts, or unusual API calls (especially those that might indicate reconnaissance, for example, checking policies within custodian solutions).
User activity: Monitor the specific actions performed with each request against wallet infrastructure systems. This includes monitoring the creation, modification, updating, or deletion of information or configurations. Attackers typically conduct internal reconnaissance before executing a heist. This involves actions like listing wallets, retrieving balances, or checking configured limits and allow lists through custodian platforms. Monitoring these activities provides valuable opportunities for early detection.
Access to knowledge and code repositories: Implement detections for unusual access to knowledge bases and code repositories., for example, high-velocity or high-volume requests, which might indicate bulk downloading, or searches for keywords such as “passwords”, “keys”, or “infrastructure.” Mandiant has frequently observed threat actors performing internal reconnaissance through knowledge and code repositories prior to performing a heist.

Heightened Monitoring on Developer and Signing Systems

No matter how many security controls are implemented on developer and signing systems, there is still a chance they can be compromised. For this reason, heightened monitoring should be configured on these and other sensitive systems, including any that interact with wallet infrastructure.

Monitor host-based activity: Monitor process creation and software installations to promptly detect any compromise and verify the effectiveness of preventative security controls. For signing systems, deploy detections to alert on any software not associated with approved hardware wallet vendors or managed web browsers.
Monitor network activity: Engineer network detections tailored to the permitted use cases of each system. On signing systems, implement strict alerts for any network calls to destinations beyond those necessary for software updates and transaction signing, even if these should be blocked at the network level.
Perform threat hunting: Conduct regular threat hunting and auditing on developer and signing systems to ensure secrets are not stored on disk and that security controls remain effective.

Response

Taking Action to Thoroughly Investigate and Remediate Compromise

When compromise is detected in a cryptocurrency organization, it is essential to tactically harden the environment, investigate to identify root cause, and perform remediation swiftly once the attack path is understood.

Tactical Hardening and Positioning

When compromise is identified within any part of a cryptocurrency organization, time is of the essence to harden the environment as much as possible and configure security technologies to give organizations the best chance of thoroughly investigating, effectively remediating, and securing the environment going forward.

Enhance logging: If the attacker is still active within the environment, immediately increase logging verbosity for critical areas such as secret use against custodian platforms, wallet infrastructure, and identity providers. This enhanced visibility helps understand attacker actions and enables real-time tactical responses.
Isolate wallet infrastructure: At the first sign of access or malicious attempted access to wallet infrastructure, including hot wallets or secrets, temporarily isolate the affected systems or custodian platforms. This containment measure prevents further attacker access and potential asset loss. Such actions should be linked to agreed processes for customer or stakeholder communication, if necessary.
Implement restrictions on withdrawals: Where an attacker may be active or have opportunities to complete their mission, implement cool-down periods or other restrictions on withdrawals. This should include limiting withdrawals to new addresses or introducing limits on the volume or velocity of withdrawals.
Rotate wallet keys: If there are any suspicions that the attacker possesses wallet keys or passphrases, rotate them immediately or transfer funds to a new, secure wallet as quickly as possible. For multi-sig wallets, rotate all signers to invalidate any compromised keys, and issue new hardware devices if necessary.
Isolate compromised systems: Immediately disconnect any systems exhibiting signs of malware or attacker access, especially those involved in signing transactions or interacting with wallet infrastructure services. Preserve these systems for forensic investigation.
Carefully plan remediation: Generally, avoid widespread remediation actions like enterprise-wide password resets during the initial response phase. Without a complete understanding of the attacker’s access, such actions could alert them, prompting them to accelerate their malicious activities and target accessible wallets. While caution is advised against widespread remediation actions during the initial response phase of a typical compromise, the unique nature of cryptocurrency heists demands a more nuanced approach. Unlike other breaches where organizations might be able to monitor attacker activity, the theft of a single file or key in a cryptocurrency environment can lead to significant financial loss, so a different approach may be needed.
Engage incident responders: Organizations should engage an incident response service provider immediately to begin scoping and investigating the intrusion and advising on the best tactical options to take at each phase of the investigation.

Investigating

Investigation of compromises within cryptocurrency organizations varies significantly between intrusion types, initial information identified from detection, and infrastructure architecture. However, in the majority of cases Mandiant has investigated, similar tactics were observed, which can provide a starting point for investigators.

Malware on developer or signing systems: Analyze systems that interact with wallet or custodian infrastructure for signs of malware, including developer and signing systems. Given their prevalence in cryptocurrency organizations, macOS systems are particularly attractive targets for cryptocurrency threat actors. Both built-in and third-party security mechanism logs and artifacts should be carefully reviewed.
Initial access by phishing: Investigate opportunities threat actors may have had to introduce malware or steal credentials in the environment. Mandiant frequently observes threat actors targeting developers with deceptive job offers containing malicious coding or debugging challenges, most frequently delivered via Slack, Telegram, or LinkedIn.
Malicious webpages: Analyze web history on hosts and the network perimeter to uncover any visits to malicious websites, especially by signers in the case of a multi-sig heist. Be vigilant for websites mimicking custodian platforms or wallet management applications that trick users into connecting hardware wallets and inadvertently signing malicious transactions.
Malicious transaction requests: Review available logs that may have recorded interactions with wallet infrastructure, especially send transactions. These are typically available within the custodian platform, if deployed, or in API gateways.

On-Chain Analysis

Investigation can also be performed on the blockchain and can provide valuable information for attribution of a threat actor and opportunities for recovery of funds.

Follow the money: Trace the movement of funds through the blockchain to identify the origin, destination, and intermediaries involved in transactions relating to stolen funds.
Identify key entities: Analyze transaction patterns and metadata to uncover addresses and entities associated with the movement of funds.
Use blockchain explorers: Leverage blockchain explorers to gather information on addresses, transactions, and associated metadata.
Cluster analysis: Group addresses controlled by the same entity based on transaction patterns and behaviors.
Consult with experts: Engage with blockchain analytics experts or law enforcement agencies for advanced tracing and identification techniques.
On-chain adversary tactics, techniques, and procedures (TTPs): TTPs also apply post-intrusion, where an adversary has developed their own methodology to obscure the money flow and launder the funds, for example, using specific tokens such as Monero (XMR), which is a cryptocurrency and blockchain with enhanced privacy features, or using obfuscation protocols such as mixers like Tornado Cash.

Remediating and Moving Forward

Effective remediation after a thorough investigation is important in order to secure the infrastructure going forward. The precise steps an organization takes will depend on the specific architecture and the attacker’s tactics, but several key actions are generally recommended by Mandiant.

Rotate all secrets: As in tactical hardening, reset all wallet secrets or move funds to newly created, secure wallets. Implement enterprise-wide password resets, revoke compromised certificates, and reset API keys for custodian platforms and any access points to wallet infrastructure.
Remove footholds: Identify and eliminate any access points the attacker may still have in the environment. If malware was deployed, rebuild affected systems from known-good configurations to ensure a clean environment.
Implement system hardening: Apply all preventive measures detailed in Prevention to enhance the overall security posture and make the environment more resistant to future attacks.
Implement enhanced security monitoring: Implement enhanced security monitoring tailored to the specific indicators of compromise and attacker methodologies identified during the investigation. This will provide a way to verify that remedial actions have been effective.
Plan for effective remediation: Organizations must strategically time their remedial actions, ensuring they align with the investigation’s findings. Action should be initiated when sufficient knowledge about the attacker and their activities is gained and sufficient monitoring is in place to ensure the effectiveness of these actions and provide timely alerts if any gaps remain.

Conclusion

Cryptocurrency organizations operate in a challenging threat landscape where the risk of immediate and substantial financial loss is high. The convergence of high-value digital assets, complex and rapidly evolving technologies, and the pressures of a rapidly evolving market, creates an environment where organizations should expect to be targeted. The recommendations in Securing Cryptocurrency Organizations provide a starting point for bolstering defenses, however security is not a point-in-time achievement, and organizations should aim for continuous improvement.

Mandiant’s recommendations highlight the importance of a multi-faceted enterprise security approach for every size of cryptocurrency organization. This includes the secure management of wallet infrastructure, robust access controls, and securing systems which interact with infrastructure, such as developer and signing workstations. Mandiant has also emphasized the importance of proactive threat detection to identify malicious activity earlier within wallet infrastructure, wider platform infrastructure, and on employee workstations. Finally, organizations should be prepared to respond to an intrusion with suitable data available to support an investigation, and appropriate containment and hardening controls to limit the impact.

The task of securing a cryptocurrency organization is complex. By embracing a culture of proactive security, continuous improvement, and implementing the recommendations in Securing Cryptocurrency Organizations as a baseline, Mandiant hopes that the number of successful intrusions decreases across the industry.

Call to Action for Security Leaders

Security leaders in cryptocurrency organizations should take these steps to make the most of Securing Cryptocurrency Organizations:

Conduct a comprehensive security assessment which benchmarks your current security posture against industry best practices and the recommendations provided in Securing Cryptocurrency Organizations. Identify gaps in your controls and prioritize remediation efforts through risk analysis.
Develop and implement a security roadmap based on the assessment. Focus on technical and strategic improvements with clear timelines and resource allocations across prevention, detection, and response capabilities.
Strive to be the most trusted organization within your industry by fostering a security-conscious culture. Empower your organization to see security as a shared responsibility supported by your investment.

The future of cryptocurrency organizations depends on the security and trust within them. By proactively addressing the challenges and implementing effective security controls, cryptocurrency organizations can safeguard their assets, protect their customers, and contribute to a more secure and resilient industry.

Mandiant provides cryptocurrency organizations with a range of services including security assessments, compromise assessments, threat hunts, security monitoring, threat intelligence, and incident response. To learn more about our experience and how we can help, get in touch.

Acknowledgement

Special thanks to the reviewers: Adrian Hernandez, Robert Wallace, Joseph Dobson, Mohamed El-Banna, and Jigisha Patel.

Read More for the details.

2025 01 17

GCP – The EU’s DORA regulation has arrived. Google Cloud is ready to help

Tibor Kiss Cloud, Google Cloud gcp

As the Digital Operational Resilience Act (DORA) takes effect today, financial entities in the EU must rise to a new level of operational resilience in the face of ever-evolving digital threats.

At Google Cloud, we share your commitment to the goals of DORA. We believe in building a more resilient and secure financial sector, and we’re here to support you with your DORA compliance journey. Our comprehensive suite of services, resilient infrastructure, and deep understanding of the regulatory landscape can help enable your success.

To accelerate your DORA efforts, today we’re excited to share our DORA Customer Guides on the Register of Information and Information and Communications Technology (ICT) Risk Management, and our new Google Cloud Third-Party Risk Management Resource Center. In addition, financial entities can request our DORA subcontractor list starting today.

These materials build on the DORA contract resources that we shared in February 2024, and reaffirm our mission to make Google Cloud the best cloud for DORA compliance.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3c64f63190>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Empowering you with DORA Customer Guides

DORA requires financial entities to develop a comprehensive ICT risk management strategy. This can be a complex undertaking, requiring a deep understanding of potential threats and vulnerabilities. Our ICT Risk Management Customer Guide accelerates this process for Google Cloud by mapping DORA requirements to specific Google Cloud offerings. This can help you build a robust ICT risk management framework with our services.

DORA requires financial entities to maintain a detailed register of all their ICT service providers. This register needs to include specific information about your contracts and service providers, and it must follow a specific template. Our Register of Information Customer Guide can help simplify the process and provides the information you need from us to complete the relevant templates for Google Cloud services.

Trust through transparency about subcontracting

Transparency and risk management are paramount when it comes to subcontracting. DORA recognizes this by requiring financial entities to have a clear understanding of their ICT service providers’ subcontracting arrangements and defined subcontracting conditions.

To provide the transparency you need, we’ve updated our Subcontractor List to align with DORA’s requirements. You can request this list by contacting your Google Cloud account representative.

We’re also committed to maintaining a robust third-party risk management program that meets the highest standards of security. Our Third-Party Risk Management Resource Center provides information about how we select, manage, and monitor our subcontractors at Google Cloud, giving you insight into how we carefully manage your exposure to third-party risk.

Looking ahead

Alongside our customers, Google Cloud has been preparing for DORA for some time and as we enter an important new phase, we are as committed as ever to helping you succeed under DORA.

We’re also dedicated to collaborating with policymakers as they finalize important details of DORA, such as the outstanding Regulatory Technical Standard on Subcontracting. By working together, we believe the sector can achieve a strong and effective framework for implementing DORA.

Our goal is to make Google Cloud the best possible service for sustainable, digital transformation for European organizations on their terms — and there is much more to come.

Read More for the details.

2025 01 17

GCP – GKE delivers breakthrough Horizontal Pod Autoscaler performance

Tibor Kiss Cloud, Google Cloud gcp

At Google Cloud, we are committed to providing the fastest and most reliable Kubernetes platform, Google Kubernetes Engine (GKE). Today, we are excited to announce an improved Horizontal Pod Autoscaler (HPA), the Kubernetes feature that automatically updates workload resources to match demand. We rearchitected the HPA stack, resulting in a significant improvement in scaling performance. You can apply it to your environment with the new Performance HPA profile, which delivers:

2x faster scaling: Workloads now scale up more than twice as quickly, enabling faster response times and improved application performance.
Improved metrics resolution: A new fast metrics path with improved metrics resolution allows for more granular scaling and reaction.
Linear scaling to up to 1000 HPA objects: HPA now supports high-scale deployments with consistent performance, helping you run large-scale applications with confidence.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud containers and Kubernetes’), (‘body’, <wagtail.rich_text.RichText object at 0x3e3c8334e430>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectpath=/marketplace/product/google/container.googleapis.com’), (‘image’, None)])]>

Why this matters

Faster HPA response times have been a common request from many GKE users, who frequently overprovision resources to account for delays in the autoscaling stack, sometimes resulting in higher costs and lower efficiency.

The Performance HPA profile addresses these challenges by:

Minimizing waste: The new HPA profile enables more precise scaling, reducing the need for over-provisioning and optimizing resource utilization.
Improving application responsiveness: Faster scaling helps ensure applications can quickly adapt to changing demands, delivering a better user experience.
Increasing operational efficiency: The new profile streamlines operations by simplifying scaling management and reducing the need for manual intervention.

Many GKE customers welcome the change.

“With GKE’s Performance HPA profile, we’ve witnessed a remarkable boost in horizontal auto-scaling speed. In our tests with over 1000 HPA objects, workloads scaled up twice as fast. We’re excited to leverage this performance enhancement in our production environments.” – Sophy Cao, Senior Engineer, Spotify

Get started today

The Performance HPA profile is available now as a preview opt-in feature for both GKE Standard and GKE Autopilot. We encourage all GKE users to try the new Performance HPA profile and experience its significant performance improvements firsthand. It only takes a single gcloud command to opt-in — see the guide to enable this feature in your cluster today!

Read More for the details.

2025 01 17

GCP – Cloud CISO Perspectives: Talk cyber in business terms to win allies

Tibor Kiss Cloud, Google Cloud gcp

Welcome to the first Cloud CISO Perspectives for January 2025. We’re starting off the year at the top with boards of directors, and how talking about cybersecurity in business terms can help us better convey the costs and priority and priority of the cybersecurity risks we face.

As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.

–Phil Venables, VP, TI Security & CISO, Google Cloud

aside_block: <ListValue: [StructValue([(‘title’, ‘Get vital board insights with Google Cloud’), (‘body’, <wagtail.rich_text.RichText object at 0x3e10133f3f40>), (‘btn_text’, ‘Visit the hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors?utm_source=cloud_sfdc&utm_medium=email&utm_campaign=FY24-Q2-global-PROD941-physicalevent-er-CEG_Boardroom_Summit&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

Use business terms when discussing cybersecurity to gain broader support

By Phil Venables, VP, TI Security & CISO, Google Cloud

There’s little doubt that cyberattacks are a first-class business risk, and that a robust cybersecurity program can be an enabler of business. So when cybersecurity discussions move from the SOC to the C-suite and boardrooms, we should focus on using (and, when necessary, translating) cybersecurity terminology to more commonly-understood business terms.

Common business terminology can help increase broader cybersecurity awareness, and help drive better cybersecurity policies and practices as a goal of the business. When we discuss the consequences of cyberattacks at the executive and board level in terms of the business impact, we’re making ourselves more understandable to crucial stakeholders — and more likely to find them in agreement with us.

Four broad categories that we can talk about include:

Financial losses that can come with recalling products, replacing compromised components, and compensating affected customers can quickly run up to millions of dollars. As a point of reference, security breaches cost an average of $4.88 million each in 2024.
Reputational damage from news reports of a breach can erode customer trust and loyalty, and drive a decline in sales while causing long-term damage to your brand.
Legal and regulatory fallout can upend budgets as your organization faces potential customer lawsuits and regulatory penalties.
Operational disruption often diverts resources and also impacts business continuity, which can strain relationships with partners and delay production.

Placing cyber risk in business context

Cyberattacks can directly disrupt critical services, compromise sensitive data, damage brand reputation, and erode customer trust, ultimately impacting revenue streams and shareholder value. To effectively manage cybersecurity risk and embed it into the business culture, we recommend that organizations take three key steps.

Quantify cyber risk by developing clear methods to assess the financial impact of potential cyber threats. When your organization’s leadership can translate technical jargon into business consequences, they can help decision makers understand the overall cyber risk exposure in financial terms.
Frame cybersecurity in business terms to better communicate cybersecurity risks. This can mean focusing on their potential impact on an organization’s strategic objectives and priorities. Instead of technical metrics, use narratives that highlight how these threats could affect revenue and operations.
Deliver business benefits with security so that when you put in place controls to mitigate risk, you are also delivering adjacent benefits. Risk modeling, cost-benefit analyses, and tracking performance metrics aligned with business goals can help build strong security and business returns.

When conducting risk assessments, for example, evaluate the controls that sustain the current risk level and assess them for replacement, consolidation, or improvements that can deliver adjacent benefits. You might not always be successful but the mere act of trying will enhance relationships with the wider organization.

Putting this into action

Boards of directors should discuss the following four topics with their chief information officer or chief technology officer, and their chief information security officer, as well as the business:

Clearly define business-critical services, document their dependencies, including third party providers, and identify potential vulnerabilities.
Prioritize resilience so that cybersecurity measures can protect against threats and enhance the resilience of critical services. Advocate for investments in redundancy, disaster recovery planning, and incident response capabilities.
Integrate security into all business processes by encouraging active participation from all departments, ensuring security becomes an integral part of the organization’s DNA — not an afterthought.
Build a resilient workforce by offering competitive compensation and benefits, providing professional development opportunities, and fostering a positive work environment to skilled cybersecurity professionals. Investments in comprehensive training and mentorship programs to cultivate cybersecurity and risk management skills throughout the organization can also help you attract, develop, and retain talent.

A collaborative approach, with a focus on aligning cybersecurity with critical business services, can strengthen your organization’s security posture, protect its critical assets, and enhance its resilience against the ever-evolving cyber threat landscape.

Sections of this article appeared first in the sixth edition of our Perspectives on Security for the Board report, which also covered supply-chain threats and information sharing best practices. You can read the full report here.

aside_block: <ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3e10133f3d30>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

In case you missed it

Here are the latest updates, products, services, and resources from our security teams so far this month:

Get ready for a unique, immersive security experience at Next ‘25: Here’s why Google Cloud Next is shaping up to be a must-attend event for security experts and the security-curious alike. Read more.
How Google makes threat detection high-quality, scalable, and modern: Get an inside look at Google’s approach to modern threat detection and response, part of our new “How Google Does It” series. Read more.
How to make the cloud an engine for manufacturing success: In spite of challenges and threats facing the manufacturing sector, we see significant cause for optimism. Here’s why.
The EU’s DORA has arrived. Google Cloud is ready to help: As DORA takes effect, financial entities in the EU must rise to a new level of operational resilience in the face of digital threats. Here’s how Google Cloud can help. Read more.
Start using our tokenization tools to protect sensitive data: Google Cloud has tokenization built in. Here’s why you should use it for sensitive data protection. Read more.

Please visit the Google Cloud blog for more security stories published this month.

aside_block: <ListValue: [StructValue([(‘title’, ‘Fact of the month’), (‘body’, <wagtail.rich_text.RichText object at 0x3e10133f38e0>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://cloud.google.com/blog/products/identity-security/cloud-ciso-perspectives-the-high-security-cost-of-legacy-tech’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

Threat Intelligence news

Use Backscatter for automated malware configuration extraction: Backscatter is a tool developed by the Mandiant FLARE team that can automatically extract malware configurations. Read more.
How to fix single-page vulnerable applications: Single-page applications (SPAs) often have multiple access control vulnerabilities. By implementing a robust access control policy on supporting APIs, the risks associated with client-side rendering can be largely mitigated. Here’s how. Read more.

Please visit the Google Cloud blog for more threat intelligence stories published this month.

Now hear this: Google Cloud Security and Mandiant podcasts

Ransomware’s rising stakes in the cloud: What specific challenges and considerations arise when dealing with ransomware in cloud environments, and how can organizations adapt their security strategies to mitigate these risks? Allan Liska, ransomware sommelier and threat intelligence analyst, Recorded Future, joins hosts Anton Chuvakin and Seth Rosenblatt to discuss the evolving, challenging ransomware landscape. Listen here.
Cybersecurity Forecast 2025: Less hype, more real: What’s coming this year in cybersecurity? How realistic are AI threats? Which legitimate threats are being obfuscated by hype? Andrew Kopcienski, principal intelligence analyst, Google Threat Intelligence Group, helps Anton clear up his crystal ball. Listen here.

To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back again in January with more security-related updates from Google Cloud.

Read More for the details.

2025 01 16

GCP – C4A, the first Google Axion Processor, now GA with Titanium SSD

Tibor Kiss Cloud, Google Cloud gcp

Today, we are thrilled to announce the general availability of C4A virtual machines with Titanium SSDs custom designed by Google for cloud workloads that require real-time data processing, with low-latency and high-throughput storage performance. Titanium SSDs on C4A VMs deliver storage performance of up to 2.4M random read IOPS, up to 10.4 GiB/s of read throughput, and up to 35% lower access latency compared to previous generation SSDs.¹

Titanium SSDs enhance storage security and performance while offloading local storage processing to free up CPU resources. Titanium SSDs are the first generation of Google SSDs integrated with Titanium, a system that boosts application performance by offloading networking, storage and management from the host CPU into a system of custom silicon, hardware and software on-host and throughout our data centers, connected to the host CPU using a Titanium Offload Processor.

C4A is a VM instance family, based on Google Axion Processors, that provides up to 65% better price-performance and up to 60% better energy efficiency than comparable current-generation x86-based instances². Together, C4A and Titanium SSDs deliver industry-leading price-performance for a broad range of Arm-compatible general-purpose workloads such as high-performance databases, analytics engines, and search and workloads that benefit from caching and local storage capacity.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3ea62a2a71c0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/compute’), (‘image’, None)])]>

C4A with Titanium SSDs offer up to 72 vCPUs, 576 GB of memory, and 6 TB of local storage in two shapes — Standard (with 4 GB of memory per vCPU) and High-memory (with 8 GB of memory per vCPU). C4A delivers the connectivity and storage performance that enterprise workloads need with up to 50 Gbps in standard bandwidth and up to 100 Gbps with Tier1 networking for high-traffic applications. C4A instances also support Google Cloud’s latest-generation Balanced and Extreme Hyperdisk storage for scalable, high-performance storage with up to 350k IOPS and 5 GB/s throughput per VM.

Google Cloud customers can use C4A with Titanium SSD in Compute Engine, Google Kubernetes Engine (GKE), Batch, and Dataproc, and more. C4A VMs are also now available in preview in Dataflow, with support for Cloud SQL, AlloyDB, with other services coming soon.

What our customers and partners are saying

“Couchbase Capella Columnar is purpose-built to accelerate complex analytical queries for real-time insights and empower AI-driven applications. Capella Columnar, running on Google Axion C4A instances with Titanium SSDs, delivers unparalleled price-performance, ultra-low latency and scalable compute power for analytic and operational workloads. We look forward to helping organizations deliver premium customer experiences with Capella Columnar on Google Axion processors.” – Matt McDonough, SVP of Product and Partners, Couchbase

“We are excited to introduce Axion-based C4A VMs with Titanium SSDs to Databricks on Google Cloud in the coming weeks, enabling us to deliver ever stronger price-performance and efficiency gains for our customers. The optimizations offered by Google’s latest Axion compute and Hyperdisk storage options will help customers generate more value from their data warehousing and AI investments on the Databricks Data Intelligence Platform.” – Abhishek Rai, Sr. Director of Engineering, Databricks

“Elastic is committed to enabling customers to drive innovation and cost-efficiency with our Search, AI-powered observability, security, and search solutions on Google Cloud. Google Axion-based C4A VMs with Titanium SSDs provided up to 40% better throughput compared to previous generation VMs in our testing. We look forward to introducing Google Cloud C4A VMs with local Titanium SSD on Elastic Cloud.” – Uri Cohen, VP of Product Management, Elastic

Try C4A now

C4A instances are now generally available via on-demand, Spot VMs, reservations, committed use discounts (CUDs), and FlexCUDs. C4A VMs with Titanium SSDs are available in us-central1 (Iowa), us-east4 (Virginia), us-east1 (SC), europe-west1 (Belgium), europe-west4 (Netherlands), europe-west3 (Frankfurt), europe-west2 (London) and asia-southeast1 (Singapore) today, with availability in additional regions coming soon. Get started on C4A with Titanium SSD today g.co/cloud/axion.

^{1. Results are based on Google Cloud’s internal benchmarking
2. As of September 2024, based on published listed prices. Performance based on the estimated SPECrate®2017_int_base performance benchmark scores run in production on comparable latest-generation generally-available VMs with general purpose storage types.}

Read More for the details.

2025 01 16

GCP – Bitly: Protecting users from malicious links with Web Risk

Tibor Kiss Cloud, Google Cloud gcp

Bitly’s partnership with Google Web Risk helps enhance Bitly’s ability to protect users and build trust as they generate millions of links and QR Codes daily.

Over the last decade, Bitly has solidified its reputation as a multiproduct connections platform, generating millions of links, QR codes, and mobile landing pages every day. Bitly enables users to shorten and customize long URLs for easier sharing, management, and tracking. Its capabilities in link tracking and analytics make Bitly a powerful and essential tool for brands and businesses of all sizes. As the company continues to scale, it’s constantly exploring new ways to enhance its trust and safety program.

Bitly’s trust and safety principles

Bitly’s commitment to trust and safety unfolds in a three-pronged approach. First, the company uses its proprietary technology, including the Bitly Abuse Prevention System and its Threat Detection Service (TDS) and Abuse API, to monitor and remove harmful URLs, ensuring a safer online experience for all users.

Next, Bitly supports its users by fostering innovation, upholding core values, enforcing an Acceptable User Policy (AUP), and developing user-friendly tools including the Bitly Trust Center.

Finally, Bitly forges partnerships with technology experts and NGOs to combat online threats such as terrorism, child sexual abuse materials (CSAM), and phishing campaigns, including its partnership with Google Web Risk.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ea62abe6f40>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Google Web Risk and Bitly

To strengthen its already robust trust and safety program, Bitly partnered with Google Web Risk to evaluate linked URLs in real-time against Google’s consistently updated database of unsafe web resources and URLs that violate any of the company’s Safe Browsing policies. Web Risk includes data on more than 1 million unsafe URLs, and continually updates this information by analyzing billions of URLs daily. These unsafe URLs typically include social engineering sites — such as phishing and deceptive sites — and sites that host malware or unwanted software.

“Our goal is to keep users safe and earn their trust by leveraging Google Web Risk’s enterprise security service to validate the safety of millions of generated links and QR Codes in real-time. Given the scale and speed of our users generating millions of links each day, Google Web Risk’s ability to handle this type of volume while delivering real-time verdicts is crucial to ensure seamless and safe online experiences,” said Ben Kleiman, director of Trust and Safety, Bitly.

Some of the key features that Bitly and Google Web Risk bring to customers include:

Checking millions of links and QR codes automatically every day, in real-time, for signals of abusive or malicious content.
Delivering high-fidelity confidence scoring for near real-time automatic, programmatic blocking, or warnings of abusive content.
Providing actionable insights on common types of abusive content such as phishing, malware, spam, and other threats targeting users.
Surfacing patterns of abuse that point to bad actors for strategic response.
Strengthening the competitive advantage around brand trust in the safety of Bitly links.

Web Risk confidence scoring avoids false positives

One key area of content safety that can be the trickiest for companies is the calibration of confidence scoring to avoid false positives in threat detection for URLs.

Google Web Risk’s enterprise-grade features include risk scoring and confidence levels, a unique capability that interested Bitly and added value to the partnership. Web Risk’s confidence scoring helps organizations evaluate the maliciousness level of a URL based on blocklists, machine learning models, and heuristic rules. Web Risk ranks URLs as low, medium, high, very high, and extremely high risk.

“We put a significant focus on confidence scoring within Web Risk to maintain customer trust,” said Kleiman. “Failing to block a malicious URL can undermine that trust, just as false positives can hurt user confidence in our links. We meticulously calibrate confidence scoring with Web Risk and have fine-tuned our threshold where our false positive rate has been remarkably low, particularly given the scale of links and QR codes we generate daily.”

Bitly continues to evolve its security stack to protect its users

Navigating the constantly shifting Internet environment proves to be a challenge for every technology company. Bitly recognizes the importance of adopting cutting-edge technology to safeguard its users effectively.

Bitly has a unique vantage point on the Internet since it captures a lot of signals from many different users in many different industries.

“Our goal is to enhance our security infrastructure to make smarter, faster, and more impactful decisions – not just around websites, but also around our users and their associated accounts. Web Risk is now a core part of this effort, and we are looking forward to leveraging more Google resources, like reCAPTCHA, which will be the next big thing to enhance Bitly’s capabilities,” said Kleiman.

As Bitly’s product offerings grow, so does the complexity of the threat landscape faced by the company’s Trust and Safety team. Kleiman said, “As we continue to evolve, Google’s suite of solutions can provide the necessary tools we need to address our evolving needs so that we can stay ahead of the curve.”

Read More for the details.

2025 01 16

GCP – New year, new updates to AI Hypercomputer

Tibor Kiss Cloud, Google Cloud gcp

The last few weeks of 2024 were exhilarating as we worked to bring you multiple advancements in AI infrastructure, including the general availability of Trillium, our sixth-generation TPU, A3 Ultra VMs powered by NVIDIA H200 GPUs, support for up to 65,000 nodes in Google Kubernetes Engine (GKE), and Parallelstore, our distributed file system service that offers low-latency, high-throughput storage that’s essential for HPC and AI workloads. We’re excited to see what you build with these new capabilities.

These innovations come together in AI Hypercomputer, a systems-level approach that draws from our years of experience serving AI experiences for billions of users, and combines performance-optimized hardware, open software and frameworks, and flexible consumption models. This means when you build your AI solution on Google Cloud, you can choose from a set of purpose-built infrastructure components that are designed to work well together. This freedom to choose the appropriate solution for the needs of your specific workload is fundamental to our approach.

Here are some key updates to AI Hypercomputer from the last quarter based on new infrastructure components and how they enable specific AI use cases.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud infrastructure’), (‘body’, <wagtail.rich_text.RichText object at 0x3e565d1aa7c0>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/compute’), (‘image’, None)])]>

Running distributed (multi-node) workloads

The performance of multi-node (multi-host) applications such as large-scale AI training and HPC workloads can be highly sensitive to network connectivity, requiring precise setup and proactive monitoring. We wanted to make it easier for customers to run large multi-node workloads on GPUs, and launched A3 Ultra VMs and Hypercompute Cluster, our new highly scalable clustering system. Both offerings were made generally available to close out 2024.

A3 Ultra, with NVIDIA H200 GPUs is a new addition to the A3 family of NVIDIA Hopper GPU-accelerated VMs with twice the GPU-to-GPU network bandwidth and twice the high bandwidth memory (HBM) compared to A3 Mega with NVIDIA H100 GPUs. A3 Ultra VMs offer the best performance in the A3 family. They are built with our new Titanium ML network adapter and incorporate NVIDIA ConnectX-7 network interface cards (NICs) to deliver a secure, high-performance cloud experience for AI workloads. Combined with our datacenter-wide 4-way rail-aligned network, A3 Ultra VMs deliver up to 3.2 Tbps of non-blocking GPU-to-GPU communication with RDMA over Converged Ethernet (RoCE).

A3 Ultra VMs are also available through GKE, which provides an open, portable, extensible, and highly scalable platform for training and serving AI workloads. To try out A3 Ultra VMs, you can easily create a cluster with GKE or try this pretraining GPU recipe.

Hypercompute Cluster, meanwhile, is a supercomputing services platform built on AI Hypercomputer that lets you deploy and manage a large number of accelerators as a single unit. With features such as dense co-location of resources with ultra-low-latency networking, targeted workload placement, advanced maintenance controls to minimize workload disruption, and topology-aware scheduling integrated into popular schedulers like Slurm and GKE, we built Hypercompute Cluster to help you achieve your throughput and resilience goals. You can use a single API call with pre-configured and validated templates for reliable and repeatable deployments, and with cluster-level observability, health monitoring, and diagnostic tooling, Hypercompute Clusters can run your most demanding workloads easily on Google Cloud. Hypercompute Cluster is now available with A3 Ultra VMs.

LG Research is an active user of Google Cloud infrastructure, which they used to train their large language model, Exaone 3.0. They are also an early adopter of A3 Ultra VMs and Hypercompute Cluster, which they are using to power their next set of innovations.

“From the moment we started using Google Cloud’s A3 Ultra with Hypercompute Cluster, powered by NVIDIA H200 GPUs, we were immediately struck by its remarkable performance gains and seamless scalability for our AI workloads. Even more impressive, we had our cluster up and running with our code in under a day — an enormous improvement from the 10 days it used to take us. We look forward to further exploring the potential of this advanced infrastructure for our AI initiatives.” – Jiyeon Jung, AI Infra Sr Engineer, LG AI Research

Making inference on TPUs easier

To enable the next generation of AI agents capable of complex, multi-step reasoning, you need accelerators designed to handle the demanding computational requirements of these advanced models. Trillium TPUs provide significant advancements for inference workloads, delivering up to 3x improvement in inference throughput compared to prior generation TPU v5e.

There are multiple ways to leverage Google Cloud TPUs for AI inference based on your specific needs. You can do this through Vertex AI, our fully managed, unified AI development platform for building and using generative AI, and which is powered by the AI Hypercomputer architecture under the hood. But if you need greater control, we have options lower in the stack that are designed for optimal serving on Cloud TPUs: JetStream is a memory-and-throughput-optimized serving engine for LLMs. MaxDiffusion offers a launching point for diffusion models. And for the Hugging Face community, we worked closely with Hugging Face to launch Optimum TPU and Hugging Face TGI to make serving on Cloud TPUs easier.

Most recently, we announced experimental support for vLLM on TPU with PyTorch/XLA 2.5. Motivated by the great response for this popular serving option, we’ve been running a preview with a small set of customers to get to the stage of bringing the performance (and price-performance) benefits of Cloud TPUs to vLLM.

Our goal is to make it easy for you to try out Cloud TPUs with your existing vLLM setup — just make a few configuration changes to see performance and efficiency benefits in Compute Engine, GKE, Vertex AI, and Dataflow. You can take vLLM for a spin on the Trillium TPUs with this tutorial. All this innovation is happening in the open, and we welcome your contributions.

And in case you missed it, Google Colab now supports Cloud TPUs (TPU v5e) if you want to try TPUs for your project.

Pushing the boundaries of AI infrastructure

As we start a new year, we’re excited to continue pushing the boundaries of AI infrastructure with AI Hypercomputer. These updates represent our ongoing commitment to providing you with the performance, efficiency, and ease of use you need to accelerate your AI journey. We look forward to seeing what you achieve with these new capabilities.

Read More for the details.

2025 01 15

GCP – Get started with Google Cloud’s built-in tokenization for sensitive data protection

Tibor Kiss Cloud, Google Cloud gcp

In many industries including finance and healthcare, sensitive data such as payment card numbers and government identification numbers need to be secured before they can be used and shared. A common approach is applying tokenization to enhance security and manage risk.

A token is a substitute value that replaces sensitive data during its use or processing. Instead of directly working with the original, sensitive information (usually referred to as the “raw data”), a token acts as a stand-in. Unlike raw data, the token is a scrambled or encrypted value.

Using tokens reduces the real-world risk posed by using the raw data, while maintaining the ability to join or aggregate values across multiple datasets. This technique is known as preserving referential integrity.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3e50e5d3cf40>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Tokenization engineered into Google Cloud

While tokenization is often seen as a specialized technology that can be challenging and potentially expensive to integrate into existing systems and workflows, Google Cloud offers powerful, scalable tokenization capabilities as part of our Sensitive Data Protection service. With it, you can make calls into serverless API endpoints to tokenize data on the fly in your own applications and data pipelines.

This allows you to enable tokenization without needing to manage any third-party deployments, hardware, or virtual machines. Additionally, the service is fully regionalized, which means tokenization processing happens in the geographical region of your choice helping you to adhere to regulatory or compliance regimes. The pricing is based on data-throughput with no upfront costs, so you can scale to meet the needs of your business with as little or as much as you need.

Sensitive Data Protection takes things even further offering in-line tokenization for unstructured, natural language content. This allows you to tokenize data in the middle of a sentence and if you pick two-way tokenization (and have the right access permissions), you can even detokenize data back when necessary.

This opens up a whole new set of use-cases including run time tokenization of logs, customer chats, or even as part of a generative AI-serving framework. We’ve also built this technology directly into Contact Center AI and Dialogflow services so that you can tokenize customer engagement on-the-fly.

1 - token_unstructured — The image above shows a raw input that contains an identifier (email address) along with a masked output that shows this email in tokenized form.

Tokenization with BigQuery

In addition to serverless access through Sensitive Data Protection, we also offer tokenization directly in BigQuery. This gives you tokenization methods at your fingertips in BigQuery SQL queries, User Defined Functions (UDFs), views, and pipelines.

Tokenization technology is built directly into the BigQuery engine to work at high speed and high scale for structured data, such as tokenizing an entire column of values. The resulting tokens are compatible and interoperable with those generated through our Sensitive Data Protection engine. That means you can tokenize or detokenize in either system without incurring unnecessary latency or costs, all while maintaining the same referential integrity.

Using tokens to solve real problems

While the token obfuscates the risk, utility and value are still preserved. Consider the following table which has four rows and three unique values: value1, value2, value3.

<value1> → <token1><value2> → <token2><value1> → <token1><value3> → <token3>

Here you can see that each value is replaced with a token. Notice how “value1” gets “token1” consistently. If you run an aggregation and count unique tokens, you’ll get a count of three, just like on the original value. If you were to join on the tokenized values, you’d get the same type of joins as if joining on the original value.

This simple approach unlocks a lot of use cases.

Obfuscating real-world risk

Consider the use-case of running fraud analysis across 10 million user accounts. In this case, let’s say that all of your transactions are linked to the end-users email address. An email address is an identifier that poses several risks:

It can be used to contact the end-user who owns that email address.
It may link to data in other systems that are not supposed to be joined.
It may identify someone’s real world identity and risk exploding that identity’s connection to internal data.
It may leak other forms of identity, such as the name of the owner of the email account.

Let’s say that the token for that email is “EMAIL(44):AYCLw6BhB0QvauFE5ZPC86Jbn59VogYtTrE7w+rdArLr” and this token has been scoped only to the tables and dataset need for fraud analysis. That token can now be used in place of that email address and you can tokenize the emails across all the transaction tables, and then run fraud analysis.

During this analysis any users or pipelines exposed to the data would only see the obfuscated emails, thus protecting your 10 million users while unblocking your business.

Next steps

Tokenization provides a powerful way to protect sensitive information while still allowing for essential data operations. By replacing sensitive data with non-sensitive substitutes, tokens can significantly reduce the risk of data breaches and simplify compliance efforts. Google Cloud simplifies tokenization by offering a readily available, scalable, and region-aware service, allowing you to focus on your core business rather than managing infrastructure.

To get started on using tokenization on Google Cloud, see the following:

Read More for the details.

2025 01 15

GCP – Your Single-Page Applications Are Vulnerable: Here’s How to Fix Them

Tibor Kiss Cloud, Google Cloud gcp

Written by: Steven Karschnia, Truman Brown, Jacob Paullus, Daniel McNamara

Executive Summary

Due to their client-side nature, single-page applications (SPAs) will typically have multiple access control vulnerabilities
By implementing a robust access control policy on supporting APIs, the risks associated with client-side rendering can be largely mitigated
Using server-side rendering within the SPA can prevent unauthorized users from modifying or even viewing pages and data that they are not authorized to see

Introduction

Single-page applications (SPAs) are popular due to their dynamic and user-friendly interfaces, but they can also introduce security risks. The client-side rendering frequently implemented in SPAs can make them vulnerable to unauthorized access and data manipulation. This blog post will explore the vulnerabilities inherent in SPAs, including routing manipulation, hidden element exposure, and JavaScript debugging, as well as provide recommendations on how to mitigate these risks.

Single-Page Applications

A SPA is a web application design framework in which the application returns a single document whose content is hidden, displayed, or otherwise modified by JavaScript. This differs from the flat file application framework traditionally implemented in PHP or strictly HTML sites and from the Model-View-Controller (MVC) architecture where data, views, and server controls are handled by different portions of the application. Dynamic data in SPAs is updated through API calls, eliminating the need for page refreshes or navigation to different URLs. This approach makes SPAs feel more like native applications, offering a seamless user experience. JavaScript frameworks that are commonly used to implement SPAs include React, Angular, and Vue.

Client-Side Rendering

In SPAs that use client-side rendering, a server responds to a request with an HTML document that contains only CSS, metadata, and JavaScript. The initially returned HTML document does not contain any content, and instead once the JavaScript files have been run in the browser, the application’s frontend user interface (UI) and content is loaded into the HTML document at runtime. If the application is designed to use routing, JavaScript takes the URL and attempts to generate the page that the user requested. While this is happening, the application is making requests to the API endpoint to load data and check whether or not the current user is authorized to access the data. If a user is not yet authenticated, then the application will render a login page or redirect the user to a separate single sign-on (SSO) application for authentication.

While all of this happens, a user may briefly observe a blank white page before the application dashboard or login page is loaded into their browser. During this pause, the application is potentially loading hundreds of thousands of lines of minified JavaScript that will build the full user experience of the application. SPAs are used in millions of applications across the globe, including Netflix, Hulu, Uber, and DoorDash.

Issues with Client-Side Rendering

Because SPAs rely entirely on the client’s browser to render content (using API data), users have significant control over the application. This enables users to manipulate the application freely, making user or role impersonation easier.

Routing

One fundamental aspect of the JavaScript frameworks that SPAs are implemented in is the idea of routes. These frameworks use routes to indicate different pages in the application. Routes in this case are different views that a user can see, like a dashboard or user profile. Since all of the JavaScript is handled by the client’s browser, the client can view these routes in the JavaScript files that are included in the application source. If a user can identify these routes, they can attempt to access any of them. Depending on how the JavaScript was implemented, there may be checks in place to see if a user has access to the specific route. The following is an example of React routing that includes information on creating the views, and more importantly path attributes.

In = function () {
	return (0, _.jsx)(d.rs, {
		children: (0, _.jsxs)(ki, {
			children: [
			(0, _.jsx)(d.AW, {
				path: "/dashboard",
				children: (0, _.jsx)(Ii, {}),
			}),
			(0, _.jsx)(d.AW, {
				path: "/users",
				children: (0, _.jsx)(wi, {}),
			}),
			(0, _.jsx)(d.AW, {
				path: "/profile",
				children: (0, _.jsx)(Ti, {}),
			}),
			],
		}),
	});
};

Hidden Elements

One way that access control is handled by SPAs is through hidden page elements. This means that when the page loads, the application checks the user’s role through local/session storage, cookie values, or server responses. After the application checks the user’s role, it then displays or hides elements based on the user’s role. In some cases, the application only renders elements that are accessible by the user. In other cases, the application renders every element but “hides” them by controlling the CSS properties of the element. Hidden elements can be exposed through browser Developer Tools, allowing users to force their display. These hidden elements could be form fields or even links to other pages.

JavaScript Debugging

Modern browsers allow users to debug JavaScript in real time with breakpoints. Modern web browsers allow breakpoints to be set on JavaScript files, which can be used to modify variables or rewrite functions all together. Debugging core functions can allow users to bypass access controls and gain unauthorized page access. Consider the following JavaScript:

function isAuth() {
        var user;
        var cookies = document.cookies;
        var userData = btoa(cookies).split(‘:’);
        if (userData.length == 3) {
                user.name = userData[0];
                user.role = userData[1];
                user.isAuthed = userData[2]; 
        } else {
                user.name = “”;
                user.role = “”;
                user.isAuthed = false; 
        }
        return user;
}

The previously defined function reads a user’s cookie, Base64 decodes the value, splits the text using : as the delimiter, and if the values match, it considers the user as authenticated. Identifying these core functions allows an attacker to bypass any authorization and access controls that are being handled by the client-side application.

Exploitation

Manually exploiting JavaScript framework issues takes time and practice, but there are a few techniques that can make it easier. A common technique involves analyzing JavaScript files to identify application routes. Identifying routes allows you to “force-browse” to application pages and access them directly, rather than through the UI. This technique may work on its own, but other times you may need to identify any role checks in the application. These checks can be accessed through the JavaScript debugger to modify variables during execution to bypass authorization or authentication checks. Another useful technique involves capturing server responses to requests for user information in an HTTP proxy, such as Burp Suite Professional, and manually modifying the user object. While these exploitation techniques are effective, they can be mitigated through strong preventative measures, including those detailed in this post.

Recommendations

Access control issues are systemic to client-side-rendered JavaScript frameworks. Once a user has the application loaded into their browser, there are few effective mitigations to prevent the user from interacting with content in unauthorized ways. However, by implementing robust server-side access control checks on APIs, the effect that an attacker could produce is severely reduced. While the attacker might be able to view what a page would look like in the context of an administrator or even view the structure of a privileged request, the attacker would be unable to obtain or modify restricted data.

API requests should be logged and monitored to identify if unauthorized users are attempting to or successfully accessing protected data. Additionally, it is advisable to conduct periodic penetration tests of web applications and APIs throughout their lifetime to identify any gaps in security. Penetration testing should uncover any APIs with partial or incomplete access control implementations, which would provide an opportunity to remediate flaws before they are abused by an adversary.

API Access Controls

Implementing robust API access controls is critical for securing SPAs. Access control mechanisms should use a JSON Web Token (JWT) or other unique, immutable session identifier to prevent users from modifying or forging session tokens. API endpoints should validate session tokens and enforce role-based access for every interaction. APIs are often configured to check if a user is authenticated, but they don’t comprehensively check user role access to an endpoint. In some cases, just one misconfigured endpoint is all it takes to compromise an application. For example, if all application endpoints are checking a user’s role except the admin endpoint that creates new users, then an attacker can create users at arbitrary role levels, including admin users.

An example of proper API access control is shown in Figure 1.

Figure 1: Proper API access control example

This diagram shows a user authenticating to the application, receiving a JWT, and rendering a page. The user interacts with the SPA and requests a page. The SPA identifies that the user is not authenticated so the JavaScript renders the login page. Once a user submits the login request, the SPA forwards it to the server through an API request. The API responds stating the user is authenticated and provides a JWT that can be used with subsequent requests. Once the SPA receives the response from the server, it stores the JWT and renders the dashboard that the user originally requested.

At the same time, the SPA requests the data necessary to render the page from the API. The API sends the data back to the application, and it is displayed to the user. Next, the user finds a way to bypass the client-side access controls and requests the main admin page in the application. The SPA makes the API requests to render the data for the admin page. The backend server checks the user’s role level, but since the user is not an admin user, the server returns a 403 error stating that the user is not allowed to access the data.

The example in Figure 1 shows how API access controls prevent a user from accessing API data. As stated in the example, the user was able to access the page in the SPA; however, due to the API access controls, they are not able to access the data necessary to fully render the page. For APIs developed in C# or Java, frameworks often provide annotations to simplify implementing access controls.

Server-Side Rendering

Aside from API access controls, another way to mitigate this issue is by using a JavaScript framework that has server-side rendering capabilities, such as Svelte-Kit, Next.js, Nuxt.js, or Gatsby. Server-side rendering is a combination of the MVC and SPA architectures. Instead of delivering all source content at once, the server renders the requested SPA page and sends only the finalized output to the user. The client browser is no longer in charge of routing, rendering, or access controls. The server can enforce access control rules before rendering the HTML, ensuring only authorized users see specific components or data.

An example of server-side rendering is shown in Figure 2.

This diagram shows a user accessing a server-side rendered application. After requesting an authenticated page in the application, the server checks if the user is authenticated and authorized to view the page. Since the user is not yet authenticated, the application renders the login page and displays that page to the user. The user then authenticates, and the server builds out the session, sets necessary cookies or tokens, and then redirects the user to the application dashboard. Upon being redirected, the user makes a request, the server checks the authentication state, and since the user has permissions to access the page, it fetches the necessary data and renders the dashboard with the data.

Next, the user identifies an admin page URL and attempts to access it. In this instance, the application checks the authentication state and the user’s role. Since the user does not have the admin role, they are not allowed to view the page and the server responds with either a 403 Forbidden or a redirection to an error page.

A Final Word

In conclusion, SPAs offer a dynamic and engaging user experience, but they also introduce unique security challenges when implemented with client-side rendering. By understanding the vulnerabilities inherent in SPAs, such as routing manipulation, hidden element exposure, and JavaScript debugging, developers can take proactive steps to mitigate risks. Implementing robust server-side access controls, API security measures, and server-side rendering are excellent ways to safeguard SPAs against unauthorized access and data breaches. Regular penetration testing and security assessments can further strengthen the overall security posture of SPAs by identifying any security gaps present in the application and allowing developers to remediate them before they are exploited. By prioritizing security best practices, developers can ensure that SPAs deliver both a seamless user experience and a secure environment for sensitive data.

Read More for the details.

2025 01 14

GCP – Unlock multimodal search at scale: Combine text & image power with Vertex AI

Tibor Kiss Cloud, Google Cloud gcp

The way users search is evolving. When searching for a product, users might type in natural-sounding language or search with images. In return, they want tailored results that are specific to their query. To meet these demands, developers need robust multimodal search systems.

In this blog post, we’ll share a powerful approach to build a multimodal search engine using Google Cloud’s Vertex AI platform. We’ll combine the strengths of Vertex AI Search and vector search, using an ensemble method with weighted Rank-Biased Reciprocal Rank (RRF). This approach allows for:

Improved user experience: Searching becomes more intuitive and less reliant on finding the “perfect” keywords.
Enhanced product discovery: Users can uncover items they might not have found with text alone.
Higher conversion rates: More relevant and engaging search results lead to happier customers and increased sales.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb8e7fbed90>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Why using a combined approach matters

Think about how you search for products online. Assume you want to search for queries such as “homes with a large backyard” or “white marble countertops”. Some of this information might be stored in text, while others might only be available in images. When you search for a product, you want the system to look through both modalities.

One approach might be to ask a Large language model (LLM) to generate a text description of an image. But this can be cumbersome to manage over time and add latency for your users. Instead, we can leverage image embeddings and combine the search results with text data in Vertex AI Search. Together, this multimodal approach delivers:

Richer visual understanding: Multi-modal embeddings capture the complex visual features and relationships within images, going beyond simpler text annotations.
Image-based queries: Users can directly search using an image, allowing for more intuitive discovery based on visual inspiration.
Precise filtering: Filtering by detailed attributes like size, layout, materials, and features becomes possible, leading to highly accurate search and curated results.

Google Cloud’s Vertex AI platform provides a comprehensive set of tools for building and deploying machine learning solutions, including powerful search capabilities:

Vertex AI search: A highly scalable and feature-rich engine for many types of search. It supports advanced features like faceting, filtering, synonyms, and custom relevance ranking. It also enables advanced document parsing including unstructured documents (PDFs) and even those with embedded graphics (e.g. tables, infographics, etc.)
Vertex AI multimodal embedding API: This is used to generate image embeddings (numerical representations of images).
Vertex AI Vector Search: This is used as the vector database to store the embeddings with metadata information for searching. It can store both sparse embeddings, e.g. text descriptions, and dense embeddings, e.g. images.

Our ensemble approach: Text + image power

To create our multimodal search engine, we’ll use an ensemble approach that combines the strengths of Vertex AI Search and vector search for images:

Text search with Vertex AI Search:

Index your product catalog data (names, descriptions, attributes) into a data store using agent builder.
When a user enters a text query, Vertex AI Search returns relevant products based on keyword matching, semantic understanding, and any custom ranking rules you’ve defined.
This also has capabilities to return facets which can further be used for filtering.
You can even visualize how unstructured or complex documents are parsed and chunked

Image search with vector embeddings:

Generate image embeddings for your products using multimodal embeddings API.
Store these embeddings in vector search.
When a user uploads an image or text, convert it to an embedding and query the vector database to find visually similar product images.

Combining results with weighted RRF:

Rank-biased Reciprocal Rank (RRF): This metric measures the relevance of a ranked list by considering the position of the first relevant item. It favors lists where relevant items appear higher.
Weighted RRF: Assign weights to the text relevance score (from Vertex AI Search) and the image similarity score (from vector search). This allows you to adjust the importance of each modality (i.e. Vertex or Vector Search) in the final ranking.
Ensemble: Combine the text and image search results, re-rank them using the weighted RRF score, and present the blended list to the user.

To enhance the search experience, use Vertex AI Agent Builder Search’s faceting capabilities:

Define facets: Based on your product data, create facets for categories, attributes (color, size, material), price ranges, etc.
Dynamic filtering: Allow users to interactively refine their searches using these facets, narrowing down the results to the most relevant products. The filters adjust automatically based on the returned results (hence “dynamic”)
Natural language query understanding: If the textual data is structured then you can enable natural language query understanding in your Vertex AI Agent Builder Search to improve results of the query. You can then parse the filters from the response to apply the same filters to the vector search using namespaces.

Why this approach works

This approach gives developers the best of both worlds by combining the rich features of Vertex AI Search (for example, the parsing pipeline) with the ability to directly utilize images as a query. It’s also flexible and customizable because it adjusts the weights in your RRF ensemble and tailors facets to your specific needs.

Above all, this approach gives your users what they need – the ability to search intuitively using text, images, or both, while offering dynamic filtering options for refined results.

Get started with multi-modal search

By leveraging the power of Vertex AI and combining text and image search with a robust ensemble method, you can build a highly effective and engaging search experience for your users. Get started:

Explore Vertex AI: Dive into the documentation and explore the capabilities of Vertex AI Search and embedding generation.
Experiment with embeddings: Test different image embedding models and fine-tune them on your data if needed.
Implement weighted RRF: Design your scoring function and experiment with different weights to optimize your search results.
Natural language query understanding: Leverage the inbuilt capabilities of Vertex AI agent builder Search to generate filters on structured data to apply the same filters to Vector Search.
Filters in vector search: Apply filters to your image embeddings to further give control to the users.

Read More for the details.

2025 01 14

GCP – Trading in the Cloud: Lessons from Deutsche Börse Group’s cloud-native trading engine

Tibor Kiss Cloud, Google Cloud gcp

Earlier this year, Deutsche Börse Group began developing a new cloud-native, purpose-built trading platform. It was built with a focus on digital assets, such as stablecoins, cryptocurrencies, and other tokenized assets. However, the new platform is instrument-agnostic and can trade in all types of assets, from equities to ETFs.

Developing a trading platform for digital assets isn’t just about embracing this increasingly popular and diverse digital investment universe. Tokens and other digital assets originate from decentralized systems, evolve quickly, trade 24/7 across the globe — and require a trading platform fit for purpose. Therefore, if the new trading platform can reliably deliver on digital assets, it can handle just about any asset you’d want to trade.

This work is one of the first major results of the strategic partnership between Deutsche Börse Group and Google Cloud announced in 2023. Today, institutional trading is largely done on-premise with leased-line connectivity or co-location. Deutsche Börse Group have designed a new cloud-native trading engine for a digital trading platform with 24/7 availability and a cloud-native internet API for access (with co-location as a future integration pattern for more demanding market participants) so it can be rolled-out quickly to new markets and operated at low cost.

As an international exchange organization and innovative market infrastructure provider, Deutsche Börse Group ensures capital markets are fair, transparent, reliable and stable. Their business covers the entire financial market transaction process chain, including the provisioning of indices, data, software, SaaS and analytical solutions, as well as admission, trading, and clearing. Additionally, it comprises services for funds, the settlement and custody of financial instruments, and the management of collateral and liquidity.

As a technology company, the Group also develops state-of-the-art IT solutions and offers its IT systems all over the world. Trust, stability, reliability, resilience, consistency, and compliance are the cornerstones of Deutsche Börse Group’s business — and the key features we incorporated into the new trading engine over the ten months it took to build.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3eb9083b7580>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Digital markets demand new trading systems

Today, Deutsche Börse Group successfully operates high-volume/low-latency trading venues — such as Xetra, Börse Frankfurt, Eurex, and the European Energy Exchange, as well as partner exchanges — by using proven high-performance architectures. Deutsche Börse Group has reached this point by combining financial and technological expertise, and finding the right partners with the knowledge to support its vision.

But even with deep knowledge of our respective fields, the teams at Deutsche Börse Group and Google Cloud knew that building a digital asset trading platform from the ground up would be a challenge. It remains a new and fast-moving space that requires careful and thoughtful consideration to get right.

The need for a new trading engine, and the desire to make it the cornerstone and first component of Deutsche Börse Group’s emerging Digital Asset Business Platform, stems from changing market structures. In the world of digital assets, 24/7 operations are required to reduce execution risk. Market participants also demand choice of market access, including internet connectivity to execute trades anytime, anywhere. Providing access via APIs and convenient SDKs is important for both developer productivity and consistent trade flow. Taken together, these features are essential in markets such as digital assets, where leased line connections and bespoke integrations are not the highest priority.

While traditional trading architectures are designed for industrial purposes and can support high-volume, established markets well, our new trading engine is designed for innovative and changing market structures. They prioritize low time-to-market, with participants demanding rapid deployment and seamless integration. Cloud-native platforms address this need by leveraging the flexibility of the cloud to accelerate deployment and simplify connectivity. This translates to faster deployment and ease of use, which are critical advantages in the dynamic world of digital assets.

Finally, a new trading engine would have to meet not only these new requirements, but also common needs such as resilience, fault-tolerance, and high availability.

The Google Cloud team has prioritized the adoption of cloud resource management best practices — infrastructure as code, the continuous integration of infrastructure changes, and their continuous delivery. This enabled the engineering team to quickly develop, test, and deploy an entire exchange, including infrastructure, with minimal manual intervention, allowing the team to experiment and test the performance of different configurations.

The overall scope was twofold: enable the rapid deployment of new trading venues, and enable incremental changes to existing markets on a daily and even intra-day basis. This would enable a market to operate 24/7.

The architecture of a cloud-native trading system

Recognizing that internet connectivity is the access pattern of choice in the target markets, the Google Cloud team designed a multi-market architecture that uses direct ingress to the Google Cloud’s platform, and leverages a Global External Proxy Network Load Balancer (GEPNLB) for traffic from both TCP/IP sockets and WebSocket clients. Each market environment utilizes its own set of Network Endpoint Groups (NEGs) and Google Kubernetes Engine clusters. This access pattern may change in the future — for example, if the markets become more liquid and therefore attract investors who require low-latency access via colocation and dedicated interconnects.

In this architecture, the NEGs act as backends for the global GEPNLB backend service, and traffic is routed to the NEGs for each market as appropriate. To reduce latency, the architecture uses single tenancy, different subnets per market, and placement policies to minimize distance between critical components and reduce network hops, contributing to improved performance and reduced latency for market participants.

To enhance security, the architecture incorporates Cloud Armor for DDoS protection. A Cloud Armor security policy is attached to the backend service with various rules, including those for mitigating DDoS attacks. This protects the application from malicious traffic and ensures service availability.

The new trading engine at the heart of this architecture initially supports hit-and-take and request-for-offer market models. It uses sophisticated, highly available, high-performance, in-memory, fault-tolerant services to ensure fair and orderly trading. This requires all trade messages to be processed on a strict first-in-first-out basis to maintain order and prevent any unfair advantages. This is a particularly important feature, as it ensures all market participants have an equal opportunity to interact with the market.

A new kind of trading platform for new kinds of markets

To ensure smooth operations and optimal resource allocation, the team designed comprehensive monitoring of all technical activity using the Google Cloud operations suite. This included both functional monitoring to track trading activity, leveraging Google Cloud Trace to follow the lineage of requests coming in from the web and pinpoint bottlenecks, and technical monitoring to ensure the health and performance of the underlying infrastructure. Google Cloud Monitoring captured key performance indicators at each layer of the trading system stack, including application service metrics and resource utilization.

These real-time insights were combined with rigorous performance testing and capacity planning to ensure low-latency handling of high trading volumes. This combination enabled proactive identification and resolution of potential issues and continuous optimization of resource utilization.

To further streamline operations, the integration of managed services offered by Google Cloud, such as backup and archiving, is a future priority for Deutsche Börse Group as it seeks to focus on its core business while relying on Google Cloud for infrastructure management.

Market participants of all kinds are becoming more sophisticated and more demanding every day as technology continues to evolve the way they access markets, and the types of assets they can invest in. Deutsche Börse Group needs to offer services that are equally sophisticated and able to keep pace with the demands of its global customers.

With our new partnership, we have laid the foundation for a trading platform of the future that will serve not only the increasingly popular world of digital assets, but also legacy trading of all kinds. And with the redundancy, flexibility, and security of our work, it has the potential to make trading of all kinds smoother, faster, and more secure.

If you are looking to reinvent your trading platforms, or any other aspect of your financial services business, discover what Google Cloud can do for you today.

Read More for the details.

2025 01 14

GCP – Backscatter: Automated Configuration Extraction

Tibor Kiss Cloud, Google Cloud gcp

Written by: Josh Triplett

Executive Summary

Backscatter is a tool developed by the Mandiant FLARE team that aims to automatically extract malware configurations. It relies on static signatures and emulation to extract this information without dynamic execution, bypassing anti-analysis logic present in many modern families. This complements dynamic analysis, providing faster threat identification and high-confidence malware family attribution. Google SecOps reverse engineers ensure precise indicators of compromise (IOC) extraction, empowering security teams with actionable threat intelligence to proactively neutralize attacks.

Overview

The ability to quickly detect and respond to threats has a significant impact on potential outcomes. Indicators of compromise (IOCs) serve as crucial breadcrumbs, allowing cybersecurity teams to identify and mitigate potential attacks while expanding their search for related activity. VirusTotal’s existing suite of tools to analyze and understand malware IOCs, and thus the Google Threat Intelligence platform by extension, is further enhanced with Backscatter.

VirusTotal has traditionally utilized dynamic analysis methods, like sandboxes, to observe malware behavior and capture IOCs. However, these methods can be time-consuming and may not yield actionable data if the malware employs anti-analysis techniques. Backscatter, a service developed by the Mandiant FLARE team, complements these methods by offering a static analysis capability that directly examines malware without executing it, leading to faster and more efficient IOC collection and high-confidence malware family identification. Additionally, Backscatter is capable of analyzing sandbox artifacts, including memory dumps, to improve support for packed and obfuscated malware that does successfully execute in dynamic environments.

Within the Google Threat Intelligence platform, Backscatter shines by identifying configuration data, embedded IOCs, and other malicious artifacts hidden within malware uploaded by users. It can pinpoint command-and-control (C2 or C&C) servers, dropped files, and other signs of malware presence, rapidly generating actionable threat intelligence. All of the extracted IOCs and configuration attributes become immediately pivotable in the Google Threat Intelligence platform, allowing users to identify additional malware related to that threat actor or activity.

Complementing Dynamic Analysis

Backscatter enables security teams to quickly understand and defend against attacks. By leveraging Backscatter’s extracted IOCs in conjunction with static, dynamic, and reputational data, analysts gain a more comprehensive view of potential threats, enabling them to block malicious communication, detect and remove dropped files, and ultimately neutralize attacks.

Backscatter’s static analysis approach, available in Google Threat Intelligence, provides a valuable addition to the platform’s existing dynamic analysis capabilities. This combination offers a more comprehensive threat intelligence strategy, allowing users to leverage the strengths of both approaches for a more robust security posture.

Backscatter in GTI and VirusTotal

Backscatter is available to Google SecOps customers, including VirusTotal Enterprise and its superseding long-term Google Threat Intelligence platform. While detecting a file as malicious can be useful, more clarity about the specific threat provides defenders with actionable intelligence. By providing a higher confidence attribution to a malware family, capabilities and behaviors can be approximated from previous reporting without requiring manual analysis.

Figure 1: Google Threat Intelligence identifies that a service has extracted DONUT and ASYNCRAT malware configurations from the file (link)

Embedded data such as C2 servers, campaign identifiers, file paths, and registry keys can provide analysts with additional contextual information around a specific event. Google Threat Intelligence helps link that event to related activity by providing pivots to related IOCs, reports, and threat actor profiles. This additional context allows defenders to search their environment and expand remediation efforts.

Google Threat Intelligence displays that Backscatter was able to extract the DONUT payload and extract the payload's ASYNCRAT configuration — Figure 2: Google Threat Intelligence displays that Backscatter was able to extract the DONUT payload

Google Threat Intelligence displays that Backscatter was able to extract the DONUT payload's ASYNCRAT configuration — Figure 3: Google Threat Intelligence displays that Backscatter was able to extract the DONUT payload’s ASYNCRAT configuration

By taking a static approach to extracting data from malware, Backscatter is able to handle files targeting different environments, operating systems, and execution mechanisms. In the previous example, the DONUT malware sample is x86 shellcode and was not able to be executed directly by a sandbox.

Backscatter in the Field

Mandiant Managed Defense leverages Backscatter to deliver faster and more accurate identification and analysis of rapidly emerging malware families. This enables them to more quickly scope threat activity and more rapidly provide customers with pertinent contextual information. From distribution campaigns providing initial access, to ransomware operations, to targeted attacks by state-sponsored actors, Backscatter aims to provide actionable threat intelligence to enable security teams and protect customers.

Figure 4: Google Threat Intelligence displays a phishing campaign involving UNC2500 using the BLACKWIDOW and DARKGATE backdoors

One example threat group is UNC2500, which primarily distributes malware via email attachments and links to compromised websites. Many of the malware families used by this group, such as QAKBOT and DARKGATE, are supported by Backscatter, allowing Managed Defense customers to proactively block IOCs extracted by Backscatter.

Figure 5: UNC2500 provides initial access to UNC4393 to deploy BASTA ransomware

Looking Ahead

Backscatter stands as a testament to Google SecOps’ commitment to providing cutting-edge tools for combating cyber threats. By offering a fast and efficient way to extract IOCs through static analysis, Backscatter empowers security teams to stay one step ahead of attackers. Incorporating Backscatter into their workflow, Google Threat Intelligence customers can strengthen their cybersecurity defenses and safeguard their valuable assets.

Read More for the details.

2025 01 13

GCP – How inference at the edge unlocks new AI use cases for retailers

Tibor Kiss Cloud, Google Cloud gcp

For retailers, making intelligent, data-driven decisions in real-time isn’t an advantage — it’s a necessity. Staying ahead of the curve means embracing AI, but many retailers hesitate to adopt because it’s costly to overhaul their technology. While traditional AI implementations may require significant upfront investments, retailers can leverage existing assets to harness the power of AI.

These assets, ranging from security cameras to point-of-sale systems, can unlock store analytics, faster transactions, staff enablement, loss prevention, and personalization — all without straining the budget. In this post, we’ll explore how inference at the edge, a technique that runs AI-optimized applications on local devices without relying on distant cloud servers, can transform retail assets into powerful tools.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ea0c3d1cbe0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

How retailers can build an AI foundation

Retailers can find assets to fuel their AI in all corners of the business. You can unlock employee productivity by transforming your vast repository of handbooks, training materials, and operational procedures into working assets for AI.

Digitized manuals for store equipment, human resources, loss prevention, and domain-specific information can also be combined with agent-based AI assistants to provide contextually aware “next action assistants”. By extending AI optimized applications from the cloud to the edge, retail associates can now ask their AI assistant, “What do I do next?” with a detailed and fast response tailored to the retail associate’s question.

Edge processing power decision point: CPU vs GPU

Next, we’ll explore the critical decision on the right hardware to power your applications. The two primary options are CPUs (Central Processing Units) and GPUs (Graphics Processing Units), each with its own strengths and weaknesses. Making the informed choice requires understanding your specific use cases and balancing performance requirements, bandwidth, and model processing with cost considerations. Consider this chart to guide your decision-making process, especially when choosing between deploying at a regional DC or at the edge.

Decision matrix (chart):

Feature	CPU	GPU	Use cases (examples)
Cost	Lower	Higher	Basic analytics, people counting, simple object detection
Performance	Required; Good for general-purpose tasks	Optional; Good for parallel processing	Complex AI, video analytics, high-resolution image processing, ML model training
Power consumption	Lower	Higher	Remote locations, small form-factor devices
Latency	Moderate	Lower (for parallel tasks)	Real-time applications, immediate insights
Deployment location	Edge or Regional DC	Typically Edge, but feasible in Regional DC	Determined by latency, bandwidth, and data processing needs

Key decision criteria for retail decision makers

Complexity of AI models: Retail use case focused AI models, like basic object detection, can often run efficiently on CPUs. More complex models, such as those used for real-time video analytics or personalized recommendations with large datasets, typically require the parallel processing power of GPUs.
Data volume and velocity: If you’re processing large amounts of data at high speed, a GPU may be necessary to keep up with the demand. For smaller datasets and lower throughput, a CPU may suffice.
Latency requirements: For use cases requiring ultra-low latency, such as real-time fraud detection, GPUs can provide faster processing, especially when located at the edge, closer to the data source. However, network latency between the edge and a regional DC might negate this benefit if the GPU is located regionally.
Budget: GPUs usually have a higher price tag than CPUs. Carefully consider your budget and the potential ROI of investing in GPU-powered solutions before making a decision. Start with CPU-based solutions where possible and upgrade to GPUs only when absolutely necessary.
Power consumption: GPUs generally consume more power than CPUs. This is an important factor to consider for edge deployments, especially in locations with limited power availability. This is less of a concern if deploying at a regional DC where power and cooling are centralized.
Deployment location: The proximity of the processing power to the data source has major implications for latency. Deploying at the edge (in-store) minimizes latency for real-time use cases. Regional DCs introduce network latency, making them less suitable for applications requiring immediate action. However, certain tasks requiring heavy compute but not low latency (e.g., nightly inventory analysis) might be better suited for a regional DC where resources can be pooled and managed centrally.

Remember, not all AI and ML require new investments in emerging technology. Many AI/ML based use cases can produce the desired outcome without using a GPU. For example, consider visual inspection for storage analytics and fast check out referenced in the Google Distributed Cloud Price-a-Tray interactive game. The inference is performed at 5FPS, while the video stream continues to run at 25FPS. The bounding boxes are then drawn on top of the returned information rather than having one system perform the video stream, detection and bounding boxes. This enables more efficient use of the CPU since many of the actions in this example can be split across cores and threads.

But there are cases when GPUs do make sense. When very high precision is required, GPUs are often needed as the drop in fidelity to quantize a model may reduce the quality beyond acceptable thresholds. In the example of tracking an item, if millimeter movement accuracy is required, 5FPS would not be sufficient on a reasonably fast moving item and a GPU would likely be required.

There is a middle between GPUs and CPUs—the world of speciality accelerators. Accelerators come in the form of peripherals to a system or as special instruction sets to a CPU. CPUs are being manufactured with advanced matrix multiplication math assisting tensor manipulation on-chip, greatly improving performance of ML and AI models. One concrete example is running models compiled for OpenVINO. In addition, Google Distributed Cloud (GDC) Server and Rack editions utilize Intel Core processors, an architecture designed to be more flexible, supporting matrix math improving the performance of ML models on CPU over traditional ML model service serving.

Bring AI to your business

By tapping into the power of existing infrastructure and deploying AI at the edge, retailers can deliver modern customer experiences, streamline operations, and unlock employee productivity.

Learn more about how to transform your retail brand with Google Distributed Cloud.

Read More for the details.

2025 01 12

GCP – Empowering retailers with AI for commerce, marketing, supply chains, and more

Tibor Kiss Cloud, Google Cloud gcp

Google Cloud’s mission is to accelerate every organization’s ability to digitally transform its business and industry — and a key part of doing that is with our ISV and service partners, who possess critical industry knowledge and technical expertise. To provide customers with the most advanced ecosystem of solutions across industries, we’ve enabled these partners to easily build and scale products on our platform. Many are deeply engaged with our AI technology to deliver new and novel AI solutions directly to our customers and theirs.

Today, at the annual National Retail Federation (NRF) conference, we wanted to highlight more than 20 ISV and services partners that are utilizing Vertex AI, Gemini models, and other Google Cloud technologies to empower retail businesses with the tools they need to transform how employees work and shoppers engage with their brands.

Generative AI has already had a significant impact on the retail industry by enabling businesses to run more personalized marketing campaigns, increase sales via improved search capabilities, and enhance customer service experiences through more accurate and tailored resolutions. In many cases, AI agents are helping these businesses move beyond predictive capabilities to performing tasks autonomously.

At NRF, we’re excited to showcase the breadth of our ecosystem of retail partners and spotlight the ways they are enabling customer success using technology from Google Cloud.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4e7b85be50>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Transforming marketing with AI-powered data

AI is helping retailers get significantly more value from business data, enabling them to create personalized campaigns at scale, increase ROI with data-driven insights, and build more predictive and advanced audience segments. Partners are using Vertex AI, Gemini models, and BigQuery to let customers unlock the true potential of their data to optimize revenue and more effectively grow their businesses.

Eagle Eye delivers its AI-powered omnichannel personalization solution, built on Vertex AI, with built-for-retail algorithms to generate personalized promotions at scale that drive loyalty and customer engagement across channels.
LiveRamp provides a data collaboration platform that allows companies to enrich, activate, and analyze customer data while protecting brand and consumer trust
Revieve offers multiple solutions tailored for beauty retailers and brands that provide real-time consumer interactions, next gen AI, conversational AI, and data-informed product discovery.
Revionics’ price optimization suite utilizes Gemini and Vertex AI to power conversational analytics that enable customers to engage with their retail data using natural language search, such as “which competitor changes prices most frequently” and “which products are priced higher than competitors.”

Optimizing unified commerce experiences

Unified commerce experiences equip retailers with a more holistic view of front- and back-end systems to have complete visibility of the customer, inventory, and orders across all retail channels. With Google Cloud technology like BigQuery and embedded ML, partners are helping customers enhance decision-making processes and create stronger brand loyalty and revenue growth.

BigCommerce uses Google Cloud AI within BigAI Product Recommendations, which enables brands to offer shoppers real-time, personalized recommendations and can boost conversion and average order value.
Bloomreach uniquely integrates customer and product data within its real-time AI solution, enabling more personalized marketing, product discovery, advertising content, and conversational shopping experiences.
commercetools is a global leader in composable commerce and empowers businesses to customize, scale, and optimize shopping experiences with solutions that help retailers reduce risks and costs, and expand growth through exceptional customer experiences.
Everseen Vision AI platform and applications reduce retail shrink, improve inventory accuracy, enhance customer service, and provide data-driven insights, contributing to retailers’ ROI and a streamlined shopping experience.
Quantum Metric provides a digital analytics platform that enables businesses to more easily monitor, troubleshoot, and optimize their customers’ digital journeys while leveraging gen AI to enhance user retention, conversion rates, and much more.
Shopify is the leading global commerce company with a platform engineered for speed, customization, reliability, and security for businesses of any size, and a better experience for consumers everywhere they shop.

Creating sustainable supply chains

AI-powered tools for supply chains and logistics are enabling retailers to drive more sustainable and efficient operations, scale automation, and reduce their carbon footprint across the entire value chain. Partners are leveraging Vertex AI and BigQuery to extend these capabilities to retailers, with industry-leading analytics and predictive capabilities that can help optimize business performance.

345 Global is a cloud-based platform that enables customers to optimize store planning, merchandising, sales, and marketing functions within a single, integrated solution.
Impact Analytics helps retailers and consumer goods businesses make better decisions and improve profitability with a platform that uses predictive analytics and machine learning to optimize various aspects, such as forecasting demand, managing supply chains, and enhancing merchandise planning, pricing, and promotions.
Manhattan empowers retailers to unify point of sale, order management, inventory, fulfillment, and customer service with supply chain execution — optimizing operations, enabling real-time decisions, and driving growth.
o9 Solutions unlocks measurable results by transforming disconnected planning processes, reducing value leakage, and enabling smarter, integrated, and more efficient planning decisions.

aside_block: <ListValue: [StructValue([(‘title’, ‘Our 2025 AI trends for retail and consumer goods’), (‘body’, <wagtail.rich_text.RichText object at 0x3e4e7b85ba00>), (‘btn_text’, ‘Read them now.’), (‘href’, ‘https://cloud.google.com/resources/ai-trends-retail?utm_source=cgc-blog&utm_medium=blog&utm_campaign=FY24-Q4-global-ENT30703-website-dl-ai-trends-report-retail-cpg-2025&utm_content=-&utm_term=-‘), (‘image’, None)])]>

Enhancing physical store operations

Physical stores and in-person shopping experiences remain vital to retailers. AI is helping these businesses improve how they operate in a variety of ways, whether it’s enhancing how merchandising assistants support customer requests or deploying machine vision to detect and resolve low-inventory challenges.

NCR Voyix enables retailers to deliver a seamless and personalized omnichannel shopping experience while providing real-time, data-driven insights into shopper behavior and store performance, which helps optimize operations and supports long-term growth.
Standard.ai offers solutions that let retailers optimize performance through computer vision with capabilities, such as multi-camera tracking to enable high-resolution understanding of shopper behaviors and store performance.
VusionGroup helps retailers maximize efficiency and improve store performance with solutions that can optimize critical functions, such as intelligent pricing and promotions, real-time shelf monitoring, in-store digital advertising, and more.
Zebra offers new integrated hardware and software solutions that leverage AI and machine learning to help retailers transform workflows through improved inventory, connected frontline workers, and intelligent automation.

Enabling customer success with services partners

Google Cloud relies on its services partners to provide customers with the expertise and support needed to plan, deploy, and optimize AI projects. Many of these partners have launched services specifically for retailers and are continuing to demonstrate their proven ability to help customers transform with AI and other Google Cloud technology at NRF.

Accenture and its ai.RETAIL solution provide customers with the technology needed to transform operations, deploying AI and edge computing to improve consumer experiences, personalize marketing, enhance employee productivity, and more.
Deloitte offers a real-time Associate Productivity solution for intelligent task management and improving in-store operations, a Demand Planning solution to enhance inventory productivity and on-shelf availability, and a Customer Data Enrichment solution for better customer insights and personalized marketing.
Publicis Sapient applies Google Cloud AI for its Content Supply Chain offering, which helps businesses optimize the content lifecycle, and its Retail Media Accelerator, which enables retailers to identify new revenue streams and increase ROI throughout the marketing lifecycle.
Tredence brings unified data models and AI/ML accelerators together with its gen AI-powered Category Performance Advisor, which provides real-time prescriptive recommendations for retail organizations to stay ahead of market trends, improve efficiency, and drive measurable growth.
Slalom provides retail businesses with a multimodal AI discovery solution that uses BigQuery, Vertex AI, and Gemini to help customers solve product discovery challenges and initiate automated workflows for delivery and warranty information.

Read More for the details.

2025 01 10

GCP – How to build dynamic web experiences with Conversational Agents

Tibor Kiss Cloud, Google Cloud gcp

If you have a website, it’s table stakes to build engaging experiences that are effective at retaining existing customers, and attracting new ones. Users want tailored content, but traditional website development tools struggle to keep up with the demand for dynamic, individualized journeys. With Google Gemini and Conversational Agents (Dialogflow CX), you can now build websites that dynamically adapt their content based on what your users are looking for.

In this blog post, you will learn how to:

Create dynamic web pages that respond to user’s intents using Conversational Agents
Use function tools to bridge the gap between conversation intent and web content display

What is a Conversational Agents function tool?

A Conversational Agent function tool is a feature that allows your chatbot to interact with external systems and trigger actions based on user conversations. In this article, we use it to:

Detect user intents from natural language input
Map those intents to specific function tool
Dynamically update the UI based on the conversation flow

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud developer tools’), (‘body’, <wagtail.rich_text.RichText object at 0x3ec89f16f610>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Let’s take an example: Retail chatbot

While everyone can benefit from these features, retailers in particular can benefit from building dynamic web pages with Conversational Agents. We’ll use a retail chatbot use case to demonstrate this tool. Here’s the workflow:

Step 1: Create a function tool

Set up a new Playbook function tool called Load-Swag-Content with the following input/output schemas in YAML format.

code_block: <ListValue: [StructValue([(‘code’, ‘# Input formatrnproperties:rn url:rn type: stringrn description: the URL for the Swagrnrequired:rn – urlrntype: objectrnrn# Output formatrnnull’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ec89cf41340>)])]>

Your console should look something like this:

Step 2: Set up a playbook steering agent

Set up a main steering playbook to call the function tool Load-Swag-Content.

Step 3: Create examples to drive Playbook agent behavior.

In this example, when a user asks about “Backpack”, the Playbook agent will call the function tool by passing a backpack related URL as an argument to the web client.

More information on the web client in the next step.

Step 4: Write web client JavaScript function

This client-side Javascript function receives the URL from the Load-Swag-Content function tool and updates the HTML iframe accordingly.

code_block: <ListValue: [StructValue([(‘code’, ‘function loadURL(url) {rn console.log(“URL received: ” + url.url);rn document.getElementById(‘myIframe’).src = url.url;rn }’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ec89cf415e0>)])]>

We are using HTML iframe to demonstrate the function calling and parameter passing capabilities. The same concept works across different web frameworks and applications, and developers can be as creative as they want to build custom logic.

Step 5: Register the function tool

Register the Playbook function tool using registerClientSideFunction, which will map the Load-Swag-Content tool with the JavaScript function loadURL.

code_block: <ListValue: [StructValue([(‘code’, ‘const toolId = “YOUR_TOOL_ID”rnrnconst dfMessenger = document.querySelector(‘df-messenger’);rnrndfMessenger.registerClientSideFunction(toolId, “Load-Swag-Content”, loadURL)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ec89cf41160>)])]>

You can get the toolId from the browser URL bar at the Playbook function tool page.

Step 6: Integrate Dialogflow messenger

Finally, embed the Dialogflow messenger integration with your web client.

code_block: <ListValue: [StructValue([(‘code’, ‘<df-messengerrn location=”YOUR_REGION”rn project-id=”YOUR_PROJECT_ID”rn agent-id=”YOUR_AGENT_ID”rn language-code=”en”rn max-query-length=”-1″>rn <df-messenger-chat-bubblern chat-title=”VA driven Web Content”rn expanded=”true”rn chat-width=”320″rn chat-height=”480″>rn </df-messenger-chat-bubble>rn</df-messenger>’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ec89cf418b0>)])]>

Sample front end source code

This is a front end sample code. You need to update configuration such as YOUR_REGION, YOUR_PROJECT_ID, YOUR_AGENT_ID, YOUR_TOOL_ID, and custom JavaScript function.

code_block: <ListValue: [StructValue([(‘code’, ‘<HTML>rn <body>rn <iframe id=”myIframe” src=”” width=”100%” height=”600px”></iframe>rn rn <link rel=”stylesheet” href=”https://www.gstatic.com/dialogflow-console/fast/df-messenger/prod/v1/themes/df-messenger-default.css”>rn <script src=”https://www.gstatic.com/dialogflow-console/fast/df-messenger/prod/v1/df-messenger.js”></script>rn <df-messengerrn location=”<YOUR_REGION>”rn project-id=”<YOUR_PROJECT_ID>”rn agent-id=”<YOUR_AGENT_ID>”rn language-code=”en”rn max-query-length=”-1″>rn <df-messenger-chat-bubblern chat-title=”VA driven Web Content”rn expanded=”true”rn chat-width=”320″rn chat-height=”480″>rn </df-messenger-chat-bubble>rn </df-messenger>rn <style>rn df-messenger {rn z-index: 999;rn position: fixed;rn –df-messenger-font-color: #000;rn –df-messenger-font-family: Google Sans;rn –df-messenger-chat-background: #f3f6fc;rn –df-messenger-message-user-background: #d3e3fd;rn –df-messenger-message-bot-background: #fff;rn bottom: 16px;rn right: 16px;rn }rn </style>rn rn <script>rn function loadURL(url) {rn console.log(“URL received: ” + url.url);rn document.getElementById(‘myIframe’).src = url.url;rn }rn rn const toolID = “<YOUR_TOOL_ID>”;rn rn const dfMessenger = document.querySelector(‘df-messenger’);rn dfMessenger.registerClientSideFunction(toolID, “Load-Swag-Content”, loadURL);rn rn const default_url = {“url”:”https://www.example.com/v1/inventory.html”};rn loadURL(default_url);rn dfMessenger.sendQuery(‘Hi’);rn rn </script>rn </body>rn </HTML>’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3ec89cf41d60>)])]>

Demo web page

Let’s look at a demo use case for a virtual swag assistant. The customer is greeted at the start of the chat.

When the customer wants to find out more about a Fleece Jacket, the page is dynamically updated to display relevant information.

Next steps

To learn more about Conversational Agent Function tools, check out the following resources and enhance your customer experience with real-time intent-based dynamic web pages.

Get started with Conversational Agent by following the tutorial here
Function Tool Documentation
Conversational Agents best practices

Read More for the details.

2025 01 09

GCP – Introducing Vertex AI RAG Engine: Scale your Vertex AI RAG pipeline with confidence

Tibor Kiss Cloud, Google Cloud gcp

Closing the gap between impressive model demos and real-world performance is crucial for successfully deploying generative AI for enterprise. Despite the incredible capabilities of generative AI for enterprise, this perceived gap may be a barrier for many developers and enterprises to “productionize” AI. This is where retrieval-augmented generation (RAG) becomes non-negotiable – it strengthens your enterprise applications by building trust in its AI outputs.

Today, we’re sharing the general availability of Vertex AI’s RAG Engine, a fully managed service that helps you build and deploy RAG implementations with your data and methods. With our Vertex AI RAG Engine you will be able to:

Adapt to any architecture: Choose the models, vector databases, and data sources that work best for your use case. This flexibility ensures RAG Engine fits into your existing infrastructure rather than forcing you to adapt to it.
Evolve with your use case: Add new data sources, updating models, or adjusting retrieval parameters happens through simple configuration changes. The system grows with you, maintaining consistency while accommodating new requirements.
Evaluate in simple steps: Set up multiple RAG engines with different configurations to find what works best for your use case.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud AI and ML’), (‘body’, <wagtail.rich_text.RichText object at 0x3eef9c26ed30>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/vertex-ai/’), (‘image’, None)])]>

Introducing Vertex AI RAG Engine

Vertex AI RAG Engine is a managed service that lets you build and deploy RAG implementations with your data and methods. Think of it as having a team of experts who have already solved complex infrastructure challenges such as efficient vector storage, intelligent chunking, optimal retrieval strategies, and precise augmentation — all while giving you the controls to customize for your specific use case.

Vertex AI’s RAG Engine offers a vibrant ecosystem with a range of options catering to diverse needs.

DIY capabilities: DIY RAG empowers users to tailor their solutions by mixing and matching different components. It works great for low to medium complexity use cases with easy-to-get-started API, enabling fast experimentation, proof-of-concept and RAG-based application with a few clicks.
Search functionality: Vertex AI Search stands out as a robust, fully managed solution. It supports a wide variety of use cases, from simple to complex, with high out-of-the-box quality, easiness to get started and minimum maintenance.
Connectors: A rapidly growing list of connectors helps you quickly connect to various data sources, including Cloud Storage, Google Drive, Jira, Slack, or local files. RAG Engine handles the ingestion process (even for multiple sources) through an intuitive interface.
Enhanced performance and scalability: Vertex AI Search is designed to handle large volumes of data with exceptionally low latency. This translates to faster response times and improved performance for your RAG applications, especially when dealing with complex or extensive knowledge bases.
Simplified data management: Import your data from various sources, such as websites, BigQuery datasets, and Cloud Storage buckets, that can streamline your data ingestion process.
Improved LLM output quality: By using the retrieval capabilities of Vertex AI Search, you can help to ensure that your RAG application retrieves the most relevant information from your corpus, which leads to more accurate and informative LLM-generated outputs.

Customization

One of the defining strengths of Vertex AI’s RAG Engine is its capacity for customization. This flexibility allows you to fine-tune various components to perfectly align with your data and use case.

Parsing: When documents are ingested into an index, they are split into chunks. RAG Engine provides the possibility to tune chunk size and chunk overlap and different strategies to support different types of documents.
Retrieval: you might already be using Pinecone, or perhaps you prefer the open-source capabilities of Weaviate. Maybe you want to leverage Vertex AI Vector Search or our Vector database. RAG Engine works with your choice, or if you prefer, can manage the vector storage entirely for you. This flexibility ensures you’re never locked into a single approach as your needs evolve.
Generation: You can choose from hundreds of LLMs in Vertex AI Model Garden, including Google’s Gemini, Llama and Claude.

Use Vertex AI RAG as a tool in Gemini

Vertex AI’s RAG Engine is natively integrated with Gemini API as a tool. You can create grounded conversation that uses RAG to provide contextually relevant answers. Simply initialize a RAG retrieval tool, configured with specific settings like the number of documents to retrieve and using an LLM-based ranker. This tool is then passed to a Gemini model.

code_block: <ListValue: [StructValue([(‘code’, ‘from vertexai.preview import ragrnfrom vertexai.preview.generative_models import GenerativeModel, Toolrnimport vertexairnrnPROJECT_ID = “PROJECT_ID”rnCORPUS_NAME = “projects/{PROJECT_ID}/locations/LOCATION/ragCorpora/RAG_CORPUS_RESOURCE”rnMODEL_NAME= “MODEL_NAME”rnrn# Initialize Vertex AI API once per sessionrnvertexai.init(project=PROJECT_ID, location=”LOCATION”)rnrnconfig = vertexai.preview.rag.RagRetrievalConfig(rn top_k=10,rn ranking=rag.Ranking(rn llm_ranker=rag.LlmRanker(rn model_name=MODEL_NAMErn )rn )rn)rnrnrag_retrieval_tool = Tool.from_retrieval(rn retrieval=rag.Retrieval(rn source=rag.VertexRagStore(rn rag_resources=[rn rag.RagResource(rn rag_corpus=CORPUS_NAME,rn )rn ],rn rag_retrieval_config=configrn ),rn )rn)rnrnrag_model = GenerativeModel(rn model_name=MODEL_NAME, tools=[rag_retrieval_tool]rn)rnresponse = rag_model.generate_content(“Why is the sky blue?”)rnprint(response.text)rn# Example response:rn# The sky appears blue due to a phenomenon called Rayleigh scattering.rn# Sunlight, which contains all colors of the rainbow, is scatteredrn# by the tiny particles in the Earth’s atmosphere….rn# …’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eef9c26ed00>)])]>

Use Vertex AI Search as a retriever:

Vertex AI Search provides a solution for retrieving and managing data within your Vertex AI RAG applications. By using Vertex AI Search as your retrieval backend, you can improve performance, scalability, and ease of integration.

Enhanced performance and scalability: Vertex AI Search is designed to handle large volumes of data with exceptionally low latency. This translates to faster response times and improved performance for your RAG applications, especially when dealing with complex or extensive knowledge bases.
Simplified data management: Import your data from various sources, such as websites, BigQuery datasets, and Cloud Storage buckets, that can streamline your data ingestion process.
Seamless integration: Vertex AI provides built-in integration with Vertex AI Search, which lets you select Vertex AI Search as the corpus backend for your RAG application. This simplifies the integration process and helps to ensure optimal compatibility between components.
Improved LLM output quality: By using the retrieval capabilities of Vertex AI Search, you can help to ensure that your RAG application retrieves the most relevant information from your corpus, which leads to more accurate and informative LLM-generated outputs.

code_block: <ListValue: [StructValue([(‘code’, ‘from vertexai.preview import ragrnimport vertexairnrnPROJECT_ID = “PROJECT_ID”rnDISPLAY_NAME = “DISPLAY_NAME”rnENGINE_NAME = “ENGINE_NAME”rnrn# Initialize Vertex AI API once per sessionrnvertexai.init(project=PROJECT_ID, location=”us-central1″)rnrn# Create a corpusrnvertex_ai_search_config = rag.VertexAiSearchConfig(rn serving_config=f”{ENGINE_NAME}/servingConfigs/default_search”,rn)rnrnrag_corpus = rag.create_corpus(rn display_name=DISPLAY_NAME,rn vertex_ai_search_config=vertex_ai_search_config,rn)rnrn# Check the corpus just createdrnnew_corpus = rag.get_corpus(name=rag_corpus.name)rnprint(new_corpus)’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3eef9cad9ac0>)])]>

Get started today

You can access Vertex AI’s RAG Engine through our Vertex AI Studio. Visit the Google Cloud Console to get started, or reach out to us for a guided proof of concept. To get started visit our RAG quick start documentation or take a look at our Vertex AI RAG Engine GitHub repository.

Read More for the details.

2025 01 09

GCP – Avoid global outages by partitioning cloud applications to reduce blast radius

Tibor Kiss Cloud, Google Cloud gcp

Cloud applications like Google Workspace provide benefits such as collaboration, availability, security, and cost-efficiency. However, for cloud application developers, there’s a fundamental conflict between achieving high availability and the constant evolution of cloud applications. Changes to the application, such as new code, configuration updates, or infrastructure rearrangements, can introduce bugs and lead to outages. These risks pose a challenge for developers, who must balance stability and innovation while minimizing disruption to users.

Here on the Google Workspace Site Reliability Engineering team, we once moved a replica of Google Docs to a new data center because we needed extra capacity. But moving the associated data, which was vast, overloaded a key index in our database, restricting user ability to create new docs. Thankfully, we were able to identify the root cause and mitigate the problem quickly. Still, this experience convinced us of the need to reduce the risk of a global outage from a simple application change.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3eef945c6ac0>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Limit the blast radius

Our approach to reducing the risk of global outages is to limit the “blast radius,” or extent, of an outage by vertically partitioning the serving stack. The basic idea is to run isolated instances (“partitions”) of application servers and storage (Figure 1). Each partition contains all the various servers necessary to service a user request from end to end. Each production partition also has a pseudo-random mix of users and workloads, so all the partitions have similar resource needs. When it comes time to make changes to the application code, we deploy new changes to one partition at a time. Bad changes may cause a partition-wide outage, but we are protected from a global application outage.

Compare this approach to using canarying alone, in which new features or code changes are released to a small group of users before rolling them out to the rest. While canarying deploys changes first to just a few servers, it doesn’t prevent problems from spreading. For example, we’ve had incidents where canaried changes corrupted data used by all the servers in the deployment. With partitioning, the effects of bad changes are isolated to a single partition, preventing such contagion. Of course, in practice, we combine both techniques: canarying new changes to a few servers within a single partition.

Benefits of partitioning

Broadly speaking, partitioning brings a lot of advantages:

Availability: Initially, the primary motivation for partitioning was to improve the availability of our services and avoid global outages. In a global outage, an entire service may be down (e.g., users cannot log into Gmail), or a critical user journey (e.g., users cannot create Calendar events) — obviously things to be avoided.
Still, the reliability benefits of partitioning can be hard to quantify; global outages are relatively infrequent, so if you don’t have one for a while, it may be due to partitioning, or may be due to luck. That said, we’ve had several outages that were confined to a single partition, and believe they would have expanded into global outages without it.
Flexibility: We evaluate many changes to our systems by experimenting with data. Many user-facing experiments, such as a change to a UI element, use discrete groups of users. For example, in Gmail we can choose an on-disk layout that stores the message bodies of emails inline with the message metadata, or a layout that separates them into different disk files. The right decision depends on subtle aspects of the workload. For example, separating message metadata and bodies may reduce latency for some user interactions, but requires more compute resources in our backend servers to perform joins between the body and metadata columns. With partitioning, we can easily evaluate the impact of these choices in contained, isolated environments.
Data location: Google Workspace lets enterprise customers specify that their data be stored in a specific jurisdiction. In our previous, non-partitioned architecture, such guarantees were difficult to provide, especially since services were designed to be globally replicated to reduce latency and take advantage of available capacity.

Challenges

Despite the benefits, there are some challenges to adopt partitioning. In some cases, these challenges make it hard or risky to move from a non-partitioned to a partitioned setup. In other cases, challenges persist even after partitioning. Here are the issues as we see them:

Not all data models are easy to partition: For example, Google Chat needs to assign both users and chat rooms to partitions. Ideally, a chat and its members would be in a single partition to avoid cross-partition traffic. However, in practice, this is difficult to accomplish. Chat rooms and users form a graph, with users in many chat rooms and chat rooms containing many users. In the worst case, this graph may have only a single connected component — the user. If we were to slice the graph into partitions, we could not guarantee that all users would be in the same partition as their chat rooms.
Partitioning a live service requires care: Most of our services pre-date partitioning. As a result, adopting partitioning means taking a live service and changing its routing and storage setup. Even if the end goal is higher reliability, making these kinds of changes in a live system is often the source of outages, and can be risky.
Partition misalignment between services: Our services often communicate with each other. For example, if a new person is added to a Calendar event, Calendar servers make an Remote Procedure Call (RPC) to Gmail delivery servers to send the new invitee an email notification. Similarly, Calendar events with video call links require Calendar to talk to Meet servers for a meeting id. Ideally, we would get the benefits of partitioning even across services. However, aligning partitions between services is difficult. The main reason is that different services tend to use different entity types when determining which partition to use. For example, Calendar partitions on the owner of the calendar while Meet partitions on meeting id. The result is that there is no clear mapping from partitions in one service to another.
Partitions are smaller than the service: A modern cloud application is served by hundreds or thousands of servers. We run servers at less than full utilization so that we can tolerate spikes in traffic, and because servers that are saturated with traffic generally perform poorly. If we have 500 servers, and target each at 60% CPU utilization, we effectively have 200 spare servers to absorb load spikes. Because we do not fail over between partitions, each partition has access to a much smaller amount of spare capacity. In a non-partitioned setup, a few server crashes may likely go unnoticed, since there is enough headroom to absorb the lost capacity. But in a smaller partition, these crashes may account for a non-trivial portion of the available server capacity, and the remaining servers may become overloaded.

Key takeaways

We can improve the availability of web applications by partitioning their serving stacks. These partitions are isolated, because we do not fail over between them. Users and entities are assigned to partitions in a sticky manner to allow us to roll out changes in order of risk tolerance. This approach allows us to roll out changes one partition at a time with confidence that bad changes will only affect a single partition, and ideally that partition contains only users from your organization.

In short, partitioning supports our efforts to provide stronger and more reliable services to our users, and it might apply to your service as well. For example, you can improve the availability of your application by using Spanner, which provides geo-partitioning out of the box. Read more about geo-partitioning best practices here.

References

Read More for the details.

2025 01 09

GCP – Get ready for a unique, immersive security experience at Next ‘25

Tibor Kiss Cloud, Google Cloud gcp

Few things are more critical to IT operations than security. Security incidents, coordinated threat actors, and regulatory mandates are coupled with the imperative to effectively manage risk and the vital business task of rolling out generative AI. That’s why in 2025 at Google Cloud Next we are creating an in-depth security experience to show you all the ways that you can make Google part of your security team and advance your innovation agenda with confidence.

Let’s see why Google Cloud Next is shaping up to be a must-attend event for security experts and the security-curious alike.

What’s in store for you

Here are some of the opportunities you’ll have to interact with Google’s security experts and security technology:

Our massive Security Lounge, a dedicated area of the expo where you can meet the security leaders engineering Google Cloud’s secure by design platform and products, and experience product demos spanning Google Cloud’s security portfolio. Get all your burning product questions answered and provide direct input to the teams who build them.
An interactive Security Operations Center to experience the power of Google Security Operations from the eyes of both defender and adversary. See first-hand how Google equips cybersecurity teams with the data, AI, and scalable analytics to detect and remediate today’s most sophisticated attacks.
At the Mandiant Threat Space, you’ll be able to hear and learn directly from frontline defenders and incident responders who battle advanced threats and defend critical infrastructure around the world.
The Securing AI experience demonstrates how Google Cloud products and expertise can help you manage AI risk, from creation to consumption: inventory your AI assets, safeguard your AI systems, and respond to threats.
Our Capture the Flag challenge, where you can test and hone your cybersecurity skills. This exercise will use real-world data, including Cybersecurity and Infrastructure Security Agency (CISA) advisories, ransom notes, and information from the dark web, to simulate a real-world threat hunt. Navigate clues, analyze evidence, and solve puzzles to capture the flags and best the competition.
Security tabletop exercises where participants role-play and analyze aspects of a hypothetical but realistic cybersecurity incident, such as a data breach or ransomware attack. Gain insight into how your organization is likely to perform during incidents before they happen and learn best practices for handling these incredibly challenging situations that you can take back to your organization.
Birds of a Feather sessions for insightful discussions on key cloud security topics. These are unique opportunities to connect with peers, share your cybersecurity expertise and solve problems with the help of the Google Cloud Security community.

aside_block: <ListValue: [StructValue([(‘title’, ‘$300 in free credit to try Google Cloud security products’), (‘body’, <wagtail.rich_text.RichText object at 0x3eef9afcca00>), (‘btn_text’, ‘Start building for free’), (‘href’, ‘http://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

Security breakout sessions

If you’ve attended Next in the past, you know that breakouts are also core to our program. We’ll have more than 40 security breakout sessions covering today’s pressing security topics including cloud security and governance, frontline threat intelligence, enterprise compliance and resilience, AI risk management, and incident response.

Here’s a sneak peek at some of the key breakout sessions on the agenda:

Securing your AI deployments, from creation to consumption: Learn how to build secure AI systems from the ground up and protect your AI models from attacks.
Route, reduce, redact: Managing your security data pipeline: Dive into the new data pipeline management capabilities of Google SecOps and learn how to transform security data to manage scale, reduce costs, and satisfy compliance mandates.
Got identity? Learn to love IAM: Master Identity and Access Management (IAM) to control access to your cloud resources and prevent unauthorized access.
Stop data exfiltration with cloud-first security controls: Discover how to prevent sensitive data from leaving your organization’s control.
Unlocking OT security: Threat intelligence for critical industries: Learn how advanced threat intelligence enables organizations to move from reactive to proactive defense strategies.
AI security and APIs: Addressing the OWASP top 10 LLM and API risks: Understand the top security risks for large language models (LLMs) and APIs, and learn how to mitigate them.
Strengthen cloud security posture, detect threats, and mitigate risks with Security Command Center: Use Google Cloud’s Security Command Center to gain comprehensive visibility into your security posture and respond to threats effectively.
Best practices for SIEM migration and ditching dinosaurs: In this panel, experts will share insights and best practices from their own SIEM migration journeys.
Keep AI secrets safe with Confidential Computing: Explore confidential computing techniques to protect your sensitive data and AI models in use.
Protect Internet-facing web, API, and gen AI services from attacks: Secure your web applications, APIs, and generative AI services from a wide range of threats.
There’s no place like Chrome for advanced data protection and threat intelligence: Learn how Chrome’s security features can protect your users and your organization from cyberattacks.

Dedicated security executive program

Our CISO Connect for Leaders is dedicated programming designed to equip CISOs and other security leaders with insights and strategies they need to navigate the evolving threat landscape and build a security-first culture. If you would like to be considered for participation in this executive program at Next ‘25, contact your Google Cloud account representative.

Don’t miss out

Next ‘25 is the ideal opportunity for everyone in your organization to learn about how Google Cloud can help keep them safe as they move forward in the AI era. You can also earn continuing professional education credits for your certifications.

Next ’25 will take place at the Mandalay Bay Convention Center in Las Vegas, April 9 to 11, 2025. Early bird pricing is available for $999 — but space is limited, so register soon.

Elevate your security game at Next ’25. Register today, and stay tuned for more updates and information on our security programming.

Read More for the details.

2025 01 09

GCP – How retailers are accelerating AI into production with NVIDIA and Google Cloud

Tibor Kiss Cloud, Google Cloud gcp

Retailers have always moved quickly to connect and match the latest merchandise with customers’ needs. And the same way they carefully design every inch of their stores, the time and thought that goes into their IT infrastructure is now just as important in the era of omnichannel shopping.

As retail organizations increasingly adopt AI foundation models and other AI technologies to improve the shopping journey, robust infrastructure becomes paramount. Retailers need to be able to develop AI applications and services quickly, reliably, robustly, and affordably, and with support from Google Cloud and NVIDIA, leading companies are already accelerating their time to market and achieving scalable costs as they move AI from pilots into production.

Google Cloud has worked with NVIDIA to empower retailers to boost their customer engagements in exciting new ways, deliver more hyper-personalized recommendations, and build their own AI applications and agents; we’ve also integrated prebuilt generative AI agents for customer service to drive immediate savings. With the NVIDIA AI Enterprise software platform available on the Google Cloud Marketplace, retailers can streamline AI development and deployment through scalable NVIDIA infrastructure running on Google Cloud.

aside_block: <ListValue: [StructValue([(‘title’, ‘Try Google Cloud for free’), (‘body’, <wagtail.rich_text.RichText object at 0x3ec644279490>), (‘btn_text’, ‘Get started for free’), (‘href’, ‘https://console.cloud.google.com/freetrial?redirectPath=/welcome’), (‘image’, None)])]>

And now, retailers can also leverage NVIDIA NIM microservices, part of NVIDIA AI Enterprise and available on Google Kubernetes Engine (GKE) to deploy generative AI models at scale, optimize inference and handle large volumes of inquiries at reduced costs.

Retail customers and partners are combining Google Cloud with NVIDIA AI Enterprise to unlock AI transformation at scale.

Reduce Costs and Enhance Customer Satisfaction: LiveX AI stands at the cutting edge of generative AI technology, building custom, multimodal AI agents that can deliver truly human-like customer experiences. Google Cloud and LiveX AI collaborated to help jumpstart LiveX AI’s development, using Google Kubernetes Engine (GKE) and NVIDIA AI Enterprise. In a matter of three weeks, LiveX AI and Google Cloud worked together to deliver a custom solution for its client, resulting in a reduction in customer support costs by up to 85%.

“NVIDIA’s software on Google Cloud brings two of the best technology leaders together. NVIDIA’s easy-to-use NIM microservices, available on Google Cloud, are secure and reliable, and help deploy high-performance AI model inference more quickly and affordably. NVIDIA NIM microservices and GPUs on GKE accelerated LiveX AI Agent’s average answer/response generation speed by 6.1x, enabling real-time, human-like interactions for customer support, shopping assistance, and product education, boosting growth, retention and customer experience.” – Jia Li, Co-Founder, Chief AI Officer, LiveX AI

Improve responsiveness: AI techniques like text embedding and vector database help retailers make more relevant recommendations by using more data, but this can also slow the experience down. The in-house engineering and data science organization at a top-5 U.S. grocer collaborated with Google and NVIDIA to optimize models for better performance.

By using NVIDIA AI Enterprise software’s performance and caching improvements in its Vertex AI endpoint, the grocer cut inference time from several seconds to just 100 milliseconds — without changing the model. This now makes large-scale, real-time personalization possible. Learn more about the benefits of combining Google Cloud Vertex AI Platform and NVIDIA AI Enteprise software.

In-store analytics & innovation: AI is advancing how brick and mortar stores understand customer engagement, creating new opportunities to personalize the shopper journey. Standard.ai is accelerated by NVIDIA Metropolis, also available with NVIDIA AI Enterprise on the Google Cloud Marketplace, giving retailers and consumer goods precise visualization of customer journeys and creating actionable insights by real time analyzing factors such as dwell time, shopper orientation, proximity, and engagement with products, ads, and high-impact zones.

“The NVIDIA Metropolis platform and DeepStream software development kit have enabled us to seamlessly deploy our video pipelines across Google Cloud data centers and on-prem GPUs, and, in combination with model optimizations through the NVIDIA TensorRT ecosystem of application programming interfaces, we have cut our image preprocessing time to one-third, significantly reducing our infrastructure footprint.” – David Woolard, Chief Technology Officer, Standard.ai

Accelerate AI transformation

Influenced by the rapid advancements of AI, the retail landscape is evolving faster than ever. For retailers looking to stay on the cutting edge, the collaboration between Google Cloud and NVIDIA continues to offer access to the latest in AI models, infrastructure, platforms that ensure scalability, and development tools all in an environment that’s built on responsible AI practices and best-in-class security.

Get started now with NVIDIA AI Enterprise on Google Cloud to maximize your AI investments and scale across your enterprise.

Read More for the details.

Google Cloud

Google for Startups Accelerator: AI First

Hear from our alumni

Apply now: Don’t miss the chance to join this year’s Google for Startups Accelerator: AI First UK cohort

The Rise of Crypto Heists and the Challenges in Preventing Them

Prevention

Securing Cryptocurrency and IT Infrastructure to Prevent Compromise

Cryptocurrency Infrastructure and Wallet Management

Standards for Crypto Infrastructure

Custodian Solutions

Smart Contracts and Multi-Sig Wallets

Management and Protection of Cold Wallets and Their Transactions

Cold Wallet Security

Dedicated Signing Systems

General Infrastructure Security

Secrets Management

Securing Developer Workstations

Protecting Customers

Detection

Maintaining Visibility and Engineering Detections to Catch Attackers Early

Monitoring User Transactions and Activity

Monitoring Internal User and Wallet Interaction

Heightened Monitoring on Developer and Signing Systems

Response

Taking Action to Thoroughly Investigate and Remediate Compromise

Tactical Hardening and Positioning

Investigating

On-Chain Analysis

Remediating and Moving Forward

Conclusion

Call to Action for Security Leaders

Acknowledgement

Empowering you with DORA Customer Guides

Trust through transparency about subcontracting

Looking ahead

Why this matters

Get started today

Use business terms when discussing cybersecurity to gain broader support

In case you missed it

Threat Intelligence news

Now hear this: Google Cloud Security and Mandiant podcasts

What our customers and partners are saying

Try C4A now

Bitly’s trust and safety principles

Google Web Risk and Bitly

Web Risk confidence scoring avoids false positives

Bitly continues to evolve its security stack to protect its users

Running distributed (multi-node) workloads

Making inference on TPUs easier

Pushing the boundaries of AI infrastructure

Tokenization engineered into Google Cloud

Tokenization with BigQuery

Using tokens to solve real problems

Obfuscating real-world risk

Next steps

Announcing BigQuery Encrypt and Decrypt function compatibility with Sensitive Data Protection

Executive Summary

Introduction

Single-Page Applications

Client-Side Rendering

Issues with Client-Side Rendering

Routing

Hidden Elements

JavaScript Debugging

Exploitation

Recommendations

API Access Controls

Server-Side Rendering

A Final Word

Why using a combined approach matters

Our ensemble approach: Text + image power

Why this approach works

Get started with multi-modal search

Digital markets demand new trading systems

The architecture of a cloud-native trading system

A new kind of trading platform for new kinds of markets

Executive Summary

Overview

Complementing Dynamic Analysis

Backscatter in GTI and VirusTotal