Google Cloud

2024 02 20

GCP – Unlocking New Frontiers: The Synergy of of Audio Transcripts using Video Intelligence API and Generative AI

The potential of video analytics and generative AI to revolutionize industries is immense. These technologies are opening new frontiers in automated insights, decision-making, and content generation. By marrying AI insight and audio data, organizations are realizing benefits across the span of business, from increased sales and revenue to enhanced customer experiences and reduced costs. Any organization not exploring these tools risks falling behind the competition.

This post offers a glimpse into the tremendous potential at the intersection of visual and linguistic intelligence. It reveals how we can harness video analytics and AI to unlock game-changing insights, create bespoke buyer experiences, and elevate customer relationships. The possibilities stretch as far as the imagination; and the time to start this exploration is now.

The demand for advanced video analysis and content generation is skyrocketing in today’s digital age. In fact, per Precedence & Straits research, the video analytics industry is expected to reach $50.7 billion by 2032, while the gen AI industry is expected to reach $118.06 billion by 2032. It’s clear that organizations are searching for AI solutions to extract insights from audio transcripts within videos and streamline processes through automation.

Combining the power of analyzing video content at scale and leveraging large language models to generate contextual narratives allows organizations to tap into a new realm of possibilities, including:

Sports Analytics: Leverage gen AI to transform post-game interviews into blog articles. These insights enable coaches to optimize training, strategize more effectively, and spot star talent.

Real Estate Retail: Analyze property walkthrough videos with computer vision, extracting critical details about layout, condition, and amenities. Feed transcripts into generative models to effortlessly construct detailed listing descriptions. AI can then analyze and compare against similar listings to accurately estimate market value.

Retail: Ingredients extracted in real-time from cooking show transcripts. As the chef talks through the recipe, matching items automatically land in your cart. Streamlining the shopping experience for aspiring home cooks.

The Video Intelligence API from Google Cloud and gen AI on Google Cloud offer complimentary video analysis and content creation capabilities. The Video Intelligence API enables businesses to efficiently analyze video content at scale, with features for labeling, speech transcription, shot detection, and more. Meanwhile, gen AI allows for the automated generation and summarization of text, images, audio, and video. Together, these technologies provide a comprehensive solution – the Video Intelligence API extracts vital insights from video assets. At the same time, gen AI leverages those insights to create novel experiences like chatbots, listings, and articles. This end-to-end pipeline from analysis to content creation delivers tremendous value.

Exploring the Synergy:

Imagine a real estate company using the Video Intelligence API to analyze video recordings of home tours performed by a real estate retail agent. The API first analyzes home tour footage, identifying property features and conditions. Transcripts are then fed into a generative model which produces custom listing descriptions for each home. By pairing advanced video analytics with AI-generated copy, properties can be indexed with rich detail – including visual details like layout and amenities and customized descriptive text for each listing. The end result is enhanced exposure on real estate search platforms and a better experience for potential buyers.

The potential applications of video analytics paired with gen AI extend far beyond real estate listings. This powerful combo delivers value across sectors:

Personalized home search: Leverage multi-modal analytics to personalize the home search for potential buyers. By utilizing Google’s foundation models, organizations can generate text descriptions of homes that match a buyer’s specific criteria. These personalized descriptions can then curate a customized list of homes that align with the buyer’s preferences, significantly enhancing their search experience.Search powered by gen AI: Video analytics and Chirp, Google Cloud’s speech model, can be leveraged to automatically transcribe video walkthroughs of homes. This transcription makes it easier for customers to search for specific features or details within the videos. For instance, customers can search for homes with “hardwood floors” or “granite countertops” and quickly find videos mentioning these features. Now, customers are empowered to find their ideal homes more efficiently and effectively.Customer insights: The combination of Video Intelligence API and gen AI allows organizations to analyze customer feedback from home walkthroughs. By extracting insights from this feedback, businesses gain a deeper understanding of customers’ likes, dislikes, and preferences. This valuable information can be used to personalize marketing and sales efforts, optimize the customer experience, and even enhance virtual reality walkthroughs for customers who are unable to visit a property physically.Question and Answer (QnA) on house features based on video summaries: The Video Intelligence API can summarize the key features of a property from home tour videos. By integrating this summary with gen AI-powered question-answering capabilities, organizations can create an interactive QnA system. Potential buyers can ask questions about the property’s features, condition, or any other aspect, and the system responds with accurate and informative answers based on the video summary. This enhances the customer experience by providing instant access to relevant information and streamlining the decision-making process.

The combinations of the Video Intelligence API and gen AI unlock endless possibilities and together, their analytical and creative strengths pave the way for innovation across sectors. In real estate, this fusion delivers tangible benefits:

Increased Sales: Realtors, freed from manual tasks, can invest more time building client relationships and closing deals.Enhanced Customer Experience: With greater availability, realtors can provide personalized guidance tailored to each client.Cost Savings: By automating rote tasks, firms reduce expenses and labor costs.

Architecture:

Sample Video used for this demo can be found here

The fusion of visual and linguistic intelligence is transforming customer engagement across industries, unlocking revolutionary methods to uncover insights and craft contextualized content. By automating repetitive tasks, employees can now focus on delivering white-glove service that forges lasting bonds with customers. These technologies paint a vivid portrait of human behavior – analyzing filters, walkthroughs, and searches to reveal preferences, needs, and even unspoken wishes. As videos and transcripts reveal nuanced narratives of customer journeys, boundless opportunities emerge to optimize each touchpoint. Ultimately, this synergy of AI capabilities promises more meaningful personal connections, richer insights, and enhanced experiences for businesses and their customers.

References :

Video Analytics Market Size by Straits Research Generative AI Market Size by Precedence Research Google Cloud Video Intelligence API

Learn more about how Generative AI on Google Cloud can help your organization.

Read More for the details.

2024 02 16

GCP – How Gov Market helps to create economic opportunity in Germany with digital sovereignty

Cloud, Google Cloud gcp

In Germany, public procurement can be a highly complex and time-consuming process. More than 30,000 procurement entities vie for contracts across a fragmented regulatory landscape, with different rules applying at the federal, state, and municipal levels. Since public procurement procedures are often time-consuming, particularly for large or high-value contracts, small and medium-sized enterprises (SMEs) often can’t compete for public contracts.

While procurement itself is supported by digital procurement platforms (such as eVergabe) planning and processes during tendering and contract execution are largely not standardized, require many manual steps, and are not provided by universal digital marketplaces as in other European countries. There are also challenges related to lack of training and development to upskill procurement officials, who are increasingly required to have a wide range of skills, including legal and technical expertise. Not surprisingly, there’s a shortage of qualified procurement professionals in Germany.

As a result, on average it takes German government agencies 22 months to purchase technology. This lack of timely access to new technology can be a major impediment to innovation in the public sector, and led to the creation of GovMarket, a new digital marketplace that can help drive innovation in procurement for public institutions in Germany.

A solution for a more efficient procurement process

GovMarket is a PwC and Public Germany joint venture aimed at fostering collaboration between government agencies and technology companies that specialize in the public sector. The platform’s goal is to streamline the public procurement process, making it easier for businesses to compete for public contracts. GovMarket also offers a range of tools and services to help businesses comply with public procurement regulations.

GovMarket has a strong focus on data sovereignty and protection. All data stored on the platform is encrypted and securely stored in Germany. GovMarket is also committed to complying with all applicable data protection laws and regulations.

A key enabler for sovereignty solutions

GovMarket’s success lies in its ability to address both the technological and cultural challenges of public procurement in Germany. Technologically, GovMarket provides an online platform that is easy to use and that meets the needs of both public buyers and suppliers.

The mission of GovMarket is to transform the way that procurement law is applied in Germany, moving away from an error-prone, paper-based process to an efficient digital and easily accessible process. This process can allow public organizations to use the platform as a reliable market research source designed to support compliance and curated solutions.

GovMarket chose to run on T-Systems Sovereign Cloud, powered by Google Cloud, in part because it can use Google Cloud’s innovative data residency and access controls to help ensure that GovMarket customer data remains under the control of their respective governments (directly or through trusted regional partners). It also can help adhere to strict data privacy regulations. This commitment to data sovereignty is further reinforced by Google Cloud’s security measures, which include:

Data encryption at rest and in transit: Google Cloud employs industry-leading encryption techniques to protect data from unauthorized access, both when it’s stored and when it’s being transmitted.Zero Trust-based access controls: Google Cloud is enforcing granular access controls to help ensure that only authorized personnel can access sensitive data (such as only EU based personnel).Threat detection and prevention: Google Cloud’s security infrastructure continuously monitors for and mitigates threats, helping keep data safe from evolving cyberattacks.

In addition to data sovereignty and security, GovMarket uses other benefits provided by the Sovereign Cloud. The platform’s scalability and performance can help GovMarket to meet the demands of its growing customer base, while its cost-effectiveness has helped GovMarket optimize its IT expenditures.

GovMarket has the potential to revolutionize public procurement in Germany. By addressing the challenges of complexity, time-consuming, and lack of digitization processes, GovMarket can make it easier for start-ups and SMEs to compete for public contracts. The platform’s focus on sovereignty and data protection can help build trust and confidence among public buyers and suppliers. GovMarket is available today to help unlock the potential of public procurement in Germany.

Read More for the details.

2024 02 16

GCP – Why mainframe migration is more than technology transformation

Cloud, Google Cloud gcp

British author and novelist L.P. Hartley once said, “The past is a foreign country: They do things differently there.” This sentiment is especially true in the realm of mainframe migrations, which often trigger unforeseen changes across an organization.

Processes established in mainframe environments tend to be delivered as centralized internal services with critical capabilities, such as maintenance, security, and support. However, these centralized processes and delivery are also bound to transform when migrating applications and workloads to cloud environments.

Centralized delivery has proven effective at lowering costs through consistency, shared engineering resources, bulk discounts, and specialized understanding of organizational requirements. At the same time, finite resources also hindered scalability, flexibility, and speed; any initiative that tangentially touched the mainframe had to be coordinated centrally.

But as cloud migrations near the finish line, many organizations suddenly realize they will need to replicate the same work once carried out by their mainframe teams — and they have new power to make changes. Here, deep expertise from a cloud provider or partner can prove invaluable for helping organizations get the most out of their transformations.

At Google Cloud Consulting, we’ve helped organizations navigate this complex journey countless times, witnessing firsthand that transformation is not just about technology but also people and processes. In this post, we’ll share three areas where you can expect the biggest impact and change when migrating from mainframes to the cloud.

Maintenance: The difference is in the design

We frequently advise customers to design away the need for processes, so they don’t become a bolt-on activity that the new cloud team needs to worry about. For example, opting for a Google Cloud managed database, such as AlloyDB, CloudSQL or BigQuery, can remove many of the utilities and maintenance tasks associated with on-premises databases or data stores. Instead of worrying about nightly backups or annual database reorganizations, these processes are automated and managed by Google Cloud.

Similarly, Google Cloud can automatically perform OS upgrades and patches by rotating GKE instances on a regular basis (e.g., 30-45 days). Organizations can define this in their policies and integrate rolling out the new OS images into their CI/CD pipelines. In this way, these processes become part of the normal development lifecycle rather than a separate activity that must be monitored and managed.

Of course, teams will still need to test backup and emergency restore processes regularly. However, testing doesn’t have to be particularly onerous. The intrinsic network isolation combined with on-demand capacity means it is quite possible to spin up a disaster recovery environment, restore a database, run the tests, and tear it down again — all without any disruption to the production environment.

Compared to the centralized processes they replace, maintaining and updating cloud-based applications generally takes a fraction of the work and time associated with the same tasks for on-premises solutions.

Security: Everyone gets a key (well maybe not)

Dealing with security is often another challenging transition during migrations. Google Cloud offers encryption at rest by default, but in isolation, that is not always enough. There are additional steps, such as rotating keys, to consider. With the move to a cloud environment, rotating the keys becomes the responsibility of the organization.

Where a mainframe may have been grandfathered in with the minimum acceptable solution, many organizations take the opportunity to step up their security controls when migrating to the cloud, which can create new burdens. In mainframes, the same key might be used for all data. In the cloud, applying the same policy to everything, everywhere, is illogical.

A common pattern we see for success is pushing the responsibility of rotating keys down to each product or application owner. This allows for different rotation policies based on the data; keys for data that require the highest levels of PCI and PII protection might be rotated every 90 days, while public data may be every 12 months. This pattern also shrinks the blast radius related to compromised keys. The goal is to tailor the policy for the data and any local laws, improving the overall security posture with nominal additional effort.

Similarly, cloud compliance is a crucial issue to address. Compliance audits are time-consuming, but they also play a critical role in helping organizations ensure transparency and safeguard their data and cloud assets. The trick to a successful audit is supporting policies and documentation with an electronic trail that shows compliance. The beauty of the cloud is that it’s easy to layer organizational policies on top of existing cloud documentation and use logs from products and services as proof of meeting requirements.

Support: New responsibilities, better solutions

There’s nothing worse than a 2AM call because something has failed (bad) or is limping along (worse). The tightly integrated nature of the mainframe used to mean that support primarily fell to a core on-call team with the skills and security permissions to troubleshoot mainframe problems from the ground up.

With a migration to Google Cloud, we are responsible for providing the instructure, but we have little to no insight into what is being executed. Therefore, support will shift to individual product teams — a new responsibility, but one that comes with a lot of help.

Whether it is a Single Pane of Glass or a “Single Glass of Pain”, Google Cloud Operations contains a fully integrated suite of metrics to assist troubleshooting. These metrics can be optimized with plug-ins to reflect an organization’s particular technology stack.

In addition, the cloud also offers new options for resolving issues — ones that don’t require debugging on a call in the middle of night. For instance, it’s now possible to simply isolate the instance having the problem from the cluster and spin up a replacement in seconds, leaving customers to get on with their lives and teams to solve problems in the morning with a cup of coffee.

Final thoughts

Embarking on a mainframe migration journey is both challenging and rewarding. Taking on new roles and responsibilities can be exciting, but it’s likely to be a bumpy transition for existing teams that have spent decades honing their skills in specific roles. Organizations should accept this up front and plan accordingly, eliminating as much as possible through design and automation.

At Google Cloud Consulting, we understand the intricacies of this process. Our team is dedicated to helping clients navigate these changes, leveraging our deep expertise to streamline the transition, automate processes, and empower organizations to embrace the full potential of the cloud.

Read more about how Google Cloud approaches mainframe modernization, or drop us a line if you want to chat with one of our experts. Our invitation is open: connect with us at Google Cloud Consulting to explore how we can tailor a migration strategy that aligns with your organization’s unique needs and goals. Together, we can turn the challenge of mainframe migration into an opportunity for growth and innovation.

Read More for the details.

2024 02 16

GCP – Cloud CISO Perspectives: Guidance from our latest Threat Horizons report

Cloud, Google Cloud gcp

Welcome to the first Cloud CISO Perspectives for February 2024. Today I’ll be looking at our latest Threat Horizons report, which provides a forward-thinking view of cloud security with intelligence on emerging threats and actionable recommendations from Google’s security experts.

As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.

–Phil Venables, VP, TI Security & CISO, Google Cloud

aside_block
<ListValue: [StructValue([(‘title’, ‘Board of Directors Insights Hub’), (‘body’, <wagtail.rich_text.RichText object at 0x3e184afe7670>), (‘btn_text’, ‘Visit the Hub’), (‘href’, ‘https://cloud.google.com/solutions/security/board-of-directors’), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

Latest Threat Horizons report: Emerging threats, actionable recommendations

By Phil Venables, VP, TI Security & CISO, Google Cloud

Google Cloud’s new Threat Horizons report for the first half of 2024 concludes that we saw threats increase across information technology environments, including on-premise, mobile, operational technology, and the cloud in 2023. However, while some sectors were more plagued by underlying vulnerabilities, security concerns for cloud providers were more often driven by security hygiene and misconfigurations. We expect those trends to continue in 2024, as the report notes, which means that organizations may also have a better sense of the kinds of threats they face (and how to prepare for them) in the coming year.

Phil Venables, VP, TI Security & CISO, Google Cloud

To help organizations prepare for the latest threats, we advise that they focus on credential issues and use cloud-first features such as write-once read-only backups, logging, and encryption at rest; that they move to multi-factor authentication if they haven’t already; and that they strengthen their cloud security governance.

Among Google Cloud customers, the report found that credential issues were the most commonly observed security oversight. Even though cloud platforms invest in security, abuse detection, and mitigation, organizations should also monitor their cloud computing resources for suspicious activity.

At Google Cloud, we take credential abuse seriously, so we offer many security capabilities to support customers and protect their environments, including two-factor authentication, strong password policies, identity and access management (IAM) policies, cloud audit logs, and a centralized view of threats and vulnerabilities in Security Command Center. We also now offer a credential leak monitoring service through Mandiant Digital Threat Monitoring.

As more industries and organizations move to the cloud, the cloud will become an ever-more appealing target.

The report also confirmed that data theft in general and ransomware specifically remain challenges for cloud customers. In 2023, threat actors targeting cloud environments began prioritizing data exfiltration over data encryption and stolen data advertisements. While this could suggest that they are focused on monetary gain, we noted that they “increasingly seek to profit by selling the data (or access to the data) rather than expecting victims to pay the ransom for decryption keys.”

To better secure cloud instances, we recommend deploying cloud-specific backup strategies that include testing configurations and templates of stored assets; using technologies such as WORM (Write Once Read Many) and the Bucket Lock feature on Google Cloud to provide immutable and policy compliant backup storage, and implementing resilient architecture, such as multi-region cloud use and backup mirroring, to reduce risk of data loss or inaccessibility.

Security logs can often capture and make visible to defenders a level of contextual and historical background data that enables them to better observe threat actors at work. Exporting relevant logs to a centralized, well-governed repository can help monitor threat activity on your network and in your systems.

As more industries and organizations move to the cloud, it will become an ever-more appealing target. The full report provides more details on these topics as well as on nation-state threat activity, ransomware, and the positive impact of security partnerships, which you can read here. You can read previous Threat Horizons reports here.

aside_block
<ListValue: [StructValue([(‘title’, ‘Join the Google Cloud CISO Community’), (‘body’, <wagtail.rich_text.RichText object at 0x3e184afe78e0>), (‘btn_text’, ‘Learn more’), (‘href’, ‘https://rsvp.withgoogle.com/events/ciso-community-interest?utm_source=cgc-blog&utm_medium=blog&utm_campaign=2024-cloud-ciso-newsletter-events-ref&utm_content=-&utm_term=-‘), (‘image’, <GAEImage: GCAT-replacement-logo-A>)])]>

In case you missed it

Here are the latest updates, products, services, and resources from our security teams:

Gen AI governance: 10 tips to level up your AI program: To help organizations navigate AI governance challenges, we’ve outlined 10 best practices to streamline and operationalize AI implementation at scale. Read more.Scaling security with AI, from detection to solution: In alignment with Google’s Secure AI Framework (SAIF), we’ve been using AI itself to automate and streamline routine and manual security tasks, including fixing security bugs. Last year, we wrote about how LLMs can expand vulnerability testing coverage, and we’re excited to share some updates. Read more.U.N. cybercrime treaty could endanger web security: Earlier this month, the United Nations convened member states to continue its years-long negotiations on the U.N. Cybercrime Treaty. Google takes the threat of cybercrime very seriously, and dedicates significant resources to combating it. We urge member states to heed calls from civil society groups to address critical gaps in the treaty and revise the text to protect users and security professionals. Read more.Simplify DORA compliance with Google Cloud’s updated contracts: Google Cloud is committed to supporting our customers’ compliance with EU DORA. Here’s the latest on our DORA support. Read more.A recipe for scaling security: In 2023, Google was able to modernize and improve our code at scale — no small feat given our vast portfolio of services. It’s important for us to share how we did it, what security benefits the changes brought to our users, and importantly, how other interested organizations can adopt the same strategy. Read more.Introducing Policy Analyzer: We are pleased to announce the general availability launch of Policy Analyzer for Org Policy and Custom Org Policy, which can help customers identify which resources are governed by which org policy constraints, empowering customers to understand and strengthen their security posture, and satisfy attestation and auditing requirements. Read more.Synthesized uses gen AI for compliant BigQuery dataset snapshots: Learn how Synthesized uses generative AI to create privacy-preserving snapshots of BigQuery datasets. Read more.

News from Mandiant

Latest research on Ivanti Connect Secure VPN zero-day exploitation: We follow our first blog on two zero days impacting VPN provider Ivanti with additional tactics, techniques, and procedures used by UNC5221 and other threat groups against Ivanti customers. We also detail new malware families and variants to previously-identified malware families being used by UNC5221. Read more.Uncovering USB malware’s hidden depths: Mandiant Managed Defense has been tracking UNC4990, a threat actor who heavily relies on USB devices for initial infection. Since at least 2020, they have primarily targeted users based in Italy and are likely motivated by financial gain. Read more.Introducing credential leak monitoring from Mandiant: Mandiant Digital Threat Monitoring now includes the capability to monitor your credential leaks on the deep and dark web, which will automatically alert you if your employees’ accounts have appeared in our compromised credential data. Read more.Dynamic capa: Exploring executable run-time behavior with CAPE sandbox: We are excited to announce that capa v7.0 can now identify program capabilities from dynamic analysis reports generated from the CAPE sandbox, an expansion of capa’s original static analysis approach. Read more.

Now hear this: Google Cloud Security and Mandiant podcasts

Cloud-ghostbusters: Who you gonna call for cloud forensics: How does cybersecurity forensics happen in the cloud? Google security engineer Jason Solomon sits down with our Cloud Security podcast hosts Anton Chuvakin and Tim Peacock to talk about next steps when there’s something strange in your cloud neighborhood. Listen here.Learn how Google Workspace security is built for modern threats: Workspace makes the claim that, unlike other productivity suites available today, it’s designed for the modern threat landscape. What gives Google the ability to make this claim? Engineering managers Emre Kanlikilicer and Sophia Gu discuss with Anton and Tim what sets Google Workspace apart, from a security perspective. Listen here.Prescriptions for a healthy cybersecurity future: Unfortunately, 2023 was a banner year for healthcare and life sciences cybersecurity incidents. Google Cloud Office of the CISO Director Taylor Lehmann and Security Architect Bill Reid join host Luke McNamara to discuss their takeaways from the last year of threat activity witnessed by enterprises within healthcare and life sciences. Listen here.Is the CTI lifecycle due for an update: Mandiant Intelligence Advisor Renze Jongman joins Luke to discuss his blog post on the CTI Process Hyperloop, and how we can do better at applying threat intelligence to the needs of the security organization and larger enterprise. Listen here.

To have our Cloud CISO Perspectives post delivered twice a month to your inbox, sign up for our newsletter. We’ll be back in two weeks with more security-related updates from Google Cloud.

Read More for the details.

2024 02 15

GCP – Building trust in crypto payments with Google Maps Platform

Cloud, Google Cloud gcp

Editor’s note: Today’s blog post comes from Pavel Matveev, CEO and Co-Founder, Wirex. Wirex is on a mission to take crypto mainstream while offering customers clear insights into every transaction with Google Maps Platform.

At Wirex, we’ve made it our mission to bring cryptocurrency into the mainstream. As the creator of the world’s first crypto-enabled payment card, we work directly with both Visa and MasterCard to allow our six million users to spend cryptocurrency with more than 50 million merchants around the world. And with our ‘cryptoback’ feature, users earn cryptocurrency every time they make a traditional currency payment, bridging the gap between traditional finance and digital currency.

Building trust with our customers is really important. As a fully licensed and regulated fintech company, with over 20 industry awards, we work hard to establish trust in our brand as a financial institution customers can depend on. We want to provide full transparency around the funds our customers invest with us. So when we realized back in 2019 that our customer service team was receiving a lot of calls from customers trying to understand where certain transactions had been made, we were quick to react.

Building trust with Enriched Transactions from Google Maps Platform

We worked with Google Maps Platform partner Snowdrop Solutions to implement its Merchant Reconciliation System (MRS), which uses Enriched Transactions from Google Maps Platform to augment a customer’s transaction with contextual location information.

Now, our customers can use our app to see clear, detailed transaction information, including merchant names, categories and logos, as well as the address and map location of where the transaction took place. Customers can instantly recognize their own transactions making it easy to identify any fraudulent transactions, so they can quickly freeze their card or claim a chargeback. This has led to a 50% reduction in the number of transaction-related inquiries to our customer service team leading to reduced customer services overhead, and it has helped us to build even greater trust and transparency with our customers.

Customers can view the exact location of transactions from within the Wirex app providing peace of mind and transparency

Growing our business with global transaction data

MRS and Google Maps Platform have helped us to scale our business, too. Not only has the reduction in customer service workload helped us to refocus resources where they’re most needed, the global reach of Google Maps Platform has allowed us to expand geographically, offering customers the same level of transaction data wherever they are.

Since implementing Enriched Transactions, we have doubled our user base from three to six million, and expanded from our London base into both Asia-Pacific and the U.S.

Clearer data for smarter business decisions

Our customers aren’t the only ones benefiting from enriched transaction data. With unprecedented insights into the spending habits of our users, we are able to make data-driven decisions about our products and business. Understanding, for example, that the two main spending categories for customers in Singapore are traveling and restaurants could, for instance, allow us to provide more cashback for those categories, to boost spending volume.

All of which puts us in a strong position to achieve our goal of reaching 10 million customers within the next 18 months. We’re confident that MRS and Enriched Transactions from Google Maps Platform are helping us to scale our business and further build the trust of our customers, so we can keep building trust in digital currency.

For more information on Google Maps Platform, visit our website.

Read More for the details.

2024 02 15

GCP – Google Cloud expands access to Gemini models for Vertex AI customers

Cloud, Google Cloud gcp

In December, Google announced Gemini, our most capable and general model yet. Since December, select customers like Samsung and Palo Alto Networks have been building sophisticated AI agents with Gemini models in Vertex AI, unlocking new levels of productivity, personalized learning, and more for their users. Today, we’re bringing more Gemini models to our customers with new updates and expanded availability:

Gemini 1.0 Pro, our best model for scaling across AI tasks, is now generally available to all Vertex AI customers. Starting today, any developer can start building with Gemini Pro in production. 1.0 Pro offers the best balance of quality, performance, and cost for most AI tasks, like content generation, editing, summarization, and classification.Gemini 1.0 Ultra, our most sophisticated and capable model for complex tasks, is now generally available on Vertex AI for customers via allowlist. 1.0 Ultra is designed for complex tasks, showing especially strong performance in areas such as complex instruction, code, reasoning, and multilinguality, and is optimized for high quality output.

In addition, we’re excited to introduce a new generation of Gemini models with Gemini 1.5, which delivers improved performance on a more efficient architecture.

The first Gemini 1.5 model we’re releasing for early testing is Gemini 1.5 Pro, which is now in private preview on Vertex AI. It’s a mid-size multimodal model, optimized for scaling across a wide-range of tasks, and performs at a similar level to 1.0 Ultra, our largest model to date. 1.5 Pro introduces a new breakthrough experimental feature in long-context understanding — the longest context window of any large-scale foundation model yet. Apps can now run up to 1 million tokens in production. This means 1.5 Pro can process vast amounts of information in one go — including 1 hour of video, 11 hours of audio, codebases with over 30,000 lines of code or over 700,000 words.

Larger context windows enable models to reference more information, grasp narrative flow, maintain coherence over longer passages, and generate more contextually rich responses. For example, with 1.5 Pro, enterprises can:

Accurately analyze an entire code library in a single prompt, without the need to fine-tune the model, including understanding and reasoning over small details that a developer might easily miss, such as errors, inefficiencies, and inconsistencies in code.Reason across very long documents, from comparing details across contracts to synthesizing and analyzing themes and opinions across analyst reports, research studies, or even a series of books.Analyze and compare content across hours of video, such as finding specific details in sports footage or getting caught up on detailed information from video meeting summaries that support precise question-answer.Enable chatbots to hold long conversations without forgetting details, even over complex tasks or many follow-up interactions.Enable hyper-personalized experiences by pulling relevant user information into the prompt, without the complexity of fine-tuning a model.

How customers are innovating with Gemini models

Vertex AI has seen strong adoption with API requests increasing nearly 6X from H1 to H2 last year. We are really impressed with the amazing things customers are doing with Gemini models particularly because they are multimodal and can handle complex reasoning so well.

Samsung: Samsung recently announced that their Galaxy S24 series is the first smartphone equipped with Gemini models. Starting with Samsung-native applications, customers can take advantage of summarization features across Notes and Voice Recorder. Samsung is confident their end users are protected with built-in security, safety, and privacy in Vertex AI.

Palo Alto Networks: Palo Alto Networks is testing Gemini models across a variety of use cases including intelligent product agents that let its customers interact with their product portfolio in a more intuitive way and reduce the time spent with customer support.

Jasper: Jasper, an AI offering that helps enterprise marketing teams create and repackage content, is using Gemini models to automatically generate blog content and product descriptions for their customers. Teams can now move faster while maintaining a high quality bar for content, ensuring it adheres to brand voice and marketing guidelines.

Quora: Quora, the popular question and answer platform, is using Gemini to help power creator monetization on their AI chat platform, Poe, where users can explore a wide-variety of AI-powered bots. Gemini is enabling Poe creators to build custom bots across a variety of use cases including writing assistance, generating code, personalized learning, and more.

Build production-ready applications with the Gemini API in Vertex AI

The Gemini API in Vertex AI empowers developers to build the next generation of AI agents and apps — ones that can simultaneously process information across modalities like text, code, images, and video. To harness the power of the Gemini models, organizations and developers need to be able to build enterprise-grade applications and take them to production. Vertex AI is the only cloud AI platform to offer a single, integrated platform for models, tooling and infrastructure, ensuring that once applications are built with Gemini models, they can be easily deployed and maintained. With Vertex AI, customers can:

Customize Gemini models for specific business needs. The Gemini API in Vertex AI now supports adapter based tuning such as Low-Rank Adaptation (LoRA), which allows developers to customize the model in an efficient, lower-cost way. Additional customization techniques like reinforcement learning from human feedback (RLHF) and distillation are coming to the Gemini API in the coming months.

Augment the Gemini models response with up to the minute information and enable the model to take action in the real world. With support for fully-managed grounding, developers can improve the accuracy and relevance of the Gemini model’s answers using their enterprise’s own data. With function calling, now generally available, developers can connect the Gemini model to external APIs for transactions and other actions.

Manage and scale Gemini in production with purpose-built tools to help ensure that once applications are built, they can be easily deployed and maintained. Vertex AI offers an automated evaluation tool for generative AI models: Automatic Side by Side. This feature compares models responses by a standard set of criteria, which helps developers understand Gemini’s performance and adjust prompts and tuning based on that feedback.

Build search and conversational agents with Gemini models with minimal coding expertise required, in hours and days instead of weeks and months:

Vertex AI Search provides developers with an out of the box, Google Search-quality information retrieval and answer generation system. With support for Gemini models, developers can build search applications with even more robust grounding, accurate citations, and satisfying answers.Vertex AI Conversation now offers developers the ability to build sophisticated gen AI powered conversational chatbots using Gemini models. With the advanced reasoning and multimodal capabilities of Gemini, developers can drive more personalized, informative and engaging conversational AI experiences in their applications.

The Gemini era is just beginning — stay on the cutting edge

Developers can build production grade applications on Vertex AI, which offers enterprise-grade model augmentation, testing, deployment, and management tools. In addition, developers can experience the Gemini models with the API in Google AI Studio, a free, web-based developer tool to prototype and launch apps quickly with an API key. With all of our new Gemini models now in our customers’ hands, we can’t wait to see the new generation of intelligent apps and agents they’ll create. The Gemini era is just beginning, however — if your organization wants to stay on the cutting edge, work with your account team to ensure you’re signed up to be a trusted tester of upcoming Gemini models. Be sure to join us in Las Vegas in April at Google Cloud Next ‘24 for our latest gen AI news and to explore our upcoming events for deep dives into products and strategies.

Read More for the details.

2024 02 15

GCP – Coming of age in the fifth epoch of distributed computing, accelerated by machine learning

Cloud, Google Cloud gcp

Editor’s note: Today, we hear from Google Fellow Amin Vahdat, who is the VP & GM for ML, Systems, and Cloud AI at Google. Amin originally delivered this as a keynote in 2023 at the University of Washington for The Allen School’s Distinguished Lecture Series. This post captures Amin’s reflections on the history of distributed computing, where we are today, and what we can expect for the next-generation of computing services.

Over the past fifty years, computing and communication have transformed society with sustained exponential growth in capacity, efficiency, and capability. Over that time, we have, as a community, delivered a 50-million-fold increase in transistor count per CPU and grown the Internet from 4 nodes to 5.39 billion.

While these advances are impressive, the human capabilities that result from these advances are even more compelling, sometimes bordering on what was previously the domain of science fiction. We now have near-instantaneous access to the evolving state of human knowledge, limited only by our ability to make sense of it. We can now perform real-time language translation, breaking down fundamental barriers to human communication. Commensurate improvements in sensing and network speeds are delivering real-time holographic projections that will begin to support meaningful interaction at a distance. This explosion in computing capability is also powering next-generation AI systems that are solving some of the hardest scientific and engineering challenges of our time, for example, predicting the 3D structure of a protein, almost instantly, down to atomic accuracy, or unlocking advanced text-to-image diffusion technology, delivering high-quality, photorealistic outputs that are consistent with a user’s prompt.

Maintaining the pace of underlying technological progress has not been easy. Every 10-15 years, we encounter fundamental challenges that require foundational inventions and breakthroughs to sustain the exponential growth of the efficiency and scale of our infrastructure, which in turn power entirely new categories of services. It is as if every factor of a thousand exposes some new fundamental, progressively more challenging limit that must be overcome and creates some transformative opportunity. We are in one of those watershed moments, a once-in-a-generation challenge and opportunity to maintain and accelerate the awe-inspiring rate of progress at a time when the underlying, seemingly insatiable demand for computing is only accelerating.

A look back on the brief history of computing suggests that we have worked through four such major transitions, each defining an ‘epoch’ of computing. We offer a historical taxonomy that points to a manifest need to define and to drive a fifth epoch of computing, one that is data-centric, declarative, outcome-oriented, software-defined, and centered on proactively bringing insights to people. While each previous epoch made the previously unimaginable routine, this fifth epoch will bring about the largest transformation thus far, promising to democratize access to knowledge and opportunity. But at the same time, it will require overcoming some of the most intrinsically difficult, and cross-stack challenges in computing.

We begin our look back at Epoch 0. Purists will correctly argue that we could look back thousands of years further, but we choose to start with some truly landmark and foundational developments in computer science that took place between 1947-1969, laying the basis for modern computing and communication.

1947: Bardeen, Brattain and Shockley invent the first working transistor.
1948: Shannon introduces Information Theory, the basis for all network communication.
1949: Stored programs in computers become operational.
1956: High-level programming languages are invented.
1964: Instruction Set Architectures, common across different hardware generations, emerge.
1965: Moore’s Law introduced, positing that transistor count per integrated circuit will double every 18-24 months.
1967: Multi-user operating systems provide protected sharing of resources.
1969: Introduction of the ARPANet, the basis for the modern Internet.

These breakthroughs became the basis for modern computing at the end of Epoch 0: four computers based on integrated circuits running stable instruction set architectures and a multi-user, time-shared operating system connected to a packet-switched internet. This seemingly humble baseline laid the foundation for exponential progress in subsequent epochs.

In the first Epoch, computer networks were largely used in an asynchronous manner: transfer data across the network (e.g., via FTP), operate on it, and then transfer results back.

Notable developments: SQL, FTP, email, and Telnet
Interaction time among computers: 100 milliseconds
Characteristics:
• Low-bandwidth, high-latency networks
• Rare pairwise interactions between expensive computers
• Character keystroke interactions with humans
• The emergence of open source software
Breakthrough: Personal computers

Aided by increasing network speeds, prevalence of personal computers/workstations, and widespread, interoperable protocols (IP, TCP, NFS, HTTP), synchronous, transparent computation and communication became widespread in Epoch 2.

Notable developments: Remote Procedure Call, client/server computing, LANs, leader election and consensus
Interaction time among computers: 10 milliseconds
Characteristics:
• 10 Mbps networks
• Internet Architecture scales globally thanks to TCP/IP
• Full 32-bit CPU fits on a chip
• Shared resources between multiple computers
Breakthrough: The World Wide Web

In Epoch 3, the true breakthrough of HTTP and the World Wide Web brought network computing to the masses, breaking the confines of personal computing. To keep pace with continued exponential growth in the Internet and the needs of a global user population, many of the design patterns of modern computing were established during this period.

One of the drivers of Epoch 3 was the end of Dennard scaling, which essentially limited the maximum clock frequency of a single CPU core. This limitation led the industry to adopt multi-core architectures, necessitating a move toward asynchronous, multi-threaded, and concurrent development environments.

Notable developments: HTTP, three-tier services, massive clusters, web search
Interaction time among computers: 1 millisecond
Characteristics:
• 100 Mbps–1Gbs networks
• Autonomous Systems / BGP
• Complex apps no longer fit on a single server; scaling to many servers
• Web indexing and search, population-scale email
Breakthrough: Cluster-based Internet services, mobile-first design, multithreading and instruction-level parallelism

Epoch 4 established planetary-scale services available to billions of people through ubiquitous cellular devices. In parallel, a renaissance in machine learning drove more real-time control and insights. All of this was powered by warehouse-scale clusters of commodity computers interconnected by high-speed networks, which together processed vast datasets in real-time.

Notable developments: Global cellular data coverage, planet-scale services, ubiquitous video
Interaction time among computers: 100 microseconds
Characteristics:
• 10-100 Gbps networks, flash
• Multiple cores per CPU socket
• Infrastructure that scales out across LANs (e.g., GFS, MapReduce, Hadoop)
• Mobile apps, global cellular data coverage
Breakthroughs: Mainstream machine learning, readily available specialized compute hardware, cloud computing.

Today, we have transitioned to the fifth Epoch, which is marked by a superposition of two opposing trends. First, while transistor count per ASIC continues to increase at exponential rates, clock rates are flat and the cost of each transistor is now nearly flat, both limited by the increasing complexity and investment required to achieve smaller feature sizes. The implication is that performance normalized to cost improvements, or performance efficiency, of all of compute, DRAM, storage, and network infrastructure, is flattening. At the same time, ubiquitous network coverage, broadly deployed sensors, and data-hungry machine learning applications are accelerating the demand for raw computing infrastructure exponentially.

Notable developments: Machine learning, generative AI, privacy, sustainability, societal infrastructure
Interaction time among computers: 10 microseconds
Featuring:
• 200Gbps–1+Tb/s networks
• Ubiquitous, power-efficient, and high-speed wireless network coverage
• Increasingly specialized accelerators: TPUs, GPUs, Smart NICs
• Socket-level fabrics, optics, federated architectures
• Connected spaces, vehicles, appliances, wearables, etc…
Breakthroughs: Many coming…

Without fundamental breakthroughs in computing design and organization, our ability as a community to meet societal demands for computing infrastructure will falter. Coming up with new architectures to overcome these limitations, new hardware and increasingly, software architectures, will define the fifth epoch of computing.

While we cannot predict the breakthroughs that will be delivered in this fifth epoch of computing, we do know that each previous epoch has been characterized by a factor of 100x improvement in scale, efficiency, and cost-performance, all while improving security and reliability. The demand for scale and capability is only increasing, so delivering such gains without the tailwinds of Moore’s Law and Dennard scaling at our backs will be daunting. We imagine, however, the broad strokes will involve:

Declarative programming models: The Von Neumann model of sequential code execution on a dedicated processor has been incredibly useful for developers for decades. However, the rise of distributed and multi-threaded computing has broken the abstraction to the point where much of modern imperative code focuses on defensive, and often inefficient, constructs to manage asynchrony, heterogeneity, tail latency, optimistic concurrency, and failures. Complexity will only increase in the years ahead, essentially requiring new declarative programming models focused on intent, the user, and business logic. At the same time, managing execution flow and responding to shifting deployment conditions will need to be delegated to increasingly sophisticated compilers and ML-powered runtimes.
Hardware segmentation: In earlier epochs, a general-purpose server architecture with a system balance of CPU, memory, storage, and networking could efficiently meet workload needs throughout the data center. However, when designing for specialized computing needs, ML training, inference, video processing, the conflicting requirements for storage, memory capacity, latency, bandwidth and communication is causing a proliferation of heterogeneous designs. When general-purpose compute performance was improving at 1.5x/year, pursuing even a 5x improvement for 10% of workloads did not make sense given the complexity. Today, such improvements can no longer be ignored. Addressing this gap will require new approaches to designing, verifying, qualifying, and deploying composable hardware ASICs and memory units in months, not years.
Software-defined infrastructure: As underlying infrastructure has become more complex and more distributed, multiple layers of virtualization from memory to CPU have maintained the single server abstraction for individual applications. This trend will continue in the coming epoch as infrastructure continues to scale out and become more heterogeneous. The corollary of hardware segmentation, declarative programming models and distributed computing environments comprised of thousands of servers, will stretch virtualization beyond the confines of individual servers to include distributed computing on a single server, multiple servers, storage/memory arrays, and clusters — in some cases bringing resources across an entire campus together to efficiently deliver end results.
Provably secure computation: In the last epoch, the need to sustain compute efficiency inadvertently came at the cost of security and reliability. However, as our lives move increasingly online, the need for privacy and confidentiality increases exponentially for individuals, for business, and governments. Data sovereignty, or the need to restrict the physical location of data, even derived, will become increasingly important to adhere to government policies, but also to transparently show the lineage of increasingly ML-generated content. Despite some cost in baseline performance, these needs must be first-class requirements and constraints.
Sustainability: The first three epochs of computing delivered exponential improvements in performance for fixed power. With the end of Dennard scaling in the fourth epoch, global power consumption associated with computing has grown quickly, partially offset by the move to cloud-hosted infrastructure, which is 2-3x more power-efficient relative to earlier, on-premises designs. Further, cloud providers have made broad commitments to move to first carbon-neutral and then carbon-free power sources. However, the demand for data and compute will continue to grow and even likely accelerate in the fifth epoch. This will turn power-efficiency and carbon emissions into primary systems-evaluation metrics. Of particular note, embodied carbon over the entire lifecycle of infrastructure build and delivery will require both improved visibility and optimization.
Algorithmic innovation: The tailwinds of exponentially increasing performance have allowed software efficiency improvements to often go neglected. As improvement in underlying hardware components slows, the focus will turn to software and algorithmic opportunities. Studies indicate that opportunities for 2-10x improvement in software optimization abound in systems code. Efficiently identifying these software optimization opportunities and developing techniques to gracefully and reliably deliver these benefits to production systems at scale will be a critical opportunity. Leveraging recent breakthroughs in coding LLMs to partially automate this work would be a significant accelerant in the fifth epoch.

Integrating across the above, the fifth epoch will be ruled by measures of overall user-system efficiency (useful answers per second) rather than lower-level per-component measures such as cost per MIPS, cost per GB of DRAM, cost per Gb/s, etc. Further, the units of efficiency will not be simply measured in performance-per-unit-cost but will explicitly account for power consumption and carbon emissions, and will take security and privacy as primary metrics, all while enforcing reliability requirements for the infrastructure on which society increasingly depends. Taken together, there are many untapped opportunities to deliver the next generation of infrastructure:

A greater than 10x opportunity in scale-out efficiency of our distributed infrastructure across hardware and software.
Another 10x opportunity in matching application balance points — that is, the ratio between different system resources such as compute, accelerators, memory, storage, and network — through software-defined infrastructure.
A more than 10x opportunity in next-generation accelerators and segment-specific hardware components relative to traditional one-size-fits-all, general-purpose computing architectures.
Finally, there is a hard-to-quantify but absolutely critical opportunity to improve developer productivity while simultaneously delivering substantially improved reliability and security.

Combining these trends, we are on the cusp of yet another dramatic 1000x efficiency gain over the next epoch that will define the next generation of infrastructure services and enable the next generation of computing services, likely centering around breakthroughs in multimodal models and generative AI. The opportunity to define, design, and deploy what computing means for the next generation does not come along very often, and the tectonic shifts in this fifth epoch promise perhaps the biggest technical transformations and challenges to date, requiring a level of responsibility, collaboration and vision perhaps not seen since the earliest days of computing.

Read More for the details.

2024 02 14

GCP – Immersive Stream for XR: Performance and savings with NVIDIA L4 GPU and autoscaling

Cloud, Google Cloud gcp

NVIDIA L4 GPUs and Google Cloud deliver the next level of cloud GPU performance to handle the rendering of your most demanding projects, along with capacity autoscaling that automatically scales resources to match traffic surges and daily peaks and troughs. This ensures that you have only the capacity you need at all times. Today, we are excited to announce improvements in both performance and cost savings to Immersive Stream for XR, Google Cloud’s solution for hosting, rendering, and streaming 3D and eXtended reality (XR) experiences, which will help you deliver more immersive and cost-effective experiences via NVIDIA L4 Tensor Core GPU support and capacity autoscaling.

The NVIDIA L4 GPU is now generally available as a GPU class on Immersive Stream for XR for the 3D-only mode. NVIDIA L4 GPUs offer twice the performance compared with the existing NVIDIA T4 Tensor Core GPUs. The NVIDIA L4 GPU class on Immersive Stream for XR redefines the price-performance ratio for immersive experience providers.

“By once again collaborating with Google Cloud for its Immersive Stream for XR, now powered by NVIDIA L4 GPUs, we’re able to offer top performance at a lower cost to power the next-generation of immersive experiences.” – Anne Hecht, senior director of enterprise software products, NVIDIA

Further, a new capacity autoscaling feature dynamically matches capacity to real-time traffic needs. You only need to configure a range of regional capacities with a target buffer of standby available capacity. This helps ensure sufficient server capacity without the overhead of manual capacity planning, allowing you to focus on content creation, not infrastructure, and only pay for the resources you need. This new autoscaling feature is available for all service instances, using either the NVIDIA T4 or NVIDIA L4 GPUs.

The above new capabilities are available today. To get started, you can follow the quickstart guide to get started, select the NVIDIA L4 GPU class and 3D-only experience mode. Generally speaking, 3D-only is a cost-effective mode, priced significantly less than AR mode while still supporting augmented reality experiences. Please see this comparison between 3D and AR modes. You can also learn more about how to enable and configure autoscaling for your service instances in this guide to lower costs for your immersive experiences, immediately.

Read More for the details.

2024 02 14

GCP – Looker Hackathon 2023 results: Best hacks and more

Cloud, Google Cloud gcp

In December, the Looker team invited our developer and data community to collaborate, learn, and inspire each other at our annual Looker Hackathon. More than 400 participants from 93 countries joined together, hacked away for 48 hours and created 52 applications, tools, and data experiences. The hacks use Looker and Looker Studio’s developer features, data modeling, visualizations and other Google Cloud services like BigQuery and Cloud Functions.

For the first time in Looker Hackathon history, we had two hacks tie for the award of the Best Hack. See the winners below and learn about the other finalists from the event. In every possible case, we have included links to code repositories or examples to enable you to reproduce these hacks.

Best Hack winners

DashNotes: Persistent dashboard annotations

By Ryan J, Bartosz G, Tristan F

Have you ever wanted to take note of a juicy data point you found after cycling through multiple filterings of your data? You could write your notes in an external notes application, but then you might lose the dashboard and filter context important to your discovery. This Best Hack allows you to take notes right from within your Looker dashboard. Using the Looker Custom Visualization API, it creates a dashboard tile for you to create and edit text notes. Each note records the context around its creation, including the original dashboard and filter context. The hack stores the notes in BigQuery to persist the notes across sessions. Check out the GitHub repository for more details.

Document repository sync automation

By Mehul S, Moksh Akash M, Rutuja G, Akash

Does your organization struggle to maintain documentation on an increasing number of ever-changing dashboards? This Best Hack helps your organization automatically generate current detailed documentation on all your dashboards, for simplified administration. The automation uses the Looker SDK, the Looker API, and serverless Cloud Functions to parse your LookML for useful metadata, and stores it in BigQuery. Then the hack uses LookML to model and display the metadata inside a Looker dashboard. Checkout the GitHub repository for the backend service and the GitHub repository for the LookML for more details.

Nearly Best Hack winner

Querying Python services from a Looker dashboard

By Jacob B, Illya M

If your Looker dashboard had the power to query any external service, what would you build? This Nearly Best Hack explores how your Looker Dashboard can communicate with external Python services. It sets up a Python service to mimic a SQL server and serves it as a Looker database connection for your Looker dashboard to query. Then, clever LookML hacks enable your dashboard buttons to send data to the external Python service, creating a more interactive dashboard. This sets up a wide array of possibilities to enhance your Looker data experience. For example, with this hack, you can deploy a trained ML model from Google Cloud’s Vertex AI in your external service to deliver keen insights about your data. Check out the GitHub repository for more details.

Finalists

What do I watch?

By Hamsa N, Shilpa D

We’ve all had an evening when we didn’t know what movie to watch. You can now tap into a Looker dashboard that recommends ten movies you might like based on your most liked movie from IMDB’s top 1000 movies. The hack analyzes a combination of genre, director, stars, and movie descriptions, using natural language processing techniques. The resulting processed data resides in BigQuery, with LookML modeling the data. Check out the GitHub repository for more details.

Template analytics

By Ehsan S

If you need to determine which customer segment will be most effective to market to, check out this hack, which performs Recency, Frequency, Monetary (RFM) analysis on data from a Google Sheet to help you segment customers based on their last transaction recency, how often they’ve purchased, and how much they’ve spent over time. You provide the custom Looker Studio Community Connector, along with a Google Sheet, and the connector performs RFM analysis on your Google Sheet’s data. The hack’s Looker Studio report visualizes the results to give an overview of your customer segments and behavior. Check out the Google Apps Script code for more details.

LOV filter app

By Markus B

This hack implements a List of Values (LOV) filter that enables you to have the values of one dimension filter a second dimension. For example, take two related dimensions: “id” and “name”. The “name” dimension may change, while the “id” dimension always stays constant.

This hack uses Looker’s Extension Framework and Looker Components to show “name” values in the LOV filter that translate to “id” values in an embedded dashboard’s filter. This helps your stakeholders filter on values they’re familiar with and keeps your data model flexible and robust. Check out the GitLab repository for more details.

Looker accelerator

By Dmitri S, Joy S, Oleksandr K

This collection of open-source LookML dashboard templates provides insight into Looker project performance and usage. The dashboards use Looker’s System Activity data and are a great example of using LookML to create reusable dashboards. In addition, you can conveniently install the Looker Block of seven dashboards through the Looker Marketplace (pending approval) to help your Looker developer or admin to optimize your Looker usage. Check out the GitHub repository for more details.

The SuperViz Earth Explorer

By Ralph S

With this hack, you can visually explore the population and locations of cities across the world on an interactive 3D globe, and can filter the size of the cities in real time as the globe spins. This custom visualization uses the Looker Studio Community Visualization framework with the clever combination of three.js, a 3D Javascript library, and clever graphics hacks to create a visual experience. Check out the GitHub repository for more details.

dbt exposure generator

By Dana H.

Are you using dbt models with Looker? This hack automatically generates dbt exposures to help you debug and identify how your dbt models are used by Looker dashboards. This hack serves as a great example of how our Looker SDK and Looker API can help solve a common pain point for developers. Check out the GitHub repository for more details.

Hacking Looker for fun and community

At Looker Hackathon 2023, our developer community once again gave us a look into how talented, creative, and collaborative they are. We saw how our developer features like Looker Studio Community Visualizations, LookML, and Looker API, in combination with Google Cloud services like Cloud Functions and BigQuery, enable our developer community to build powerful, useful — and sometimes entertaining — tools and data experiences.

We hope these hackathon projects inspire you to build something fun, innovative, or useful for you. Tap into our linked documentation and code in this post to get started, and we will see you at the next hackathon!

Read More for the details.

2024 02 14

GCP – Leveraging Backup for GKE (BfG) for Effortless Volume Migration: From In-tree to CSI

Cloud, Google Cloud gcp

In Kubernetes, persistent volumes were initially managed by in-tree plugins, but this approach hindered development and feature implementation since in-tree plugins were compiled and shipped as part of k8s source code. To address this, the Container Storage Interface (CSI) was introduced, standardizing storage system exposure to containerized workloads. CSI drivers for standard volumes like Google Cloud PersistentDisk were developed and are continuously evolving. The implementation for in-tree plugins is being transitioned to CSI drivers.

If you have a Google Kubernetes Engine (GKE) cluster(s) that is still using the in-tree volumes, please follow the instructions below to learn how to migrate to CSI provisioned volumes.

Why migrate?

There are various benefits to using a gce-pd CSI Driver, including improved deployment automation, customer managed keys, volume snapshots and more.

In GKE version 1.22 and later, CSI Migration is enabled. Existing volumes that use the gce-pd provider managed through CSI drivers via transparent migration in the kubernetes controller backend. No changes are required to any StorageClass. You must use the pd.csi.storage.gke.io provider in the StorageClass to enable features like CMEK or volume snapshots.

An example of a storage Class with an in-tree storage plugin and a CSI driver.

code_block
<ListValue: [StructValue([(‘code’, ‘apiVersion: storage.k8s.io/v1rnkind: StorageClassrn…rnprovisioner: kubernetes.io/gce-pd <— in-treernprovisioner: pd.csi.storage.gke.io <— CSI provisioner’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35cc4b20>)])]>

[Please perform the below actions in your test/dev environment first]

Before you begin:

To test migration, create a GKE cluster. Once the cluster is ready, check the provisioner of your default storage class. If it’s already a CSI provisioner pd.csi.storage.gke.io then change it to gce-pd (in-tree) by following these instructions

Refer to this page if you want to deploy a stateful PostgreSQL database application in a GKE cluster.. We will refer to this sample application throughout this blog.

Again, make sure that a storage class (standard) with gce-pd provisioner creates the volumes (PVCs) attached to the pods.

As a next step, we will backup this application using Backup for GKE (BfG) and restore the application while changing the provisioner from gce-pd (in-tree) to pd.csi.storage.io (the CSI driver).

Create a backup Plan

Please follow this page to ensure you have BfG enabled on your cluster.

When you enable the BfG agent in your GKE cluster, BfG provides a CustomResourceDefinition that introduces a new kind of Kubernetes resource: the ProtectedApplication. For more on ProtectedApplication, please visit this page.

A sample manifest file:

code_block
<ListValue: [StructValue([(‘code’, ‘kind: ProtectedApplicationrnapiVersion: gkebackup.gke.io/v1alpha2rnmetadata:rn name: postgresqlrn namespace: blogrnspec:rn resourceSelection:rn type: Selectorrn selector:rn matchLabels:rn app.kubernetes.io/name: postgresql-harn components:rn – name: postgresqlrn resourceKind: StatefulSetrn resourceNames: [“db-postgresql-ha-postgresql”]rn strategy:rn type: BackupAllRestoreAllrn backupAllRestoreAll: {}’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35cc4640>)])]>

If Ready to backup status shows as true, your application is ready for backup.

code_block
<ListValue: [StructValue([(‘code’, ‘❯ kubectl describe protectedapplication postgresqlrn……rnStatus:rn Ready To Backup: true’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35cc4e20>)])]>

Let’s create a backup plan following these instructions.

Up until now, we have only created a backup plan and haven’t taken an actual backup. But before we start the backup process, we have to bring down the application.

Bring down the Application

We have to bring down the application right before taking its backup (This is where the Application downtime starts). We are doing it to prevent any data loss during this migration.

My application is currently exposed via a service db-postgresql-ha-pgpool with the following selectors:

We’ll patch this service by overriding above selectors with a null value so that no new request can reach the database.

Save this file as patch.yaml and apply it using kubectl.

code_block
<ListValue: [StructValue([(‘code’, ‘spec:rn selector:rn app.kubernetes.io/instance: “”rn app.kubernetes.io/name: “”rn app.kubernetes.io/component: “”‘), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35cc46a0>)])]>

code_block
<ListValue: [StructValue([(‘code’, ‘❯ kubectl patch service db-postgresql-ha-pgpool –patch-file patch.yamlrnservice/db-postgresql-ha-pgpool patched’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35cc4160>)])]>

You should no longer be able to connect to your app (i.e., database) now.

Start a Backup Manually

Navigate to the GKE Console → Backup for GKE → Backup Plans

Click Start a backup as shown below.

Restore from the Backup

We will restore this backup to a Target Cluster. Please note that you do have an option to select the same cluster as your source and your target cluster. The recommendation is to use a new GKE cluster as your target cluster.

Restore process completes in following two steps:

Create a restore plan

Restore a backup using the restore plan

Create a restore plan

You can follow these instructions to create a restore plan.

While adding the transformation rule(s) , we will change the storage class from standard to standard-rwo.

Add transformation rules → Add Rule (Rename a PVC’s Storage Class)

Please see this page for more details.

Next, review the configuration and create a plan.

Restore backup using the (previously created) restore plan

When a backup is restored, the Kubernetes resources are re-created in the target cluster.

Navigate to the GKE Console → Backup for GKE → BACKUPS tab to see the latest backup(s). Select the backup you took before bringing down the application to view the details and click on SET UP A RESTORE. Fill all the mandatory fields and click RESTORE.

Once done, switch the context to the target cluster and see how BfG has restored the application successfully in the same namespace.

The data was restored into new PVCs (verify with kubectl -n blog get pvc). Their storageclass is gce-pd-gkebackup-de, which is a special storageclass used to provision volumes from the backup.

Let’s get the details of one of the restored volumes to confirm BfG has successfully changed the provisioner from in-tree to CSI

New volumes are created by the CSI provisioner. Great!

Bring up the application

Let’s patch the service db-postgresql-ha-pgpool back with the original selectors to bring our application up. Save this patch file as new_patch.yaml and apply using kubectl.

We are able to connect to our database application now.

Note: This downtime will depend on your application size. For more information, please see this link.

Use it today

Backup for GKE can help you reduce the overhead of this migration with a minimal downtime. It can also help you prepare for disaster recovery.

Read More for the details.

2024 02 14

GCP – Unlock Web3 data with BigQuery and Subsquid

Cloud, Google Cloud gcp

Editor’s note: The post is part of a series showcasing partner solutions that are Built with BigQuery.

Blockchains generate a lot of data with every transaction. The beauty of Web3 is that all of that data is publicly available. But the multichain and modular expansion of the space has increased the complexity of accessing data, where any project looking to build cross-chain decentralized apps (DApps) has to figure out how to tap into on-chain data that is stored in varying locations and formats.

Meanwhile, running indexers to extract the data and make it readable is a time-consuming, resource-intensive endeavor often beyond small Web3 teams’ capabilities, since proficiency in coding smart contracts and building indexers are entirely different skills.

Having recognized the challenges for builders to leverage one of the most valuable pieces of Web3 (its data!), the Subsquid team set out to build a fully decentralized solution that drastically increases access to data in a permissionless manner.

Subsquid explained

One way to think about the Subsquid Network is as Web3’s largest decentralized data lake — existing to ingest, normalize, and structure data from over 100 Ethereum Virtual Machines (EVM) and non-EVM chains. It allows devs to quickly access (‘query’) data more granularly — and vastly more efficiently — than via legacy RPC node infrastructure.

Subsquid Network is horizontally scalable, meaning it can grow alongside archival blockchain data storage. Its query engine is optimized to extract large amounts of data and is a perfect fit for both dApp development (indexing) and for analytics. In fact, a total of over 11 billion dollars in decentralized application and L1/L2 value depends on Subsquid indexing.

Since September, Subsquid has been shifting from its initial architecture to a permissionless and decentralized format. So far during the testnet, 30,000 participants — including tens of thousands of developers — have built and deployed over 40,000 indexers. Now, the Subsquid team is determined to bring this user base and its data to Google BigQuery.

BigQuery and blockchain

BigQuery is a powerful enterprise data warehouse solution that allows companies and individuals to store and analyze petabytes of data. Designed for large-scale data analytics, BigQuery supports multi-cloud deployments and offers built-in machine learning capabilities, enabling data scientists to create ML models with simple SQL.

BigQuery is also fully integrated with Google’s own suite of business intelligence and external tools, empowering users to run their own code inside BigQuery using Jupyter Notebooks or Apache Zeppelin.

Since 2018, Google has added support for blockchains like Ethereum and Bitcoin to BigQuery. Then, earlier this year, the on-chain data of 11 additional layer-1 blockchain architectures was integrated into BigQuery, including Avalanche, Fantom, NEAR, Polkadot, and Tron.

But while it’s great to be able to run analytics on public blockchain data, this might not always offer exactly the data a particular developer needs for their app. This is where Subsquid comes in.

Data superpowers for Web3 devs and analysts

Saving custom-curated data to BigQuery lets developers leverage Google’s analytics tools to gain insights into how their product is used, beyond the context of one chain or platform.

Multi-chain projects can leverage Subsquid in combination with BigQuery to quickly analyze their usage on different chains and gain insights into fees, operating costs, and trends. With BigQuery, they aren’t limited to on-chain data either. After all, Google is the company behind Google Analytics, an advanced analytics suite for website traffic.

Subsquid Developer relations engineer Daria A. demonstrates how to store data indexing using Subsquid to BigQuery and other tools

Analyzing across domains by combining sets of on-chain activity with social media data and website traffic can help projects understand retention and conversion in their projects while identifying points where users drop off, to further improve their workflows.

“BigQuery is quickly becoming an essential product in Web3, as it enables builders to query and analyze one’s own data, as well as to leverage a rich collection of datasets already compiled by others. Since it’s SQL based, it’s extremely easy to explore any data and then run more and more complex queries. With a rich API and complete developer toolkit, it can be connected to virtually anything,” writes Dmitry Zhelezov, Subsquid CEO and co-founder.

“Now, with the addition of Subsquid indexing, Web3 developers literally have data superpowers. They can build a squid indexer from scratch or use an existing one to get exactly the data they need extremely efficiently. We can’t wait to see what this unlocks for builders.”

Get started with Subsquid on BigQuery today

Subsquid’s support for BigQuery is already feature-complete. Are you interested in incorporating this tool into your Web3 projects? Find out more in the documentation. You can also view an example project demoed on YouTube and open-sourced on GitHub.

The Built with BigQuery advantage for Data Providers and ISVs

Built with BigQuery helps companies like Subsquid build innovative applications with Google Data and AI Cloud. Participating companies can:

Accelerate product design and architecture through access to designated experts who can provide insight into key use cases, architectural patterns, and best practices.Amplify success with joint marketing programs to drive awareness, generate demand, and increase adoption.

BigQuery gives Data Providers and ISVs the advantage of a powerful, highly scalable unified AI lakehouse that’s integrated with Google Cloud’s open, secure, sustainable platform. Click here to learn more about Built with BigQuery.

Read More for the details.

2024 02 14

GCP – Introducing vector search in BigQuery

Cloud, Google Cloud gcp

The advent of advanced AI and machine learning (ML) technologies has revolutionized the way organizations leverage their data, offering new opportunities to unlock its potential. Today, we’re announcing the public preview of vector search in BigQuery, which enables vector similarity search on BigQuery data. This functionality, also commonly referred to as approximate nearest-neighbor search, is key to empowering numerous new data and AI use cases such as semantic search, similarity detection, and retrieval-augmented generation (RAG) with a large language model (LLM).

Vector search is often performed on high-dimensional numeric vectors, a.k.a. embeddings, which incorporate a semantic representation for an entity and can be generated from numerous sources, including text, image, or video. BigQuery vector search relies on an index to optimize the lookups and distance computations required to identify closely matching embeddings.

Here is an overview of BigQuery vector search:

It offers a simple and intuitive CREATE VECTOR INDEX and VECTOR_SEARCH syntax that is similar to BigQuery’s familiar text search functionality. This simplifies combining vector search operations with other SQL primitives, enabling you to process all your data at BigQuery scale.It works with BigQuery’s embedding generation capabilities, notably via LLM-based or pre-trained models. Yet the generic interface allows you to use embeddings generated via other means as well.BigQuery vector indexes are automatically updated as the underlying table data mutates, with the ability to easily monitor indexing progress. This extensible framework can support multiple vector index types, with the first implemented type (IVF) combining an optimized clustering model with an inverted row locator in a two-piece index.The LangChain implementation simplifies Python-based integrations with other open-source and third-party frameworks.The VECTOR_SEARCH function is optimized for analytical use cases and can efficiently process large batches of queries (rows). It also delivers low-latency inference results when handling small input data. Faster, ultra-low-latency online prediction can be performed on the same data through our integration with Vertex AI.It’s integrated with BigQuery’s built-in governance capabilities, notably row-level, data masking, and column-level security policies.

Use cases

The combination of embedding generation and vector search enables many interesting use cases, with RAG being a canonical one. The examples below provide high-level algorithmic descriptions for what can be encoded in your data application or queries using vector search:

Given a new (batch of) support case(s), find ten closely-related previous cases, and pass them to an LLM as context to summarize and propose resolution suggestions.Given an audit log entry, find the most closely matching entries in the past 30 days.Generate embeddings from patient profile data (diagnosis, medical and medication history, current prescriptions, and other EMR data) to do similarity matching for patients with similar profiles and explore successful treatment plans prescribed to that patient cohort.Given the embeddings representing pre-accident moments from all the sensors and cameras in a fleet of school buses, find similar moments from all other vehicles in the fleet for further analysis, tuning, and re-training of the models governing the safety feature engagements.Given a picture, find the most closely-related images in the customer’s BigQuery object table, and pass them to a model to generate captions.

BigQuery-based RAG deep dive

BigQuery enables you to generate vector embeddings and perform vector similarity search to improve the quality of your generative AI deployments with RAG. You can find some some steps and tips below:

You can generate vector embeddings from text data using a range of supported models, including LLM-based ones. These models effectively understand the context and semantics of words and phrases, allowing them to encode the text into vectors that represent its meaning in a high-dimensional space.With BigQuery’s scale and ease of use, you can store these embeddings in a new column, right alongside the data it was generated from. You can then perform queries against these embeddings or build an index to improve retrieval performance.Efficient and scalable similarity search is crucial for RAG, as it allows the system to quickly find the most relevant pieces of information based on the query’s semantic meaning. Vector similarity search involves efficiently searching through millions or billions of vectors from the vector data store to find the most similar vectors. BigQuery vector search uses its indexes to efficiently find the closest matching vectors according to a distance measurement technique such as cosine or euclidean.When doing prompt engineering with RAG, the first step involves converting the input into a vector using the same (or a similar) model to that used for encoding the knowledge base. This ensures that the query and the stored information are in the same vector space, making it possible to measure similarity.The vectors identified as most similar to the query are then mapped back to their corresponding text data. This text data represents the pieces of information from the knowledge base that are most likely to be relevant to the query.The retrieved text data is then fed into a generative model. This model uses the additional context provided by the retrieved information to generate a response that is not only based on its pre-trained knowledge, but also enhanced by the specific information retrieved for the query. This is particularly useful for questions that require up-to-date information or detailed knowledge on specific topics.

The diagram below provides a simplified view of the RAG workflow in BigQuery:

Publication search and RAG examples

In the next three sections, we use the `patents-public-data.google_patents_research.publications` table in the Google Patents public dataset table as a running example to highlight three (of the many) use cases BigQuery vector search enables.

Case 1: Patent search using pre-generated embedding

One of the most basic use cases for BigQuery vector search is performing similarity search using data with pre-generated embeddings. This is common when you intend to use embeddings that are previously generated from proprietary or pre-trained models. As an example, if you store your data and queries in <my_patents_table> and <query_table> respectively, the search journey would consist index creation, followed by vector search:

code_block
<ListValue: [StructValue([(‘code’, “CREATE OR REPLACE VECTOR INDEX `<index_name>`rnON `<my_patents_table>`(embedding_v1)rnOPTIONS(distance_type=’COSINE’, index_type=’IVF’)”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35dab340>)])]>

code_block
<ListValue: [StructValue([(‘code’, “SELECT query.publication_number AS query_publication_number,rn query.title AS query_title, base.publication_number,rn base.title, distancernFROMrn VECTOR_SEARCH(rn TABLE `<my_patents_table>`, ’embedding_v1′,rn TABLE `<query_table>`, top_k => 5, distance_type => ‘COSINE’)”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35dabe80>)])]>

Note that indexing is mainly a performance optimization mechanism for approximate nearest-neighbor search, and vector search queries can succeed and return correct results without an index as well. For more details, including specifics on recall calculation, please see this tutorial.

Case 2: Patent search with BigQuery embedding generation

You can achieve a more complete end-to-end semantic search journey by using BigQuery’s capabilities to generate embeddings. More specifically, you can generate embeddings in BigQuery via LLM-based foundational or pre-trained models. The SQL snippet below assumes you have already created a BigQuery <LLM_embedding_model> that references a Vertex AI text embedding foundation model via BigQuery (see this tutorial for more details):

code_block
<ListValue: [StructValue([(‘code’, “CREATE TABLE `<patents_my_embeddings_table>` ASrn SELECT * FROM ML.GENERATE_TEXT_EMBEDDING(rn MODEL `<LLM_embedding_model>`,rn (SELECT *, abstract AS contentrn FROM `patents-public-data.google_patents_research.publications`rn WHERE LENGTH(abstract) > 0 AND LENGTH(title) > 0 AND country = ‘Singapore’))rn WHERE ARRAY_LENGTH(text_embedding) > 0;”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35dab130>)])]>

We skip demonstrating the index-creation step, as it is similar to “Case 1” above. After the index is created, we can use VECTOR_SEARCH combined with ML.GENERATE_TEXT_EMBEDDING to search related patents. Below is an example query to search patents related to “improving password security”:

code_block
<ListValue: [StructValue([(‘code’, “SELECT query.query, base.publication_number, base.title, base.abstractrnFROM VECTOR_SEARCH(rn TABLE `<patents_my_embeddings_table>`, ‘text_embedding’,rn (SELECT text_embedding, content AS queryrn FROM ML.GENERATE_TEXT_EMBEDDING(rn MODEL `<LLM_embedding_model>`,rn (SELECT ‘improving password security’ AS content))rn ), top_k => 5)”), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35dab5b0>)])]>

Case 3: RAG via integration with generative models

BigQuery’s advanced capabilities allow you to easily extend the search cases covered above into full RAG journeys. More specifically, you can use the output from the VECTOR_SEARCH queries as context for invoking Google’s natural language foundation (LLM) models via BigQuery’s ML.GENERATE_TEXT function (see this tutorial for more details).

The sample query below demonstrates how you can ask the LLM to propose project ideas to improve user password security. It uses the top_k patents retrieved via semantic similarity vector search as context passed to the LLM model to ground its response:

code_block
<ListValue: [StructValue([(‘code’, ‘SELECT ml_generate_text_llm_result AS generated, promptrnFROM ML.GENERATE_TEXT(rn MODEL `<LLM_text_generation_model>`,rn (SELECT CONCAT(‘Propose some project ideas to improve user password security using the context below: ‘, STRING_AGG(FORMAT(“patent title: %s, patent abstract: %s”, base.title, base.abstract), ‘,\n’)) AS prompt,rn FROM VECTOR_SEARCH(rn TABLE `<patents_my_embeddings_table>`, ‘text_embedding’,rn (SELECT text_embedding, content AS queryrn FROM ML.GENERATE_TEXT_EMBEDDING(rn MODEL `<LLM_embedding_model>`,rn (SELECT ‘improving password security’ AS content))rn ), top_k => 5)rn ),rn STRUCT(0.4 AS temperature, 300 AS max_output_tokens, 0.5 AS top_p, 5 AS top_k,rn TRUE AS flatten_json_output));’), (‘language’, ”), (‘caption’, <wagtail.rich_text.RichText object at 0x3e9a35dab640>)])]>

Getting started

BigQuery vector search is now available in preview. Get started by following the documentation and tutorial.

Read More for the details.

2024 02 14

GCP – 11 Ways You Win with ChromeOS Flex

Cloud, Google Cloud gcp

It’s estimated that cyber crimes will cost $10.5 trillion annually by 2025.1 How are you planning to protect your business from this growing threat? Traditionally, upgrading to the latest available operating system has been a viable way to reduce the chance of an attack, but with hundreds of millions of Windows 10 devices losing support soon because they are ineligible for the newest version of Windows, finding a way to keep these PCs secure and sustainable should be a top priority. With ChromeOS Flex, businesses can install a new, auto-updating operating system on their existing fleet of Windows devices. This modernizes devices, extending their lifespan and keeping them out of landfills, reducing their attack surface, and saving on hardware refresh costs.

In this article, I’ll share how your organization can win in 11 different ways with ChromeOS Flex.

ChromeOS Flex is the cloud-first, fast, and secure operating system for PCs and Macs

1. It’s secure

ChromeOS Flex is built with security as a first principle, not an afterthought. Features like data encryption, automatic updates and sandboxing help protect against malware and other online threats. IT controls prevent data loss on lost or stolen devices, and regular security updates provide a continuous line of protection. Best of all, as of today, ChromeOS Flex has never had a successful reported ransomware attack, and upgrading your devices with ChromeOS Flex eliminates the need for antivirus software, saving your business on application and management costs!

2. It’s familiar

If you’ve used Chrome Browser or have Google Workspace, ChromeOS Flex will be a natural addition to your organization. They use the Chrome web browser as their principal user interface, and all three services can be managed from the same Google Admin console as well.

3. It increases productivity

ChromeOS Flex is designed to be fast and efficient, helping your team be more productive. ChromeOS Flex devices boot quickly, don’t slow down over time, and update in the background every four weeks. If you’re starting to notice your Windows or Mac devices slow down or struggle with compatibility issues, ChromeOS Flex can bring them back to life.

4. It’s easy to manage

Modernizing your Windows and Mac devices with ChromeOS Flex gives you the power to effortlessly manage them from the Google Admin console. IT admins can easily configure, deploy, and manage any number of devices remotely.

5. It works with your business applications

With ChromeOS Flex, you can feel confident deploying third party solutions. Gain access to a wide range of Chrome Enterprise Recommended solution partners that have been validated by Google, so you can choose third party products that are best for your ecosystem across security, kiosks, healthcare, and many more. You can also stream legacy Windows and productivity applications that are seamlessly integrated into the ChromeOS experience with virtual app delivery.

6. It’s flexible

The ChromeOS team has certified nearly 600 devices for ChromeOS Flex, meaning almost any device you have deployed will work seamlessly with ChromeOS Flex installed! Whether you need to modernize devices for your IT team, your back office, or front of house employees, ChromeOS Flex is a solution that can handle the job. You can boot up your device from the USB drive to experience ChromeOS Flex without erasing your existing operating system, and when you’re ready, ChromeOS Flex can be installed at no cost in as little as 5 minutes.

7. It reduces IT support costs

ChromeOS Flex is a low-maintenance, reliable operating system, reducing the need for IT support. Google provides regular software updates and security patches for ChromeOS Flex, ensuring your old devices remain supported for many years.

8. It reduces hardware costs

ChromeOS Flex can be installed on your existing hardware, reducing the need to purchase new devices. Whether your devices have run into security issues and become compromised, are starting to slow down, or are not eligible for upgrades to new OS versions, ChromeOS Flex can keep your devices up and running.

9. It prevents devices from becoming e-waste

ChromeOS Flex can help to extend the lifespan of existing hardware, reducing the need for frequent replacements and preventing devices from heading to landfills. Extending the usable lifespan of your older computers reduces electronic waste and is a sustainable choice.

10. It’s energy efficient

Devices running ChromeOS Flex consume 19% less energy on average than other comparable operating systems.2 That means installing ChromeOS Flex on your devices is not only good for the planet, it’s also good for your bottom line.

11. It’s perfect for businesses of any size

ChromeOS Flex, with its simple management and security features, is a scalable solution ideal for businesses looking for a cost-effective and easy-to-manage operating system. Whether you’re managing devices for a smaller business on the path to growth or an established enterprise, ChromeOS Flex can support you.

Providing a modern, reliable experience for your employees shouldn’t require you to replace your perfectly good device fleet. If you want to improve device security, keep your existing fleet out of landfills, and save your business on refresh costs in the process, you should give ChromeOS Flex a try! Best of all, you can get started in as little as 5 minutes and at no cost. Try ChromeOS Flex here.

1. PurpleSec. (2023). ‘2023 Cyber Security Statistics: The ultimate list of stats, data & trends’. https://purplesec.us/resources/cyber-security-statistics/

2. Sutton-Parker, J. (2022), ‘Quantifying greenhouse gas abatement delivered by alternative computer operating system displacement strategies’. 1877-0509. Procedia Computer Science, Volume 203, 2022, Pages 254-263. Amsterdam, the Netherlands: Science Direct, Elsevier B.V.

Read More for the details.

2024 02 13

GCP – How the DoD unified data organization-wide with Apigee

Cloud, Google Cloud gcp

Discovering data management solutions that align with the Department of Defense’s (DoD) rigorous requirements for scale, security, and stability presents a formidable challenge.

When DoD personnel make decisions, they need instant access to the data that supports them. Everything—from managing the DoD itself and executing Command and Control to making smarter decisions about their supply chain—relies on their access to reliable data. Retrieving the data to make these decisions is critical to ensuring the right people get the right resources at the right time. However, data silos within DoD were bogging down DoD’s ability to access and use data effectively for years. It was also inhibiting their ability to innovate. Without an effective way to manage their data, the DoD could not use artificial intelligence/machine learning (AI/ML) tools to their fullest potential, including the ability to search across all data within the Department.

In an effort to address this, DoD has for several years successfully managed and grown a dataAI/ML platform known as Advana that serves over 100,000 users with nearly 500 data sources made available centrally. Seeking to expand upon Advana’s data sharing capabilities, the DoD engaged Google for help given Google’s experience in managing vast volumes of data for analytics, enterprise search, and machine learning.

Data For Anyone, Anytime They Need It

CDAO sought to mature Advana’s data sharing capabilities in pursuit of a decentralized data mesh using an application programming interface (API). The challenge for the CDAO was managing the volume of APIs at DoD scale. Google Cloud’s API management software, Apigee, offered an ideal solution. Google Public Sector began working with the DoD in 2022 to help ensure DoD personnel could manage their APIs by leveraging Apigee in Advana. In September 2023, Apigee achieved General Availability (GA) for their API developer portal. This portal ensures any sector of the DoD can build their own Apigee or API instance and interface with Advana’s data mesh. This keeps the data flowing and helps the DoD streamline processes and communications so they can meet their mission. Ultimately, the DoD will now be able to easily build APIs and bring together a network of 1,200 systems into a mesh that allows data to be quickly searched and retrieved across the organization.

To learn more, contact the Google Public Sector Professional Services Organization.

Read More for the details.

2024 02 13

GCP – Announcing the general availability of Network Function Optimizer for GKE Enterprise

Cloud, Google Cloud gcp

Traditionally, customers have relied on VMs to run their network-intensive applications. But for organizations who want to migrate network-intensive VM-based applications to Kubernetes and achieve the same level of performance, they need to add advanced container networking capabilities to their Pods, for example the ability for Pods to have multiple interfaces, and high-performance acceleration technology for Pods.

Today, we’re excited to announce that Network Function Optimizer is generally available for GKE Enterprise, the premium edition of Google Kubernetes Engine. As part of GKE Enterprise, Network Function Optimizer delivers the enterprise scale and high data-plane performance for containerized applications that our customers have been looking for, including the functionality that we’ve developed as part of our Multi-network Kubernetes Enhancement Proposal and our Multi-network, New Level of Multi-tenancy presentation into the Kubernetes community. On top of that, we’ve packaged all these advanced container networking features to make it easy for you to migrate your network-intensive workloads into GKE Enterprise in one nifty solution!

Announced at the Mobile World Congress 2023, Network Function Optimizer in GKE Enterprise lets you apply the container networking enhancements directly from the GKE console, to support additional use cases in AI/ML, telco and containerized security. Some key use cases and benefits include:

Extending multi-network capabilities to Pods that run on the nodes. With multi-network support for Pods, you can enable multiple interfaces on nodes and Pods in a GKE cluster, allowing for data-plane and control-plane separation. This is a strict requirement for the deployment of containerized network functions (CNFs) such as containerized firewalls, containerized IDS/IPS, containerized proxies etc.

Delivering a high-performance data plane natively in software that’s comparable to those assisted by hardware, simplifying workload scheduling on any Pod and removing underlying hardware/NIC dependency. With this flexibility, you can now move your applications to containers and benefit from the advantages containers can bring such as autoscaling, bin-packing, and portability.

Workload-optimized infrastructure for AI

With the rapid adoption of AI/ML, enterprises are building AI platforms on Kubernetes to serve models and perform inference for development teams and users across their organization.

Two open-source projects in particular are enabling AI/ML at scale in the Kubernetes community: Kubeflow, which originated at Google, and is now a CNCF Incubator project, provides a multi-user environment and interactive notebook management. Ray orchestrates distributed computing workloads across the entire ML lifecycle, including training and serving. Although Ray is not a Kubernates-native project, the community’s kubeRay toolkit simplifies Ray deployment on Kubernetes; we provide a Terraform template for running Ray on GKE.

Network Function Optimizer running on GKE complements these solutions, providing low-latency, high performance network interfaces with which to run AI/ML workloads such as Kubeflow and Ray.

“Infrastructure services that make it easier to scale out are critical for our customers to fully take advantage of the new next generation AmpereOne™ powered C3A compute instances. C3A instances are positioned to be the most dynamically scalable compute instances on the market today, especially paired with services like GKE and Google’s new Network Function Optimizer services.” – Jeff Wittich, Chief Product Officer, Ampere

Market-leading threat protection with robust security ecosystem

Google Cloud customers want threat protection that’s built-in, but they also want the flexibility to bring in their preferred security vendors from on-premises to their cloud environments.

Deploying a single or several GKE clusters in a shared VPC can be relatively straightforward, but what happens if you need your Kubernetes clusters to communicate with multiple VPCs? Network Function Optimizer can help. Its multi-network Pod capabilities enable multiple VPC networks to communicate with the GKE cluster.

Another foundational feature is its high-performance data plane. As part of Network Function Optimizer, containerized security services like a firewall, intrusion detection, VPN, etc. combined with native Linux data-plane acceleration, have comparable performance to hardware-based acceleration technology — but with the abstraction, flexibility and portability inherent to software. With this, the security service can be scheduled on any Pod, removing the dependency between the containerized security service and the underlying NIC hardware.

This way, you can continue to use Google Cloud’s built-in network security solutions with the flexibility to bring in your choice of third-party security solutions. With Network Function Optimizer, you can create security services with multiple layers of inspection provided by your preferred firewall and deep-packet-inspection vendors, and balance the network-intensive security-inspection load across many Pods.

“In the cloud and data-centric world, threats are coming from internal and external sources, with a rapidly disappearing perimeter. Broadcom is partnering with Google to accelerate the delivery of advanced network functions that operate transparently in the data path on traffic as a bump in the wire using GKE and Network Function Optimizer. We are excited to work with Google Cloud in moving the needle forward for the whole industry and be a first adopter of these advanced functions. This enables us to improve our resiliency and increase our velocity, delivering features for our customers.” – Gary Tomic, Principal Architect, Fellow, Broadcom

Getting started

To learn how to deploy and validate the Network Function Optimizer capabilities into your own GKE deployment, watch this introduction video, demo videos, and visit this codelab, where you’ll learn to enable more than a single interface on Nodes and for Pods in your GKE clusters.

Read More for the details.

2024 02 13

GCP – Your RAGs powered by Google Search technology, part 2

Cloud, Google Cloud gcp

In the first post of this series, we explored the concept of retrieval augmented generation (RAG) and how the same technologies that power Google Search can greatly enhance the effectiveness of the information retrieval capabilities of a RAG system. In this follow-up post, we will now take a deeper look at the other critical technologies that are essential for building a successful RAG system to help ground large language models (LLM) when building applications.

Deep re-ranking with Google signals

Approximate nearest neighbors (ANN) algorithms like ScaNN are proficient at quickly retrieving results in semantic search, but they are not as strong at scoring and ordering them precisely. Anyone who has ever used Google Search knows how important it is for the top results to be the most relevant to our queries and carefully sorted by relevance.

That’s why modern semantic search engines use a two-stage retrieval approach to generate results. First, they use an ANN retriever to do a quick first pass and bring up results, and then apply a re-ranking model to fine-tune the results and make sure the most relevant ones are at the top — all in milliseconds. This same approach also applies in RAG systems to provide production-quality search.

The two staged retrieval approach

(from Scaling deep retrieval with TensorFlow Recommenders and Vertex AI Matching Engine)

For example, Google Search provides an advanced ranking system that looks at dozens of signals to determine the final ranking of results for every single query. Using deep learning models in its ranking system, RankBrain, Google Search can understand the relevance between queries and documents and re-rank the results with the signals from those models to optimize the user experience.

Vertex AI Search leverages the same cutting-edge technologies and unique ranking signals available in Google Search to combine results from embedding-based and keyword-based retrieval and power its advanced re-ranking model specifically designed for enterprise needs. As a result, there is no need for developers to spend precious time or additional resources on LLMs for re-ranking results.

Furthermore, while Vertex AI Search has already fine-tuned the retrieval and ranking models using Google’s proprietary data, customers can further enhance these models with their own data to better align them with their specific business use cases and data characteristics.

Filtering and boosting

Filtering and boosting search results is one of the most common business needs for search capabilities. Similarly, a RAG system also requires the ability to filter and boost results when retrieving information. For instance, a streaming service typically prefers to highlight premium content while avoiding the display of content that is no longer available. Likewise, a healthcare directory often prioritizes highly-rated providers and filters out those who are not accepting new patients.

Vertex AI Search provides a flexible filter expression syntax to meet these filtering and boosting requirements for both search and information retrieval engines. This syntax enables developers to define complex business logic, providing precise and dynamic control over search results with associated metadata. In addition, Vertex AI Search makes it easy to reduce the presence of irrelevant content in search results and enhance search precision, automatically extracting filters from natural language queries to align with metadata fields, such as category and price.

Extraction and generation

Once you have the re-ranked and filtered list of relevant documents, you might consider extracting the sections that best capture the essence of each document in relation to the query, for use as input for LLMs for answer generation or summarization. Otherwise, the results may contain too much content and too much extra noise.

Below, we’ll discuss these technologies in further detail.

Smart content extraction
Vertex AI Search provides multiple ways to extract highly relevant content from documents, helping to surface and isolate the most valuable information for an LLM from the sources. With this functionality, a RAG system can provide a spot-on answer to a query rather than generating results from an entire document that might include a lot of irrelevant, unnecessary information.

Each search result may contain the following types of content:

Snippet: A brief text extract from search result documents, providing a preview of a result’s content. Snippets are displayed below each result, similar to snippets in Google Search, and help users assess the relevance and usefulness of that result.Extractive answer: A verbatim text extract from the original document, similar to Google Search Featured Snippets, typically positioned at the top of pages to offer concise, contextually relevant answers to user queries.Extractive segments: A more detailed verbatim text extract from a document to provide more context for LLMs than an extractive answer. Extractive segments can be further processed or used as input for an LLM for various tasks, including, question answering, content summarization, or a conversation where the text extract relevant to the query is added with the context around it.

Let’s take a look at an example of each type of extracted content for the following query “What is Vertex AI Search and Conversation?”:

Snippet, Extractive answer, and Extractive Segment with Vertex AI Search

For even deeper customization, Vertex AI Search can pull extractive segments from a result document according to specific business requirements. For instance, a customer can adjust the number of segments to extract from each document, incorporate adjacent segments for extra context, and even select top segments or filter out less relevant ones based on segment relevant scores. These options are particularly useful for improving the performance of your RAG systems as you can extract the most important part of the documents to the LLM, so it can generate highly relevant and reliable outputs.

Summarization and multi-turn search
If the primary use case for your RAG system is to simply summarize the search results or provide a chatbot-style context-aware search interface, then you don’t even have to build a RAG system at all. Vertex AI Search comes with built-in LLMs that specialize in generating search result summaries and also supports follow-up questions with multi-turn search to provide a context-aware search.

These built-in LLMs are fine-tuned to work seamlessly with the extracted content, ensuring the creation of precise, well-grounded answers with citations. The following is an example response to the query “What are the benefits of BigQuery?”

Summarization with citations

Furthermore, you have the flexibility to customize summarization and perform multi-turn search to your specific needs, such as specifying a condition that triggers generating a response and defining a preamble to make prompts more effective. Preambles serve to influence the nature of the response by providing additional context or instructions to the LLMs.

For instance, you can include a preamble in a prompt like “explain like you would to a ten year old” or “give a very detailed answer.” The following is an example response to the same query used to generate the response above but with a preamble “please show the answer format in an ordered list” added.

Summarization with a customization instruction

Using a built-in LLM from Vertex AI Search to summarize and enable multi-turn search is a quick way to introduce the power of large models into your enterprise systems, rather than building your own RAG system from scratch.

Google-quality document processing

Another challenge when developing a gen AI application for tasks like question answering, content summarization, or conversation with proprietary enterprise data is effectively handling documents in various formats and lengths.

Before performing retrieval and ranking, it’s essential to understand the structure of each document and properly process them, making large volumes of textual information accessible and manageable for the retrieval and generation components of your RAG system.

Document processing in Vertex AI Search

Document understanding and text chunking
Vertex AI Search goes beyond only extracting text from documents. It also identifies structural and content elements, including titles, section headings, paragraphs, and tables, which define the organization and hierarchy of a wide variety of documents. This information is used to intelligently segment the document objects into smaller, retrievable segments (chunks), preserving the coherence of semantic elements while minimizing noise. This segmentation is more effective than the widely-used simple text chunking, which often fails to maintain semantic coherence.

Additionally, Vertex AI Search can extract information from each segment as annotations to enhance the search experience. The document segments are tokenized and embedded to create indexes for retrieval and ranking.

Once Vertex AI Search has identified the relevant document segments through retrieval and ranking, it can further process their content or use them as input for generating responses, resulting in higher quality and more relevant generated output.

Document and query annotation with Knowledge Graph
As we covered in our previous post, keyword search uses keywords to find relevant information while semantic search looks at similarity in the meaning of content. Knowledge Graph finds information by using the graph relationship between entities. Using Knowledge Graph can be more beneficial when you want to extract entities and their relationships from text, helping to create structure from text. This approach is similar to embeddings, except the result is structured as a graph, making it easier for humans to comprehend it. As a result, Knowledge Graph is another promising option for retrieving information in RAG systems.

At Google, we have been utilizing Knowledge Graph in Google Search since 2012, helping add more context to search queries by providing information about things, people, or places that Google already knows about. Google Search leverages Knowledge Graph to tap into its existing intelligence and understanding of the web to find and return results related to a user’s search query, such as landmarks, celebrities, cities, geographical features, movies, and more.

Knowledge Graph is also integrated with Vertex AI Search to enhance its search capabilities including web and media search. When documents and queries are processed or summarized, Vertex AI Search will also automatically identify relevant entities with Knowledge Graph and add it to any annotations.

For instance, imagine your document or query contains the keyword “Buffett,” and it is highly probable that the keyword refers to Warren Buffett. Vertex AI Search will automatically annotate documents with additional information about him from the Google Search Knowledge Graph and add related Buffettt keywords to the original query. This increases the likelihood of retrieving this document with other keywords or topics related to him.

Document and query annotation with Knowledge Graph relationships

Collect scattered enterprise data

While supporting a wide range of enterprise data sources with a single search engine may not seem like the most exciting or widely discussed topic, it’s a significant challenge that will need to be addressed — especially as LLMs and RAG systems get implemented into existing production applications and users discover the value they bring. More and more, you’ll likely encounter questions like: “Can I search documents across our databases, file servers, and web pages?” or “Will my search results also include documents from Google Drive, Salesforce, and Confluence?”.

Here are some of the ways Vertex AI Search can make it easier to bring these capabilities to real-word services.

Blended Search and web crawling
One recent accomplishment with Vertex AI Search is Blended Search, which allows you to run a single query across all your data, including:

Structured data, such as tables imported from BigQuery, databases, CSV, JSON, etc.Unstructured data like PDFs or web page content imported from storagePublic website content

This means you don’t have to issue queries on multiple search indexes for each data type and then integrate them. Instead, you can run a single query across different systems and data sources simultaneously and get merged results back from all of them.

Blended Search in Vertex AI Search

Vertex AI Search also enables you to crawl all your company’s websites and easily generate a website index that you can easily search with advanced website indexing. All you have to do is specify the URL or URL pattern and Vertex AI Search will instantly create a data store with all the relevant website pages. You can then use Google Search Central, to manage site crawling on your website.

Connectors
In addition to supporting and enabling search across various datasets, Vertex AI Search also offers a wide variety of Connectors for popular third-party data sources, including:

Available as Private Preview:

JiraSalesforceConfluence

Features under development:

Google DriveMicrosoft SharePoint and OneDriveSlackBoxDropBoxServiceNow

When paired with an RAG system, Vertex AI Search Connectors can quickly grab valuable information stored in various data stores scattered throughout the enterprise and across the web. Then, they use the human-level intelligence of LLM to digest the information, summarize and serve it to users in a way that is easy to understand and use.

Everything is fully managed

Now you might be thinking, “This is a lot to take on.” And if you plan to build a RAG system from scratch — you’re right. Implementing advanced search technologies that could reproduce search quality comparable to Google Search, whether you’re using them for gen AI apps or not, would take years of development and hiring a host of data scientists and engineers with specialized skills in ML, search engines, DevOps, and MLOps.

We believe that developers should be able to focus on utilizing the power of LLMs and RAGs to solve business-specific problems, instead of trying to solve the same problems that Google has been working on for the past 25+ years. That’s why Vertex AI Search shares most of its technology and infrastructure with Google Search, which is operated 24×7 for billions of users by our top-notch SRE team. You don’t need to worry about managing any server instances or containers because there are none. Everything is packaged as an out-of-the-box, fully managed serverless service — just like BigQuery.

Building RAG systems with Vertex AI Search

In this article, we have seen the benefits of using Vertex AI Search as an information retrieval engine in a RAG system. Even for developers who do not have any experience on building a RAG system, Vertex AI Search provides a great foundation to get started faster with new gen AI technologies.

For more information about how to build your own RAG systems with Vertex AI search, we recommend explore the following resources:

LangChain-based samples and documents: RAG sample notebooks using Vertex AI Search, PaLM, and LangChain and Vertex AI Search support in LangChain Grounding in Vertex AI: provides a quick and easy way for groundingCheck Grounding API provides a grounding score for an answer candidateVertex AI Conversation-based grounding: Vertex AI Search and Conversation: search with follow-ups How to use custom embedding with Vertex AI Search Vertex AI Search and Conversation product page Get started with Vertex AI Search Vertex AI Search sample notebooks on GitHub Gen AI repoVideo: Harnessing the power of generative AI to deliver next-gen search experiences

Read More for the details.

2024 02 13

GCP – Power self-serve analytics and generative AI with Sparkflows and Google Cloud

Cloud, Google Cloud gcp

Self-service analytics powered by ML and generative AI is the new holy grail for data-driven enterprises, enabling enhanced decision-making through predictive insights, and providing a significant boost in operational efficiency and innovation. C-level executives increasingly see self-service analytics as the key driver of employee productivity and business efficiency.

Today, technical practitioners employ a variety of open-source libraries, including Apache Spark, Ray, pandas, sk-learn, h20 and many more to create analytics and ML applications. This entails writing a lot of code, which has a steep learning curve. Additionally, developing front-end interfaces for business users to interact with the systems in a secure and scalable manner takes a long time.

Enterprises also face challenges in hiring and retaining data-science experts and incur overhead costs for managing a large number of heterogeneous tools and technologies. Handling a growing variety and volume of data from siloed sources is a huge barrier to analytics initiatives. Lack of seamless workload scaling slows business solutions development.

Democratizing analytics and building ML applications are best done when business users and IT teams are empowered with services offered by cloud technology through intuitive, easy-to-use workflows, analytical apps, and conversational interfaces.

This brings out the strong need for a unified self-service platform made for all users to create and launch business solutions powered by cloud.

Sparkflows

Sparkflows is a Google Cloud partner that provides a powerful platform packed with self-service analytics, ML and gen AI capabilities for building data products. Sparkflows help integrate diverse open-source technologies through intuitive user-driven interfaces.

With Sparkflows, data analytics teams can turbocharge the development of ETL, exploratory analytics, feature engineering, ML models and gen AI apps using 460+ no-code/ low-code processors, and various workbenches as shown below.

Various AI and gen AI workbenches in Sparkflows

Self-service with Sparkflows and Google Cloud

Sparkflows running on Google Cloud provides unified self-serve data science capabilities with connectivity top BigQuery, Vertex AI, AlloyDB and Cloud Storage. The solution automatically pushes down the computation to high-performance distributed job execution engines like Dataproc and BigQuery. These automated integrations scale business solutions for very large datasets.

Interaction diagram: Sparkflows and Google Cloud

Sparkflows has developed a large number of solutions for the sales and marketing, manufacturing and supply chain departments of retail and CPG customers.

Business scenarios using Sparkflows and Google Cloud

Let’s assume the engineering team of a retail company needs to empower the marketing team with a self-service analytics tool that can identify the customers who are likely to churn, and measure the effectiveness of the campaigns by analyzing the coupon responsiveness, sales, and demographic data.

The team needs to ingest and prepare data quickly, build ML models, analytics reports and gen AI apps in an automated fashion where Spark code will be generated and jobs will be submitted to a Dataproc cluster effortlessly.

Installation

As the first step, Sparkflows is installed inside the customer’s secure VPC network either on a virtual machine or in a container running in Google Cloud. Sparkflows runs securely with built-in SSO integration.

Configuration

Admin users configure the Dataproc Serverless Spark cluster and various types of LLM services like PaLM API in Sparkflows admin console.

Self-service solution design & execution

Sparkflows enables a unified experience for continuous machine learning.

Let’s now discuss the steps required to identify customers who are likely to churn and the ability to analyze the reviews by customers to measure satisfaction. This process involves:

Dataset exploration Data preparation ML model training ML model prediction Visualizations Creating analytical apps Generative AI apps

Sparkflows connects with various Google Cloud services for performing the above operations (Ref: Interaction diagram: Sparkflows and Google Cloud).

Datasets

In this example, the datasets (customer transactions, campaigns, coupons and demographic info) are stored in BigQuery and product review data is in Cloud Storage. Business users can select a domain like retail and then view all the datasets stored in Google Cloud within Sparkflows. Users can browse files in Cloud Storage, explore and query BigQuery tables. Sparkflows dataset explorer seamlessly connects with Data Catalog.

Data preparation

Users can rapidly design various workflows for ingesting the datasets and performing data profiling, automated quality checks, cleaning and exploratory analysis using 350+ no-code/low-code data preparation processors. All these workflows help automate the Spark code generation and functionality development for the current business solution, cutting down the engineering time from weeks to hours.

Each of the visual workflows results in the automatic creation of a Spark job which is launched on Dataproc Serverless. Dataproc Serverless is an ideal platform for running these jobs. It is a highly performant and cost-effective distributed computing platform that is able to quickly spin up additional compute resources as needed. The platform is also very cost-effective as customers are only billed for resources for the duration of the job execution.

ML model training

Data scientists and analysts can perform feature engineering to calculate various aggregated metrics from the data processed by workflows designed in previous steps. Developers can leverage 80+ No Code/Low Code ML processors to create an ML modeling workflow. The features are used for training a model which can predict customers most likely to churn.

The features based on purchase pattern and coupon redemption information are used for creating the segments of customers

ML model prediction

Below is an example of the Prediction workflow for churn prediction.

The Prediction workflow can be triggered manually, via the built-in scheduler, through the API, or using the Analytical App UI.

ML Model Prediction Workflow

Visualization – descriptive and predictive analytics

Business users can drag the nodes used in workflows in the report designer UI and create powerful reports, which allow data scientists to inspect profiling stats, data quality results, exploratory insights, training metrics and prediction outputs.

When the underlying workflows are executed in a Dataproc cluster, the reports are automatically refreshed.

Reports of descriptive and predictive analytics

Business analytical apps

Business analytical apps in Sparkflows let business users build front-end applications for data products. Business users interact with these apps using their browsers. The analytical apps are built with an interactive UI.

Gen AI apps

Now, let’s build a few gen AI apps to allow the business team perform the following operations:

Ask questions from the product review dataSummarize, extract topics and translate texts

The first step is to configure the Vertex PaLM API connection in the admin console and select the connection in the Analytical App.

Allow users to query product reviews and gain insights

Allow users to translate and query documents

This is how Sparkflows helps sales and marketing teams of a retail company identify potential customer churn, measure campaign effectiveness, find target customer segments, and analyze product reviews and business documents.

ML solutions

It enables a wide range of gen AI apps, from content synthesis, content generation, and NLQ-based reports, to prompt-based business solutions.

Generative AI solutions

Better together

Having the ability to move fast with AI and generative AI is of great value to all types of enterprises. The partnership between Sparkflows and Google Cloud puts powerful and affordable self-serve AI and gen AI capabilities in the hands of the users in a secure and scalable way. Building gen AI solutions using Sparkflows and Google Cloud is highly affordable, thanks to Vertex’s highly cost-effective gen-ai pricing model and Sparkflows’ discounted pricing package. Overall, Sparkflows with Google Cloud drives operational efficiencies, accelerates business solutions, and speeds up time to market thereby propelling business growth.

Try out Sparkflows

Here are a few links to get started with Sparkflows and Google Cloud:

Get a sandbox instance in Google Cloud Sign up for the playground Ask for a demo Learn more Tech blogs

We thank the many Google Cloud and Sparkflows team members who contributed to this collaboration, especially Kaniska Mandal and Deb Dasgupta for their guidance during the process.

Read More for the details.

2024 02 13

GCP – Context-aware code generation: Retrieval augmentation and Vertex AI Codey APIs

Cloud, Google Cloud gcp

Retrieval augmented generation, or RAG, is a way to use external data or information to improve the accuracy of large language models (LLMs). Today, we’ll explore how to use RAG to improve the output quality of Google Cloud AI models for code completion and generation on Vertex AI using its Codey APIs, a suite of code generation models that can help software developers complete coding tasks faster. There are three Codey APIs that help boost developer productivity:

Code completion: Get instant code suggestions based on your current context, making coding a seamless and efficient experience. This API is designed to be integrated into IDEs, editors, and other applications to provide low-latency code autocompletion suggestions as you write code.Code generation: Generate code snippets for functions, classes, and more in seconds by describing the code you need in natural language. This API can be helpful when you need to write a lot of code quickly or when you’re not sure how to start. It can be integrated into IDEs, editors, and other applications including CI/CD workflows.Code chat: Get help on your coding journey throughout the software development lifecycle, from debugging tricky issues to expanding your knowledge with insightful suggestions and answers. This multi-turn chat API can be integrated into IDEs, and editors as a chat assistant. It can also be used in batch workflows.

These models also integrate Responsible AI capabilities, such as source citation and toxicity checking, which automatically cite or block code based on Responsible AI guidelines set by Google.

The Codey APIs deliver far more than generic code generation, allowing you to tailor code output to your organization’s specific style and securely access private code repositories based on your organization’s guidelines. The ability to customize these models helps you generate code that complies with established coding standards and conventions while leveraging custom endpoints and proprietary codebases for code generation tasks.

To achieve this level of customization, you can tune models using specific datasets such as your company’s codebase. Alternatively, you can also utilize RAG to incorporate external knowledge sources into the code generation process, which we will now discuss in detail below.

What is RAG?

Traditional large language models are limited by their internal knowledge base, which can lead to responses that are irrelevant or lack context. RAG addresses this issue by integrating an external retrieval system into LLMs, enabling them to access and utilize relevant information on the fly.

This technique allows LLMs to retrieve information from an authoritative external source, augment their input with relevant context, and generate more informed, accurate responses. Code generation models, for instance, can use RAG to fetch relevant information from existing code repositories and use it to create accurate code, documentation, or even fix code errors.

How does RAG work?

Implementing RAG requires a robust retrieval system capable of delivering relevant documents based on user queries.

Here’s a quick overview of how a RAG system works for code generation:

The retrieval mechanism fetches relevant information from a data source. This information can be in the form of code, text, or other types of data.The generation mechanism — i.e., your code generation LLM — uses the retrieved information to generate its output.The generated code is now more relevant to the input query or question.

While you can employ various approaches, the most common RAG pattern involves generating embeddings for chunks of source information and indexing them in a vector database, such as Vertex AI Vector Search.

The diagram below shows a high-level RAG pattern for code generation with Codey APIs.

Figure1: High level information flow in RAG pattern for Code Generation

The first step is identifying source information. For code generation, this could be an API definition, code repositories, documentation, or similar. Next, you will need to determine the chunking scheme. Chunking information allows you to select and provide only the relevant content needed to address a query.

The best chunking approaches for RAG are those that preserve the contextual information required for text generation. For code, we recommend choosing chunking methods that respect natural code boundaries, such as function, class, or module borders. Techniques like random splits or mid-sentence/clauses could break the context and degrade your output.

After you create information chunks from the information source, you can generate embeddings and index them in a vector database. When a query is received, another embedding is generated for the query and used to help retrieve relevant information chunks.

From there, a prompt, the user question and relevant information chunks are sent to the Codey APIs to generate a response.

Using RAG with Codey APIs

Now that we understand what RAG is, let’s see how it works for code generation with the Codey models in Vertex AI.

For this demonstration, we utilized the sample code and Jupyter notebooks in Google Cloud’s generative AI GitHub repository as the data source. We crawled the entire repository and listed any Jupyter notebooks. Subsequently, we analyzed these notebooks and extracted code elements, which were then chunked and indexed within a vector database. For detailed instructions on how to do this, you can follow the steps in the following notebook.

The example below shows a generated response to a prompt without using RAG to add external context.

Prompt: “Create python function that takes a prompt and predicts using langchain.llms interface for VertexAI text-bison model”

Output without RAG:

Figure 2: Output from the model without any external context

In the above example, the LLM does not have pre-existing knowledge of the Langchain library. While the response may look convincing and coherent, the model has actually hallucinated and generated code that does not correctly instantiate the text-bison model and make a call to the Predict function.

Next, let’s try using RAG for the same prompt.

Output with RAG:

Figure 3: Output using Retrieval Augmented Generation

Using RAG, Codey was able to dynamically inject code from the Google Cloud Github repository and deliver code output that uses the correct syntax that will allow the Vertex AI text-bison model to call the Langchain API.

Common use cases and limitations of RAG

While RAG can be a helpful tool for improving the accuracy of LLM-generated code and text, it is important to note that RAG is not a perfect solution. There are still some cases where RAG can generate inaccurate or misleading results. This is because the knowledge base or other external source that RAG uses may not be accurate or up-to-date, or the LLM may not be able to correctly interpret the information from the knowledge base.

WIth that in mind, we recommend using RAG with Codey APIs when:

A model should be able to generate code variations within a supported language. For example, using RAG to explore different coding styles or adapting code to specific variations of SQL.You need transparency and citations for the sources you used to generate code.Your model should be able to analyze and learn from your latest codebase, ensuring code freshness.You want an existing code model with a deep understanding of diverse coding patterns and nuances to achieve superior code completion and targeted function generation.

Is RAG an alternative to fine-tuning?

RAG and supervised tuning are two different techniques for improving the performance of code models. They are complementary approaches with unique strengths and weaknesses and can be used together.

For instance, you could first use supervised tuning to tune a Codey model (eg. code-bison) on a specific domain or task, and then use RAG to augment the model’s knowledge with information from a large database. To learn more about fine-tuning your model, follow this guidance.

Limitations of using RAG

While RAG can be a helpful tool for improving the accuracy and informativeness of LLM-generated code and text, it is important to note that RAG is not a perfect solution.

There are still some cases where RAG can generate inaccurate or misleading results. This is because the knowledge base or other external source that RAG uses may not be accurate or up-to-date, or the LLM may not be able to correctly interpret the information from the knowledge base.

Getting started

To get started with Codey, you can sign up for a free trial of Vertex AI. Once you have a Vertex AI account, you can create a Codey instance and start using the code model APIs.

If you need a vector database, Vector Search has excellent performance, price, and industry-leading features. If you aren’t using code snippets and prefer an unstructured or structured (table) document RAG Search, Vertex AI Search makes this whole process easy.

Read More for the details.

2024 02 12

GCP – Cloud Monitoring now offers PromQL alerting and importing dashboards from Grafana

Cloud, Google Cloud gcp

The Cloud Monitoring team believes in the power of the open source community and open source interfaces. Our stance is that you should be able to get the near-unlimited scalability of our metrics platform without giving up the platform neutrality of open source projects or the battle-tested breadth of resources created by the community. This is why we’ve invested heavily in both the OpenTelemetry project and the Prometheus ecosystem with our Managed Service for Prometheus. Today, we’re excited to announce that Cloud Monitoring has achieved even greater compatibility with the open-source monitoring ecosystem with our GA release of PromQL-based alerting policies, and our command-line tool for importing dashboards from Grafana!

Cloud Monitoring’s PromQL querying has proven to be quite popular, with customers appreciating the tight integration with Google Cloud services and not having to run a separate application just to query metrics. However, until now, adopting the Cloud Monitoring UI meant manually porting over any pre-existing custom dashboards or alerts and sacrificing the ability to easily use the fantastic wealth of dashboards and alerts created by the community. With this release, you get the best of all worlds: your dashboard and alerting assets, the community’s recommended assets, and the power and scalability of Cloud Monitoring.

PromQL alerting in the console

While open-source PromQL alerting is very flexible and powerful, it traditionally suffers from usability issues. Writing alert expressions in YAML doesn’t provide you a good way to visually assert correctness, nor is there any way to write tests as an alternative to visual confirmation. Running the required standalone alertmanager instance inserts a potential infrastructure reliability risk into your critical alerting and notification routing path.

With Cloud Monitoring’s fully managed PromQL alerting, you get the universality of open-source alerting without having to run and manage your own open-source infrastructure. You can upload any alert rule written in standard Prometheus rule_files format using the gcloud command line migration tool or Terraform, you can create PromQL alerts from scratch in Cloud Monitoring’s alerting UI, or you can generate PromQL using the Metrics Explorer query builder and then copy-paste it into a new alerting policy.

Creating an alert on container high memory usage using standard cAdvisor metrics and a PromQL alert expression from samber’s Awesome Prometheus Alerts.

Fired alerts can be routed to any supported notification channel, including email, PagerDuty, Slack, SMS, and Pub/Sub. Other destinations can usually be supported by using Webhooks. Recording rules are not yet supported.

Import dashboards from Grafana

Cloud Monitoring provides a command-line tool that you can use to import dashboard files that use the Grafana JSON format. Most dashboards that are in Grafana’s dashboard library, that you find in a Github repo, or that you have created yourself can be displayed in the Cloud Monitoring UI.

Instrumento’s Kubernetes cluster monitoring (via Prometheus) dashboard can be imported into Cloud Monitoring.

Cloud Monitoring makes it even easier to pivot dashboards with automatically generated page-level filters. You can also add Cloud Logging widgets or charts that use Google Cloud system metrics to your dashboard for an even richer monitoring experience.

Getting started

PromQL is now generally available in Cloud Monitoring’s alerting. To learn more about creating policies using the API, UI, or Terraform, see the documentation for Alerting policies with PromQL or see the Public Preview blog post for a working example. You can also get started in the Cloud Console by navigating to the alert policy creation page, selecting Code editor, then selecting PromQL at the bottom of the code window.

Importing dashboards requires building and using a command-line tool. For instructions, see the documentation for Importing Grafana dashboards into Cloud Monitoring.

Read More for the details.

2024 02 12

GCP – Your RAGs powered by Google Search technology, part 1

Cloud, Google Cloud gcp

When a large language model (LLM) doesn’t have enough information or has no contextual knowledge of a topic, it is more likely to hallucinate and provide inaccurate or false responses. Developers are increasingly excited about generative AI and Retrieval Augmented Generation (RAG) — an architecture pattern that combines LLMs with backend information retrieval from other information sources. This approach can help overcome some of the most significant limitations of LLMs, such as knowledge limited to the scope of training data, lack of relevant context from enterprise data, and data that is not fresh or outdated.

As RAG-based approaches have grown in popularity, it’s become clear that a RAG system’s efficacy is completely dependent on the search quality of the backend retrieval system. LLMs are often smart enough to understand and answer questions. However, you can not fully utilize the power if the backend cannot provide high-quality search capabilities for scanning large amounts of proprietary information.

AI researchers and developers have been exploring numerous RAG technologies, including text chunking, query expansion, hybrid search, knowledge graph, reranking, and others. In short, an LLM needs the right data so that it doesn’t have to make things up, and you need the best search technology to get the best data.

Google Search technologies, backed by over two decades of experience pioneering information retrieval at scale, offer powerful information retrieval capabilities for RAG solutions. With Vertex AI Search, for instance, developers can easily combine LLMs with Google Search technology, letting them focus on building innovative AI apps instead of taking months to design and build their own advanced search engine for RAG systems.

Vertex AI Search is a fully-managed platform that lets you build AI-enabled search experiences for your public or internal websites, mobile applications, and various other enterprise search services. The product is the result of deep collaboration between the Google Search and Google Cloud teams to create a developer-friendly way to integrate LLM intelligence with Google Search technology.

In the first post of this two-part series, we’ll explore the key features that power Google-quality retrieval in LLM and RAG-based applications and how Vertex AI Search has inherited these capabilities from Google Search.

Google-quality retrieval and ranking

Semantic search using deep learning has become a crucial feature for most search engines, letting developers build systems that can understand the meaning of query texts rather than simply using keyword matching.

Despite these advancements, most RAG systems still use simple similarity search in vector databases to retrieve information. This approach can often lead to the return of low-quality, irrelevant results.

The primary reason for the lower search quality lies in the principle that “the question is not the answer.” A question like “Why is the sky blue?” and its answer, “The scattering of sunlight causes the blue color,” have distinctly different meanings. Why, then, do you use similarity search to find answers?

Semantic search is not just similarity search

In the Stack Overflow demo that we introduced in a previous post, simple similarity search was highly effective because the dataset had 8 million pairs of questions and answers. However, datasets do not usually contain pre-existing question-and-answer or query-and-candidate pairs in many real-world RAG scenarios. Therefore, it is vital for an AI model to learn and be able to predict the relationship between queries and their corresponding answers to deliver production-quality semantic search.

Google Search started incorporating semantic search in 2015, with the introduction of noteworthy AI search innovations like deep learning ranking system RankBrain. This innovation was quickly followed with neural matching to improve the accuracy of document retrieval in Search. Neural matching allows a retrieval engine to learn the relationships between a query’s intentions and highly relevant documents, allowing Search to recognize the context of a query instead of the simple similarity search.

“Neural matching helps us understand fuzzier representations of concepts in queries and pages, and match them to one another. It looks at an entire query or page rather than just keywords, developing a better understanding of the underlying concepts represented in them. Take the search “insights how to manage a green,” for example. If a friend asked you this, you’d probably be stumped. But with neural matching, we’re able to make sense of it. By looking at the broader representations of concepts in the query — management, leadership, personality and more — neural matching can decipher that this searcher is looking for management tips based on a popular, color-based personality guide.”—How AI powers great search results

Neural matching learns the relationships between queries and documents

Vertex AI Search uses the same RankBrain and neural matching processes to generate query and document embeddings, which are vectors that map semantic relationships and enable Google-quality semantic search. Instead of having to “homebrew” a way to fill the gap between questions and answers in a RAG system, developers can easily access the benefits of the semantic search technology that has been tested by billions of users over many years.

For example, with a popular keyword-based search engine loaded with all product catalogs from the Google Merchandise Store, a query “warm clothing for winter” will return the following results:

A typical keyword search result for a query “warm clothing for winter”

With Vertex AI Search loaded with the same product catalogs, the same query returns the following results:

Generative AI Summary: You can find warm clothing for winter at the Google Merchandise Store [1, 2, 5]. Some of the items include: Google Vail Unisex Grey Puffer Jacket, Google Denali Unisex Puffer Vest, Google Denali Womens Puffer Vest, Super G Tahoe Unisex Black Puffer Vest, Super G Glacier Unisex Puffer Jacket, Super G Glacier Womens Puffer Jacket [5].

Vertex AI Search results for a query “warm clothing for winter”

As you can see in the results above, the semantics of queries and answers are vastly different in many cases. Queries often represent the user’s intent (i.e., looking for warm clothing) rather than the answers (i.e., puffer jacket or vest) directly. A production-grade semantic search is not just a similarity search, but must provide smart recommendation to users.

Production-grade semantic search + LLM reasoning

In advanced RAG systems, LLM reasoning is commonly employed to overcome the limitations of simple similarity searches. By combining it with production-grade semantic search, it can greatly enhance the efficiency of an advanced RAG system.

As a basic example of LLM reasoning, you may dynamically build the following personalized prompt:

Given that it’s the beginning of winter, a customer is browsing for clothing on an e-commerce site. Winters are cold in their city. They entered “warm clothing for winter” as a search term on the site. What other search terms might they use to find related and cross-sell items?

Responses from an LLM may include the following queries:

Type-specific: Warm winter jackets, Cozy knitwear, Thermal leggings, Waterproof snow bootsActivity-specific: Ski clothing, Winter running gear, Work-appropriate winter outfits, Cozy homewearStyle-specific: Cashmere sweaters, Puffer vests, Statement scarves, Athleisure-inspired winter looks

By building a RAG system that can conduct searches on Vertex AI Search with these queries, you benefit from both the power of LLM reasoning and product-grade semantic search. The result is a system that can discover a broad array of relevant products that match different requirements and attributes, including type, activity, and style.

Vector search and AI processor serving billions

Another common misconception is that semantic search is a relatively new innovation that has gained popularity with the rise of LLMs. While semantic search may be among the current hot topics, it’s actually the result of years of research and development. Google has been at the forefront of semantic search development for nearly a decade, starting with a strategic decision to invest in developing its family of custom, in-house AI processors back in 2013 — the Tensor Processing Unit (TPU).

TPUs are specifically tailored to provide the underlying power needed to support machine learning and AI workloads, but their genesis is rooted in the goal of supporting the deep learning needed to deliver a production-grade semantic search experience. The first TPU was deployed to Google Search production serving infrastructure in 2015. This substantial investment has helped to reduce costs and latency, enabling us to bring production-grade semantic search experience to billions of users.

Google has spent years investing in and developing a powerful set of search technologies. For instance, Google Search processes semantic search with query and document embeddings with ScaNN, one of the world’s largest and fastest vector search infrastructures. ScaNN powers Google Search and many other Google services, quickly finding highly relevant documents and content to help users get the information they need in seconds. According to ANN benchmarking graphs, ScaNN is one of the industry’s core, state-of-the-art algorithms for recalling queries.

Overall, Google’s suite of breakthrough search technologies, including RankBrain, neural matching, ScaNN, and its family of TPUs, represents some of the most valuable technology assets built over the last decade. Inherited by Vertex AI Search, these same technologies enable the delivery of Google-quality semantic search capabilities with millisecond-level latency and at a reasonable cost, all while letting developers allocate more TPU resources to access the power of AI and large language models as a commercial service.

Vertex AI Search uses Tensor Processing Unit (TPU) to power its large-scale semantic search

From text to multimodal, search to recommendation

In Vertex AI Search, documents are automatically converted into embeddings. Advanced developers can also expand the query-and-answer relationships with custom embeddings tailored to their business requirements.

Custom embeddings can extend the query-and-answer relationships

Custom embeddings can be used to improve search quality by allowing search engines to use additional signals to understand the relationship between a query and documents. For example, the Vertex AI Multimodal Embeddings can generate custom video embeddings, which can be used to conduct semantic search of video content with a text-based query.

This same technique can also be used to build a custom recommendation system. Developers can train their own Two-Tower model (dual encoder model) that captures business-specific relationships between queries and answers. Some common examples include finding music with similar style to a specific song, upselling and cross-selling products, or suggesting fashion items to complete a coordinated look.

You can then generate custom embeddings from these relationships to improve the quality and personalization of recommendations. These techniques can greatly improve Vertex AI Search’s ability to map queries to relevant answers based on bespoke requirements.

Keyword search with 25+ years of experience

Many RAG implementations depend only on semantic search, which entails finding documents based on the meanings ascribed to them by an embedding model. However, semantic search is not a cure-all. In some cases, the embedding model may not grasp the meaning of an item, rendering the search useless. The models may not be familiar with newly added product names or trained to memorize millions of product model numbers in its embedding space.

Additionally, the majority of users continue to expect a conventional keyword search experience, which typically includes using exact or partial keyword matches and keyword-based filtering mechanisms. This functionality is especially crucial in sectors like healthcare and finance, where keyword searches are fundamental for accessing precise and critical patient information, financial data, or compliance-related documents.

Beyond our expertise in semantic search, we have also spent over 25 years advancing keyword-based (or token-based) search technology. Vertex AI Search builds on this by providing a hybrid search engine that simultaneously performs both keyword and semantic searches for each query. The results are then merged and re-ranked based on their respective scores, combining the best aspects of both search approaches to fill in the gaps left by each.

Hybrid search in Vertex AI Search

Query understanding and rewriting

Often, users type queries incorrectly or misremember the exact names of items. In such cases, query rewriting and query expansion play crucial roles in interpreting and rectifying inputs. By default, Vertex AI Search offers automatic context-aware query rewriting and expansion in its supported languages, including:

Word stemming and spell correctionAdding related words and synonymsRemoving unimportant wordsAnnotating important entities with Knowledge Graph (we’ll cover this in our second post)

For example, if you search for “dinosaur keyholder” in Vertex AI Search with the Google Merchandise Store dataset, you’ll get the following results:

Generative AI Summary: You can find a Chrome Dino Keychain for $8.00 [1]. It’s a cute keychain that you can take on adventures with you [1].

Vertex AI Search results (keychain, lanyard and pin) for “dinosaur keyholder”

With the query words “dinosaur” and “keyholder”, the keyword search automatically expands the query with synonyms “dino” and “keychain.” Similarly, semantic search considers the word “keyholder” as an intent-based query (i.e., you are looking for a dino accessory) closer to “lanyard” and “pin”. The results above show merged keyword and semantic search results — without integrating any additional search engines or performing manual maintenance of a synonym dictionary. When needed, you can even define your own synonyms and query terms on Vertex AI Search through its serving controls.

In the second part, we will continue our discussion of the critical search technologies for RAG systems, such as re-ranking, document extraction and processing, knowledge graph, and data collection.

Read More for the details.

Google Cloud

Architecture:

A solution for a more efficient procurement process

A key enabler for sovereignty solutions

Maintenance: The difference is in the design

Security: Everyone gets a key (well maybe not)

Support: New responsibilities, better solutions

Final thoughts

Latest Threat Horizons report: Emerging threats, actionable recommendations

In case you missed it

News from Mandiant

Now hear this: Google Cloud Security and Mandiant podcasts

How customers are innovating with Gemini models

Build production-ready applications with the Gemini API in Vertex AI

The Gemini era is just beginning — stay on the cutting edge

Best Hack winners

DashNotes: Persistent dashboard annotations

Document repository sync automation

Nearly Best Hack winner

Querying Python services from a Looker dashboard

Finalists

What do I watch?

Template analytics

LOV filter app

Looker accelerator

The SuperViz Earth Explorer

dbt exposure generator

Hacking Looker for fun and community

Why migrate?

Create a backup Plan

Bring down the Application

Start a Backup Manually

Restore from the Backup

Create a restore plan

Restore backup using the (previously created) restore plan

Bring up the application

Use it today

Subsquid explained

BigQuery and blockchain

Data superpowers for Web3 devs and analysts

Subsquid Developer relations engineer Daria A. demonstrates how to store data indexing using Subsquid to BigQuery and other tools

Get started with Subsquid on BigQuery today

Use cases

BigQuery-based RAG deep dive

Publication search and RAG examples

Case 1: Patent search using pre-generated embedding

Case 2: Patent search with BigQuery embedding generation

Case 3: RAG via integration with generative models

Getting started

Workload-optimized infrastructure for AI

Market-leading threat protection with robust security ecosystem

Getting started

Deep re-ranking with Google signals

Filtering and boosting

Extraction and generation

Google-quality document processing

Collect scattered enterprise data

Everything is fully managed

Building RAG systems with Vertex AI Search

Vertex AI Search adds new generative AI capabilities and enterprise-ready features

Sparkflows

Self-service with Sparkflows and Google Cloud

Business scenarios using Sparkflows and Google Cloud

Installation

Configuration

Self-service solution design & execution

Datasets

Data preparation

ML model training

ML model prediction

Visualization – descriptive and predictive analytics

Business analytical apps

Gen AI apps

ML solutions

Generative AI solutions

Better together

Try out Sparkflows

What is RAG?

How does RAG work?

Using RAG with Codey APIs