2022 is officially upon us, and while 2021 had its fair share of challenges, we continue to be impressed by the technology investments made by our customers and partners and their ongoing migration of legacy systems to the cloud. Before diving into 2022, we thought it would be fun to do a quick roundup of 2021’s ‘greatest migration hits’ first.

Fast and easy migration to Google Cloud

The most common theme we saw in 2021 was our customers migrating to Google Cloud faster and easier than they thought possible. Migrations are rarely a “one size fits all” project, but with the right solutions and people, it’s more than possible to be successful while moving fast. We’ll list some of the best migration stories and resources below.

To get you started, take a look at a bunch of great overview blogs including 3 Approaches To A Better Cloud Migration, 5 cheat sheets to help you get started on your Google Cloud journey, and 30 ways to leave your data center: key migration guides, in one place.

We also focused this year on best practices for using specific services to migrate what’s important to you. For data and storage, we released Best practices for migrating from on-prem to Google Cloud using Storage Transfer Service. While large-scale cloud migrations can be intimidating, there are a number of actions that you can take to ensure that even a multi-petabyte data transfer goes as smoothly as possible, using Google Cloud’s Storage Transfer Service.

Similarly, for databases, the Database Migration service went GA, which helps simplify the process of migrating MySQL and PostgreSQL databases from on-premises and other clouds to Cloud SQL. Freedom Financial Network quickly migrated their large MySQL databases to Cloud SQL and the actual downtime for each application from the database side was a maximum of ten minutes.

Datastream, a new Google Cloud service, gives you a way to Unlock the power of change data capture and replication with new, serverless Datastream, now GA. Datastream allows you to synchronize data across disparate databases, storage systems, and applications reliably and with minimal latency, to support real-time analytics, database replication, and event-driven architectures.

Saving time and money by migrating to Google Cloud

Getting to Google Cloud quickly and easily is one thing, but making sure you do that in a cost-effective manner is equally important. Properly budgeting for cloud costs and then staying within those budgets are crucial to cloud success, and we had some important stories to tell in 2021.

We commissioned Forrester Consulting to conduct aTotal Economic ImpactTM study on what kind of savings customers might be able to expect when migrating to Google Cloud. Through customer interviews and financial analysis, the study found that over three years a composite organization saved $13.1 million by migrating. There’s a ton more detail in the study itself, so be sure todownload it today.

In addition to this study, a benchmark showed that the New Tau VMs deliver leading price-performance for scale-out workloads, a crucial avenue of savings for compute workloads, while Spot VMs give you an improved TCO with a maximum discount of 91% over on-demand VMs. And we also confirmed that you can Accelerate Google Cloud database migration assessments with EPAM’s migVisor to better understand and estimate your post-migration costs.

Once you’ve migrated to the cloud, having a smart, efficient way to manage it all is key to optimizing your costs. That’s why we were especially happy to Introduce VM Manager, which helps you simplify and automate the maintenance of large VM fleets, a key way to keep compute and administration costs down.

Happy, Happy, Joy, Joy (also known as happy customers)

Customers come to Google Cloud to solve complicated business challenges, and nothing makes us happier than exceeding their expectations and seeing them succeed.

In 2021, a few customer successes really stood out:

J.B. Hunt executes speedy yet seamless cloud-to-cloud migration

AppLovin built on Google Cloud so they could transform mobile marketing

TELUS International migrates a key customer experience app to Google Cloud

Carrefour driving a digital transformation to bring healthy, sustainable food to everyone

Telecom Italia/TIM Brasil moved petabyte-scale data from on-premises filesystem to Google Cloud in a highly performant and fully-managed way

Freedom Financial Network accelerated their migration to a fully managed database in the cloud

Auto Trader charted their road from Oracle to PostgreSQL

SenseData completed their migrations with Google Cloud’s managed database services

These are just seven notable customer stories that we remember from 2021, but there are plenty more at the Google Cloud customer success site. As you can see, customers had a laser focus on making migrations fast and easy, keeping them under budget, and delivering success to their users. If you’re exploring cloud migrations, check out our cloud migration guide and checklist. Or, if you’re eager to get going, let’s kick off a free IT discovery and assessment so you can start plotting your course to Google Cloud.

Read More for the details.

2022 01 25

GCP – Google Cloud Deploy, now GA, makes it easier to do continuous delivery to GKE

Cloud, Google Cloud gcp

Effective software delivery — usually achieved via continuous integration (CI) and continuous delivery (CD) — is a top priority for many product development teams. It’s easy to understand why: the 2021 State of DevOps report found that elite performers of software delivery deployed code much more frequently than low performers, with three times fewer change-related failures. Teams who excel at modern software delivery operational practices were also 1.8 times more likely to report better business outcomes.

You need great tools to do software delivery effectively. Without capable tooling, teams have to design, maintain, and scale their software delivery solutions on their own, which can be difficult given the breadth of continuous delivery’s flow control, security and audit, and integration requirements. Deploying container image artifacts adds further complexity, particularly in Kubernetes environments.

Building on your feedback and Google’s own best practices, we’ve been working on software delivery tooling that helps you meet your continuous delivery goals — especially with respect to Google Kubernetes Engine (GKE) environments. Today, we are pleased to announce the general availability of Google Cloud Deploy, a managed, opinionated continuous delivery service that makes continuous delivery to GKE easier, faster, and more reliable.

Solving for continuous delivery challenges

While designing Google Cloud Deploy, we talked to a number of customers to better understand the challenges they face doing continuous delivery to GKE. While a handful of themes emerged, three stood out: cost of ownership, security and audit, and measurement.

Cost of ownership

As shared in our Preview launch post this past September, the operational cost of Kubernetes continuous delivery can be very high. Identifying best and repeatable practices, scaling delivery tooling and pipelines, collecting key metrics, and staying current — to say nothing of maintenance — is resource-intensive and takes time away from the core business.

As a managed service, Google Cloud Deploy eliminates the scaling and maintenance responsibilities that typically come with self-managed continuous delivery solutions. Now you can reclaim the time spent maintaining your continuous delivery tooling and spend it delivering value to your customers.

Google Cloud Deploy also provides structure. Delivery pipelines and targets are defined declaratively and retained with each release. That means if your delivery pipeline changes, the release’s path to production remains durable. No more time lost troubleshooting issues on in-flight releases caused by changes made to the delivery pipeline.

Promoting a release

Whether or not you already have a continuous delivery capability, you likely already have continuous integration, approval and/or operation workflows, and other systems that intersect with your software delivery practices.

Google Cloud Deploy embraces the GKE delivery tooling ecosystems in three ways: connectivity to CI systems, support for leading configuration (rendering) tooling, and Pub/Sub notifications to enable related software delivery tooling.

“While looking for a Continuous Delivery solution we considered ArgoCD and Spinnaker, however we chose Google Cloud Deploy because it is a managed service, provided proper CD primitives and integrated seamlessly with our GKE clusters. It has empowered every team member to safely and reliably promote their code from commit all the way through to production.”—Jonathan Sokolowski, DevOps Engineer, Search.io

A variety of GKE roles and personas interact with continuous delivery processes. DevOps engineers are focused on release promotion and rollback decisions, while a business decision maker thinks about delivery pipeline health and velocity. Google Cloud Deploy’s user experience keeps these multiple perspectives in mind, making it easier for various personas to perform contextualized reviews and make decisions, improving efficiency and reducing cost of ownership.

Deployment approvals

Security and control

Google Cloud Deploy’s security foundations strengthen secure software supply chain practices through delivery flow control and auditability.

Lots of different users interact with a software delivery system, making a variety of decisions. Not all users and decisions carry the same authority, however. Being able to define a delivery pipeline and make updates doesn’t always mean you can create release candidates, for example, nor does being able to promote a release to staging mean you can approve it to production. Modern continuous delivery is full of security and audit considerations. Restricting who can access what, where, and how is necessary to maintain release integrity and safety.

Throughout, Google Cloud Deploy enables fine-grained restriction through IAM, with discrete access control and execution-level security. Google Cloud Deploy also supports deploying to private GKE clusters and Virtual Private Cloud (VPC) Service Controls (currently in Beta) to respect security perimeters. For safeguards against unwanted approvals, you can take advantage of flow management features such as release promotion, rollback, and approvals.

Auditing with Google Cloud Deploy works just like it does for other Google Cloud services. Cloud Audit Logs audits user-invoked Google Cloud Deploy activities, providing centralized awareness into who promoted a specific release or made an update to a delivery pipeline. You can also create Google Cloud Deploy pipelines in supported locations to better conform with your business needs.

Measurement

Great tooling is only part of an effective software delivery strategy — you also need to know what metrics you need to measure, how, and why. By making it easier to measure software delivery performance, Google Cloud Deploy helps teams focus on software delivery optimization and achieve their desired business outcomes.

Google Cloud Deploy collects and makes available built in metrics about delivery pipelines. These include deployment history and success, and also the DORA metric ‘deployment frequency.’

Delivery pipeline metrics

Monitoring your deployed resources is another way to measure the effectiveness of your software delivery processes. To aid monitoring, Google Cloud Deploy automatically labels deployed Kubernetes resources, making it easier to associate your delivery pipelines with application performance. You can integrate application monitoring further using the Google Cloud Deploy API, so you can automatically promote code if it is stable and roll it back if an anomaly is detected.

The future

Comprehensive, easy-to-use, and cost-effective DevOps tools are key to building an efficient software development team, and it’s our hope that Google Cloud Deploy will help you implement complete CI/CD pipelines. And we’re just getting started! Stay tuned as we introduce exciting new capabilities and features to Google Cloud Deploy in the months to come.

In the meantime, check out the product page, documentation, quickstart, and tutorials. Finally, If you have feedback on Google Cloud Deploy, you can join the conversation. We look forward to hearing from you!

Read More for the details.

2022 01 25

GCP – Applications are now open for the second Google Cloud Accelerator Canada Cohort

Cloud, Google Cloud gcp

We’re inviting Canadian cloud-native technology startups to apply for the second Google Cloud Accelerator Canada cohort. The intensive 10-week virtual bootcamp helps startups prepare for the next phase of growth and development in their journey.

Throughout the pandemic, we saw just how important the role of cloud technology is in almost every sector. From healthcare to education, retail to manufacturing, cloud technology keeps us connected, and helps us maintain continuity in our schools, work and businesses.

We know the need for cloud-based solutions will continue to grow, that’s why we launched the first Google Cloud Accelerator Canadalast year, entirely dedicated to supporting cloud-native technology startups. The accelerator was the first of its kind at Google, designed to bring the best of our programs, products, people and technology to startups doing interesting work in cloud. “We are thrilled to be at the start of our second journey in meeting and building meaningful relationships with this new Google Canada Accelerator Cloud cohort,” said James Lambe, Managing Director of Google Cloud Canada.

Twelve startups were selected for our inaugural cohort, representing a range of industries including healthcare, hospitality and real estate. We worked with founders and their teams to help solve some of the top challenges facing their startup and provided 1:1 mentorship from an array of Google Cloud experts. We also offered support in AI/ML, design/UX, Android, web, product strategy, sales, marketing, and more.

Today, we’re excited to announce that applications are now open for the second Google Cloud Accelerator Canada. The 10-week virtual accelerator will offer cloud mentorship and technical project support, as well as deep dives and workshops on product design, customer acquisition and leadership development for cloud startup founders and leaders.

The Accelerator is best suited for funded, revenue generating startups who have a minimum of five employees and are well-positioned for their next phase of growth. In order to ensure Google can provide meaningful value, startups should aim to leverage either Cloud or AI/ML technologies in their product, service, or operations, or show an interest in leveraging these technologies in the future.

Applications are now open until March 1, 2022 and the accelerator will kick off this April. Interested startups leveraging cloud to drive growth and innovation are encouraged to apply here.

Read More for the details.

2022 01 25

Azure – Public preview: Support for private links available on the new agent

Azure, Cloud Azure

Start using the new agent and data collection rules in environments with special network requirements and maintain network isolation by leveraging private links.

Read More for the details.

2022 01 24

GCP – Test Cases in Dialogflow CX

Cloud, Google Cloud gcp

As a developer, you want to ensure that as you develop new features for your Dialogflow CX agent, you want to ensure that it passes a fixed number of Test Cases that you have marked as Golden Test Cases. Dialogflow CX not only provides you with a feature to manage a list of Test Cases that you can run against your agent but also understand Test coverage.

This blog post will take a sample Dialogflow Agent and will highlight how you can get started today with not just testing the agent but capturing those test runs as golden test cases, building up a test suite and also understanding test coverage.

Sample Agent : ABC Holiday Resorts

We will utilize an existing Agent that has been made available for download here. Go ahead and restore the sample Agent (ABC Holiday Resorts) into a sample Agent project following the steps given over here. Once you have restored the agent, you should be able to see the sample flow that is shown below:

The Sample agent has a single flow and the agent currently supports users in three different areas:

Get Information on current offers/deals

Get Information on current cities in which resorts are present

Reserve a vacation in one of the resorts

Given that the objective here is to highlight the Test Cases feature of Dialogflow, the Agent has been kept simple and does not have any external integrations and complex validations.

You can try out the Agent by using the Test Agent feature in Dialogflow CX console with the following phrases:

What are the current offers?

Where are the resorts?

Book a vacation

Test Cases and Coverage

The first up step is to visit the Test Cases feature in Dialogflow CX console. Go to the Manage tab and then the Test Cases section as shown below:

Since we do not have any Test Cases provided as part of the initial restore of the Agent and the fact that we have not captured any so far, the list is empty.

Click on Coverage and you will see the coverage for various Flows, Intents, Route Groups in the Agent. This is also currently not there since we do not have any Test Cases.

Capturing a Test Case

We can capture a Test Case via the Test Agent feature in the console. The steps to capture a Test Case is given below:

Assuming that your Agent is open in the Dialogflow CX console, click on the Test Agent button.

Provide the utterances as you go through the flow.

A sample test case is shown below. This exercises the option of asking the Agent about current offers/deals.

Once the flow is complete, click on the Save button.

This will ask you for the Test Cases, which we can provide as shown below and finally click on the Save button. This will save the Test Case.

Viewing Test Cases and Coverage

You can view the current list of Test Cases, via the Test Cases link in the Manage Tab, as shown below:

There are a couple of Test Cases that we added and you can also check the Coverage by clicking on the Coverage option in the screen above.

The Transition coverage is shown above and you can see the value for Yes for the two test cases that we captured. Additionally, you can also see the Intents coverage via the Intents option above. The screen is shown below:

You should go ahead and try out other flows and scenarios via the Test Agent feature. Capture those Test Cases and take a look at the overall coverage. This will ensure that you are testing the transitions and intents and if there are newer versions of the Agent, you can ensure that there is no regression by running these Test Cases at any time via the Run option.

Additionally, if you would like to look at Test Cases in detail, we suggest looking at the prebuilt Agents that are available as part of Dialogflow CX. Import one of the pre-built Agents like Travel : Car Rental Agent or Travel : Baggage Claim Agent and visit the Test Cases.

This concludes our overview of Test Cases in Dialogflow CX. Feel free to reach out to me on Twitter @iRomin for any questions or feedback. Do check out the references in the next section for learning more about Dialogflow CX and Test Cases.

References

ABC Holiday Resorts – Sample Agent File

Dialogflow CX Documentation

Dialogflow CX Testing

Dialogflow CX prebuilt Agents

Cloud, Google Cloud gcp

Communication services providers (CSPs) are experiencing a period of disruption. Overall revenue growth is decelerating and is projected to remain below 1 percent per year, following a trend that started even before the pandemic.1 At the same time, driven by the pandemic, data consumption in 2020 increased by 30 percent relative to 2019, with some operators even reporting increases of 60 percent.2

The combination of pressure on revenues with rising data traffic costs is forcing operators to innovate in three fundamental ways. First, operators are looking to establish new sources of revenue. Second, increased network utilization must be met with a reduction in network cost. And third, there is an opportunity to gain new customers by improving the customer experience.

Fortunately, 5G offers a path forward across each of these three areas. Concepts such as network slicing and private networks allow CSPs to offer differentiated network services to public sector and enterprise customers. The disaggregation of hardware and software allows new vendors with unique strengths to enter the market and to enable CSPs to build, deploy, and operate networks in fundamentally new ways. And the ability to place workloads at the edge permits CSPs to offer compelling experiences to consumers and businesses alike. In this blog, we will discuss how CSPs can create a solid foundation for their cloud networks.

Understanding telecommunications networks

First, it is useful to consider the way telecommunications networks were traditionally built. Initially, networks were built using physical network functions (PNFs) — appliances that used a tight combination of hardware and software to perform a specific function. PNFs offered the benefit of being purpose-built for a specific application, but they were inflexible and difficult to upgrade. As an example, deploying new features frequently required replacing the entire PNF, i.e., deploying a new hardware appliance.

The first step in improving deployment agility came with the concept of virtualized network functions (VNFs), software workloads designed to operate on commercial off-the-shelf (COTS) hardware. Rather than utilizing an integrated hardware and software appliance, VNFs disaggregated the hardware from the software. As such, it became possible to procure the hardware from one vendor and the software from another. It also became possible to separate the hardware and software upgrade cycles.

However, while VNFs offered advantages over PNFs, VNFs were still an intermediate step. First, they typically needed to be run within a virtual machine (VM), and as such required a hypervisor to interface between the host operating system (OS) and the guest OS inside the VM. The hypervisor consumed CPU cycles and added inefficiency. Second, the VNF itself was frequently designed as a monolithic function. This meant that while it was possible to upgrade the VNF separately from the hardware, such an upgrade, even for a feature that affected only a portion of the VNF, required deployment of the entire large VNF. This created risk and operational complexity, which in turn meant that upgrades were delayed just as they were with PNFs.

Creating the foundation for cloud networks

The trick to establishing your cloud based network resides in the challenge of moving from VNFs to containerized network functions (CNFs) — network functions organized as containers as a collection of small programs, each of which can be independently operated.

The concept of containers is not new. In fact, Google has been using containerized workloads for over 15 years. Kubernetes, which Google developed and open-sourced, is the world’s most popular container orchestration system, and is based on Borg, Google’s internal container management system.3 There are lots of benefits to using containers, but fundamentally, it frees developers from worrying about resource scheduling, interprocess communication, security, self-healing, load balancing, and many other tedious (but important!) tasks.

Consider just a couple examples of benefits that containerization brings to network functions. First, when upgrading the network function to implement new features, you no longer need to re-deploy the entire network function. Instead, you only need to re-deploy the containers that are affected by the upgrade. This improves developer velocity and reduces the risk of the upgrade because, rather than infrequent upgrades that each introduce substantial changes, you can now have frequent upgrades that each deploy small changes. Small changes are less risky because they are easier to understand and to roll back in case of anomaly. Incidentally, this also improves your security posture because it reduces the time between when a security vulnerability is discovered and when a patch is deployed.

Speaking of security, another example of the benefits that containerization brings to network functions is an automatic zero-trust security posture. In Kubernetes, the communication among microservices can be handled by a service mesh, which manages mundane aspects of inter-services communication such as retries in case of failure and providing observability into communication. It can also manage other essential aspects such as security. For example, Anthos Service Mesh, which is a fully-managed implementation of the open-source Istio service mesh (also co-developed by Google), includes the ability to authenticate and encrypt all communications using mutual TLS (mTLS) and to deploy fine-grained access control for each individual microservice.

Automation and orchestration for cloud networks

CNFs bring tremendous benefits, but they also bring challenges. In place of a relatively small number of network appliances, we now have a large number of containers, each of which requires configuration, management, and maintenance. In the past, many of these processes were accomplished using manual techniques, but this is impossible to accomplish economically and reliably at the scale required by CNFs.

Fortunately, there are cloud-native approaches to solving these challenges. First, consider the problem of autonomously deploying and maintaining CNFs. The ideal way is to use the concept of Configuration as Data. Unlike imperative techniques such as Infrastructure as Code, which provide a detailed description of a sequence of steps that need to be executed to achieve an objective, Configuration as Data is a declarative method whereby the user specifies the desired end state (i.e., the actual desired configuration) and relies on automated controllers to continuously drive the infrastructure to achieve that state. Kubernetes includes such automated controllers, and the great news is that this method can be used not just for infrastructure but also for the applications residing on top of it, including CNFs. This cloud-native technique frees you from the toil and associated risk of writing detailed configuration procedures, so you can focus on the business logic of your applications.

As another example, consider the problem of understanding your network performance, including anomaly detection, root cause analysis, and resolution. The cloud-native approach starts with creating a data platform where both infrastructure and CNF monitoring data can be ingested, regularized, processed, and stored. You can then correlate data sets against each other to detect anomalies, and with AI/ML techniques, you can even anticipate anomalies before they happen. AI/ML is likewise indispensable in gaining an understanding of why the anomaly is happening, i.e. performing root cause analysis, and automated closed-loop controllers can be developed to correct the problem, ideally before it even happens.

Architecting for the edge

The transition from VNFs to CNFs is a critical piece in addressing the challenge that CSPs face today, but it alone is not enough. CNFs need infrastructure to run on, and not all infrastructure is created equal.

Consider a typical 5G network. There are some functions, such as those associated with an access network, that need to be deployed at the edge. These functions require low latency, high throughput, or even a combination of the two. In 5G networks, examples of such functions include the radio unit (RU), distributed unit (DU), centralized unit (CU), and the user plane function (UPF). The first three are components of the radio access network (RAN), while the last is a component of the 5G core. At the same time, there are some other control plane functions such as the session management function (SMF) or the authentication and mobility management function (AMF) that do not have such tight latency and high throughput requirements and can thus be placed in a more centralized data center. Furthermore, consider an AI/ML use case where a particular model (perhaps for radio traffic steering) needs to run at the network edge because of its latency requirements. While the model itself needs to run at the edge, model training (i.e., generating the model coefficients) is frequently a compute-intensive exercise that is latency-insensitive and is thus more optimal to run in a public cloud region.

All of these use cases have one thing in common: they call for a hybrid deployment environment. Some applications must be deployed at the edge as close to the user as possible. Others can be deployed in a more centralized environment. Still others can be deployed in a public cloud region to take advantage of the large amount of compute and economies of scale available therein. Wouldn’t it be convenient — if not transformational — if you could use a single environment for deploying at the edge, in a private datacenter, and in public cloud, with a consistent set of security, lifecycle management, policy, and orchestration resources across all such locations? This is indeed what Google Distributed Cloud, enabled by Anthos, brings to the table.

With Google Distributed Cloud, you can architect a 5G network deployment such as the one shown below.

Business benefits of cloud networks

Beyond the technical benefits, consider the business benefits of such an architecture. First, by following the best practices of hardware and software disaggregation, it permits the CSP to procure the infrastructure and the network functions from different vendors, spurring competition among vendors. Second, each workload is placed in precisely the right location, enabling efficient utilization of hardware resources and offering compelling low-latency, high-throughput services to users. Third, because the architecture utilizes a common hybrid platform (Anthos), it makes it easy to move workloads across infrastructure locations. Fourth, the separation of workloads into microservices accelerates time-to-market when developing new features or applications, such as those enabling enterprise use cases. And finally, the container management platform supports the simultaneous deployment of both network functions and edge applications on the same infrastructure, allowing the operator to deploy new experiences such as AR/VR directly on bare metal as close to the user as possible.

The next generation cloud network is now

There is a lot more we could say, but perhaps the most important takeaway is that this architecture is not a future dream. It exists today, and Google is working with leading CSPs and network vendor partners to deploy it, helping them realize the promise of 5G to deliver new revenues, reduce operating costs, and enable new customer experiences.

To learn more, watch the video series on the cloudification of CSP networks.

Discover what’s happening at the edge: How CSPs Can Innovate at the Edge.

1. Statista, Forecast growth worldwide telecom services spending from 2019 to 2024
2 PricewaterhouseCoopers, Global entertainment and media outlook 2021-2025
3. Borg: The Predecessor to Kubernetes

Read More for the details.

2022 01 24

GCP – Explaining machine learning models to business users using BigQueryML and Looker

Cloud, Google Cloud gcp

Organizations increasingly turn to AI to transform work processes, but this rapid adoption of models has amplified the need for explainable AI. Explaining AI helps us understand how and why models make predictions. For example, a financial institution might wish to use an AI model to automatically flag credit card transactions for fraudulent activity. While an accurate fraud model would be a first step, accuracy alone isn’t sufficient. Banks and regulators are often required to explain why an AI model is making a specific prediction. Was a fraud decision based on the transaction amount? The cardholder’s gender? Their spend history? Explainable AI helps answer these types of questions, promotes fair business practices, assists with regulatory requirements, and protects against bias and discrimination.

Implementing explainable AI in Google Cloud is an increasingly easy and common practice. Data scientists can use Google Cloud’s Vertex AI to understand what factors contribute to predictions for even the most complex deep learning models. But what about citizen data scientists?

In this post we’ll look at how data analysts can also take advantage of explainable AI by creating models in SQL using BigQuery ML and then explain those model predictions to stakeholders and domain experts using “What-If Scenario Dashboards” in Looker.

Building a Fraud Model in SQL Using BigQuery Machine Learning

BigQuery Machine Learning (BQML) allows analysts to create a variety of machine learning models entirely in SQL. In addition to democratizing data science capabilities, BQML benefits organizations by allowing models to be trained and predictions made without moving any data eliminating many data governance and MLOps challenges.

In this example, a retail bank has a dataset of credit card transactions, card holder details, and merchant information. A query creates the model training data including transaction amount, the distance between a merchant and the customer’s home, and the transaction time of day. These features are generated entirely in SQL taking advantage of BigQuery’s support for geo-spatial functions.

The sample dataset is publicly available to query. Make sure to create a dataset named retail_banking in your project to store the resulting ML datasets and models.

Example BigQuery SQL to Prepare a Model Training Dataset

After creating the training data, a short query fits a regression model to predict whether a transaction is fraudulent. BQML includes robust defaults along with many options for specifying model behavior. BigQuery provides model fit metrics, training logs, and other model details.

Build Model

You can use other models in BigQuery ML such as xgboost and deep neural networks as well. Explainability works with those techniques also.

Explainable AI in BQML

Beyond training and inspecting a model, BQML makes it easy to access explainable AI capabilities. Users can provide new hypothetical transactions and view the model’s prediction and explanation.

For example, the following query creates three hypothetical transactions with varying amounts, distances, and times of day. The model predicts the first transaction is fraudulent because of the large monetary value and early hour.

BQML Explainable AI Query and Result

Creating a “What-If Scenario Dashboard” in Looker

While BQML unlocks a rich set of capabilities, it can be more valuable to bring explainable AI to non-technical stakeholders such as business domain experts or executives. These individuals are often better qualified to interpret and validate a model’s explanation. Providing an interface to explainable AI improves the trust, adoption, and overall success of an AI initiative.

Looker helps bring BQML’s explainable AI to stakeholders. Looker is a modern BI tool and data platform that is deeply integrated with BigQuery. With Looker, analysts create governed dashboards and data experiences using a unique semantic model. In this example we use the semantic model to parameterize the BQML SQL statement and create a dashboard. Once built, end users can enter their own transaction details using dashboard filters and view the prediction and model explanation – all without writing any code!

Looker Explainable AI Dashboard

BQML Predictions in Looker’s Semantic Model

The LookML pattern below creates the “What-If Scenario” dashboard. Define a parameter for each hypothetical user input. Build a derived table using ML.EXPLAIN_PREDICT on a subquery with the user’s input parameters. This pattern should be modified based on your dataset, trained model, and desired user inputs. Alternatively, you can reference the existing BQML Looker blocks on the Looker marketplace for an end-to-end guide to using BigQuery Machine Learning with Looker.

Historically model interpretation has been limited to data science teams. Collaborating with business stakeholders has required significant back-and-forth or the development of custom data science applications. Looker and BigQuery ML provide an alternative approach that empowers SQL analysts and enables business collaboration.

Next Steps

You can start with BigQuery Machine Learning and Explainable AI by writing a short query. Or you can learn more about how teams are doing data science with Looker, jumpstart your own use case with the Looker Marketplace BQML blocks, or explore how AI-powered data experiences are possible using Vertex AI and the Looker platform.

Read More for the details.

2022 01 24

GCP – GCP Controls to leverage for Data Pipeline in Regulated Industries

Cloud, Google Cloud gcp

Many companies, both in Digital Native and traditional regulated industries, such as Finance, Healthcare and Telecom, use their data and cloud technologies to solve complex problems, enable rapid ML experimentation, and bring new products to market.

However, before using Cloud for data workloads, many in regulated industries weigh risk vs reward. The risks are mainly in three categories, namely “external cybersecurity”, “data exfiltration or insider threats”, and “cloud provider access to data”. Additionally, data residency requirements as specified by regulations play a crucial role in choosing Cloud over on-premise solutions. While these risks are equally applicable to many customer, the security bar and, therefore, the scrutiny from customers within regulated industries is far higher than other industries.

This blog post describes a set of controls to leverage when creating data products in compliance with security and regulatory requirements using Google Cloud services. We have worked with a number of customers and have observed that their viewpoint is different with respect to applying controls. We suggest you consult your CDO, CISO and legal counsel to ensure that the controls on your GCP projects are in line with regulatory requirements of your company and industry.

Data Residency and Processing

You may be required to store and process your customer’s data within a specified location due to data residency regulations such as GDPR, CCPA etc. In Google Cloud, you can control locations where data is stored and processed using restricting resource locations at Organization, Project or individual service level. Of these, Policy-based restrictions at the Organization level are the most convenient to set up.

You can use one of the more curated value groups to choose a geographic location(s). Typically, the value groups map to a region, e.g. London, or regions such as the European Union. Restricting resources located at the Organizational level will apply to all services which support resource locations. Additionally, you can choose individual services, e.g. BigQuery, Pub/Sub, Dataflow, etc., and use resource location restrictions. The latter allows application-specific customisation and could be a preferred approach for some use cases.

Protecting Sensitive Data

Further to data residency requirements, several regulations require that data should be protected from unauthorized access and highlights the importance of encryption as a mechanism to safeguard data in the event of unauthorized access or data theft. Regulations such as PCI-DSS are very prescriptive. Bank of England’s Prudential Regulation Authority (PRA) has concluded a set of supervisory statements detailing PRA’s expectations from PRA-regulated firms in relation to outsourcing and adoption of cloud.

Google Cloud offers multiple options for encrypting data at rest in services such as Cloud PubSub, Cloud Storage, BigQuery and Persistent disks. Before delving into the details, we should note that all data on Google cloud is encrypted as described in this encryption at rest paper. The default encryption uses AES-256 encryption standard and provides strict key access controls and auditing.

While the default encryption method using Google managed key may be sufficient for your use case, Google cloud offers other encryption options such as Customer Managed Encryption Key (CMEK) and External Key Management (EKM), which have been proven to be very effective for customers in regulated industries. Both of these options give you fine-grained control over key management. You can disable or revoke a key and periodically rotate it to reduce the risk of data breaches. The keys can be generated externally and imported into Cloud KMS, thus further enhancing controls on access to data. Additionally, to comply with FIPS-140-2 requirements, you can store encryption keys in Cloud HSM.

In some cases, just encrypting data at the Storage layer may not be sufficient, e.g. Personal Identifiable Information (PII) in nearly every industry, and Price Sensitive Information (PSI) in Financial Services will require some additional protection. Moreover, many use cases will require data obfuscation to use the data without revealing the actual information. On Google Cloud, you can use Data Loss Prevention service to discover, classify, automatically mask, tokenise and transform sensitive elements in structured and unstructured data (e.g. fields containing notes or images).

Once data is on the Cloud, it undergoes several transformations to create curated datasets to power descriptive and prescriptive analytics. Google cloud services such as Dataflow, Dataproc and Data Fusion to build pipelines offer a level of controls using encryption and confidential VM offering. Additionally, you can manage, monitor and govern data across data lakes and data warehouses using Dataplex (in preview). Dataplex helps to centralize the discovery of metadata and data governance.

Data Encryption and masking techniques help to mitigate external cybersecurity risks.

Mitigating Data Exfiltration Risk

Risks from insider threats are more considerable and more challenging to mitigate. The other difficulty is the inevitable multi-tenant Cloud environment. Typically, you will have multiple GCP projects for different applications. VPC Service Controls helps you to mitigate the risk of data exfiltration by isolating multi-tenant services and ensuring that only authorized networks can access sensitive data. Service perimeters setup using VPC Service Control restricts resource access to allowed IP addresses, identities, and trusted client devices. And it also allows you to securely exchange data with resources outside of the perimeter using ingress and egress rules.

Cloud Provider Access Management

Google Cloud’s commitment to access management relies on a set of products such as Access transparency, Access Approval and Key Access justification. Access Transparency logs give you information on actions taken by Google personnels. It is part of Google’s long-term commitment to transparency and user trust. You can use Access Transparency logs to verify that Google personnel are accessing content only for valid business reasons such as fixing an outage or attending to support requests. A new feature called Access Approval , currently in preview, takes it further by requiring your explicit approval whenever they need to access your content.

Explaining Model Predictions

To realize the potential of AI in regulated industries, AI must be used and developed responsibly. To that end, we have established innovation principles that guide Google’s AI applications, best practices to share our work with communities outside of Google and programs to operationalise our efforts. Our AI Principles serve as our ethical charter for research and product development. To further put responsible AI into practice, we have made available to the world a Responsible AI Toolkit to help developers implement the free responsible AI tools and our innovative techniques during the ML workflow.

Putting it all together – A functional architecture

We’ve put together an example functional architecture to highlight the components of a secure data pipeline and storage as discussed in this article. In this architecture, we are highlighting the various stages of data storage, processing and consumption. At each stage, the process interacts with services providing adequate controls e.g. encryption, identification and tokenization of sensitive data, and transparency reporting via logging. All of the components are deployed within a service perimeter to prevent any data exfiltration.

Conclusion

In this article we have identified a set of controls to ensure that AI/ML pipeline and Analytics on Google Cloud can comply with many regulatory requirements. We have seen that it is possible to restrict access to data and processes to a region and we have highlighted a few of these controls with an example pipeline. Due to the expanding nature of regulatory requirements, creating a compliant solution will require technical as well as legal advice. We suggest you work with your CDO, CISO and Legal counsels to ensure that the solution has all controls required by your organization and industry.

To get started with the Google Cloud data ecosystem, please feel free to contact us or start a free trial.

Read More for the details.

2022 01 24

GCP – Retailers unwrap a successful 2021 holiday season with Google’s Black Friday/Cyber Monday program

Cloud, Google Cloud gcp

Over the last five years, the November-December holiday shopping season has accounted for an average of 19 percent of annual retail sales, driven in large part by the surge in shopping over the five-day Thanksgiving holiday weekend. In 2021, nearly 180 million Americans shopped between Thanksgiving Day and Cyber Monday, generating online sales of $8.9B on Black Friday and $10.7B on Cyber Monday. While the 2021 holiday shopping season started earlier than in previous years, Cyber Monday remained the biggest online shopping day of the year: in the peak hour (11pm-12am ET/8pm-9pm PT), consumers spent $12M every minute.

At Google Cloud, we’re committed to helping all our partners, from retailers and financial services firms to shippers and grocers, unwrap a successful holiday shopping season with our Black Friday/Cyber Monday (BFCM) program. Launched in 2016 as part of our Event Management Service, this program served more than 100 unique customers in 2021. Our Event Management Service includes activities such as early resource planning, architectural reviews and disaster recovery exercises, and additional expert support to help Premium customers handle Peak Events—not only BFCM, but the Super Bowl, Tax Season, and other occasions where support is critical.

‘Tis the season to make a list—and check it twice

Just like any good proactive holiday shopper, we started planning early for additional needs—outside typical workloads—for the upcoming holiday season across 90+ zones. We then validated those projections with our customers, giving them the opportunity to make changes based on their business needs. In 2021, those projections were spot on, with only one retailer requiring additional resources during the peak events.

In response to customer requests, we also launched a new capacity reservation option in 2021 that allowed customers to signal and secure capacity assurance for their peak shopping times.

The bow on top: service freezes and specialized support teams

To further ensure a successful holiday season, Google Cloud mirrors many retailers by initiating code freezes for key Google Cloud services during the two-week period around BFCM. As consumer online and in-store shopping patterns evolve, we will continue to adjust the change freeze timing as our retail partners require.

If you’ve ever worked retail during the holiday shopping season, you know it’s all hands on deck and long work days and nights. Google Cloud is right there with our retail partners, staffing additional BFCM support across the globe to watch for, and ensure speedy resolution of, any issues that arise. Knowing the importance of this time period, we also provide regular status updates to key leadership.

Celebrating the joy of the season

Carrefour Retail Brazil, one of Brazil’s leading ecommerce retailers, is one of the dozens of Google Cloud customers that have been able to enjoy the season with the help of our BFCM program. “I appreciate Google Cloud’s support related to Black Friday,” said Paulo Farroco, Chief Information Officer. “I believe that the Black Friday period has been completely stable.”

Etsy, a global marketplace for unique and creative goods serving more than 5.2 million sellers and offering nearly 100 million items, has a mission to “keep commerce human.” This is more important than ever during peak seasons. The Etsy engineering team reflected on the 2021 holiday season on @codeascraft: “We are proud to support Etsy’s sellers and buyers during the 2021 Cyber 5. Thanks to our teams of engineers and our partners at Google Cloud, we saw no incidents or degradation of service during this massive scale event.”

Looking forward to 2022

With the holiday shopping season trending to span weeks instead of days, we’ve already begun planning for 2022. We’ll watch shopping patterns as they evolve and adjust our timelines to better meet our partners’ needs, including extended freeze timing. We’re also encouraging customers to build on what they learned throughout the 2021 BFCM period to support other peak traffic events leading up to BFCM 2022. We’ve discovered that customers who participate year after year move along the “operational maturity” scale so that BFCM becomes easier over time, despite increases in traffic.

Ready to ring in a successful 2022 shopping season? There’s no time like the present to partner with internal and external teams to discuss key lessons learned in 2021 and what will make 2022 even better. Contact your Google Cloud representative to make a fa-la-la-lasting impact on your upcoming holiday sales.

Read More for the details.

Cloud

Fast and easy migration to Google Cloud

Saving time and money by migrating to Google Cloud

Happy, Happy, Joy, Joy (also known as happy customers)

A year in review: Advancements in infrastructure at Google Cloud

Solving for continuous delivery challenges

Cost of ownership

Security and control

Measurement

The future

Introducing Google Cloud Deploy: Managed continuous delivery to GKE

Sample Agent : ABC Holiday Resorts

Test Cases and Coverage

Capturing a Test Case

Viewing Test Cases and Coverage

References

Using a Subflow in Dialogflow CX

Understanding telecommunications networks

Creating the foundation for cloud networks

Automation and orchestration for cloud networks

Architecting for the edge

Business benefits of cloud networks

The next generation cloud network is now

Five do’s and don’ts CSPs should know about going cloud-native

Building a Fraud Model in SQL Using BigQuery Machine Learning

Explainable AI in BQML

Creating a “What-If Scenario Dashboard” in Looker

Next Steps

Why you need to explain machine learning models

Data Residency and Processing

Protecting Sensitive Data

Mitigating Data Exfiltration Risk

Cloud Provider Access Management

Explaining Model Predictions

Putting it all together – A functional architecture

Conclusion

Turn data into value with a unified and open data cloud

‘Tis the season to make a list—and check it twice

The bow on top: service freezes and specialized support teams

Celebrating the joy of the season

Looking forward to 2022

Shopify engineers deliver on peak performance during Black Friday Cyber Monday 2021