Azure – Generally available: A new and improved alert rule creation experience
The ‘Create alert rule’ experience has been simplified and converted into a step wizard, which is now generally available.
Read More for the details.
The ‘Create alert rule’ experience has been simplified and converted into a step wizard, which is now generally available.
Read More for the details.
With 1-minute frequency log alerts, query can be evaluated every minute for faster time to fire.
Read More for the details.
When exporting data to event hub, you now have the flexibility to create multiple rules and event hub names to the same event hub namespace. This feature is in preview.
Read More for the details.
Secure your custom domains with a free certificate provisioned, managed, and automatically renewed by Azure API Management.
Read More for the details.
New features now available in General Availability include add multiple private endpoints for workspaces, and support for storage account IP firewall.
Read More for the details.
General Availability enhancements and updates released for Azure SQL
Read More for the details.
Connect Azure Cache for Redis to storage accounts via managed identity.
Read More for the details.
AWS Launch Wizard now supports SAP S/4HANA 2021, SAP BW/4HANA 2021, SAP HANA SPS06, and SUSE Linux Enterprise Server (SLES)15 SP3.
Read More for the details.
Amazon Location Service now offers request-based pricing for all customer use cases, including those with asset-based applications. For instance, if a developer building a delivery application needs to find the positions (latitude/longitude) associated with delivery addresses, they can pay per address searched. With Amazon Location, there are no upfront fees, no required minimum commitments, and no long-term contracts. Now it’s even easier for developers to add location awareness to their asset-based applications.
Read More for the details.
Today, Amazon Location Service added matrix routing, making it easier for customers to quickly calculate driving time and driving distance between multiple origins and destinations. With matrix routing, developers can use a single API request to reduce the latency associated with multiple routing calculations, allowing them to simplify their code, and improve the experience of their customers. For example, an application that plans delivery routes can now use Amazon Location’s matrix routing capabilities to request the driving time and distance for all deliveries for a given period. They can reduce the number of requests and latency by making one request of up to 350 origins by 350 destination to retrieve 122,500 drive times and distances.
Read More for the details.
For years, retailers have been told that they must embrace a litany of new technologies, trends, and imperatives like online shopping, mobile apps, omnichannel, and digital transformation. In search of growth and stability, retailers adopted many of these, only to realize that for every box they ticked, there was another one waiting.
And then the pandemic hit, along with rising social movements and increasingly harsh weather. Some retailers were more prepared to take on these disruptions than others, which crystallized a new universal truth across the industry: the ability to adapt on the fly became the most important trait to survive and thrive.
Today’s retail landscape has surfaced both existing and new challenges for specialty and department store retailers. Approximately88% of purchases previously occurred within a store environment. Now, it’s closer to 59%, with the remainder done online or through other omni methods.
With such constant change and upheaval, it can feel like the mantra now is ABP: always be pivoting.
The big question isn’t just how to maintain constant momentum and agility—it’s how to do it without sapping your workforce, your inventory, or your profits in the process. The pivot is now a given. What matters is how you do it.
Adapting requires a flexible base of technology that allows retailers to shift and scale seamlessly with the needs of the moment.
They need to be able to leverage real-time insights and enhance customer experiences rapidly, online and in the real world (not to mention the growing hybridization that’s AR and VR). They need to modernize their stores to power engaging consumer and associate experiences. They need to enhance operations for rapid scaling between full operations and digital-only offerings.
To help retailers achieve these goals and more, Google Cloud is honing a trio of essential innovations: demand forecasting that harnesses the power of data analytics and artificial intelligence; enhanced product discovery to improve conversion across channels; and the tools to help create the modern store experience.
In other words, here’s some of the biggest ways we’re ready to help you pivot.
One of the greatest challenges for retailers when building organizational flexibility is managing inventory and the supply chain.
We are in the midst of one of the worst global supply chain crises, stemming from soaring demand and logistics issues brought on by the pandemic. This crisis has only heightened the challenge retailers face when assessing demand and product availability. Even in normal times, mismanagement of inventory can add up to a trillion-dollar problem, according to IHL Group (costing $634 billion in lost sales worldwide each year, while overstocks result in $472 billion in lost revenues due to markdowns).
On the flipside, optimizing your supply chain can lead to greater profits. For instance, McKinsey predicts that a 10% to 20% improvement in retail supply chain forecasting accuracy is likely to produce a 5% reduction in inventory costs and a 2% to 3% increase in revenues.
Some of the challenges related to demand forecasting include:
Low accuracy leads to excess inventory, missed sales, and pressure on fragile supply chains.
Real drivers of product demand are not included, because large datasets are hard to model using traditional methods.
Poor accuracy for new product launches and products that have sparse or intermittent demand.
Complex models are hard to understand, leading to poor product allocation and low return on investment on promotions.
Different departments use different methods, leading to miscommunication and costly reconciliation errors.
AI-based demand forecasting techniques can help. Vertex AI Forecast supports retailers in maintaining greater inventory flexibility by infusing machine learning into their existing systems. Machine learning and AI-based forecasting models like Vertex AI are able to digest large sets of disparate data, drive analytics and automatically adjust when provided with new information.
With these machine learning models, retailers can not only incorporate historical sales data, but also use close to real-time data such as marketing campaigns, web actions like a customer clicking the “add to cart” button on a website, local weather forecasts, and much more.
If customers can’t easily find what they are looking for, whether online or at the store, they will turn to someone else. That’s a simple statement, but one with profound impacts.
In research conducted by The Harris Poll and Google Cloud, we found that over a six month period, 95% of consumers received search results that were not relevant to what they were searching for on a retail website. And roughly 85% of consumers view a brand differently after an unsuccessful search, while 74% say they avoid websites where they’ve experienced search difficulties in the past.
Each year, retailers lose more than $300 billion dollars from search abandonment, or when a consumer searches for a product on a retailer’s website but does not find what they are looking for. Our product discovery solutions help you surface the right products, to the right customers, at the right time. These solutions include:
Vision Product Search, which is like bringing the augmented reality of Google Lens to a retailer’s own branded mobile app experience. Both shoppers and retail store associates can search for products using an image they’ve photographed or found online and receive a ranked list of similar items.
Recommendations AI, which enables retailers to deliver highly personalized recommendations at scale across channels.
Retail Search, which provides Google-quality search results on a retailer’s own website and mobile applications.
All three are powered by Google Cloud, leveraging Google’s advanced understanding of user context and intent, utilizing technology to deliver a seamless experience to every shopper. With these combined capabilities, retailers are able to reduce search abandonment and improve conversions across their digital properties.
Stores are no longer places for just browsing and buying. They must be flexible operation centers, ready to pivot to address changing circumstances. The modern store must be multiple things at once: a mini-fulfillment and return center, a recommendation engine, a shopping destination, a fun place to work, and more.
Just as retail companies had to embrace omnichannel, stores are now becoming omnichannel centers on their own, mixing the digital and physical into a single location. Retailers can use physical stores as a vehicle to deliver superior customer experiences. This will demand heightened levels of collaboration and cooperation between stores, digital, and tech infrastructure teams, building on the agile ways they have worked together.
In many ways, it’s about allowing our physical spaces to function more like digital ones. Google Cloud can help by bringing the scalability, security, and reliability of the cloud to the store, allowing physical locations to upgrade infrastructure and modernize their internal and customer-facing applications.
Think of it as when a new OS gets released for your phone. It’s the same small, hard box, but the experience can feel radically different. Now, extend that same idea to a digitally enabled store. With the right displays, interfaces, and tools at a given retail location, the team only needs to send an over-the-air update to create radically fresh experiences, ranging from sales displays to fulfillment or employee engagement.
Such an approach can enable streamlined experiences for both customers and store associates. For instance, when it comes to the modern store’s evolving role as a fulfillment or return center, cloud solutions can help drive efficiency in stores through automation of ordering, replenishment, and fulfillment of omnichannel order selection.
Similar tools for personalized product discovery online can be applied to customers in the store, helping them to browse and explore, or even create a tailored shopping experience.
The impact of store associates can be maximized by equipping them with technology to provide expertise that drives value-added customer service, as well as increasing productivity in stores by streamlining operations, thus lowering overhead cost. At the register, customers should be able to enjoy frictionless checkout while ensuring reliable, accurate, secure transactions.
The ability to adapt and pivot to meet today’s changing consumer needs requires that retailers rely on modern tools to obtain operational flexibility. We believe that every company can be a tech company. That every decision is data driven. That every store is physical and digital all at once. That every worker is a tech worker.
Google Cloud works with retailers to help them solve their most challenging problems. We have the unique ability to handle massive amounts of unstructured data, in addition to advanced capabilities in AI and ML. Our products and solutions help retailers focus on what’s most important—from improving operations to capturing digital and omnichannel revenue.
Read More for the details.
We’ve made several updates to Google Cloud VMware Engine in the past few months — today’s post provides a recap of our latest milestones.
Google Cloud VMware Engine delivers an enterprise-grade VMware stack running natively in Google Cloud. This fully managed cloud service is one of the fastest paths to the cloud for VMware workloads without making changes to existing applications or operating models across a variety of use-cases. These include rapid data center exit, application lift and shift, disaster recovery, virtual desktop infrastructure, or modernization at your own pace.
The service helps our customers save money and time while accelerating their digital transformation journey. In fact, in a study conducted by VMware’s Cloud Economics team, Google Cloud VMware Engine delivers an average of 45% lower TCO compared to on-premises.1 Further, LIQ, a CRM software company was able to achieve 60% total infrastructure cost reduction compared with two years ago, and a 92% savings rate for storing historical data.
In June of 2021 we announced Autoscale, Mumbai expansion and more.
Key updates this time around include:
Single node private cloud: a time-bound, 60-day, single node non-production environment for VMware Engine that allows you to do proofs-of-concept.
New private clouds will now deploy on vSphere version 7.0 Update 2 and NSX-T version 3.1.2.
Preview of NetApp Cloud Volumes Service enabling independent scaling of datastore storage from compute without adding additional hosts
Service availability in Toronto and expansion into a second zone in Frankfurt and Sydney
Compliance certifications updates: achievement of ISO 27001/27017/27018, SOC 2 Type 2, SOC 3 and PCI-DSS compliance certifications
Ability to purchase Prepay option via the Google Cloud console for 1 year and 3 year commitment terms
Let us look into each of these updates in more depth.
Single node private cloud: We understand that your Cloud Transformation decisions do not happen overnight. Often you want to understand the values and benefits of your option by using products through trials and technical validations. To support such scenarios, you can now get started with your Google Cloud VMware Engine experience with a 60-day time-bound single node private cloud. Designed for non-production usage such as pilots and proof-of-concept evaluations, this configuration allows you to understand the capabilities of this service. It has a 60-day time span – this means that after 60 days, the single node private cloud is automatically deleted along with the workloads and data in it. At any point during these 60 days, you can expand to a production 3 node private cloud with a single click.
Note: A private cloud must contain at least 3 nodes to be eligible for coverage based on the SLA.
Upgrades to the core VMware stack: All new VMware Engine private clouds now deploy with VMware vSphere version 7.0 Update 2 and NSX-T version 3.1.2. For existing customers, Google Cloud VMware Engine automatically handles the upgrades of the VMware stack from version 7.0 Update 1 to 7.0 Update 2 and the NSX-T stack from version 3.0 to 3.1.2 with customers receiving proactive notifications and having the ability to select their upgrade window. Read more in our November 2021 service announcement.
ESXi: Enhanced administrative capabilities, reduced compute and I/O latency, and jitter for latency sensitive workloads, and more
vCenter: Scaled VMware vSphere vMotion operations, security fixes and more.
NSX-T: New events and alarms, support for parallel cluster upgrade, migration from NVDS to VDS and more
Preview of NetApp Cloud Volumes Service as datastores: This capability will enable you to independently scale your datastore storage without adding additional hosts, thereby saving costs. In October 2021, NetApp announced the integration of NetApp Cloud Volumes Service (CVS) as datastores for Google Cloud VMware Engine. It will enable you to migrate your vSphere workloads that require large amounts of vmdk storage to the cloud and address the needs of storage-bound workloads and use-cases such as DR. This complements the ability for you to use NetApp CVS as external storage that is mounted from within the guest OS of your Google Cloud VMware Engine VMs.
Google Cloud VMware Engine is now available in the Toronto region. This brings the availability of the service to 13 regions globally, enabling our multi-national and regional customers to leverage a VMware-compatible infrastructure-as-a-service platform on Google Cloud.
Expansion into a second zone in Frankfurt and Sydney: While we provide 4-9’s of SLA in a single zone in each one of the 13 regions that the service is available in, there are customers who want even more availability. We are happy to announce that Google Cloud VMware Engine is now available in second zones in Frankfurt and Sydney. In addition, we are working on making Google Cloud VMware Engine available in additional zones.
Compliance certifications updates:
We enable customers to meet their security and compliance needs for their VMware workloads – with a single operator model. Google manages the Google Cloud VMware Engine infrastructure and the administrative tasks that go with managing the systems, platforms, and VMware stack that supports it. These components run on Google Cloud, which leverages the same secure-by-design infrastructure, built-in protection, and global network that Google uses to protect your information, identities, applications, and devices.
One of the areas that we have been working on is adding more compliance certifications to Google Cloud VMware Engine. As you may remember, Google Cloud VMware Engine is covered under the Google Cloud Business Associate Agreement (BAA). Let us take a look at new certifications we have achieved in the last few months. The below certifications are available for Google Cloud VMware Engine running in Ashburn, Los Angeles, Frankfurt, London, Tokyo, Sydney, Netherlands, Singapore, São Paulo, Montreal, Council Bluffs, Mumbai. The supported locations are listed in the corresponding audit reports. Your Google contact should be able to provide you with those reports.
ISO Compliance: As of November 4 2021, Google Cloud VMware Engine is certified as ISO/IEC 27001/27017/27018 compliant. The International Organization for Standardization (ISO) is an independent, non-governmental international organization with an international membership of 163 national standards bodies. The ISO/IEC 27000 family of standards enable organizations to keep their information assets more secure.
SOC 2 Type 2 and SOC 3 Compliance: Google Cloud VMware Engine has received the SOC 2 Type 2 as well as the SOC 3 report based on third-party audit.
The SOC 2 is a report based on the Auditing Standards Board of the American Institute of Certified Public Accountants’ (AICPA) existing Trust Services Criteria (TSC). The purpose of this report is to evaluate an organization’s information systems relevant to security, availability, processing integrity, confidentiality, and privacy.
Like SOC 2, the SOC 3 report has been developed based on the Auditing Standards Board of the American Institute of Certified Public Accountants’ (AICPA) Trust Service Criteria (TSC). The SOC 3 is a public report of internal controls over security, availability, processing integrity, and confidentiality.
Please contact your Google account team if you would like a copy of the report.
PCI DSS Compliance: Google Cloud VMware Engine has been reviewed by an independent Qualified Security Assessor and determined to be PCI DSS 3.2.1 compliant. This means that the service provides an infrastructure upon which customers may build their own services or applications which store, process, or transmit cardholder data. It is important to note that customers are still responsible for ensuring that their applications are PCI DSS compliant. PCI DSS is a set of network security and business best practices guidelines adopted by the PCI Security Standards Council to establish a “minimum security standard” to protect customers’ payment card information. Google Cloud undergoes at least an annual third-party audit to certify individual products against the PCI DSS.
Please contact your Google account team if you would like a copy of the reports.
Prepay via Google Cloud Console: As you are aware, you have monthly as well as prepay options for 1 year and 3 year commitment contracts for purchasing Google Cloud VMware Engine. Monthly payment options are executable via the Google Cloud console, but prepay options require offline order processing. Prepay options are attractive due to the high discount levels they create (up to 50% discounts are possible). We are working on enabling prepay purchasing option directly via your Google Cloud console. If you are interested in this capability, please contact your Google Sales representative.
This brings us to the end of our updates this time around. For the latest updates to the service, please bookmark our release notes.
The authors would like to thank Krishna Chengavalli and Manish Lohani for their contributions to this article.
1. https://blogs.vmware.com/cloud/2021/07/28/google-cloud-vmware-engine-saves-over-45-on-tco-in-first-study/
Read More for the details.
If you manage sensitive data in your MySQL database, you might be obligated to record and monitor user database activity, especially if you work in a regulated industry. Although you could set up MySQL’s slow query log or general log to create an audit trail of user activity, these logs significantly impact database performance and aren’t formatted optimally for auditing. Purpose-built, open source audit plugins are better, but they lack some of the advanced security features that enterprise users need, such as rule-based auditing and results masking.
Cloud SQL for MySQL has developed a new audit plugin called the Cloud SQL for MySQL Audit Plugin that offers enterprise-grade database auditing to help you maintain a strong, compliant security posture. You can now define audit rules that govern which database activity is recorded. This activity is recorded in the form of database audit logs. The plugin masks sensitive data out of the audit logs, such as user passwords, and processed database audit logs are then sent to Cloud Logging, where you can view them to understand who performed what operations on which data, when. You can also route these logs using a user-defined log sink to a Google Cloud Storage bucket or BigQuery for long-term storage for compliance reasons or Splunk or another log management tool to detect unusual activity in real-time.
Say you’re a security engineer at Money Buckets Bank and you’ve been asked by the compliance department to audit activity on the “bank-prod” Cloud SQL instance in the “money-buckets” project. You’re asked to audit two types of activity:
Any write activity by any user on the sensitive “transactions” table in the “finance” database.
Any activity by the “dba1” and “dba2” superuser accounts.
As a security engineer, you want to narrowly define rules that only audit the sensitive activity, ensuring minimal impact to database performance. After enabling MySQL database auditing, you would call MySQL stored procedures to configure the two audit rules:
The plugin stores these audit rules in the “mysql” system database. The plugin monitors database activity from MySQL’s Audit API and, when activity matches the audit rules, records a log to send to Cloud Logging.
Later that month, you decide to review these audit logs in the Logs Explorer. To isolate all the MySQL database auditing log entries from the “money-buckets” project, you’d enter in the following query filter:
You can now use these audit log entries in your audit trail in order to comply with the key finance regulations that govern Money Buckets Bank.
With MySQL database auditing, you can collect audit records of user database activity for security and compliance purposes. To learn more about database auditing for Cloud SQL for MySQL, see the documentation.
Read More for the details.
The uniqueness of each organization in the enterprise IT space creates interesting challenges in how they need to handle alerts. With many commercial tools in the IT Service Management (ITSM) market, and lots of custom internal tools, we equip teams with tools that are both flexible and powerful.
This post is for Google Cloud customers who want to deliver Cloud Monitoring alert notifications to third-party services that don’t have supported notification channels.
It provides a working implementation of integrating Cloud Pub/Sub notification channels with the Google Chat service to forward the alert notifications to Google Chat rooms and demonstrates how this is deployed on Google Cloud. Moreover, it outlines steps for continuous integration using Cloud Build, Terraform, and GitHub. All the source code for this project can be found in this GitHub repository.
It is worth noting that the tutorial provides a generic framework that can be adapted by Google Cloud customers to deliver alert notifications to any 3rd-party services that provide Webhook/Http API interfaces.
Instructions for how to modify the sample code to integrate with other 3rd-party services is explained in the section “Extending to other 3rd-party services“.
Write a service to forward Google Cloud Monitoring alert notifications from Cloud Monitoring Pub/Sub notification channels to a third-party service.
Build and deploy the service to Cloud Run using Cloud Build, Terraform, and GitHub.
This tutorial uses billable components of Google Cloud:
Cloud Build
Cloud Compute Engine (GCE)
Cloud Container Registry
Cloud Pub/Sub
Cloud Run
Cloud Storage
Use the pricing calculator to generate a cost estimate based on your projected usage.
For this tutorial, you need a GCP project. You can create a new project or select a project that you’ve already created:
Select or create a Google Cloud project.
Go to the project selector page
Enable billing for your project.
Enable billing
When you finish this tutorial, you can avoid continued billing by deleting the resources you created. For details, see the “Cleaning up” section at the end of this tutorial.
This tutorial provides a sample integration to enable Google Cloud customers to forward alert notifications to their Google Chat rooms. The system architecture is as follows:
In the example, two monitoring alerting policies are created using Terraform: one is based on the GCE instance CPU usage_time metric and the other is based on the GCE instance disk read_bytes_count metric. Both alert policies use Cloud Monitoring Pub/Sub notification channels to send alert notifications. A Cloud Pub/Sub push subscription is configured for each Cloud Pub/Sub notification channel. The push endpoints of the Cloud Pub/Sub push subscriptions are pointed to the Cloud Run service we implement so that all the alert notifications sent to the Cloud Pub/Sub notification channels are forwarded to the Cloud Run service. The Cloud Run service is a simple Http server that transforms the incoming Cloud Pub/Sub messages into Google Chat messages and sends them to the configured Google Chat rooms via their incoming Webhook URLs.
All the infrastructure components are automatically created and configured using Terraform, which include:
Cloud Pub/Sub topics, push subscriptions, and service account setup.
Cloud Pub/Sub notification channels
Cloud Monitoring Alerting policies
Cloud Run service and service account setup.
The Terraform code can be found at ./tf-modules and ./environments.
The Cloud Run service is responsible for delivering the Cloud Pub/Sub alert notifications to the configured Google Chat rooms. The integration code is located in the ./notification_integrationfolder.
In this example, a basic Flask HTTP server is set up in main.py to handle incoming Cloud Monitoring alert notifications from Cloud Monitoring Pub/Sub channels. We use Cloud Pub/Sub push subscriptions to forward the Pub/Sub notification messages to the Flask server in real time. More information on Cloud Pub/Sub subscription can be found in the Subscriber overview.
Below is a handler that processes the Pub/Sub message:
The handler calls the ExtractNotificationFromPubSubMsg() function in utilities/pubsub.py to parse the relevant notification data from the Pub/Sub message, and then loads the notification data into a dictionary. The output is a json object with the schema defined here.
This notification dictionary is then passed to SendNotification() which sends the notification along with config_params to the _SendHttpRequest(), in utilities/service_handler.py, which appropriately notifies the third-party service about the alert with an API client. There is a URL parameter “config_id”, which is the configuration ID used by the Cloud Run service to retrieve the configuration data “config_params”. “Config_params” includes all the needed parameters (e.g. HTTP URL and user credentials) for the Cloud Run service to forward the incoming notification to the third-party service. In this example, “config_id” corresponds to the Pub/Sub topics defined here.
You can modify this dispatch function to forward alerts to any third-party service.
Remember to acknowledge the Pub/Sub message on success by returning a success HTTP status code (200 or 204). See Receiving push messages.
All the logs written in the Cloud Run service can be easily accessed either from the Cloud Logging Logs Explorer or the Cloud Run UI. The logs are very useful for debugging the Cloud Run service. Moreover, users can create an extra pull subscription of the Pub/Sub topic used by the Cloud Pub/Sub notification channel to simplify the triage of notification delivery issues. For example, if some alert notifications were not delivered to users’ Google Chat room, users could first check if the pull subscription received the Cloud Pub/Sub messages of the missing alert notifications. If the pull subscription correctly received the missing alert notifications, then it means the alert notifications got lost in the Cloud Run service. Otherwise, it was the Cloud Pub/Sub notification channel issue.
Finally, there is a Dockerfile containing instructions to build an image that hosts the Flask server when deployed to Cloud Run:
This section describes how to deploy and set up continuous integration using Cloud Build, Terraform, and GitHub, following the GitOps methodology. The instructions are based on Managing infrastructure as code with Terraform, Cloud Build, and GitOps, which also explains the GitOps methodology and architecture. Sections from the guide are also referenced in the steps below. An important difference is that this document assumes that separate Google Cloud projects are used for the dev and prod environments, whereas the referenced guide configures the environments as virtual private clouds (VPCs). As a result, the following deployment steps (with the exception of “Setting up your GitHub repository”) need to be executed for each of the dev and prod projects.
To get all the code and understand the repository structure needed to deploy your app, follow the steps in Setting up your GitHub repository.
We provided within main.py a config_map variable to store your webhook urls. You’ll first need to locate your Google Chat webhook url and replace the value for the key ‘webhook_url’ within the config_map dictionary.
Alternatively if you’d like to have a more secure option to store your webhook urls, you can create a GCS bucket to store your webhook urls.
Locate and store your Google Chat webhook url for your gchat rooms in a json file named config_params.json in the format of:
{“topic”: “webhook url”, “topic”: “webhook url”}
Create a Cloud Storage bucket to store the json file with the name gcs_config_bucket_{PROJECT_ID}.
You can also run this command in the cloud console: gsutil mb gs://gcs_config_bucket_{PROJECT_ID}
Grant the read permissions (Storage Legacy Bucket Reader and Storage Legacy Object Reader) to the default Cloud Run service account <PROJECT_NUMBER>-compute@developer.gserviceaccount.com
To deploy the notification channel integration sample for the first time automatically, we’ve provided a script deploy.py that will handle a majority of the required actions for deployment. After completing the webhook url step above run the following command:
Python3 deploy.py -p <PROJECT_ID>
To deploy the notification channel integration manually, you’ll have to complete the following steps:
1. Set the Cloud Platform Project in Cloud Shell. Replace <PROJECT_ID> with your Cloud Platform project id:
gcloud config set project <PROJECT_ID>
2. Enable the Cloud Build Service:
gcloud services enable cloudbuild.googleapis.com
3. Enable the Cloud Resource Manager Service:
gcloud services enable cloudresourcemanager.googleapis.com
4. Enable the Cloud Service Usage Service:
gcloud services enable serviceusage.googleapis.com
5. Grant the required permissions to your Cloud Build service account:
CLOUDBUILD_SA=”$(gcloud projects describe $PROJECT_ID –format ‘value(projectNumber)’)@cloudbuild.gserviceaccount.com”
gcloud projects add-iam-policy-binding $PROJECT_ID –member serviceAccount:$CLOUDBUILD_SA –role roles/iam.securityAdmin
gcloud projects add-iam-policy-binding $PROJECT_ID –member serviceAccount:$CLOUDBUILD_SA –role roles/run.admin
gcloud projects add-iam-policy-binding $PROJECT_ID –member serviceAccount:$CLOUDBUILD_SA –role roles/editor
6. Create Cloud Storage bucket to store Terraform states remotely:
PROJECT_ID=$(gcloud config get-value project)
gsutil mb gs://${PROJECT_ID}-tfstate
7. (Optional) You may enable Object Versioning to keep the history of your deployments:
gsutil versioning set on gs://${PROJECT_ID}-tfstate
8. Trigger a build and deploy to Cloud Run:
If you used the in-memory config server, run (replace <BRANCH> with the current environment branch)
gcloud builds submit . –config cloudbuild.yaml –substitutions BRANCH_NAME=<BRANCH>,_CONFIG_SERVER_TYPE=in-memory
If you use the GCS based config server, run:
gcloud builds submit . –config cloudbuild.yaml –substitutions BRANCH_NAME=<BRANCH>,_CONFIG_SERVER_TYPE=gcs
This is an optional flow and this section describes how to set up continuous deployment using Cloud Build through the use of triggers. The flow is demonstrated in the following diagram: every time users push a new version to their Git repository, it will trigger the Cloud Build trigger; the Cloud Build will run the YAML file to rebuild the Cloud Run docker image, update the infrastructure setup, and redeploy the Cloud Run service.
The instructions are based on Automating builds with Cloud Build.
Set up a code repository, this could be GitHub, Google Cloud Source repository or any private repository.
Clone the repository from our GitHub.
Switch to the new project and push the cloned repository to the remote repository.
Next we create a new trigger in Cloud Build.
Step 1: Go to Cloud Build and Click “Triggers”
Step 2: Click “Create Trigger”
If you created a new project for this tutorial, delete the project. If you used an existing project and wish to keep it without the changes added in this tutorial, delete resources created for the tutorial.
The easiest way to eliminate billing is to delete the project you created for the tutorial.
Deleting a project has the following effects:
Everything in the project is deleted. If you used an existing project for this tutorial, when you delete it, you also delete any other work you’ve done in the project.
Custom project IDs are lost. When you created this project, you might have created a custom project ID that you want to use in the future. To preserve the URLs that use the project ID, such as an appspot.com URL, delete selected resources inside the project instead of deleting the whole project.
If you plan to explore multiple tutorials and quickstarts, reusing projects can help you avoid exceeding project quota limits.
To delete a project, do the following:
In the Cloud Console, go to the Manage resources page.
Go to the Manage resources page
In the project list, select the project that you want to delete and then click Delete.
In the dialog, type the project ID and then click Shut down to delete the project.
Delete the Cloud resources provisioned by Terraform:
terraform destroy
Delete the Cloud Storage bucket called {PROJECT_ID}-tfstate.
Delete permissions that were granted to the Cloud Build service account:
gcloud projects remove-iam-policy-binding $PROJECT_ID –member serviceAccount:$CLOUDBUILD_SA –role roles/iam.securityAdmin
gcloud projects remove-iam-policy-binding $PROJECT_ID –member serviceAccount:$CLOUDBUILD_SA –role roles/run.admin
gcloud projects remove-iam-policy-binding $PROJECT_ID –member serviceAccount:$CLOUDBUILD_SA –role roles/storage.admin
Delete permission for the service account to publish to tf-topic:
gcloud pubsub topics remove-iam-policy-binding projects/[PROJECT_NUMBER]/topics/tf-topic –role=roles/pubsub.publisher –member=serviceAccount:service-[PROJECT_NUMBER]@gcp-sa-monitoring-notification.iam.gserviceaccount.com
Delete the notification channel that uses tf-topic.
Delete your forked GitHub repository notification_integration.
Disconnect the GitHub repository from Cloud Build by deleting the Cloud Build triggers.
The sample code in the tutorial provides a generic framework and can be easily customized for Google Cloud customers to deliver alert notifications to any 3rd-party services that provide Webhook/Http API interfaces.
To integrate with a new 3rd-party service, we can create a new derived class of the abstract class HttpRequestBasedHandler defined in ./notification_channel/service_handler.py and updated the following member functions:
CheckConfigParams(): A function that checks if a given integration configuration is valid, e.g. a required API key is given.
_GetHttpUrl(): A function to get the Http url (where to send Http requests) from the configuration data
_BuildHttpRequestHeaders(): A function that constructs the Http request header.
_BuildHttpRequestBody(): A function that constructs the Http request message body based on the incoming Cloud Pub/Sub message.
SendNotification(): You can reuse the one defined in the GchatHandler class.
There is no need to update the Terraform code, except you need to customize your alert policies. If you have additional suggestions, community feedback is always welcome. Please submit pull requests to continue to build the GitHub repository together.
Read More for the details.
When an alert fires from your applications, your team needs to know as soon as possible to mitigate any user-facing issues. Customers with complex operating environments rely on incident management or related services to organize and coordinate their responses to issues. They need the flexibility to route alert notifications to platforms or services in the formats that they can accept.
We’re excited to share that Google Cloud Monitoring’s Webhooks, Pub/Sub, and Slack notification channels for alerting are now Generally Available (GA). Along with our existing notification channels of email, SMS, mobile, and PagerDuty (currently in Beta), Google Cloud alerts can now be routed to many widely used services. These new notification channels can be used to integrate alerts with the most popular Collaboration, ITSM, Incident Management, and virtually any other service or software that support Webhooks or Pub/Sub integration.
You can configure your Google Cloud alerts to be sent to any vendor or custom-built tool used by your team. For example, your GKE cluster uptime checks can send the alert data to a 3rd party communication tool via the pub/sub notification channel. Or if you’re tracking security concerns such as unexpected IP addresses, you can send a log-based alert to your incident management provider.
For custom integrations, Pub/Sub is the recommended approach for sending notifications to a private network. Webhooks are supported for public endpoints and are available with basic and token authentication. Both of these notification channels can be enabled programmatically through an automation tool like Terraform.
If you’re using Slack, you can enable Cloud Monitoring access to your Slack channel/workspace and then create the notification channel. If you’d like to automate Slack channel notification deployments, you’ll need to create and install your own Slack app and reuse the OAuth token instead of using the Google Cloud Monitoring app.
If you’d like to learn more, check out our example tutorial blog on how to send pub/sub notifications to external vendors using Cloud Run and Cloud Build. Please feel free to share your comments and feedback with us in the Google Cloud Community.
Read More for the details.
On January 18th, Amazon announced quarterly security and critical updates for Amazon Corretto Long-Term Supported (LTS) versions. Corretto 11.0.14 and 8.322 are now available for download. Amazon Corretto 17 updates will be available shortly after the release is tagged in the OpenJDK 17 repository. Amazon Corretto is a no-cost, multi-platform, production-ready distribution of OpenJDK.
Read More for the details.
EMR Studio is an integrated development environment (IDE) that makes it easy for data scientists and data engineers to develop, visualize, and debug big data and analytics applications written in R, Python, Scala, and PySpark. Today, we are excited to announce that EMR Studio is now available in US West (N. California) region.
Read More for the details.
AWS Storage Gateway now makes it simpler and faster for you to get started with setting up and managing your hybrid cloud storage workflow. Using the Storage Gateway management console, you can now quickly create a new gateway in four easy steps: First, complete your local gateway setup. Second, connect your gateway to AWS. Third, activate your gateway. Fourth, finalize your gateway configuration.
Read More for the details.
Amazon MQ now provides support for RabbitMQ version 3.8.26. This patch update to RabbitMQ contains several fixes and enhancements compared to the previously supported version, RabbitMQ 3.8.23.
Read More for the details.
Explainable AI(XAI) helps you understand and interpret how your machine learning models make decisions. We’re excited to announce that BigQuery Explainable AI is now generally available (GA). BigQuery is the data warehouse that supports explainable AI in a most comprehensive way w.r.t both XAI methodology and model types. It does this at BigQuery scale, enabling millions of explanations within seconds with a single SQL query.
Why is Explainable AI so important? To demystify the inner workings of machine learning models, Explainable AI is quickly becoming an essential and growing need for businesses as they continue to invest in AI and ML. With 76% of enterprises now prioritizing artificial intelligence (AI) and machine learning (ML) over other initiatives in 2021 IT budgets, the majority of CEOs (82%) believe that AI-based decisions must be explainable to be trusted according to a PwC survey.
While the focus of this blogpost is on BigQuery Explainable AI, Google Cloud provides a variety of tools and frameworks to help you interpret models outside of BigQuery, such as with Vertex Explainable AI, which includes AutoML Tables, AutoML Vision, and custom-trained models.
So how does Explainable AI in BigQuery work exactly? And how might you use it in practice?
When it comes to Explainable AI, the first thing to note is that there are two main types of explainability as they relate to the features used to train the ML model: global explainability and local explainability.
Imagine that you have a ML model that predicts housing price (as a dollar amount), based on three features: (1) number of bedrooms, (2) distance to the nearest city center, and (3) construction date.
Global explainability (a.k.a. global feature importance) describes the features’ overall influence on the model and helps you understand if a feature had a greater influence than other features over the model’s predictions. For example, global explainability can reveal that the number of bedrooms and distance to city center typically has a much stronger influence than the construction date on predicting housing prices. Global explainability is especially useful if you have hundreds or thousands of features and you want to determine which features are the most important contributors to your model. You may also consider using global explainability as a way to identify and prune less important features to improve the generalizability of their models.
Local explainability (a.k.a. feature attributions) describes the breakdown of how each feature contributes towards a specific prediction. For example, if the model predicts that house ID#1001 has a predicted price of $230,000, local explainability would describe a baseline amount (e.g. $50,000) and how each of the features contributes on top of the baseline towards the predicted price. For example, the model may say that on top of the baseline of $50,000, having 3 bedrooms contributed an additional $50,000, close proximity to the city center added $100,000, and construction date of 2010 added $30,000, for a total predicted price of $230,000. In essence, understanding the exact contribution of each feature used by the model to make each prediction is the main purpose of local explainability.
BigQuery Explainable AI applies to a variety of models, including supervised learning models for IID data and time series models. The documentation for BigQuery Explainable AI provides an overview of the different ways of applying explainability per model. Note that each explainability method has its own way of calculation (e.g. Shapley values), which are covered more in-depth in the documentation.
In this next section, we will show three examples of how to use BigQuery Explainable AI in different ML applications:
regression classificationtime-series forecasting
Let’s use a boosted tree regression model to predict how much a taxi cab driver will receive in tips for a taxi ride, based on features such as number of passengers, payment type, total payment and trip distance. Then let’s use BigQuery Explainable AI to help us understand how the model made the predictions in terms of global explainability (which features were most important?) and local explainability (how did the model arrive at each prediction?).
The taxi trips dataset comes from the BigQuery public datasets and is publicly available in the table: bigquery-public-data.new_york_taxi_trips.tlc_yellow_trips_2018.
First, you can train a boosted tree regression model.
Now let’s do a prediction using ML.PREDICT, which is the standard way in BigQuery ML to make predictions without explainability.
But you might wonder—how did the model generate this prediction of ~11.077?
BigQuery Explainable AI can help us answer this question. Instead of using ML.PREDICT, you use ML.EXPLAIN_PREDICT with an additional optional parameter top_k_features. ML.EXPLAIN_PREDICT extends the capabilities of ML.PREDICT by outputting several additional columns that explain how each feature contributes to the predicted value. In fact, since ML.EXPLAIN_PREDICT includes all the output from ML.PREDICT anyway, you may want to consider using ML.EXPLAIN_PREDICT every time instead.
The way to interpret these columns is:
Σfeature_attributions + baseline_prediction_value = prediction_value
Let’s break this down. The prediction_value is ~11.077, which is simply the predicted_tip_amount. The baseline_prediction_value is ~6.184, which is the tip amount for an average instance. top_feature_attributions indicates how much each of the features contributes towards the prediction value. For example, total_amount contributes ~2.540 to the predicted_tip_amount.
ML.EXPLAIN_PREDICT provides local feature explainability for regression models. For global feature importance, see the documentation for ML.GLOBAL_EXPLAIN.
Let’s use a logistic regression model to show you an example of BigQuery Explainable AI with classification models. We can use the same public dataset as before: bigquery-public-data.new_york_taxi_trips.tlc_yellow_trips_2018.
Train a logistic regression model to predict the bracket of the percentage of the tip amount out of the taxi bill.
Next, you can run ML.EXPLAIN_PREDICT to get both the classification results and the additional information for local feature explainability. For global explainability, you can use ML.GLOBAL_EXPLAIN. Again, since ML.EXPLAIN_PREDICT includes all the output from ML.PREDICT anyway, you may want to consider using ML.EXPLAIN_PREDICT every time instead.
Similar to the regression example earlier, the formula is used to derive the prediction_value:
Σfeature_attributions + baseline_prediction_value = prediction_value
As you can see in the screenshot above, the baseline_prediction_value is ~0.296. total_amount is the most important feature in making this specific prediction, contributing ~0.067 to the prediction_value, though followed by trip_distance. The feature passenger_count contributes negatively to prediction_value by -0.0015. The features vendor_id, rate_code, and payment_type did not seem to contribute much to the prediction_value.
You may wonder why the prediction_value of ~0.389 doesn’t equal the probability value of ~0.359. The reason is that unlike for regression models, for classification models, prediction_value is not a probability score. Instead, prediction_value is the logit value (i.e., log-odds) for the predicted class, which you could separately convert to probabilities by applying the softmax transformation to the logit values. For example, a three-class classification has a log-odds output of [2.446, -2.021, -2.190]. After applying the softmax transformation, the probability of these class predictions is [0.9905, 0.0056, 0.0038].
Explainable AI for forecasting provides more interpretability into how the forecasting model came to its predictions. Let’s go through an example of forecasting the number of bike trips in NYC using the new_york.citibike_trips public data in BigQuery.
You can train a time-series model ARIMA_PLUS:
This function outputs the forecasted values and the prediction interval. Plotting it in addition to the input time series gives the following figure.
But how does the forecasting model arrive at its predictions? Explainability is especially important if the model ever generates unexpected results.
With ML.EXPLAIN_FORECAST, BigQuery Explainable AI provides extra transparency into the seasonality, trend, holiday effects, level (step) changes, and spikes and dips outlier removal. In fact, since ML.EXPLAIN_FORECAST includes all the output from ML.FORECAST anyway, you may want to consider using ML.EXPLAIN_FORECAST every time instead.
Compared to the previous figure which only shows the forecasting results, this figure shows much richer information to explain how the forecast is made.
First, it shows how the input time series is adjusted by removing the spikes and dips anomalies, and by compensating the level changes. That is:
time_series_adjusted_data = time_series_data – spikes_and_dips – step_changes
Second, it shows how the adjusted input time series is decomposed into different components such as both weekly and yearly seasonal components, holiday effect component and trend component. That is
time_series_adjusted_data = trend + seasonal_period_yearly + seasonal_period_weekly + holiday_effect + residual
Finally, it shows how these components are forecasted separately to compose the final forecasting results. That is:
time_series_data = trend + seasonal_period_yearly + seasonal_period_weekly + holiday_effect
For more information on these time series components, please see the documentation here.
With the GA of BigQuery Explainable AI, we hope you will now be able to interpret your machine learning models with ease.
Thanks to the BigQuery ML team, especially Lisa Yin, Jiashang Liu, Amir Hormati, Mingge Deng, Jerry Ye and Abhinav Khushraj. Also thanks to the Vertex Explainable AI team, especially David Pitman and Besim Avci.
Read More for the details.