Azure – Review your VNET configurations in Switzerland North that may be affected by IP address changes
On 30 September 2023 we’re making the resource providers for Azure API Management zone redundant in each region.
Read More for the details.
On 30 September 2023 we’re making the resource providers for Azure API Management zone redundant in each region.
Read More for the details.
On 30 September 2023, all API versions prior to 2021-08-01 will be retired and API calls using those API versions will fail.
Read More for the details.
Keep up with all the latest news, insights, and product announcements in the world of productivity and collaboration on the new Google Workspace blog. The blog features an easier-to-read layout, enhanced visual elements, and deeper content sections for each Workspace product area — from Docs and Drive to Meet and AppSheet.
The Google Cloud blog will continue to run stories about Workspace that matter to our broader Cloud audience.
Thanks for reading!
Looking for other resources?
Google Workspace on The Keyword blog: Workspace stories that matter most to consumers and users.Google Workspace Updates blog: Essential information about new features and improvements for Google Workspace customers.Google Workspace Learning Center: One place to find tips and tutorials for getting the most out of Workspace.Google Workspace for Developers: The latest for apps and integrations, including for no-code and low-code developers.
Read More for the details.
Starting today, you can use certificate-based authentication with Amazon AppStream 2.0 fleets that are joined to Active Directory to remove the logon prompt for the domain password.
Read More for the details.
Today, we are announcing Elastic IP transfer, a new Amazon VPC feature that allows you to transfer your Elastic IP addresses from one AWS Account to another, making it easier to move Elastic IP addresses during AWS Account restructuring.
Read More for the details.
AWS IoT Core announces Location Action – a capability to route latitude and longitude data from IoT devices to Amazon Location Service, making it easier for software developers to add geospatial data and location functionality to IoT applications. With this launch, you can route live location data of an IoT device to Amazon Location Service for tracking and geo-fencing use cases, such as tracking the live location of a device or receiving alerts when a device crosses a geo-fence.
Read More for the details.
Amazon Kinesis Data Streams is a serverless streaming data service that makes it easy to capture, process, and store streaming data at massive scale. Data Viewer is a new capability for Amazon Kinesis Data Streams that allows viewing data records directly from AWS Management Console. As a result, you can easily inspect the data records without programming a dedicated consumer app just to view the data, quickly check the data structure of an unfamiliar stream, or query specific records for QA and troubleshooting.
Read More for the details.
We are excited to announce that Amazon EMR release 6.8 includes Apache Flink 1.15.1. This feature is available on EMR on EC2.
Read More for the details.
Amazon EMR release 6.8 now supports Apache Hudi 0.11.1 and Apache Iceberg 0.14.0. You can use these frameworks on Amazon EMR on EC2, and Amazon EMR on EKS as well as on Amazon EMR Serverless.
Read More for the details.
Azure Automation now supports runbooks in latest Runtime versions – PowerShell 7.2 and Python 3.10 in public preview.
Read More for the details.
AWS Launch Wizard now supports for placing Microsoft SQL Server tempdb in instance store volumes during SQL Server deployment on Amazon EC2. With the launch, you can save time and effort by easily configuring tempdb by one click during deployment without needing to manually configuring it after deployment.
Read More for the details.
All Azure Windows VMs provisioned in Azure US Government Cloud after February 1, 2023, will be activated via aazkms.core.usgovcloudapi.net, which points to one new KMS IP address, 52.126.105.2.
Read More for the details.
Logic apps Standard support for Azure Functions runtime version 4.x – is now generally available.
Read More for the details.
Welcome to October’s Cloud CISO Perspectives. This month, we’re focusing on our just-completed Google Cloud Next conference and Mandiant’s inaugural mWise Conference, and what our slate of cybersecurity announcements can reveal about how we are approaching the thorniest cybersecurity challenges facing the industry today.
As I wrote in last month’s newsletter, a big part of our strategy involves integrating Mandiant’s threat intelligence with our own to help improve our ability to stop threats and to modernize the overall state of security operations faster than ever before. We focused on the democratization of SecOps to help provide better security outcomes for organizations of all sizes and levels of expertise. Therefore, it’s vital that our cybersecurity intelligence be an integral part of customer security strategies.
This is all part of our vision of engineering advanced capabilities into our platforms and simplifying operations, so that stronger security outcomes can be achieved.
As with all Cloud CISO Perspectives, the contents of this newsletter are posted to the Google Cloud blog. If you’re reading this on the website and you’d like to receive the email version, you can subscribe here.
I recently wrote on my personal blog about the grind of routine security work, and the challenges security professionals face in moving forward through our daily tasks and toil to achieve a better security state. We focus on two fundamentals: We strive to achieve grand challenges and create exponential growth in security outcomes, and we remain equally focused on tactical improvements to reduce the wear and tear of the daily grind.
Many of Google Cloud’s announcements at this year’s Next are the result of envisioning a new, improved security state, and working hard to achieve it.
At this year’s Next, we took a deep dive into our security philosophy, helped customers achieve their security goals with hands-on training, and made five major security announcements:
We introduced Chronicle Security Operations, which can help detect, investigate, and respond to cyberthreats with the speed, scale, and intelligence of Google.
We introduced Confidential Space, which can help unlock the value of secure data collaboration.
We introduced Software Delivery Shield, which can help improve software supply chain security.
We detailed our latest advancements in digital sovereignty, to address the growing demand for cloud solutions with high levels of control, transparency, and sovereignty.
And we introduced new and expanded Google Cloud partnerships with leaders across the security ecosystem.
We also revealed new capabilities across our existing slate of security products. These include:
Our Assured Open Source Software service, which we announced earlier this year, is now available in Preview.
The integration of groundbreaking technology from Foreseeti, which can help teams understand their exposure and prioritize contextualized vulnerability findings, will be coming soon to Security Command Center in Preview.
reCAPTCHA Enterprise will partner with Signifyd’s anti-fraud technology to bring to market a joint anti-fraud and abuse solution that can help enterprises reduce abuse, account takeovers, and payment fraud.
Palo Alto Networks customers can now pair Prisma Access with BeyondCorp Enterprise Essentials to help secure private and SaaS app access while mitigating threats with a secure enterprise browsing experience.
Google Workspace has received several security updates and advances. They bring data loss prevention (DLP) to Google Chat to help prevent sensitive information leaks, new Trust rules for Google Drive for more granular control of internal and external sharing, and client-side encryption in Gmail and Google Calendar to help address a broad range of data sovereignty and compliance requirements.
Google Cloud Armor, which was instrumental in stopping the largest Layer 7 DDoS attack to date, was named a Strong Performer in The Forrester Wave™: Web Application Firewalls, Q3 2022. This is our debut in the WAF Wave, and it’s encouraging to see the recognition for the product in this market segment.
New Private Service Connect capabilities available now in Preview include consumer-controlled security, routing, and telemetry to help enable more flexible and consistent policy for all services; support for on-prem traffic through Cloud Interconnects to PSC endpoints; support for hybrid environments; and five new partner managed services.
We are expanding our Cloud Firewall product line and introducing two new tiers: Cloud Firewall Essentials and Cloud Firewall Standard.
We want to help transform how organizations can secure themselves not just in the cloud but across all their environments. This also includes changing how security teams can engage and retain the support of their Boards and executive teams. At the mWise Conference held in Washington, D.C., the week following Next ‘22, in some of my remarks with Kevin Mandia we talked about the need for higher expectations of the board and CISO (and CIO) relationship to drive this transformation.
We’ve written about the importance of this change here in this newsletter, and we at Google Cloud have suggested 10 questions that can help facilitate better conversations between CISOs and their boards.
As you’ve seen, it’s been a bumper set of announcements and content this month. That momentum will continue as we further build the Most Trusted Cloud, now in partnership with our new colleagues from Mandiant.
Here are the latest updates, products, services and resources from our security teams this month:
How Cloud EKM can help resolve the cloud trust paradox: In the second of our “Best Kept Security Secrets” blog series, learn about Cloud External Key Manager, which can help organizations achieve even more control over their data in the cloud. Read more.
Announcing new GKE functionality for streamlined security management: To help make security easier to use and manage, our new built-in Google Kubernetes Engine (GKE) security posture dashboard provides security guidance for GKE clusters and containerized workloads, insights into vulnerabilities and workload configuration checks, and offers integrated event logging so you can subscribe to alerts and stream insight data elsewhere. Read more.
Introducing Sensitive Actions to help keep accounts secure: We operate in a shared fate model at Google Cloud, working in concert with our customers to help achieve stronger security outcomes. One of the ways we do this is to identify potentially risky behavior to help customers determine if action is appropriate. To this end, we now provide insights on what we are calling Sensitive Actions. Learn more.
How to secure APIs against fraud and abuse with reCAPTCHA Enterprise and Apigee X: A comprehensive API security strategy requires protection from fraud and abuse. Developers can prevent attacks, reduce their API security surface area, and minimize disruption to users by implementing Google Cloud’s reCAPTCHA Enterprise and Apigee X solutions. Read more.
Secure streaming data with Private Service Connect for Confluent Cloud: Organizations in highly regulated industries such as financial services and healthcare can now create fully segregated private data pipelines through a new partnership between Confluent Cloud and Google Cloud Private Service Connect. Read more.
3 ways artifact registry and container analysis can help optimize and protect container workloads: Our artifact management platform can help uncover vulnerabilities present in open source software, and here are three ways to get started. Read more.
Secure Cloud Run deployments with Binary Authorization: With Binary Authorization and Artifact Registry, organizations can easily define the right level of control for different production environments. Read more.
Backup and Disaster Recovery strategies for BigQuery: Cloud customers need to create a robust backup and recovery strategy for analytics workloads. We walk you through different failure modes, the impact of these failures on data in BigQuery, and examine several strategies. Learn more.
Cloud makes it better: What’s new and next for data security: In a recent webinar, Heidi Shey, principal analyst at Forrester, and Anton Chuvakin, senior staff, Office of the CISO at Google Cloud, had a spirited discussion about the future of data security. Here are some trends that they are seeing today. Read more.
How Chrome supports today’s workforce with secure enterprise browsing: Google Chrome’s commitment to security includes its ongoing partnership with our BeyondCorp Enterprise Zero Trust access solution. Here’s three ways that Chrome protects your organization. Read more.
CUF boosted security, reduced costs, and drove energy savings with ChromeOS: José Manuel Vera, CIO of CUF, Portugal’s largest private healthcare provider, explains how ChromeOS securely enabled agile medical and patient care. Read more.
Ensuring fair and open competition in the cloud: Cloud-based computing is one of the most important developments in the digital economy in the last decade, and Google Cloud supports openness and interoperability. We have been a leader in promoting fair and open licensing for our customers since the start of the cloud revolution. Here’s why.
Assured Workloads expands to new regions, gets new capabilities: Assured Workloads can help customers create and maintain controlled environments that accelerate running more secure and compliant workloads, including enforcement of data residency, administrative and personnel controls, and managing encryption keys. We’re expanding the service to Canada and Australia, and introducing new capabilities to automate onboarding and deploying regulated workloads. Read more.
We launched a new weekly podcast focusing on Cloud Security in February 2021. Hosts Anton Chuvakin and Timothy Peacock chat with cybersecurity experts about the most important and challenging topics facing the industry today. This month, they published a record nine must-listen podcasts:
Cloud security’s murky alphabet soup: Cloud security comes with its own dictionary of acronyms, and it may surprise you that not everybody’s happy with it. To help organizations with their cultural shift to the cloud, we discuss some of the most popular and contentious cloud security acronyms with Dr. Anna Belak, a director of thought leadership at our partner Sysdig. Listen here.
A CISO walks into the cloud: Frustrations, successes, and lessons from the top of the cloud: Along with data, security leaders also need to migrate to the cloud. We hear from Alicja Cade, director for financial services at our Office of the CISO, on her personal cloud transformation. Listen here.
Sharing The Mic In Cyber — Representation, Psychological Safety, and Security: A must-listen episode, this discussion digs into how DEIB intersects with psychological safety and cybersecurity, by guest hosts Lauren Zabierek, acting executive director of the Belfer Center at the Harvard Kennedy School, and Christina Morillo, principal security consultant at Trimark Security. Listen here.
“Hacking Google,” Operation Aurora, and insider threats at Google: A wide-ranging conversation on insider threats at Google, the role that detection and response play in protecting our user’s trust, and the Google tool we call BrainAuth, with our own Mike Sinno, security engineering director, Google Detection and Response. Listen here.
How virtualization transitions can make cloud transformations better: What lessons for cloud transformation can we glean from the history of virtualization, now two decades old? Thiébaut Meyer, director at Google Cloud’s Office of the CISO, talks about how the past is ever-present in the future of cloud tech. Listen here.
As part of Next ‘22, Anton and Tim recorded four bonus podcasts centered on key cybersecurity themes:
Celebrate the first birthday of the Google Cybersecurity Action Team: Google Cloud CISO Phil Venables sits down to chat about the first year of GCAT and its focus on helping customers. Listen here.
Can we escape ransomware by migrating to the cloud: Google Cloud’s Nelly Kassem, security and compliance specialist, dives deep into whether public clouds can play a role in stopping ransomware. Listen here.
Improving browser security in the hybrid work era: One of the unexpected consequences of the COVID-19 pandemic was the accelerated adoption of hybrid work. How modern browsers work with an existing enterprise stack is only one of the questions tackled by Fletcher Oliver, Chrome browser customer engineer. Listen here.
Looking back at Log4j, looking forward at software dependencies and open source security: Is another log4j inevitable? What can organizations do to minimize their own risks? Are all open-source dependencies dependable? Hear the answers to these questions and more from Nicky Ringland, product manager for Google’s Open Source Insights. Listen here.
To have our Cloud CISO Perspectives post delivered every month to your inbox, sign up for our newsletter. We’ll be back next month with more security-related updates.
Read More for the details.
One of the biggest challenges for any retailer is to track an individual customer’s journey across multiple channels (Online and In-Store), devices, purchases, and interactions.
This lack of a single view of the customer leads to a disjointed and inconsistent customer experience. Most retailers report obstacles to effective cross-channel marketing caused by inaccurate or incomplete customer data. Marketing efforts are also fragmented since the user profile data does not provide a 360˚view of customer’s experience. Insufficient information leads to lack of visibility into customer sentiment that further hinders customer engagement and loyalty.
Helps with customer engagement and loyalty by improving customer satisfaction and retention through personalization and targeted marketing communications.
Helps retailers achieve higher marketing ROI by aggregating customer interactions across all channels and identifying and winning valuable new customers, resulting in increased revenues.
360˚ is a relationship cycle that consists of many touch points where a customer meets the brand. The customer 360˚ solution provides an aggregated view of a customer. It collects all your customer data in one place, from customer’s primary contact information to their purchasing history, interactions with customer service, and their social media behavior.
Single view of customer data records and processes:
Behavior Data: Customer behavior data, including the customer’s browsing and search behavior online through click-stream data and the customer’s location if the app is location-based.Transactional Data: The transactional data includes online purchases, coupon utilization, in-store purchases, returns and refunds.Personal Information: Personal information from online registration, in-store loyalty cards and warranties will be collated into a single viewUser Profile Data: Data profiling will be used as a part of the matching and deduplication process and establish a Golden Record. Profile segments can be utilized to enable marketing automation.
An enhanced customer 360˚ solution with machine learning models can provide retailers with key capabilities for user based personalization like generating insights and orchestrate experiences for each customer.
On October 1st 2022, we announced Dataflow templates that simplify the moving and processing of data between MongoDB Atlas and BigQuery.
Dataflow is a truly unified stream and batch data processing system that’s serverless, fast, and cost-effective. Dataflow templates allow you to package a Dataflow pipeline for deployment. Templates have several advantages over directly deploying a pipeline to Dataflow. The Dataflow templates and the Dataflow page make it easier to define the source, target, transformations, and other logic to apply to the data. You can key in all the connection parameters through the Dataflow page, and with a click, the Dataflow job is triggered to move the data to BigQuery.
BigQuery is a fully managed data warehouse that is designed for running analytical processing (OLAP) at any scale. BigQuery has built-in features like machine learning, geospatial analysis, data sharing, log analytics, and business intelligence.
This integration enables Customers to move and transform data from MongoDB to BigQuery for aggregation and complex analytics. They can further take advantage of BigQuery’s Built-in ML and AI integrations for predictive analytics, fraud detection, real-time personalization, and other advanced analytics use cases.
This blog talks about how Retailers can use fully managed MongoDB Atlas and Google Cloud services to build customer 360 profiles , the architecture and the reusable repository that customers can use to implement the Reference Architecture in their environments
As part of this reference architecture, we have considered four key data sources – user’s browsing behavior, orders, user demographic information, and product catalog. The diagram below illustrates the data sources that are used for building a single view of the customer, and some key business outputs that can be driven from this data.
The technical architecture diagram below shows how MongoDB and Google Cloud can be leveraged to provide a comprehensive view of the customer journey.
The Reference Architecture consists of the following processes:
1. Data Ingestion
Disparate data sources are brought together in the data ingestion phase. Typically we integrate a wide array of data sources, such as Online Behavior, Purchases (Online and In-Store), Refunds, Returns and other enterprise data sources such as CRM and Loyalty platforms.
In this example, we have considered four representative data sources:
User profile data through User ProfilesProduct CatalogTransactional data through OrdersBehavioral data through Clickstream Events
User profile data, product catalog, and orders data are ingested from MongoDB, and click-stream events from web server log files are ingested from csv files stored on Cloud Storage.
The data ingestion process should support an initial batch load of historical data and dynamic change processing in near real-time. Near real-time changes can be ingested using a combination of MongoDB Change Streams functionality and Google PubSub to ensure high throughput and low latency design.
2. Data Processing
The data is converted from the the document format in MongoDB to the row and column format of BigQuery and loaded into BigQuery from MongoDB Atlas using the Google Cloud Dataflow Templates and Cloud Storage Text to BigQuery Dataflow templates to move CSV files to BQ.
Google Cloud Dataflow templates orchestrate the data processing and the aggregated data can be used to train ML models and generate business insights. Key analytical insights like product recommendations are brought back to MongoDB to enrich the user data.
3. AI & ML
The reference architecture leverages the advanced capabilities of Google Cloud BigQueryML and Vertex AI. Once the data is in BQ, BigQueryML lets you create and execute multiple machine learning models, but for this reference architecture, we focussed on the below models.
K-means clustering to group data into clusters. In this case it is used to perform user segmentation.
Matrix Factorization to generate recommendations. In this case, it is used to create product affinity scores using historical customer behavior, transactions, and product ratings.
The models are registered to Vertex AI Model Registry and deployed to an endpoint
for real-time prediction.
4. Business Insights
Using the content provided in github repo, we showcase the Analytics capabilities of Looker, which is seamlessly integrated with the aggregated data in BigQuery and MongoDB, providing advanced data visualizations that enable the business users to slice and dice the data and look for emerging trends. The included dashboards contain insights from MongoDB and from BigQuery, and from combining the data from both sources.
The detailed implementation steps, sample datasets and the Github repository for this reference architecture are available here.
There are many reasons to run MongoDB Atlas on Google Cloud, and one of the easiest is our self-service, pay-as-you-go listing on Google Cloud Marketplace. Please give it a try and let us know what you think. Also, check this blog to learn how Luckycart is able to handle large volumes of data and carry out complex computations it requires to deliver ultra-personalized activations for its customers using MongoDB and Google Cloud.
We thank the many Google Cloud and MongoDB team members who contributed to this collaboration. Thanks to the team at PeerIslands for their help with developing the reference architecture.
Read More for the details.
Logging is a critical part of the software development lifecycle enabling developers to debug their apps, DevOps/SRE teams to troubleshoot issues, and security admins to analyze access patterns. Log Analytics is a new set of features in Cloud Logging available in Preview to help you perform powerful analysis on log data. In this post, we’ll cover 10 reasons why you should get started with Log Analytics today.
Check our introductory blog or join us for a live webinar on Nov 15, 2022 where we will walk attendees through Log Analytics use cases including a demo. Register here today.
If you already use Cloud Logging, Log Analytics is included in the Cloud Logging pricing. There are no additional costs associated with upgrading the log bucket or running queries on the Log Analytics UI.
Our standard pricing is based on ingestion which includes storing logs in the log bucket for 30 days, our default period, or you can set a custom log retention period. Check out the pricing blog to learn how to maximize value with Cloud Logging.
If you don’t already use Cloud Logging, you can leverage the free tier of 50GiB/project/month to explore Cloud Logging including Log Analytics.
Log Analytics manages the log pipeline for you, eliminating the need to build and manage your own complex data pipelines, which can add cost and operational overhead. A simple one-click set-up allows you to upgrade an existing log bucketorcreate a new log bucket with Log Analytics. Data is available in real-time, allowing users to immediately access their data via either the Log Explorer or the Log Analytics page.
Upgrading a log bucket to Log Analytics means that your logs can be accessed via the Log Analytics page in Cloud Logging. If you also want to access log data from BigQuery, you can enable the checkbox to expose a linked dataset in BigQuery that is linked to your Log Analytics bucket.
Once the log bucket is upgraded, log data can be accessed both from Log Analytics in Cloud Logging or BigQuery which eliminates the need to manage or build data pipelines to store log data in BigQuery. Cloud Logging will still manage the log data including access, immutability, and retention. Additionally, Cloud Logging uses BigQuery’s new native support for semi-structured data so you don’t need to manage the schema in your logs.
This can be useful when:
You already have other application or business data in BigQuery and want to join it with log data from Cloud Logging
You want to use Looker Studio or other tools in the BigQuery ecosystem.
There is no cost to create a linked dataset in BigQuery, but the standard BigQuery query cost applies to querying logs via the BigQuery APIs.
Application, infrastructure and networking logs can often have high cardinality data with unique IP addresses, session ids and instance ids. High cardinality data can be difficult to convert, store, and analyze as metrics.
For example, two common use cases are:
Application and infrastructure troubleshooting
Network troubleshooting
Application and infrastructure troubleshooting
Suppose that you are troubleshooting a problem with your application running on Google Kubernetes Engine and you need to break down the requests by sessions. Using Log Analytics, you can easily group and aggregate your request logs by session, gaining insights into the request latency and how it changes over time. This insight can help you reduce time spent troubleshooting by executing just one SQL query.
Network troubleshooting
Network telemetry logs on Google Cloud are packed with detailed networking data that is often high volume and cardinality. With Log Analytics, we can easily run a SQL query on the VPC Flow Logs to find the top 10 highest count of packets and total bytes grouped by destination IP address. With this information, you can generate insights into whether any of these destination IP addresses represent unusual traffic levels that warrant deeper analysis. This latency analysis makes it easier to identify any unusual values either as a part of network troubleshooting or routine network analysis.
Log Analytics reduces the need for multiple tools by reducing data silos. The same log data can be used to gain business insights which can be useful for Business Operations teams.
Here are a few examples of how you can use Log Analytics:
Determine the top 5 regions from where content is being downloadedDetermine the top 10 referrers to a URL pathConvert IP addresses to city/state/country mapping. Identify unique IP addresses from a given country accessing a URL
For security analyses, one common pattern is to review all the GCP audit logs for a given user, IP address or application. This type of analysis requires very broad search and scalable capabilities since different services may log the IP address in different fields.
In Log Analytics, you can easily find values in logs using the SEARCH function to comb through all the fields in the log entry across terabytes of logs without worrying about the speed and performance of the database.
With the SEARCH function, you can now search across log data in SQL even when you’re not exactly sure in which field your specific search term will appear in the log entry.
We have many great enhancements on the roadmap that will make it even easier to generate insights. Charting is one of the features that can easily help users make sense of their logs. Charting in Log Analytics is available now as a Private Preview (sign-up form).
During the Private Preview for charting capabilities, we’re working hard to make it easier to use with support for additional chart types and a simple charting selector.
While Log Analytics is currently in Preview, the Cloud Logging platform is already GA and provides an enterprise-grade logging solution complete with alerting, logs-based metrics and advanced log management capabilities. With Cloud Logging, you can help reduce operational expenditure while supporting your security and compliance needs.
We put together common queries in our Github repository to make it easy to get started.
Use this SQL query to determine the min, max and average # of requests grouped for a service.
Use this query to determine if your Load Balancer latency was more than 2 seconds.
When actively troubleshooting, you can determine the list of top 50 requests to filter out the HTTP errors with this query.
Check out Github for additional sample queries.
Using the Log Analytics on Google Cloud lab, you can work through deploying a sample application, managing log buckets and analyzing log data. This can be a great way to get started, especially if you’re not already using Cloud Logging.
We’re building Log Analytics for Developers, SRE, DevOps and Operations teams to gain insights faster while keeping costs under control. To learn more about how you can use Log Analytics, please join our live webinar on Nov 15th (registration) which will include a live demo. To get started with Log Analytics today, you can use the lab to gain hands-on experience, visit the documentation or try out the Log Analytics page in the Cloud Console.
Read More for the details.
Editor’s note: Wunderkind, a leading performance marketing software, specializes in delivering tailored experiences to individuals at scale. Today, we learn how BigQuery’s high performance drives real-time, actionable decision-making that lets Wunderkind bring large brands closer to their customers.
At Wunderkind, we believe in the power of one. Behind every website visit is a living, breathing person, with unique wants and needs that can be (and should be) met by the brands they trust. When our customers and our customers’ customers get the experience they deserve, it has the potential to transform what’s possible — and deliver impactful revenue results.
Our solutions integrate hyper-personalized content into the customer experiences on retailer websites to help them understand and respond accordingly to each individual shopper. In addition, we provide these shoppers with personalized emails and text messages based on their interactions onsite. For example, we’ll alert a shopper with a ‘price drop’ message for an item they browsed, an item they left in their shopping cart, or about new products that we think they’ll love. Ultimately, our best-in-class tech and insight help deliver experiences that fit individual customers, and conversions at off-the-chart rates.
With the billions of one-to-one messages we send monthly, it effectively means we track a lot of data – in the trillions of events. Because of this, we want a deep understanding of this data so we can tailor our content specifically to each unique user to ensure it’s as enjoyable and engaging as possible.
Back in its start-up days, all of Wunderkind’s analytics relied on a MySQL database. This worked well for our reporting platform, but any sort of ad-hoc inquiry or aggregate insight was a challenge. As an analyst, I had to beg engineers to create new database indexes and tables just to support new types of reporting. As one can imagine, this consumed a lot of time and energy – figuring out how to get complicated queries to run, using SQL tricks to fake indexes, creating temporary tables, and whatever else was necessary to improve performance and execute specific queries. After all, this is a company built on data and insights – so it had to be done right.
To get the most value out of our data, we invested early in the BI platform , Looker. Our prior business intelligence efforts for the broader business were also hooked up to a single relational database. This approach was very troubling for a lot of reasons, that included but were not limited to:
We could only put so much data in a relational database.
We couldn’t index every query pattern that we wanted.
Certain queries would never finish.
We were querying off a replicated database and had no means to create any additional aggregate or derived tables.
Along with our new Business Intelligence approach, we decided to move to BigQuery. BigQuery is not just a data warehouse. It’s an analytics system that seems to scale infinitely. It gave us a data playground where we could create our own aggregate tables, mine for new insights and KPIs, and successfully run any type of data inquiry we could think up. It simply was a dream. As we were testing, we loaded one single day of event logs into BigQuery, and for a month, it fueled dozens of eye-opening insights about how our products actually work and the precise influence they have on user behavior.
BigQuery’s serverless architecture provides an incredibly consistent performance profile regardless of the complexity of the queries we threw at it. With relational databases, you can run one query and get a sub-second, exceptionally low-latency response, while another will never finish. I sometimes joke that every single query run against BigQuery takes 30 seconds — no matter how big or small. It’s a beautiful thing knowing that virtually any question you think up can be answered in a very reasonable amount of time.
BigQuery allows our Analytics team to think more about the value of the data for the business and less about the mechanics of how particular queries should run. By combining BigQuery and Looker, I can give teams across our company the flexibility to work with their data in a way that previously only analysts could.
I’ve also found that BigQuery is one of the easiest and best places to learn SQL. It’s well suited to learn for so many reasons, including:
It’s very accessible and in-browser, so there’s no complicated setup or install process. It’s free up to a terabyte per month. Its public datasets are vast and relatable, making your first queries more interesting. Real-time query validation lets us know quickly if something is wrong with our query.It’s a no-ops environment. No indexes are required. You just query.
Our Analytics team has three key groups of stakeholders: our customers and the teams that serve them, our research and development (R&D) team, and our business/operations team.
We recognize that every customer is a bit different and take pride in being able to answer their unique questions in the dimensions that make the most sense for their business. Customers may want more detail on the performance of our service for different cohorts of users or for certain types of web pages in ways that require more raw data than we provide in our standard product. BigQuery’s performance lets us respond to customers and offer them greater confidence around our approach. Thanks to Looker, we can roll out new internal insights very quickly that help inform and drive new strategies. Plus, with dashboards and alerts we can uncover cohorts and segments where our product performs exceptionally, and areas where our strategies need work.
Our R&D team is another important stakeholder group. As we plan new products and features, we work with BigQuery to forecast and simulate the expected performance and incrementality. As our product develops, we use BigQuery and Looker to prototype new KPIs and reporting. It’s helpful to easily stage live data and KPIs to ensure they’re valuable to the customer ahead of productizing in our reporting platform. BigQuery’s speed means that we can aggregate billions of rows of raw data on the fly as we perfect our stats. Additionally, we’re able to save significant engineering time by using Looker as a product development sandbox for reporting and insights.
Our final key stakeholder is our internal business operations team. Business operations typically ask more thought-provoking and challenging ‘what-if’ questions geared toward driving true incremental revenue for our customers and serving them optimally. For example, they may challenge the accuracy of the industry’s standard “attribution” methods and whether we can leverage our data to better understand return on spend and “cannibalization” for our customers. Because these tougher questions tend to involve data spanning product lines and more complicated data relationships, BigQuery’s high performance is essential to making rapid iteration with this team possible.
Across these stakeholders, we truly empower Wunderkind with actionable data. BigQuery’s performance is key to enabling real-time, iterative decision-making within our organization and in tandem with our customers. Looker is a powerful front-end to securely share data in a way that’s meaningful, actionable, and accurate. As much as I love writing SQL, I believe it’s best reserved for new ad-hoc insights and not standardized reporting. Looker is how we can enforce consistency and accuracy across our internal reporting. We’ve found the most powerful insights come out of conversations with our stakeholders. From there, we can use our data expertise and product knowledge to build flexible dashboards that scale across the organization. While it can seem a bit restrictive for some stakeholders, this approach ensures the data they’re getting is always intuitive, consistent, clean, and actionable. We’re not in the business of vanity metrics, we’re in the business of driving impact.
BigQuery is the foundational element that drives our goal of identifying not just our customers’ needs, but those that drive their customers to purchase. As a result, we can deliver better outcomes for customers, more rapid evolution of our products, and continuous validation and improvement of our business operations. We aim to maximize performance, experience, and returns for our customers – BigQuery is instrumental in helping to derive these insights. Even as Wunderkind has grown, we’ve been able to operate with a proportionally leaner team because BigQuery allows our Analytics team to perform most data tasks without needing engineering resources.
Read More for the details.
You can now monitor the quality of machine learning predictions from Batch Transform jobs in SageMaker using Amazon SageMaker Model Monitor. Amazon SageMaker Model Monitor provides a fully managed experience to monitor models in production, detect deviations, and take timely actions such as auditing or retraining.
Read More for the details.
AWS App Runner now supports PHP 8.1, Go 1.18, .Net 6, and Ruby 3.1 managed runtimes for building and running web applications and APIs. These runtimes enable you to leverage the App Runner “build from source” capability to build and deploy directly from your source code repository without needing to learn the internals of building and managing your own container images.
Read More for the details.
Amazon QuickSight launches the new capability for Row Level Security (RLS) on Dataset-as-a-Source. This is a follow-up feature to enhance the security level for the current QuickSight Dataset-as-a-Source feature. Previously, if any of the QuickSight parent datasets is RLS-enabled, the creation of a child dataset from that parent dataset was blocked. Now with the new capability, customers can create child datasets with the RLS rules inherated from the parent dataset. When querying the child dataset, QuickSight applies the parent RLS filter on top of the parent data at runtime. For further details, visit here.
Read More for the details.