Cloud

2022 09 22

Azure – Public preview: Expanded feature regional availability for Azure NetApp Files backup

Azure, Cloud Azure

Regional coverage has expanded for Azure NetApp Files backup.

Read More for the details.

2022 09 22

AWS – Introducing the Smart City Competency program

AWS, Cloud AWS

We are excited to launch the AWS Smart City Competency to provide best-in-class partner recommendations to our customers and broader market. With cities at the front line of economic, social, and global challenges such as energy and water use, traffic management, sanitation, and sustainability, the quest to build cities that “work” is increasingly reliant on technology solutions. Many AWS Partners leverage the power of AWS Cloud to innovate and deploy meaningful smart city solutions for these challenges, including urban data platforms to help cities gain insights for data-driven decision making; innovative solutions to improve the efficiency and delivery of urban infrastructure such as water, waste, transport, energy, and lighting; and most importantly, solutions to deliver better government services to all urban citizens.

Read More for the details.

2022 09 22

AWS – Amazon Polly NTTS voices now available in the AWS Europe (Paris) Region

AWS, Cloud AWS

Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products. Today, we are excited to announce the general availability of all Neural Text-to-Speech (NTTS) voices in the Europe (Paris) Region.

Read More for the details.

2022 09 22

AWS – Amazon SageMaker Data Wrangler launches new advanced settings for Amazon Athena data sources

AWS, Cloud AWS

Amazon SageMaker Data Wrangler reduces the time that it takes to aggregate and prepare data for machine learning (ML) from weeks to minutes in Amazon SageMaker Studio, the first fully integrated development environment (IDE) for ML. With SageMaker Data Wrangler, you can simplify the process of data preparation and feature engineering, and complete each step of the data preparation workflow, including data selection, cleansing, exploration, and visualization, from a single visual interface. You can import data from multiple data sources such as Amazon Simple Storage Service (Amazon S3), Amazon Redshift, Snowflake, and 26 Federated Query data sources supported by Amazon Athena. Starting today, customers importing data from Athena data sources can configure S3 query output location and data retention period to control where and how long Athena stores the intermediary data.

Read More for the details.

2022 09 22

GCP – Welcome Karen Dahut to Google Public Sector

Cloud, Google Cloud gcp

We recently announced the launch of Google Public Sector, a new Google subsidiary focused on helping U.S. federal, state, and local governments, and educational institutions accelerate their digital transformations. Google Public Sector brings Google technologies to government and education customers at scale, including open and scalable infrastructure; advanced data and analytics, artificial intelligence, and machine learning; modern collaboration tools like Google Workspace; advanced cybersecurity products; and more—so that agencies and institutions can better serve citizens and achieve their missions.

In just the few months since the introduction of Google Public Sector, we’ve seen continued momentum. We announced that Google Workspace has achieved the U.S. Department of Defense’s (DOD) Impact Level 4 (IL4) authorization. And building on the success with government customers like the U.S. Navy,Defense Innovation Unit, and the U.S. Department of Veteran Affairs, we’ve also shared how we’re helping educational institutions like ASU Digital Prep—an accredited online K–12 school offered through Arizona State University—make remote immersive learning technology more accessible to students across the United States and around the world.

Today, it is my pleasure to introduce Karen Dahut as the new CEO of Google Public Sector. With more than 25 years of experience in technology, cybersecurity, and analytics, Karen is a highly accomplished executive who has built businesses, developed and executed large-scale growth strategies, and created differentiated solutions across both commercial and federal industries. Karen joins us on Oct. 31. At that time, Will Grannis, who designed and launched Google Public Sector as founding CEO, will return to his role as the CTO of Google Cloud.

Karen was previously sector president at Booz Allen Hamilton, where she led the company’s $4 billion global defense business—representing half of the firm’s annual revenue—and global commercial business sector, which delivered next-generation cybersecurity solutions to Fortune 500 companies. Under her leadership, Booz Allen became the premier digital integrator helping federal agencies use technology in support of their missions.

Karen also has deep experience in building innovative solutions that help organizations tackle their toughest challenges. For example, at Booz Allen, she served as chief innovation officer and built the firm’s Strategic Innovation Group, which delivered new capabilities in cybersecurity, data science, and digital technologies. Prior to Booz Allen, Karen was an officer in the U.S. Navy and served as the controller for the Navy’s premier biomedical research institute.

We believe Google Public Sector will continue to play a critical role in applying cloud technology to solve complex problems for our nation—across U.S. federal, state, and local governments, and educational institutions. We’re excited today to have Karen leading this new subsidiary, providing more choice in the public sector and helping scale our services to more government agencies nationwide.

Read More for the details.

2022 09 22

GCP – New Google Maps Platform launch stages and what they mean for you

Cloud, Google Cloud gcp

Your application’s users expect a certain amount of reliability and predictability, so we want to make it easier for you to identify the commitments behind the Google Maps Platform services that power your applications. Consistent launch stages communicate the capabilities, support, and production-readiness of the products and features you’re using.

To make it easier for you to understand what you’re getting when you adopt a Google Maps Platform product, we’ve streamlined our launch stages down to: Experimental, Preview, and General Availability. These launch stage names will help you decide when you’re ready to try new products and features, depending on your needs.

All new Google Maps Platform offerings will launch in Experimental, Preview or General Availability. Experimental products are focused on validating a prototype and are not guaranteed to progress to General Availability. At Preview, products or features are ready for you to test and evaluate. They also give you time to update or test your code before adopting it for production use. At General Availability, services are stable, ready for production use, and come with full Service-level Objective/Service-level Agreement support.

We have already begun using these launch stage names. You can find more details about them in our launch stages documentation. We hope this update helps you build with confidence on Google Maps Platform.

For more information on Google Maps Platform, visit our website.

Read More for the details.

2022 09 22

GCP – Meet Optimus, Gojek’s open-source cloud data transformation tool

Cloud, Google Cloud gcp

Editor’s note: Earlier this year, we heard from Gojek, the on-demand services platform, about the open-source data ingestion tool it developed for use with data warehouses like BigQuery. Today, Gojek VP of Engineering Ravi Suhag is back to discuss the open-source data transformation tool it is building.

In a recent post, we introduced Firehose, an open source solution by Gojek for ingesting data to cloud data warehouses like Cloud Storage and BigQuery. Today, we take a look at another project within the data transformation and data processing flow.

As Indonesia’s largest hyperlocal on-demand services platform, Gojek has diverse data needs across transportation, logistics, food delivery, and payments processing. We also run hundreds of microservices across billions of application events. While Firehose solved our need for smarter data ingestion across different use cases, our data transformation tool, Optimus, ensures the data is ready to be accessed with precision wherever it is needed.

The challenges in implementing simplicity

At Gojek, we run our data warehousing across a large number of data layers within BigQuery to standardize and model data that’s on its way to being ready for use across our apps and services.

Gojek’s data warehouse has thousands of BigQuery tables. More than 100 analytics engineers run nearly 4,000 jobs on a daily basis to transform data across these tables. These transformation jobs process more than 1 petabyte of data every day.

Apart from the transformation of data within BigQuery tables, teams also regularly export the cleaned data to other storage locations to unlock features across various apps and services.

This process addresses a number of challenges:

Complex workflows: The large number of BigQuery tables and hundreds of analytics engineers writing transformation jobs simultaneously creates a huge dependency on very complex database availability groups (DAGs) to be scheduled and processed reliably.

Support for different programming languages: Data transformation tools must ensure standardization of inputs and job configurations, but they must also comfortably support the needs of all data users. They cannot, for instance, limit users to only a single programming language.

Difficult to use transformation tools: Some transformation tools are hard to use for anyone that’s not a data warehouse engineer. Having easy-to-use tools helps remove bottlenecks and ensure that every data user can produce their own analytical tables.

Integrating changes to data governance rules: Decentralizing access to transformation tools requires strict adherence to data governance rules. The transformation tool needs to ensure columns and tables have personally identifiable information (PII) and non-PII data classifications correctly inserted, across a high volume of tables.

Time-consuming manual feature updates: New requirements for data extraction and transformation for use in new applications and storage locations are part of Gojek’s operational routine. We need to design a data transformation tool that could be updated and extended with minimal development time and disruption to existing use cases.

Enabling reliable data transformation on data warehouses like BigQuery

With Optimus, Gojek created an easy-to-use and reliable performance workflow orchestrator for data transformation, data modeling, data pipelines, and data quality management. If you’re using BigQuery as your data warehouse, Optimus makes data transformation more accessible for your analysts and engineers. This is made possible through simple SQL queries and YAML configurations, with Optimus handling many key demands including dependency management, and scheduling data transformation jobs to run at scale.

Key features include:

Command line interface (CLI): The Optimus command line tool offers effective access to services and job specifications. Users can create, run, and replay jobs, dump a compiled specification for a scheduler, create resource specifications for data stores, add hooks to existing jobs, and more.

Optimized scheduling: Optimus offers an easy way to schedule SQL transformation through YAML based configuration. While it recommends Airflow by default, it is extensible enough to support other schedulers that can execute Docker containers.

Dependency resolution and dry runs: Optimus parses data transformation queries and builds dependency graphs automatically. Deployment queries are given a dry-run to ensure they pass basic sanity checks.

Powerful templating: Users can write complex transformation logic with compile time template options for variables, loops, IF statements, macros, and more.

Cross-tenant dependency: With more than two tenants registered, Optimus can resolve cross-tenant dependencies automatically.

Built-in hooks: If you need to sink a BigQuery table to Kafka, Optimus can make it happen thanks to hooks for post-transformation logic that extend the functionality of your transformations.

Extensibility with plugins: By focusing on the building blocks, Optimus leaves governance for how to execute a transformation to its plugin system. Each plugin features an adapter and a Docker image, and Optimus supports Python transformation for easy custom plugin development.

Key advantages of Optimus

Like Google Cloud, Gojek is all about flexibility and agility, so we love to see open source software like Optimus helping users take full advantage of multi-tenancy solutions to meet their specific needs.

Through a variety of configuration options and a robust CLI, Optimus ensures that data transformation remains fast and focused by preparing SQL correctly. Optimus handles all scheduling, dependencies, and table creation. With the capability to build custom features quickly based on new needs through Optimus plugins, you can explore more possibilities. Errors are also minimized with a configurable alert system that flags job failures immediately. Whether to email or Slack, you can trigger alerts based on specific requirements – from point of failure to warnings based on SLA requirements.

How you can contribute

With Firehose and Optimus working in tandem with Google Cloud, Gojek is helping pave the way in building tools that enable data users and engineers to achieve fast results in complex data environments.

Optimus is developed and maintained at Github and uses Requests for Comments (RFCs) to communicate ideas for its ongoing development. The team is always keen to receive bug reports, feature requests, assistance with documentation, and general discussion as part of its Slack community.

Read More for the details.

2022 09 22

Azure – Public preview: Customer initiated storage account conversion

Azure, Cloud Azure

You can now initiate storage account conversions from non-zonal redundancy to zonal redundancy via Azure Portal

Read More for the details.

2022 09 22

GCP – Google Cloud Firewall introduces Network Firewall Policies, IAM-governed Tags and more

Cloud, Google Cloud gcp

The network security that firewalls provide is one of the basic building blocks for a secure cloud infrastructure. We are excited to announce that three new Google Cloud Firewall features are now generally available: Global Network Firewall Policies, Regional Network Firewall Policies, and IAM-governed Tags.

With these enhancements, Cloud Firewall can help you more easily achieve a Zero Trust network posture with a fully distributed, cloud-native stateful inspection firewall service. In the new Global and Regional Network Firewall Policy structures, we expanded the policy structure down to the Virtual Private Cloud (VPC) level, and made it easier and more scalable to deploy and operate. Granular controls enforced at the virtual machine (VM) level using the new Identity and Access Management-governed Tags deliver intra-subnet micro-segmentation with pervasive policy coverage that automatically applies to workloads wherever they are deployed, independently of the network architecture.

The combination of the new policy structures and the IAM-governed Tags delivers a consistent firewall experience across the Google Cloud resource hierarchy, simplifying operations, while also achieving more granular control, enabling a more least-privilege environment, while allowing more self-service for devops for each group or app.

Additionally, we are also introducing some enhancements to our Firewall Insights module within the Network Intelligence Center.

New Network Firewall Policy structures

Previously, we introduced Hierarchical Firewall Policies at organization and folder levels. Now we are introducing a new policy structure, Network Firewall Policies, with two types: Global and Regional Network Firewall Policies. This new structure improves upon the previous VPC Firewall Rules structure.

Similar to Hierarchical Firewall Policies, the new Network Firewall Policy structure acts as a container for firewall rules. Rules defined in a Network Firewall Policy are enforced once the policy is associated with a VPC network, enabling simultaneous batch updates to multiple rules in the same policy.

The same Network Firewall Policy can be associated with more than one VPC network, and each VPC network can only have one Global Network Firewall Policy, and one Regional Firewall Policy per region associated with it. Both Global Network Firewall Policies and Regional Network Firewall Policies support IAM-governed Tags, and all Cloud Firewall enhancements moving forward will be delivered on the new Network Firewall Policy constructs.

We will continue to support the previous structure, VPC Firewall Rules, and even support the use of the new Network Firewall Policies and VPC Firewall Rules on a VPC network at the same time. However, we encourage customers to migrate their existing VPC Firewall Rules to Global Network Firewall Policies and Regional Network Firewall Policies.

Check our documentation to read more about Global Network Firewall Policies and Regional Network Firewall Policies.

How are Global Network Firewall Policies and Regional Network Firewall Policies different?

A Global Network Firewall Policy provides a global firewall configuration structure to match the global nature of Google Cloud VPC Networks. It applies to workloads deployed in all Google Cloud regions in the VPC network. A Regional Network Firewall Policy provides a regional firewall configuration structure for Google Cloud Firewalls that can only be used in a single target region. When using Regional Network Firewall Policies, users can designate a target region for a firewall policy. The firewall configuration data will be applied to workloads only in that specific region and will not be propagated to any other Google Cloud regions.

Firewall configuration is considered a critical security configuration for an organization, and Regional Network Firewall Policy can give customers an effective way to define their firewall configuration on a regional basis and can help satisfy data residency compliance requirements, such as those defined in FedRAMP.

Enabling smooth migration from VPC Firewall Rules to Network Firewall Policies

As mentioned above, to enable a smooth process for our customers migrating from existing VPC Firewall Rules to the new Network Firewall Policies, the Cloud Firewall engine supports a hybrid mode allowing both Network Firewall Policies and VPC Firewall Rules to be configured for the same VPC network. In order to avoid disruptions, the rule processing order by default will evaluate the VPC Firewalls Rules before the Network Firewall Policies as illustrated below:

Figure 1: Cloud Firewall evaluation for Rules and Policies

Alternatively, users can swap the evaluation order between VPC Firewall Rules and the new Global Network Firewall Policy, which is our recommended approach once the firewall rule set migration has been completed or for net new customers. Please note that Regional Network Firewall Policies will always be evaluated last, after Global Network Firewall Policies and VPC Firewall Rules.

We developed an automated tool to support migration to Network Firewall Policies. This new migration tool migrates VPC Firewall Rules to Network Firewall Policies in an automated fashion with user input as required. Please reach out to your account team for the latest updates.

IAM-governed Tags integration

Also known as Resource Manager Tags, or just Tags, an IAM-governed Tag is a new type of tag resource with enhanced security properties that can be applied to various Google Cloud resources, including VM instances.

The new Network Firewall Policy structures are built to easily integrate with Tags as a mechanism for enabling micro-segmentation. Unlike Network Tags, Tags are strictly controlled by IAM permissions, allowing enterprises to set up firewall controls without the risk of violation by unauthorized personnel. Using IAM permissions, Tags allow users to define their Network Firewall Policies in terms of logical groupings and delegate the management of those groups within their organization with fine-grained authorization controls.

For example, the assignment of VMs to logical names could be delegated using fine-grained IAM controls to other teams in the organization such as app developers, database administrators, or operations teams. Unlike the Network Tags that are mapped only to the primary IP address of a VM, the new Tags in Network Firewall Policies support both primary and alias IP range mapping. Tags replace Network Tags as the recommended, best-practice tagging mechanism in defining firewall rules for VM instances.

Benefits of IAM-governed Tags

Tags provide the following benefits to our customers:

Fine-grained IAM Controls: IAM authorization policies can be used to restrict which users are allowed to modify tags and which users are allowed to bind individual tags.

Scalability: Users can create a maximum of 1,000 tag keys in an organization, with a maximum of 1,000 tag values per tag key. Each tag value can be attached to an unlimited number of VM instances.

Anti-spoofing protection: When VM instances communicate, tag-based firewall policies are enforced using the identity of the sender, even if the source VM is configured to allow spoofing of the source IP of the traffic it originates.

Reference across peer networks: tags associated with workloads in one VPC network can be referenced in Firewall Rules for its peer VPC network to provide name-based rule control for inter-VPC network traffic.

More control over tagging: Users can attach up to 10 tags per VM per VPC.

Benefits of leveraging Network Firewall Policy with IAM-governed Tags instead of Network Tags and Service Accounts

Our previous recommendation for network security and micro-segmentation used VPC Firewall Rules, Network Tags, and Service Accounts to protect workloads. Network Tags do not include built-in IAM governance, so these tags can be added to any VM instance, which leaves the possibility of tag misuse from insider threat actors.

While Service Accounts come with better IAM control, only a single service account can be associated with each VM, limiting the flexibility of categorizing the workload in multiple dimensions for more flexible access controls to be applied. If changes to the Service Account are required, the VM would have to be shut down and recreated, which is not ideal for production workloads.

The Service Accounts and Network Tags could not be referenced from a different VPC, so the only method of inter-VPC network traffic control with the prior implementation was using IP addresses. In addition, the VPC firewall rules need to be updated individually at each rule level, and no batch update is supported, leading to race conditions and operational challenges.

With the introduction of Network Firewall Policies, users can enjoy the following benefits:

Batch editing of multiple rules within a single policy, which saves time, simplifies rule management, and eliminates race conditions created by single rule update patterns.

A single resource to contain all the firewall rule sets applied to a VPC network, making it possible to manipulate the firewall configurations with unified APIs on the single resource.

Support for separate IAM permissions governing policy creation, update and association to allow enterprises to implement more granular IAM controls based on their needs and requirements.

Sharing and attaching of firewall configurations across VPC networks in the same project, which significantly simplifies the configuration and management.

Support for IAM-governed Tags

Moving forward, we would recommend using the Network Firewall Policies combined with tags to enable micro-segmentation.

Firewall Insights: Custom Refresh Cycle and IPv6 Support

Firewall Insights in Network Intelligence Center offers automated analysis to identify firewall rule configuration issues. We are very excited to announce the general availability of two enhancements – Custom Refresh Cycle for Shadowed Rule Insight generation, and IPv6 Support.

Previously, the Firewall Insights service was designed to run at a fixed schedule (every 24 hours) to detect customers’ firewall rule changes and update insights when there was a change detected. With Custom Refresh Cycle support, customers are now allowed to define their own insights refresh frequency based on their operational cycle and capacity in order to help optimize the return on investment.

Additionally, the expansion of IPv6 rule support allows users to analyze IPv6 rules in the same way as IPv4 rules during their IPv6 adoption process.

Conclusion

The new policy constructs, Global and Regional Network Firewall Policies and IAM-governed Tags provide a new, foundational set of capabilities for our fully distributed, cloud-native firewall service. They allow simplified, more granular control of firewall policies with better scale. When combined with Hierarchical Firewall Policies, they can help create an environment that can be least-privileged in security posture, while also being self-service for teams throughout the organization. You can learn more about all these new capabilities in our product documentation.

Read More for the details.

2022 09 22

GCP – How LG CNS is creating future AI leaders with immersive machine learning training

Cloud, Google Cloud gcp

As a Korean technology leader specializing in digital transformation (DX), LG CNS partners with customers to help them achieve digital growth across fields including cloud, artificial intelligence (AI), big data, smart factory, and smart logistics. Towards the end of 2020, LG CNS was investigating how to drive digital transformation and growth within their own ranks to demonstrate and extend their core competitive advantage as a consulting service and system integration service provider. By enhancing their internal teams’ AI capabilities, they would ensure they could provide even more innovation and technical expertise for their customers’ own transformation journeys to maintain their position in the market, and continue to provide creative insights and thought leadership both within Korea and overseas.

After investigating several external training programs, LG CNS decided to partner with Google Cloud Learning Services to guarantee they received the right level of training and support to further elevate their world-class team. They recognized and respected Google’s global leadership in AI and machine learning (ML), and believed Google was the only partner that could elevate their company to a global level, in line with their already prominent position within the Korean market. LG CNS’s employees participated in 5 weeks of machine learning and 1 week of machine learning operations (MLOps) training through the Advanced Solutions Lab (ASL) via ASL Virtual. This immersive learning program enabled participants to collaborate and learn directly from Google Cloud engineers and ML experts, without having to attend a Google campus.

LG CNS’ participants for this transformative training program were selected based on a strict criteria. They were all considered high performers within LG CNS, and held 1-2 years practical experience in the AI/ML field. With continuous competency development and care, many will progress to become AI development leaders within the company, ensuring LG CNS can remain at the forefront of their field and collaborate with other businesses to fuel their own DX solutions.

“I hope that all trainees can grow through the Google Cloud ASL program, which has the world’s best AI technology, and I look forward to taking responsibility for leading AI in each division.” — SVP of LG CNS

Google Cloud Learning services’ early involvement in the organizational stages of this training process, and agile response to LG CNS’s requirements, ensured LG CNS could add the extra week of MLOps training to their program as soon as they began the initial ASL ML course. This productive, collaborative experience demonstrated the strength and flexibility of Google Cloud Learning services, and their capacity to tailor virtual learning content to meet the needs of the specific client and their business objectives.

Following the success of this first round in 2021, LG CNS has done ASL ML/MLOps training for another cohort in 2022 to cultivate even more AI and ML expert groups. This empowers LG CNS to build even more capacity within their workforce to continue advancing and developing the most breakthrough technologies to support their customers’ own digital transformation and innovation.

To learn more about how you can engage business innovation in your own organization through cloud education services, visit Google Cloud Training & Certification and get started on your own learning journey.

Read More for the details.

2022 09 22

GCP – Container analysis support for Maven and Go Automatic Scanning of Containers in Public Preview

Cloud, Google Cloud gcp

Java and Go vulnerability scanning support

Google Cloud’s Container Scanning API now automatically scans Maven and Go packages for vulnerabilities.

With the Container Scanning API enabled, any containers including Java (in Maven repositories) and Go language packages that are uploaded to an Artifact Registry repository will be scanned for vulnerabilities. This capability builds on existing Linux OS based vulnerability detection and provides customers with deeper insight into their applications. This feature is in Public Preview which makes it available to all Google Cloud customers.

Get started with Artifact Registry via the instructions for Go or the instructions for Java.

How it works

Once the API is enabled, upload a container image which contains Go and/or Maven packages.

Vulnerability totals for each image digest are displayed in the Vulnerabilities column. Customers can then drill down on the vulnerability to get CVE numbers, and if available, a suggested fix.

Vulnerabilities can also be displayed via the gcloud CLI and the API.

To view a list of vulnerabilities from the gcloud CLI, the following can be used.

code_block[StructValue([(u’code’, u’gcloud artifacts docker images list –show-occurrences LOCATION-docker.pkg.dev/PROJECT_ID/REPOSITORY/IMAGE_ID –format=json’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e36f7539950>)])]

To view a list of vulnerabilities with the API, run the following command.

code_block[StructValue([(u’code’, u’curl -X GET -H “Content-Type: application/json” -H \rn “Authorization: Bearer $(gcloud auth print-access-token)” \rn https://containeranalysis.googleapis.com/v1/projects/PROJECT_ID/occurrences’), (u’language’, u”), (u’caption’, <wagtail.wagtailcore.rich_text.RichText object at 0x3e36f4ebf290>)])]

Integrate your Workflows via API and Pub/Sub

This feature now makes it possible to scan Java (in Maven repositories) and Go language packages both via the existing On-Demand scan capability, and with an automatic scan on push to Artifact Registry. Language scanning is in addition to the Linux OS scanning which is already available.

This capability can be combined with Pub/Sub notifications to trigger additional actions for the vulnerabilities and other metadata. An example of this is sending an e-mail notification to those who need the information.

Organizations are increasingly concerned about the supply chain risks associated with building their applications using open source software. Being able to scan applications for vulnerabilities is an important step for customers to enhance their security posture. Language package vulnerabilities are available in the same formats that customers are already familiar with. They appear alongside OS vulnerabilities within the Artifact Registry UI, and are available through existing CLI and APIs. These steps aid customers in identifying the potential vulnerabilities introduced in software packages and make appropriate decisions with that information.

Learn more about types of vulnerability scanning.

Read More for the details.

2022 09 22

GCP – What’s New with Google’s Unified, Open and Intelligent Data Cloud

Cloud, Google Cloud gcp

We’re fortunate to work with some of the world’s most innovative customers on a daily basis, many of whom come to Google Cloud for our well-established expertise in data analytics and AI. As we’ve worked and partnered with these data leaders, we have encountered similar priorities among many of them: to remove the barriers of data complexity, unlock new use cases, and reach more people with more impact.

These innovators and industry disruptors power their data innovation with a data cloud that lets their people work with data of any type, any source, any size, and at any speed, without capacity limits. A data cloud that lets them easily and securely move across workloads: from SQL to Spark, from business intelligence to machine learning, with little infrastructure set up required. A data cloud that acts as the open data ecosystem foundation needed to create data products that employees, customers, and partners use to drive meaningful decisions at scale.

On October 11, we will be unveiling a series of new capabilities at Google Cloud Next ‘22 that continue to support this vision. If you haven’t registered yet for the Data Cloud track at Google Next, grab your spot today!

But I know you data devotees probably can’t wait until then. So, we wanted to take some time before Next to share some recent innovations for data cloud that are generally available today. Consider these the data hors d’oeuvres to your October 11 data buffet.

Removing the barriers of data sharing, real-time insights, and open ecosystems

The data you need is rarely stored in one place. More often than not data is scattered across multiple sources and in various formats. While data exchanges were introduced decades ago, their results have been mixed. Traditional data exchanges often require painful data movement and can be mired with security and regulatory issues.

This unique use case led us to design Analytics Hub, now generally available, as the data sharing platform for teams and organizations who want to curate internal and external exchanges securely and reliably.

This innovation not only allows for the curation and sharing of a large selection of analytics-ready datasets globally, it also enables teams to tap into the unique datasets only Google provides, such as Google Search Trends or the Data Commons knowledge graph.

Analytics Hub is a first-class experience within BigQuery. This means you can try it now for free using BigQuery, without having to enter any credit card information.

Analytics Hub is not the only way to bring data into your analytical environment rapidly. We recently launched a new way to extract, load, and transform data in real-time into BigQuery: the Pub/Sub “BigQuery subscription.” This new ELT innovation simplifies streaming ingestion workloads, is simpler to implement, and is more economical since you don’t need to spin up new compute to move data and you no longer need to pay for streaming ingestion into BigQuery.

But what if your data is distributed across lakes, warehouses, multiple clouds, and file formats? As more users demand more use cases, the traditional approach to build data movement infrastructure can prove difficult to scale, can be costly, and introduces risk.

That’s why we introduced BigLake, a new storage engine that extends BigQuery storage innovation to open file formats running on public cloud object stores. BigLake lets customers build secure, data lakes over open file formats. And, because it provides consistent, fine-grained security controls for Google Cloud and open-source query engines, security only needs to be configured in one place to be enforced everywhere.

Customers like Deutsche Bank, Synapse LLC, and Wizard have been taking advantage of BigLake in preview. Now that BigLake is generally available, I invite you to learn how it can help to build your own data ecosystem.

Unlocking the ways of working with data

When data ecosystems expand to data of all shape, size, type, and format, organizations struggle to innovate quickly because their people have to move from one interface to the next, based on their workloads.

This problem is often encountered in the field of machine learning, where the interface for ML is often different than that of business analysis. Our experience with BigQuery ML has been quite different: customers have been able to accelerate their path to innovation drastically because machine learning capabilities are built-in as part of BigQuery (as opposed to “bolted-on” in the case of alternative solutions).

We’re now applying the same philosophy to log data by offering a Log Analytics service in Cloud Logging. This new capability, currently in preview, gives users the ability to gain deeper insights into their logging data with BigQuery. Log Analytics comes at no additional charge beyond existing Cloud Logging fees and takes advantage of soon-to-be generally available BigQuery features designed for analytics on logs: Search indexes,a JSON data type, and the Storage Write API.

Customers that store, explore, and analyze their own machine generated data from servers, sensors, and other devices can tap into these same BigQuery features to make querying their logs a breeze. Users simply use standard BigQuery SQL to analyze operational log data alongside the rest of their business data!

And there’s still more to come. We can’t wait to engage with you on Oct 11, during Next’22, to share more of the next generation of data cloud solutions. To tune into sessions tailored to your particular interests or roles, you can find top Next sessions for Data Engineers, Data Scientists, and Data Analysts — or create and share your own.

Join us at Next’22 to hear how leaders like Boeing, Twitter, CNA Insurance, Telus, L’Oreal, and Wayfair, are transforming data-driven insights with Google’s data cloud.

Read More for the details.

2022 09 22

GCP – Exporting and analyzing billing data using BigQuery

Cloud, Google Cloud gcp

TL;DR – The best way to analyze your billing data is to set up the billing data export to BigQuery. We’ve created an interactive tutorial to help you through the process!

In this last post, we went over a few different options to analyze your Google Cloud billing data. The most powerful and comprehensive way to work with your billing data is to enable exporting to BigQuery. Let’s take a closer look at what this is and why it’s a good choice!

What does it do?

First things first, exporting your billing data to BigQuery is exactly what it sounds like: it exports data from your billing account (which handles the costs for all your Google Cloud resources) into BigQuery. BigQuery is a great choice for analyzing data, where you can run queries against your data. Having this data in BigQuery also makes it much easier to integrate with other tools, like Looker or Data Studio for visualization.

The actual data being exported to BigQuery are hourly records of every service your billing account is responsible for, along with detailed information like the SKU, what type of unit is being measured, and how many of that unit you’re paying for.

A small snapshot of the BigQuery dataset

There are actually three different options for exporting data to BigQuery:

The standard export is what we’re often referring to when we recommend everyone to set up the export to BigQuery. Here’s the full schema for the table, which contains quite a bit of detail for every row. So if you needed to know how much money you spent on Cloud Run, on October 17th, in your production project, between 2 AM and 4 AM, this is the export to use!

In addition to the standard export, there is now a detailed export. This includes everything from the standard export, as well as even more granular data for certain services. As of writing this blog post, Compute Engine and GKE are both supported by the detailed export, and provide quite a bit of extra information. For example, you could see how much you spent on Compute Engine broken down by each individual instance. For GKE, you can also break down your costs by cluster, namespace, and even see details at the pod level!

The final export is the pricing export, which gives you a detailed export of how much all Google Cloud services will cost for your billing account. This export can be incredibly useful for working with when programmatically creating estimates for future workloads.

In this blog post, we’ll be focusing on the standard and detailed exports.

Why should I use it?

Understanding cloud costs can be a tricky subject, and it becomes increasingly complicated as your cloud environment grows. There’s a big difference between tracking how much you’re spending on a project that hosts a single web server, versus trying to break down a large organization with hundreds of projects and thousands of people doing different levels of work in the cloud. Tracking and understanding what resources are contributing to your costs can be vital for an organization to ensure it’s spending money on the right things and to build predictions for future costs.

One of the most important reasons to enable the billing export as soon as you can is because the data isn’t retroactive. So, if you want to analyze last month’s billing data but you didn’t have the export enabled, you’re out of luck! That’s why we typically recommend enabling the export as soon as you create a billing account, so you’ll have the data when you need it.

One of the questions I’ve heard a few times is: “Why should I use the BigQuery export instead of the built-in reporting tools?”.

The built-in reporting tools are fantastic for getting quick answers and understanding your costs. The big difference here is scale and granularity. As teams and organizations become larger, it can be increasingly difficult to figure out which groups are responsible for different costs. If your organization is using a chargeback model (where one group pays for all resources and then breaks down cost details for individual teams/groups to be responsible for their usage), then you’ll very likely need the increased granularity to accurately calculate costs.

If you’re interested in learning more about how organizations can deal with cost challenges like this, I’d recommend checking out the FinOps foundation and learning resources!

I have more questions!

Here’s a few more questions that I’ve seen folks ask:

Why wouldn’t I always want to use the detailed export?

Like most of this data, it’s only useful if you need it. BigQuery can incur a storage cost for this billing data (the detailed export generates a whole lot of extra storage), as well as costs for querying. It’s likely that large enterprise organizations will need this level of detail, but you might not! I’d recommend starting with the standard export every time you have a new billing account, and thinking ahead to the next few years to see if you should go with the detailed export.

I should also mention that BigQuery has a free tier with a data limit, which might be enough depending on how much billing data you generate.

Why should I use BigQuery for this data?

All of the billing exports generate quite a large amount of data, and BigQuery is well suited to analyzing massive amounts of data in a quick and efficient manner. I think it’s a pretty amazing tool, and it’s definitely the right one to learn if you want to get your hands around something as complex as cloud billing! BigQuery also integrates with the rest of Google Cloud, so once you’re comfortable with the exports, you can start to build additional features like machine learning, or near-real-time reporting when large and unexpected charges happen.

I don’t know SQL. Is the export still useful?

The truth is, you’ll likely need to know SQL or work with a data analyst role to make full use of the exports. However, you can still enable the export, and then set up visualization tools with minimal effort to at least get a handle on your billing data. More often than not, folks seem to think they don’t need this level of detail now, but regret not having it down the road as their cloud usage grows!

I know some SQL, but this is a lot of data! How can I work with it?

The best resources here are the schema resources and the example queries. These are detailed resources for getting a handle on the complexity of the data and to get you up and running. And remember: it’s better to start small and improve over time. Trying to build a full cost reporting solution for an entire organization from scratch is a recipe for frustration and challenge after challenge. Start with a crawl (like being able to break down your costs by project and per month) before you walk and eventually run (like being able to create anomaly detection and and notifying teams when the resources they’re responsible for are creating unwanted cost surprises).

AWS, Cloud AWS

Today, we are happy to announce the public preview of AWS DataSync Discovery. You can use DataSync Discovery to gain visibility into your on-premises storage performance and utilization, and receive automated recommendations to help simplify and accelerate your data migration to AWS. This new feature of AWS DataSync enables you to better understand your on-premises storage usage through automated data collection and analysis, quickly identify data to migrate, and evaluate recommended AWS Storage services for your data, such as Amazon FSx for NetApp ONTAP, Amazon FSx for Windows File Server, and Amazon Elastic File System (EFS).

Read More for the details.

Cloud

Enabling reliable data transformation on data warehouses like BigQuery

Key advantages of Optimus

How you can contribute

Introducing Firehose: An open source tool from Gojek for seamless data ingestion to BigQuery and Cloud Storage

New Network Firewall Policy structures

How are Global Network Firewall Policies and Regional Network Firewall Policies different?

Enabling smooth migration from VPC Firewall Rules to Network Firewall Policies

IAM-governed Tags integration

Benefits of IAM-governed Tags

Benefits of leveraging Network Firewall Policy with IAM-governed Tags instead of Network Tags and Service Accounts

Firewall Insights: Custom Refresh Cycle and IPv6 Support

Conclusion

Cloud IAM Google Cloud

Drive digital transformation, get Cloud Digital Leader certified

Java and Go vulnerability scanning support

How it works

Integrate your Workflows via API and Pub/Sub

Building a secure CI/CD pipeline using Google Cloud built-in services

Removing the barriers of data sharing, real-time insights, and open ecosystems

Unlocking the ways of working with data

Register for Google Cloud Next

What does it do?

Why should I use it?

I have more questions!

Go forth and export

Optimizing your Google Cloud spend with BigQuery and Looker