Cloud

2023 02 01

Azure – Public Preview: Azure Digital Twins

Azure, Cloud Azure

Public preview: User-assigned managed identities for routing events for Azure Digital Twins

Read More for the details.

2023 02 01

AWS – AWS CloudFormation StackSets gives quick access to list of Regions for stack instances of a stack set

AWS, Cloud AWS

Today, AWS CloudFormation StackSets adds a new parameter in DescribeStackSet API to improve visibility to the list of Regions where a given stack set is deployed. You can now access the list of Regions directly through your management or delegated administrator AWS account.

Read More for the details.

2023 02 01

AWS – Amazon RDS for PostgreSQL now supports new minor versions 14.6, 13.9, 12.13, 11.18, and 10.23

AWS, Cloud AWS

Amazon Relational Database Service (Amazon RDS) for PostgreSQL now supports PostgreSQL minor versions 14.6, 13.9, 12.13, 11.18, and 10.23. We recommend you upgrade to the latest minor version to fix known security vulnerabilities in prior versions of PostgreSQL, and to benefit from the bug fixes, performance improvements, and new functionality added by the PostgreSQL community. Please refer to the PostgreSQL community announcement for more details about the release.

Read More for the details.

2023 02 01

GCP – Monitor gce instances with prometheus and ops agent

Cloud, Google Cloud gcp

If you’ve worked in the operations space for the last 5+ years, you’ve likely heard of or have started using Prometheus. The proliferation of Prometheus for time series metrics formatting, querying and storage across the open source world and enterprise IT has been shockingly fast, especially with teams using Kubernetes platforms like Google Kubernetes Engine (GKE). We introduced Google Cloud Managed Service for Prometheus last year, which has helped organizations solve their scaling issues when it comes to managing Prometheus storage and queries.

There’s a lot to love about the extensive ecosystem of Prometheus exporters and integrations to monitor your application workloads and visualization tools like Grafana, but we can sometimes hit challenges when trying to leverage these tools beyond kubernetes based environments.

Crossing the chasm to the rest of your environment

What if you’re looking to unify your metrics across Kubernetes clusters and services running in VMs? Kubernetes makes it easy for Prometheus to auto-discover services and immediately start ingesting metrics, but today there is no common pattern for discovering VM instances.

We’ve seen a few customers try to solve this and hit some issues like:

Building in-house dynamic discovery systems is hard

We’ve seen customers build their own API discovery systems against the Google Compute APIs, their Configuration Management Databases, or other systems they prefer as sources of truth. This can work but requires you to maintain this system in perpetuity and usually requires building an event driven architecture for realistic timeline updates

Managing their own daemonized prometheus binaries

Maybe you love systemd on Linux. Maybe not so much. Either way, it’s certainly possible to build a prometheus binary, daemonize it, and update it’s configuration to match your expected behavior and also scrape your local service for Prometheus metrics. This can work for many but if your organization is trying to avoid adding technical debt like most are, this means you still have to now track and maintain the prometheus work. Maybe that even means rolling your own RPM to maintain this and managing the SLAs for this daemonized version.

There can be a lot of pitfalls and challenges with extending Prometheus over to the VM world even though the benefits of a unified metric format and query syntax like PromQL are clear.

Making it simpler on Google Cloud

To make standardizing on Prometheus easier for you, we’re pleased to introduce support for Prometheus metrics in the Cloud Ops agent, our agent for collecting logs and metrics from Google Compute instances.

The Ops Agent was released in 2021 and was based on the OpenTelemetry project for metric collection, providing a great deal of flexibility from the community. That flexibility includes the ability to ingest Prometheus metrics, retain their shape, and upload it to Google Cloud Monitoring while maintaining the Prometheus metric structure.

This means that starting today you can deploy the Ops Agent and configure it to scrape Prom metrics.

Here’s a quick walkthrough of what that looks like:

As you can see, being able to query Prometheus metrics becomes a fairly trivial process for users of the Ops Agent with GCE instances.

Get Started today

The first step to bringing this unified experience to your VMs isinstalling the Ops Agent on your VMs and then following the steps to configure for the Prometheus receiver.

When you’re ready you can even make this more programmatic by using our Ops Agent integration with automation toolslike Terraform, Ansible, Puppet, and Chef to deploy the agent and dynamically customize the static config file as needed.

Read More for the details.

2023 02 01

GCP – How Meesho migrated a petabyte of data into Cloud CDN with zero downtime

Cloud, Google Cloud gcp

Meesho is an Indian online marketplace that serves millions of customers every day. Recently, the company decided to adopt a multi-cloud strategy, leveraging Google Cloud’s scalable and reliable infrastructure to drive operational efficiency, modernize and scale for growth. To do so, they needed to migrate billions of static files and images to Google Cloud, to render the static content that serves their web and mobile applications. But with over a petabyte of data in their object storage system, and 10 billion requests per day, Meesho needed to perform this gigantic migration gradually, with zero downtime — a huge challenge.

In this blog post, we look at how Meesho did this using Storage Transfer Service, Cloud Storage and Cloud CDN. We also look at how it saved on storage capacity by resizing static images as needed on the fly, using Cloud Run.

CDN migration requirements

Migrating from one cloud to another isn’t easy. To pull it off, Meesho identified the following requirements:

Petabyte-scale data transfer: Meesho needed to migrate billions of image files from their existing object storage server to Cloud Storage.

Dynamic image resizing: To save on storage costs, Meesho wanted the ability to dynamically resize the images based on the end user platform and store the smaller images in the Cloud CDN cache.

High-throughput data transfer: To support consumer demand, Meesho needed images to be served at a throughput of thousands of requests per second.

Zero downtime: Since any downtime involves potential loss of revenue, Meesho needed to perform the migration without taking any systems offline.

Migration architecture

Cloud CDN architecture in Google Cloud

The above figure depicts the CDN migration architecture implemented in Meesho. The existing DNS server points to both the source load balancer as well as Google External HTTP Load Balancer with weighted distribution. The source load balancer points to the source object storage. Images were transferred from the source object storage to Google Cloud Storage.

The Google External HTTP Load Balancer was deployed with Cloud CDN to serve static images that are stored in the CDN cache to users. The Google Load Balancer public IP is configured as an end point on their existing DNS server. The Load Balancer is connected to Cloud Run, which talks to the Cloud Storage bucket. When a request reaches the Load Balancer in the edge, it first checks if the content is available in Cloud CDN, and returns the object from the closest edge network. If the image is not available in the Cloud CDN cache, the request is sent to Cloud Run which obtains the image from the Cloud Storage bucket and performs dynamic resizing of the image if necessary.

Data transfer

Meesho used Google Cloud’s Storage Transfer Service to transfer data from their current object storage to Cloud storage bucket over the internet. Since the number of files and total size of the data to be transferred was huge, Meesho executed multiple parallel transfers by specifying folders and subfolders as prefixes in a Storage Transfer Service job.

Dynamic image resizing

Meesho delivers static images to multiple end user platforms — mobile, laptop — at multiple resolutions. Rather than store each image at multiple image resolutions, Meesho opted to store a single high-resolution mezzanine image. It then attached Cloud Run as a serverless network endpoint group to a Cloud Load Balancer. Application requests for images specify the name of the object, the format of the image, and its resolution (for example, abc.jpeg with 750*450 resolution). If the specific image exists for the requested resolution, then it is returned from the Cloud Storage bucket to the end user and stored in the Cloud CDN cache. If an image for a specified resolution and/or format is not found, the mezzanine image (in our example, abc.jpeg) is resized to the specified resolution and format, then stored in Cloud Storage bucket and returned to the end user. The dynamic resizing and formatting is only performed the first time for a specific resolution.

In this architecture, it is important to configure Cloud Run to scale appropriately as it handles a bulk of “CDN cache-miss” requests. Meesho performed the following configuration steps:

Configured the number of concurrent requests that a single instance of Cloud Run can handle

Ensured a sufficient minimum of Cloud Run instances were available to serve user traffic to avoid cold-start latency

Reviewed limits of Cloud Run maximum instance size for the region and increased the limits if necessary to handle peak load

Set up smaller start-up times for Cloud Run containers, so that the application could quickly autoscale to handle a surge in traffic

Optimized the memory and CPU configuration to handle processing requirements

CDN configuration

Cloud CDN was configured to ensure a high cache hit ratio > 99 %. This not only sped up the rendering of the images, but also reduced the load on Cloud Run, saving cost and improving performance.

Achieving zero downtime

Meesho followed well-established DevOps principles to achieve a zero-downtime migration:

Metrics and alerts were configured in Cloud Monitoring to oversee the load balancer.

The DNS server was configured to point to Cloud Load Balancer IP addresses in addition to their current load balancer, which served status assets.

Weight-based DNS load balancing was employed to gradually shift the traffic to Google Cloud, while monitoring application performance and HTTP response codes.

The initial migration process distributed .1% of traffic during non-peak hours. The metrics, end user performance and response codes were continuously monitored.

Traffic was gradually incremented over a two-week period by increasing the weight of the Google Cloud Load balancer in DNS. By gradually shifting traffic, Meesho ensured a healthy cache-hit ratio, allowing Cloud Run to learn traffic patterns gradually and scale seamlessly.

Meesho learned a lot through this experience, and has the following advice for anyone undertaking a similar migration:

While transferring data using Storage Transfer Service of Google Cloud, split the transfer process into multiple transfers.

Ensure that applications do not pin certificates, which could create problems while migrating to the newer certificates in Google Cloud.

Plan a gradual migration process to gradually increase the traffic to Google Cloud.

Summary

When all is said and done, Meesho considers its migration to Google Cloud a big success. After migrating the static images to Cloud CDN, Meesho held two major sales that each had three times the normal peak traffic, all with no issues. The CDN migration helped Meesho reduce its costs, improve performance and reduce load balancer errors when fetching static images. To learn more about Cloud CDN and how you can use it in your environment, check out the documentation.

Read More for the details.

2023 02 01

GCP – Mandiant now supports Attack Surface Management for Google Cloud

Cloud, Google Cloud gcp

When we closed the Mandiant acquisition in September 2022, we set the expectation that we’d be investing heavily in cybersecurity offerings that can help customers mitigate risk. In the short time since our two companies came together, we’ve aimed to do just that.

Today, we are announcing Mandiant Attack Surface Management for Google Cloud, which can enable customers to centralize visibility into cloud-hosted external assets. Attack Surface Management can deliver an adversary’s view of your organization’s attack surface, and can help discover external assets, identify business relationships, and actively check for exploitable weaknesses.

Customers can automatically pull external-facing assets hosted in Google Cloud into Attack Surface Management, and customers with multi-cloud and hybrid cloud environments can use the integrations with Azure and AWS to see their ecosystem in one view. You can read more about this announcement in Mandiant’s blog.

aside_block[StructValue([(u’title’, u’Hear monthly from our Cloud CISO in your inbox’), (u’body’, <wagtail.wagtailcore.rich_text.RichText object at 0x3ea351f07790>), (u’btn_text’, u’Subscribe today’), (u’href’, u’https://go.chronicle.security/cloudciso-newsletter-signup?utm_source=cgc-blog&utm_medium=blog&utm_campaign=FY23-Cloud-CISO-Perspectives-newsletter-blog-embed-CTA&utm_content=-&utm_term=-‘), (u’image’, None)])]

In addition to this new integration, there have been several new exciting Google Cloud and Mandiant offerings.

Threat Ready with Mandiant can help customers identify architecture and configuration issues that could lead to exploitation, battle test their environments to pinpoint vulnerabilities, and brings access to world-class incident response and intelligence experts to their security programs.

Mandiant Breach Analytics for Chronicle creates an early warning system that can help customers reduce an attacker’s dwell time. Powered by machine learning, Mandiant Breach Analytics matches indicators of compromise (IOCs) derived from incident response engagements, managed defense hunting missions, and advanced threat research to security events analyzed by Chronicle Security Operations.

Mandiant Threat Intelligence Browser Plug-in overlays threat intelligence in your browser so you can quickly learn more about an indicator, malware, or threat actor. This enhanced plug-in is compatible with Chrome and Firefox browsers.

Future cybersecurity investments from Google Cloud will continue to scale frontline expertise to help our customers rise up to today’s security challenges. Visit Mandiant.com to learn more about the Mandiant Advantage Platform and services.

Read More for the details.

2023 02 01

GCP – How Arvind Fashions Ltd leads the fashion industry with powerful data analytics on BigQuery

Cloud, Google Cloud gcp

Arvind Ltd has been in the apparel industry for more than 90 years, with its retail powerhouse Arvind Fashions Ltd being the backbone of well-known names in the retail fashion industry in India.

Arvind Fashions Ltd (Arvind) has seen significant growth in its portfolio with new franchises being added every year. The six high conviction brands include Tommy Hilfiger, Calvin Klein, Sephora, Arrow, U.S. Polo Assn. & Flying Machine.

To secure a foundation for future growth, the company has embarked on a digital transformation (DX) journey, focusing on profitability and improving the customer experience. The key objectives for Arvind’s DX is to unlock the value of existing applications, gain new insights, and build a solid workflow with resilient systems.

Getting Google Cloud to address the challenges around insights & analytics was a natural step forward, since Arvind had already formed a relationship with Google Cloud, starting with its productivity and collaboration tools during the pandemic.

Key Challenges

Arvind’s enterprise applications estate is a mix of SAP, Oracle POS, logistics management systems and other applications. Having so many different applications made it a challenge for the company to bring all of this data together to drive retail insights and at the same time maintain the freshness of its products.

As a case in point, the existing sales reporting and inventory reconciliation process had been enabled by a mix of automated and semi-automated desktop applications. There were challenges to scale the infrastructure in order to process large amounts of data at a low latency.

The synchronization of master data across functions was critical to build the data platform that provides consistent insights to multiple stakeholders across the organization.

Solution Approach – Modern Data Platform

There are several ways to solve the challenges above and do more by building a modern data analytics platform. For example, using a data lake based approach that builds use case by use case, hybrid data estates and so on. Regardless of the approach, it is important to define the solution based on certain principles.

In Arvind’s scenario, the key business principles considered are that data platforms should support Variety, Variability, Velocity and Volume. Each of these 4 V’s are critical business pivots to successful fashion retailing. Variety in SKU’s to deal with myriad fashion trends every season, Variability in shopping footfalls due to different festivities, weekends and special occasions, Velocity to be agile and responsive to customer needs, and Volumes of data that bring richer insights.

This is where Google BigQuery enabled data platform comes in, as it is able to meet the needs above.

Solution Architecture – Current Capabilities & Future Vision

BigQuery is the mothership of the data and analytics platform on Google Cloud. Its serverless construct ensures that data engineering teams focus only on insights & analytics. Storage and compute is decoupled and can be independently scaled. BigQuery has been leveraged to service both the raw as well as the curated data zones.

With BigQuery procedures, it is possible to process the data natively within the data warehouse itself. Procedures have been leveraged to process the data in a low latency manner with the familiar SQL.

But then what happens to advanced analytics and insights? With simplicity being our key guiding principle, BigQuery machine learning ensures that data analysts can create, train and deploy analytics models even with complex requirements. It can also consume data from Looker Studio, which is seamlessly integrated with BigQuery.

Here are the key principles and highlights of the data platform that have been achieved:

Simple, yet exhaustive – We needed a solution with vast technical capabilities such as data lake & data warehouse, data processing, data consumption, analytics amongst others. And at the same time it needed to be simple to implement and run ongoing operations.

Agility – High quality analytics use cases typically require a significant amount of time, effort and skill set. While building a simple solution we ensured that the selection of technology services ensured agility in the long term.

Security – An organization can be truly successful if the insights and analytics operations are democratized. But while data is made available to a wider community, we need to ensure data governance and security.

Ease of operations – Data engineering teams spend a lot of time doing infrastructure setting and management operations. With BigQuery, teams can put in more effort on building the data pipelines and models to feed into analytics instead of worrying about the infrastructure operations.

Costs – Decoupling storage and compute allows for flexible pricing. A pay-as-you-go model is the ideal solution to managing costs.

Business Impact

The ingestion frequency of the store level inventory (~800 stores) has now been changed to daily. With the additional data volumes and processing the scaling on BigQuery has been seamless. There are new processes and dashboards to address the reconciliation and root cause analysis. Operational efficiencies have improved leading to better productivity and turn around time of critical processes.

The discrepancies in various reconciliation activities have drastically reduced by an order of magnitude of 300X due to the capabilities offered by the data platform. Not only is it possible to identify discrepancies but the data platform has also enabled in identifying the root causes for the same as well.

Arvind Fashions Ltd have also been able to enhance some of the existing business processes and systems with insight from the data platform.

It’s going to be an exciting journey for Arvind Fashions Ltd and Google Cloud. There are several initiatives ready for kick off such as getting more apps on the edge devices, warehouse analytics, advanced customer data platforms, predicting the lifecycle of designs, style codes and other exciting initiatives.

Read More for the details.

2023 02 01

GCP – What Data Pipeline Architecture should I use?

Cloud, Google Cloud gcp

Data is essential to any application and is used in the design of an efficient pipeline for delivery and management of information throughout an organization. Generally, define a data pipeline when you need to process data during its life cycle. The pipeline can start where data is generated and stored in any format. The pipeline can end with data being analyzed, used as business information, stored in a data warehouse, or processed in a machine learning model.

Data is extracted, processed, and transformed in multiple steps depending on the downstream system requirements. Any processing and transformational steps are defined in a data pipeline. Depending on the requirements, the pipelines can be as simple as one step or as complex as multiple transformational and processing steps.

How to choose a design pattern?

When selecting a data pipeline design pattern, there are different design elements that must be considered. These design elements include the following:

Select data source formats.

Select which stacks to use.

Select data transformation tools.

Choose between Extract Transform Load (ETL), Extract Load Transform (ELT), or Extract Transform Load Transform (ETLT).

Determine how changed data is managed.

Determine how changes are captured.

Data sources can have a variety of data types. Knowing the technology stack and tool sets that we use is also a key element of the pipeline build process. Enterprise environments come with the challenges that require using multiple and complicated techniques to capture the changed data and to merge with the target data.

I mentioned that most of the time the downstream systems define the requirements for a pipeline and how these processes can be interconnected. The processing steps and sequences of the data flow are the major factors affecting pipeline design. Each step might include one or more data inputs, and the outputs might include one or more stages. The processing between input and output might include simple or complex transformational steps. I highly recommend keeping the design simple and modular to ensure that you clearly understand the steps and transformation taking place. Also, keeping your pipeline design simple and modular makes it easier for a team of developers to implement development and deployment cycles. It also makes debugging and troubleshooting the pipeline easier when issues occur.

The major components of a pipeline Include:

Source data

Processing

Target storage

Source data can be the transaction application, the files collected from users, and data extracted from an external API. Processing of the source data can be as simple as one step copying or as complex as multiple transformations and joining with other data sources. The target data warehousing system might require the processed data that is the result of the transformation (such as a data type change or data extraction), and lookup and updates from other systems. A simple data pipeline might be created by copying data from source to target without any changes. A complex data pipeline might include multiple transformation steps, lookup, updates, KPI calculations, and data storage into several targets for different reasons.

Source data can be presented in multiple formats. Each will need a proper architecture and tools to process and transform. There can be multiple data types required in a typical data pipeline that might be in any of the following formats:

Batch Data: A file with tabular information (CSV, JSON, AVRO, PARQUET and …) where the data is collected according to a defined threshold or frequency with conventional batch processing or micro-batch processing. Modern applications tend to generate continuous data. For this reason, micro-batch processing is a preferred design to collect the data from sources.

Transactions Data: Application data such as RDBMS (relational data), NoSQL, Big Data.

Stream Data: Real-time applications that use Kafka, Google Pub/Sub, Azure Stream Analytics, or Amazon Stream Data. Streaming data applications can communicate in real time and exchange messages to meet the requirements. In Enterprise architecture design, the real time and stream processing is a very important component of design.

Flat file – PDFs or other non-tabular formats that contain data for processing. For example, medical or legal documents that can be used to extract information.

Target data is defined based on the requirements and the downstream processing needs. It’s common to build target data to satisfy the need for multiple systems. In the Data Lake concept, the data is processed and stored in a way that Analytics systems can get insight while the AI/ML process can use the data to build predictive models.

Architectures and examples

Multiple architecture designs are covered that show how the source data is extracted and transformed to the target. The goal is to clever the general approaches, and it’s important to remember that each use case can be very different and unique to the customer and need special consideration.

The data pipeline architecture can be broken down into Logical and Platform levels. The logical design describes how the data is processed and transformed from the source into the target. The platform design focuses on implementation and tooling that each environment needs, and this depends on the provider and tooling available in the platform. GCP, Azure, or Amazon have different toolsets for the transformation while the goal of the logical design remains the same (data transform) no matter which provider is used.

Here is a logical design of a Data Warehousing pipeline:

Here is the logical design for a Data Lake pipeline:

Depending on the downstream requirements, the generic architecture designs can be implemented with more details to address several use cases.

The Platform implementations can vary depending on the toolset selection and development skills. What follows are a few examples of GCP implementations for the common data pipeline architectures.

A Batch ETL Pipeline in GCP – The Source might be files that need to be ingested into the analytics Business Intelligence (BI) engine. The Cloud Storage is the data transfer medium inside GCP and then Dataflow is used to load the data into the target BigQuery storage. The simplicity of this approach makes this pattern reusable and effective in simple transformational processes. On the other hand, if we need to build a complex pipeline, then this approach isn’t going to be efficient and effective.

A Data Analytics Pipeline is a complex process that has both batch and stream data ingestion pipelines. The processing is complex and multiple tools and services are used to transform the data into warehousing and an AL/ML access point for further processing. Enterprise solutions for data analytics are complex and require multiple steps to process the data. The complexity of the design can add to the project timeline and cost but in order to achieve the business objectives, carefully review and build each component.

Machine learning data pipeline in GCP is a comprehensive design that allows customers to utilize all GCP native services to build and process a machine learning process. For more information, see Creating a machine learning pipeline.

GCP platform diagrams are created by Google Cloud Developer Architecture.

How to choose a data pipeline architecture?

There are multiple approaches to designing and implementing data pipelines. The key is to choose the design that meets your requirements. There are new technologies emerging that are providing more robust and faster implementations for data pipelines. Google Big Lake is a new service that introduces a new approach on data ingestion. BigLake is a storage engine that unifies data warehouses by enabling BigQuery and open source frameworks such as Spark to access data with fine-grained access control. BigLake provides accelerated query performance across multi-cloud storage and open formats such as Apache Iceberg.

The other major factor in deciding the proper data pipeline architecture is the cost. Building a cost-effective solution is a major factor in deciding the design. Usually, streaming and real-time data processing pipelines are more expensive to build and run compared to using batch models. There are times that the budget runs the decision on which design to choose and how to build the platform. Knowing the details on each component and being able to do cost analysis of the solution ahead of time is important in choosing the right architecture design for your solution. GCP provides a cost calculator that can be used in these cases.

Do you really need real-time analytics or will a near real-time system be sufficient? This can resolve the design decision for the streaming pipeline. Are you building cloud native solutions or migrating an existing one from on-premises? All of these questions are important in designing a proper architecture for our data pipeline.

Don’t ignore the data volume when designing a data pipeline. The scalability of the design and services used in the platform is another very important factor to consider when designing and implementing a solution. Big Data is growing more and building capacity for processing. Storing the data is a key element to data pipeline architecture. In reality, there are many variables that can help with proper platform design. The data volume and velocity or data flow rates can be very important factors.

If you are planning to build a data pipeline for a data science project, then you might consider all data sources that the ML Model requires for future engineering. The data cleansing process is mostly a big part of the data engineering team which must have adequate and sufficient transformational toolsets. Data science projects are dealing with large data sets, which will require planning for storage. Depending on how the ML Model is utilized, either real-time or batch processing must serve the users.

What Next?

Big Data and the growth of the data in general are posing new challenges for data architects and always challenging the requirements for data architecture. A constant increase of data variety, data formats, and data sources is a challenge as well. Businesses are realizing the value of the data and are automating more processes and demanding real-time access to the analytics and decision making information. This is becoming a challenge to take into consideration all variables for a scalable performance system. The data pipeline must be strong, flexible, and reliable. The data quality must be trusted by all users. Data privacy is one of the most important factors in any design consideration. I’ll cover these concepts in my next article.

I highly recommend following Google Cloud quickstart and tutorials as the next steps to learn more about the GCP and experience hands-on practice.

Interactive Tutorial: BigQuery tour for data analysts

Interactive Tutorial: Train an AutoML tabular model

Interactive Tutorial: Analyze Billing data with BigQuery

Stay tuned. Thank you for reading! Have a question or want to chat? Find me on Twitter or LinkedIn.

Read More for the details.

2023 01 31

AWS – SageMaker Automatic Model Tuning now adds three new completion criteria for tuning jobs

AWS, Cloud AWS

Amazon SageMaker Automatic Model Tuning now supports three new completion criteria to help you customize your tuning jobs based on your desired trade-off between accuracy, cost, and runtime.

Read More for the details.

2023 01 31

AWS – Amazon OpenSearch Service simplifies remote reindex for VPC domains

AWS, Cloud AWS

Amazon OpenSearch Service adds a new connection mode for cross-cluster connection, simplifying the setup required to remote reindex between a local domain and remote VPC domains. Remote reindex enables you to migrate data from a source domain to a target domain. Remote reindex is also useful when you have to upgrade your clusters across multiple major versions.

Read More for the details.

2023 01 31

AWS – EC2 Hibernate now supports Amazon EC2 C6i, C6id, M6i, M6id, and I3en instances

AWS, Cloud AWS

You can now hibernate Elastic Block Storage-backed Amazon EC2 I3en, M6i, M6id, C6i, and C6id instances. Hibernation provides you with the convenience of pausing your instances and resuming them later from a saved state. Hibernation is just like closing and opening your laptop lid — your application will start right from where it left off. By using hibernation, you can maintain a fleet of pre-warmed instances that can get to a productive state faster without modifying your existing applications.

Read More for the details.

2023 01 31

GCP – Save the Date: Google Cloud Next ‘23 is back August 29 – 31

Cloud, Google Cloud gcp

Google Cloud Next is back in person August 29–31, 2023, at the Moscone Center in San Francisco.

Make the most of this must-attend event, where you can:

Learn about the latest Google advancements in AI, data, security, productivity, and more

Explore keynotes, breakout sessions, and demos

Take part in hands-on labs, training, and certification opportunities on site

Hear updates on product roadmaps

Connect with Google thought leaders in the executive meeting center

We really hope you can make it live and in person in San Francisco. Space is limited, so mark your calendar for August 29–31. If you aren’t able to attend live, select Next ’23 experiences will be made available digitally and on demand.

Sign up here for the latest information about Next ’23, and be among the first to know when registration opens. See you in San Francisco!

Read More for the details.

2023 01 31

AWS – Amazon CloudWatch now simplifies metric extraction from structured logs

AWS, Cloud AWS

Amazon CloudWatch now supports metric extraction from structured logs using Embedded Metric Format (EMF) without requiring customers to provide special header declaration while publishing logs.

Read More for the details.

2023 01 31

AWS – Amazon Athena releases data source connector for Google Cloud Storage

AWS, Cloud AWS

Starting today, you can use Amazon Athena to query data in Google Cloud Storage. With Athena’s data source connectors, you can run SQL queries on data stored in relational, non-relational, object, and custom data sources without the need to move data to S3 or learn a new query dialect. Google Cloud Storage is a managed service designed to store data in buckets, similar to Amazon S3.

Read More for the details.

2023 01 31

AWS – AWS CloudTrail Lake now supports ingestion of activity events from non-AWS sources

AWS, Cloud AWS

AWS CloudTrail Lake now supports ingesting activity events from non-AWS sources, making CloudTrail Lake a single location to immutably store user and API activity events for auditing and security investigations across AWS and hybrid environments. You can consolidate activity events from AWS and non-AWS sources – such as in-house applications and SaaS applications running in the cloud or on-premises – without having to maintain multiple log aggregators and analysis tools. CloudTrail Lake records all events in a prescribed CloudTrail schema, immutably stores them for up to seven years, and provides an integrated SQL experience to query your activity events. This makes it easier for you to manage and diagnose security, audit, and operational incidents in AWS and hybrid environments.

Read More for the details.

2023 01 31

AWS – Amazon MemoryDB for Redis Announces 99.99% Availability Service Level Agreement

AWS, Cloud AWS

Amazon MemoryDB for Redis now offers an availability Service Level Agreement (SLA) of 99.99% when using a Multi-Availability Zone (Multi-AZ) configuration. Previously, MemoryDB offered an SLA of 99.9% for Multi-AZ configurations. With this launch, MemoryDB has updated its Multi-AZ SLA to provide 10x higher levels of availability.

Read More for the details.

2023 01 31

AWS – Amazon Polly launches two new US English NTTS voices

AWS, Cloud AWS

Amazon Polly is a service that turns text into lifelike speech, allowing you to create applications that talk, and build entirely new categories of speech-enabled products. Today, we are excited to announce the general availability of Ruth and Stephen, two new US English neural Text-to-speech (NTTS) voices.

Read More for the details.

2023 01 31

AWS – Amazon ElastiCache for Redis announces 99.99% availability Service Level Agreement

AWS, Cloud AWS

Amazon Elasticache for Redis now offers an availability Service Level Agreement (SLA) of 99.99% when using a Multi-Availability Zone (Multi-AZ) configuration. Previously, ElastiCache for Redis offered an SLA of 99.9% for Multi-AZ configurations. With this launch, ElastiCache for Redis has updated its Multi-AZ SLA to provide 10x higher levels of availability.

Read More for the details.

2023 01 31

AWS – AWS Systems Manager announces integration of Automation with Change Calendar

AWS, Cloud AWS

Today, we are excited to announce integration of Systems Manager Automation with Systems Manager Change Calendar. Customers can now reduce the risks associated with changes to their production environment by allowing Automation runbooks to run during an allowed time window. With this feature, users in your account can only run automations during the time periods allowed by your Change Calendar. For example, you can avoid the risk of application downtime due to system update during high traffic time period, by blocking the time period in Change Calendar and enforcing Automation to check Change Calendar before updating your Amazon EC2 instance types.

Read More for the details.

2023 01 31

AWS – Amazon AppFlow announces 4 new data connectors

AWS, Cloud AWS

Amazon AppFlow announces the release of 4 new connectors that include Braintree, Microsoft Dynamics 365, Oracle HCM and Zoho CRM. Amazon AppFlow is continually expanding its catalog of connectors to popular SaaS applications and these four new data connectors make it easier for customers to access their data for use cases across marketing, eCommerce, customer service, and more.

Read More for the details.

Cloud

Crossing the chasm to the rest of your environment

Making it simpler on Google Cloud

Get Started today

CDN migration requirements

Migration architecture

Data transfer

Dynamic image resizing

CDN configuration

Achieving zero downtime

Summary

Google Cloud networking in depth: Cloud CDN

Google + Mandiant: Transforming Security Operations and Incident Response

Key Challenges

Solution Approach – Modern Data Platform

Business Impact

How to choose a design pattern?

Architectures and examples

How to choose a data pipeline architecture?

What Next?