What’s New in Modern Data Warehousing for 2020?

Over several decades we have been using data warehouses to help us gain a better understanding of our business.  We started with just trying to capture the essential data, centralizing and managing everything with governance, security, integrity and more.  From there we could create meaningful reports and dashboards that helped us keep score on our key performance indicators (KPIs), essentially measuring our performance against those pre-defined goals and metrics.  Fast forward to today, we are still doing all of that – still curating data, still defining dashboards and reports, still measuring our business. We have added a lot more data to the equation and can measure ourselves in a much shorter time increment than when we started.  However, there is a new push in business analytics, the need for true business insight. This means we need to do more than count things, do more than measure and compare, we need to actually glean real meaning from our data. We need to find patterns, correlations, predictions from data to help us not just course-correct based on scores, but to shape strategic vision with greater innovation.  

This is the role of the modern data warehouse – to help us both keep score with transactional precision and to shape the future with statistically motivated predictive analytics.  This means a lot of things have to change:

1.We need to ingest more data and more diverse data types than ever before.  

For example, a major telco needs to reduce customer churn.  In some Asian telco markets, this could mean predicting the best offer to make to subscribers every 3-6 days, as they re are no long-term contracts in place.  To do so, they will need to analyze social media data, phone usage data, network traffic, and look for the patterns that emerge between them all, possibly with the help of machine learning.  An international bank needs to shift from fraud detection to fraud prevention, requiring ad-hoc search within unstructured sources like documents, emails, and more, to find patterns of attempted fraud and stop it before it occurs.  A major energy company needs to reduce costs of preventative maintenance, by getting smarter on predicting component replacement needs before failure, without wasting precious utility, by analyzing hive volumes of time-series data.

2.We need to be nimble, and stay optimized, as we make data more accessible to more users and use cases than ever before.

Expanding the role data plays in everyone’s lives at our organization means that we are adding users, many who are not technical, and adding use cases, with increasing sophistication.  This, coupled with the ever-expanding types of data we need to take into account, puts extreme pressure on traditional data warehouses, driving the migration to modern technologies. We also need to keep pace with changes in order to keep the data warehouse performant and healthy.  Organizations worldwide are using new tools and services to migrate to, analyze, optimize and manage their modern data warehouses.

3.We need to embrace the hybrid cloud and multi-cloud deployment models

There are many advantages to cloud computing.  You can quickly deliver resources to demanding workloads, without choking existing ones.  You can have multiple isolated computing environments working together on the same data, so nobody steps on each other’s toes while keeping the integrity of a single source of truth.  You can reduce costs with pay-as-you-go vs. invest for the maximum upfront. However, there are some drawbacks. You may not be able to move all your data into a single cloud resource, due to the size or the security risk in doing so.  If you’re not managing your cloud consumption properly, you can easily end up with runaway costs. To succeed on your cloud journey, you must use on-premise computing together with multiple clouds, both public and private, with modern technology to provide consistent security, governance, metadata and schema across it all. 

In the new installment of our Modern Data Warehouse Fundamentals series, we explore all these and more, as we cover new use cases including a deep dive into Time Series Data Warehousing.  We explore tools and services available to migrate existing workloads on traditional data warehouses to our modern data warehouse. We also cover tools and services for optimizing and keeping your workloads healthy.  Finally, we show how we help keep your existing investments in Cloudera (CDH and HDP) up to date and make to journey to the cloud with the Cloudera Data Platform.  

Register to join us on December 12th: Modern Data Warehouse Fundamentals

 

David Dichmann
Senior Director Product Management
More by this author

Leave a comment

Your email address will not be published. Links are not permitted in comments.