Remove optimizing-hive-on-tez-performance
article thumbnail

Optimizing Hive on Tez Performance

Cloudera

Tuning Hive on Tez queries can never be done in a one-size-fits-all approach. The performance on queries depends on the size of the data, file types, query design, and query patterns. During performance testing, evaluate and validate configuration parameters and any SQL modifications. Understanding parallelization in Tez.

article thumbnail

Upgrade Journey: The Path from CDH to CDP Private Cloud

Cloudera

In addition the customer wanted to use the new Hive capabilities shipped with CDP Private Cloud Base 7.1.2. Hive-on-Tez for better ETL performance. ACID transactions, ANSI 2016 SQL SupportMajor Performance improvements. Navigator to atlas migration, Improved performance and scalability. Background: .

Cloud 130
article thumbnail

Automating Data Pipelines in CDP with CDE Managed Airflow Service

Cloudera

That’s why we are excited to expand our Apache Airflow-based pipeline orchestration for Cloudera Data Platform (CDP) with the flexibility to define scalable transformations with a combination of Spark and Hive. Figure 1: Pipeline composed of Spark and Hive jobs deployed to run within CDE’s managed Apache Airflow service.

Data 110