Remove Compliance Remove Data Engineering Remove Storage Remove Tools
article thumbnail

Fundamentals of Data Engineering

Xebia

The following is a review of the book Fundamentals of Data Engineering by Joe Reis and Matt Housley, published by O’Reilly in June of 2022, and some takeaway lessons. This book is as good for a project manager or any other non-technical role as it is for a computer science student or a data engineer.

article thumbnail

Cloudera Data Engineering 2021 Year End Review

Cloudera

Since the release of Cloudera Data Engineering (CDE) more than a year ago , our number one goal was operationalizing Spark pipelines at scale with first class tooling designed to streamline automation and observability. Data pipelines are composed of multiple steps with dependencies and triggers. New in 2021.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

2018: A Year in Review for Storage Systems.

Hu's Place - HitachiVantara

For lack of similar capabilities, some of our competitors began implying that we would no longer be focused on the innovative data infrastructure, storage and compute solutions that were the hallmark of Hitachi Data Systems. A REST API is built directly into our VSP storage controllers.

article thumbnail

Simplifying machine learning lifecycle management

O'Reilly Media - Data

As companies move from machine learning prototypes to products and services, tools and best practices for productionizing and managing models are just starting to emerge. Today’s data science and data engineering teams work with a variety of machine learning libraries, data ingestion, and data storage technologies.

article thumbnail

Metadata Management: Process, Tools, Use Cases, and Best Practices

Altexsoft

We’ll briefly recap the basics first and then discuss metadata management and tools that can come in handy. Metadata is basically information that describes other data. It helps us understand the origin, structure, nature, and context of data. It aims at making data assets understandable and discoverable for users.

Tools 59
article thumbnail

Navigating the Data Lake: Insights from Building and Utilizing Data Lakes

InnovationM

In this article, I will share practical insights and technologies utilized in building and harnessing the potential of data lakes. Demystifying Data Lakes Data lakes serve as flexible storage repositories, enabling organizations to store raw and diverse data types, breaking away from the constraints of traditional data warehouses.

Data 52
article thumbnail

Data Architect: Role Description, Skills, Certifications and When to Hire

Altexsoft

It serves as a foundation for the entire data management strategy and consists of multiple components including data pipelines; , on-premises and cloud storage facilities – data lakes , data warehouses , data hubs ;, data streaming and Big Data analytics solutions ( Hadoop , Spark , Kafka , etc.);

Data 87