Remove stream-processing-part-1-tutorial-developing-streaming-applications
article thumbnail

Putting Events in Their Place with Dynamic Routing

Confluent

In the most basic scenario, microservices that need to take action on a common stream of events all listen to that stream. In the Apache Kafka ® world, this means that each of those microservice client applications subscribes to a common Kafka topic. IoT: a stream of sensor data in which each sensor reading is an event.

article thumbnail

Digital Transformation is a Data Journey From Edge to Insight

Cloudera

This is the first in a six-part blog series that outlines the data journey from edge to AI and the business value data produces along the journey. Data Enrichment – data pipeline processing, aggregation & management to ready the data for further refinement. Fig 1: The Enterprise Data Lifecycle.

Data 103
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Monitoring Cloudera DataFlow Deployments With Prometheus and Grafana

Cloudera

Cloudera DataFlow for the Public Cloud (CDF-PC) is a complete self-service streaming data capture and movement platform based on Apache NiFi. It allows developers to interactively design data flows in a drag and drop designer, which can be deployed as continuously running, auto-scaling flow deployments or event-driven serverless functions.

Metrics 101
article thumbnail

Machine Learning with Python, Jupyter, KSQL and TensorFlow

Confluent

The blog posts How to Build and Deploy Scalable Machine Learning in Production with Apache Kafka and Using Apache Kafka to Drive Cutting-Edge Machine Learning describe the benefits of leveraging the Apache Kafka ® ecosystem as a central, scalable and mission-critical nervous system. Data scientists love Python, period.

article thumbnail

Optimizing Kafka Streams Applications

Confluent

Kafka Streams introduced the processor topology optimization framework at the Kafka Streams DSL layer. This framework opens the door for various optimization techniques from the existing data stream management system (DSMS) and data stream processing literature. Kafka Streams topology generation 101.

article thumbnail

Using Cloudera Data Engineering to Analyze the Paycheck Protection Program Data

Cloudera

Second, the data set is likely to evolve, which will consume additional development time and resources. Finally, in a multi-stage process like this, there’s a chance things will break. The primary objective for this data engineer is to provide the LBB with two end reports: Report 1: Breakdown of all cities in Texas that retained jobs.

article thumbnail

Test drive the Citus 11.0 beta for Postgres

The Citus Data

The easiest way to use Citus is to connect to the coordinator node and use it for both schema changes and distributed queries, but for very demanding applications, you now have the option to load balance distributed queries across the worker nodes in (parts of) your application by using a different connection string and factoring a few limitations.

Testing 126