Micro Focus is now part of OpenText. Learn more >

You are here

You are here

Prometheus is here to stay—make it part of your IT Ops monitoring

public://pictures/michael_fisher.jpeg
Michael Fisher Group Product Manager, OpsRamp
 

Prometheus has emerged as one of the most popular open-source infrastructure monitoring tools ever. It's become a de facto industry standard for monitoring cloud applications and infrastructure, particularly those built with cloud-native technologies such as Kubernetes. You should use it.

Prometheus has caught on in ways that past open-source monitoring tools such as StatsD, CollectD, and Graphite never did, thanks to its superior ease of use, richer data model, and PromQL query language. Nagios was an open-source monitoring warhorse in its day, but the cloud-native world that Prometheus was built for has left it behind.

From its introduction in 2012 to its acceptance by and graduation from the Cloud Native Computing Foundation (in 2016 and 2018, respectively), Prometheus became the benevolent virus that an organization never realized it had.

It starts with one developer instrumenting one application on his localhost or using an application already instrumented with Prometheus, then another developer putting it on her localhost. Soon the production Kubernetes cluster is running Prometheus and sending metrics to a managed Grafana host sitting on an EC2 instance that a dev team set up because it didn't want to deal with excessive local storage of time-series metrics.

Meanwhile, IT Ops is unaware of the instrumentation and usage of Prometheus, and it has few insights into the health and performance of the application, barring those that can be inferred from performance metrics coming from the underlying and supporting infrastructure.

That situation is far from ideal. Here's what you should do about it.

It's time to go all in

There are several reasons why monitoring should be centralized so that anyone, from Tier 1 support to top-level executives, can see the end-to-end availability and performance of applications and services. Centralizing monitoring data, from custom application performance to the underlying databases, is critical for ITOps.

In addition to the above scenario, the most salient reasons to integrate Prometheus monitoring with the rest of the monitoring stack include:

  • Long-term data retention—Prometheus collects time-series data and stores it in the cloud. This data is valuable to view performance over time, often up to two years.
  • Federating data across disparate instrumentation—This allows you to view Prometheus metrics in context with other monitoring data and pinpoint where a performance issue is occurring.
  • Ease of correlation of app health to supporting infrastructure—This gives you a fuller picture of application health by connecting health metrics with underlying infrastructure metrics.
  • Automated actions to remediate app issues—By connecting Prometheus to the rest of your monitoring tool kit, you can integrate it into your event management system and set actions to take automatically in response to detected application issues in Prometheus.

Take monitoring to the next level

Prometheus is a valuable tool for monitoring cloud and cloud-native application performance and for building dashboards to visualize that performance with the PromQL query tool. It doesn't necessarily replace your existing application and infrastructure monitoring tools, but it does enhance them.

So how do you take advantage of Prometheus integration to take your application and infrastructure monitoring to the next level? Here are three things you can do:

  1. Centralize: Import and store Prometheus metrics data in a centralized cloud repository to avoid the burden of managing dozens of local databases across Kubernetes clusters.
  2. Integrate: Bring PromQL into your visualization tool to add Prometheus metrics to your dashboards.
  3. Analyze: Ingest Prometheus alerts and cross-correlate with alerts from your other systems, using machine learning, to improve alerting accuracy.

De-silo your monitoring infrastructure

Prometheus has become the top choice for monitoring cloud and cloud-native applications. Chances are your DevOps and SRE teams are using it somewhere in your organization today. Though Prometheus delivers a powerful data model and a flexible query language that makes it easy to use and build dashboards for metrics, it won't live up to its potential if it remains just another monitoring silo in your environment.

Integrating Prometheus with the rest of your monitoring stack will improve your data visualizations, help you get to the root cause of a performance issue faster, lower your database management overhead, expand your event management system to cloud and cloud-native applications, and enhance your alerting system.

You'll end up with more stable, more reliable business applications and happier customers.

Keep learning

Read more articles about: Enterprise ITIT Ops