DevOps

Scratching an Itch with Prometheus

July 5, 2018 · 2 min read

Not too long ago I became obsessed with Prometheus. I'd heard about it for a while, knew it was powerful, and couldn't quite understand how everything fit together. The documentation is extremely verbose for good reason but it took playing with it for a while for everything to click. This post is a rather concise and extensive overview that goes a long way in expressing the basic concepts to my developer brain. In their simplest form, exporters expose an HTTP endpoint of /metrics with the output being statistics in Prometheus' format. The real power of Prometheus comes when you expose your own /metrics endpoint and have Prometheus consume the statistics you generate. This post is also a very good introduction with the section Building your own exporter being extremely valuable in describing just some of the possibilities.

After getting my bearings I started with a prototype with a simple premise "Why look at the usage graphs in Digital Ocean for each server independently? Why not have it in one location?" How To Install Prometheus on Ubuntu 16.04 is a very good primer to get everything up and running quickly.

I've made a few modifications since working through the article:

  • Prometheus version 2.3.1

    • There have been massive perf improvements in v2.3.x.
  • node_exporter version 0.16.0

    • There are significant changes to the metrics naming conventions.
    • This exporter typically has the most coupling with Grafana dashboards and often requires altering them to work correctly.
  • Use prometheus:prometheus for ownership of core prometheus processes like prometheus or alertmanager.

    • sudo useradd --no-create-home --shell /bin/false prometheus
  • Use prometheus-exporter:prometheus-exporter for ownership of exporters. Exporters should possibly be more isolated but I feel it may be a case of YAGNI.

    • sudo useradd --no-create-home --shell /bin/false prometheus-exporter
  • Set scrape_interval to 1 minute: scrape_interval: 1m.

    • 15 seconds is still doable but I'm currently not concerned with very granular detail.
    • This reduces the load of making 4 calls per minute to just 1, reducing some overhead required for Prometheus and every exporter.

At $dayJob we've moved to provisioning servers using Laravel Forge, which has the possibility of utilizing exporters for mysqld, mariadb, postgres, memcached, redis, beanstalkd, nginx, php-fpm, and sendmail. I've opted to use node_exporter, mysqld, nginx-vts-exporter, php-fpm, and redis respectively. To put the original premise into perspective, replicating the newer monitoring agent graphs in Digital Ocean only require node_exporter. A few of the exporters require very little setup, only setting a few configuration variables systemd service definitions. Other exporters like nginx-vts-exporter require building nginx from source.

I plan to introduce a series of posts that should aid in getting a very rudimentary implementation running. There is an abundant usage of Kubernetes in the Prometheus ecosystem, to the point that it almost seems required but fortunately it also just works(tm) in a traditional virtual machine without any real fuss.