Monitoring Docker

Telegraf + InfluxDB + Grafana

2017-09-12

Created by Thierry Sallé / @tsalle

Telegraf

  • Collecteur ecrit en Go
  • Leger
  • Input / Output Plugins

Input Plugins

  • aerospike
  • amqp_consumer (rabbitmq)
  • apache
  • aws cloudwatch
  • bcache
  • cassandra
  • ceph
  • cgroup
  • chrony
  • consul
  • conntrack
  • couchbase
  • couchdb
  • disque
  • dmcache
  • dns query time
  • docker
  • dovecot
  • elasticsearch
  • exec (generic executable plugin, support JSON, influx, graphite and nagios)
  • fail2ban
  • filestat
  • fluentd
  • graylog
  • haproxy
  • hddtemp
  • http_response
  • httpjson (generic JSON-emitting http service plugin)
  • internal
  • influxdb
  • interrupts
  • ipmi_sensor
  • iptables
  • jolokia
  • kapacitor
  • kubernetes
  • leofs
  • lustre2
  • mailchimp
  • memcached
  • mesos
  • minecraft
  • mongodb
  • mysql
  • net_response
  • nginx
  • nsq
  • nstat
  • ntpq
  • openldap
  • phpfpm
  • phusion passenger
  • ping
  • postgresql
  • postgresql_extensible
  • powerdns
  • procstat
  • prometheus (can be used for Caddy server)
  • puppetagent
  • rabbitmq
  • raindrops
  • redis
  • rethinkdb
  • riak
  • salesforce
  • sensors
  • snmp
  • snmp_legacy
  • sql server (microsoft)
  • tomcat
  • twemproxy
  • varnish
  • zfs
  • zookeeper
  • win_perf_counters (windows performance counters)
  • win_services
  • sysstat
  • system
    • cpu
    • mem
    • net
    • netstat
    • disk
    • diskio
    • swap
    • processes
    • kernel (/proc/stat)
    • kernel (/proc/vmstat)
    • linux_sysctl_fs (/proc/sys/fs)

Telegraf can also collect metrics via the following service plugins:

  • http_listener
  • kafka_consumer
  • mqtt_consumer
  • nats_consumer
  • nsq_consumer
  • logparser
  • statsd
  • socket_listener
  • tail
  • tcp_listener
  • udp_listener
  • webhooks
    • filestack
    • github
    • mandrill
    • rollbar
    • papertrail
  • zipkin

Output Plugins

  • influxdb
  • amon
  • amqp (rabbitmq)
  • aws kinesis
  • aws cloudwatch
  • datadog
  • discard
  • elasticsearch
  • file
  • graphite
  • graylog
  • instrumental
  • kafka
  • librato
  • mqtt
  • nats
  • nsq
  • opentsdb
  • prometheus
  • riemann
  • riemann_legacy
  • socket_writer
  • tcp
  • udp

Installation

Configuration

/etc/telegraf/telegraf.conf

telegraf config > /etc/telegraf/telegraf.conf

Configuration Input Plugin Docker


# # Read metrics about docker containers
[[inputs.docker]]
  ## Docker Endpoint 
  ## To use TCP, set endpoint = "tcp://[ip]:[port]" 
  ## To use environment variables (ie, docker-machine), set endpoint = "ENV"
  endpoint = "unix:///var/run/docker.sock" 
  ## Only collect metrics for these containers, collect all if empty 
  container_names = [] 
  ## Timeout for docker list, info, and stats commands 
  timeout = "5s" 
  ## Whether to report for each container per-device blkio (8:0, 8:1...) and 
  ## network (eth0, eth1, ...) stats or not perdevice = true 
  ## Whether to report for each container total blkio and network stats or not 
  total = false

InfluxDB

  • Time series datatabase
  • Ecrit en Go
  • Open-Source (sauf le mode cluster)

Installation

Configuration de Telegraf


# Configuration for influxdb server to send metrics to
[[outputs.influxdb]]
 ## The full HTTP or UDP endpoint URL for your InfluxDB instance.
 ## Multiple urls can be specified as part of the same cluster,
 ## this means that only ONE of the urls will be written to each interval.
 # urls = ["udp://localhost:8089"] # UDP endpoint example
 urls = ["http://influxdb:8086"] # required
 ## The target database for metrics (telegraf will create it if not exists).
 database = "telegraf" # required
 ## Retention policy to write to. Empty string writes to the default rp.
 retention_policy = ""
 ## Write consistency (clusters only), can be: "any", "one", "quorum", "all"
 write_consistency = "any"
 ## Write timeout (for the InfluxDB client), formatted as a string.
 ## If not provided, will default to 5s. 0s means no timeout (not recommended).
 timeout = "5s"
 # username = "telegraf"
 # password = "metricsmetricsmetricsmetrics"
 ## Set the user agent for HTTP POSTs (can be useful for log differentiation)
 # user_agent = "telegraf"
 ## Set UDP payload size, defaults to InfluxDB UDP Client default (512 bytes)
 # udp_payload = 512

Docker Compose


version: '2'
services:

  traefik:
    image: traefik
    hostname: traefik
    container_name: traefik
    command: --web --docker --docker.domain=aperogeek.fr --logLevel=DEBUG --acme --acme.email=seuf@gmail.com --acme.storage=/etc/traefik/acme.json --acme.entrypoint=https --acme.ondemand=true --entryPoints='Na
me:https Address::443 TLS:/etc/traefik/ssl/aperogeek.fr.pem,/etc/traefik/ssl/aperogeek.fr.key' --entryPoints='Name:http Address::80 Redirect.EntryPoint:https' --defaultentrypoints=http,https
    ports:
      - "80:80"
      - "443:443"
    volumes:
      - "./ssl:/etc/traefik/ssl"
      - "/var/run/docker.sock:/var/run/docker.sock"
      - "/dev/null:/traefik.toml"
      - "/data/traefik/acme.json:/etc/traefik/acme.json"

  telegraf:
    image: telegraf
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /data/telegraf/telegraf.conf:/etc/telegraf/telegraf.conf
    labels:
      - "traefik.enable=false"

  influxdb:
    image: influxdb:alpine
    volumes:
      - /data/influxdb:/var/lib/influxdb
    labels:
      - "traefik.enable=false"
 
  grafana:
    image: grafana/grafana
    volumes:
      - /data/grafana:/var/lib/grafana
    labels:
      - "traefik.frontend.rule=Host:grafana.aperogeek.fr"

docker-compose ps


       Name                      Command               State                                Ports                               
-------------------------------------------------------------------------------------------------------------------------------
grafana               /run.sh                          Up      3000/tcp                                                         
influxdb              /entrypoint.sh influxd           Up      8086/tcp                                                         
telegraf              /entrypoint.sh telegraf          Up      8092/udp, 8094/tcp, 8125/udp                                     
traefik               /traefik --web --docker -- ...   Up      0.0.0.0:443->443/tcp, 0.0.0.0:80->80/tcp, 0.0.0.0:8080->8080/tcp 

Demo