Network Infrastructure Weathermap

The main goal of collecting metrics is to store them for long term usage and to create graphs to debug problems or identify trends. However, storing metrics about your system isn’t enough to identity the problem’s & anomalies root cause. It’s necessary to have a high-level overview of your network backbone. Weathermap is perfect for a Network Operations Center (NOC). In this post, I will show you how to build one using Open Source tools only.

Icinga 2 will collect metrics about your backbone, write checks results metrics and performance data to InfluxDB (supported since Icinga 2.5). Visualize these metrics in Grafana in map form.

To get started, add your desired host configuration inside the hosts.conf file:

Note: the city & country attributes will be used to create the weathermap.

To enable the InfluxDBWriter on your Icinga 2 installation, type the following command:

Configure your InfluxDB host and database in /etc/icinga2/features-enabled/influxdb.conf (Learn more about the InfluxDB configuration)

Icinga 2 will forward all your metrics to a icinga2_metrics database. The included host and service templates define a storage, the measurement represents a table by which metrics are grouped with tags certain measurements of certain hosts or services are defined (notice the city & country tags usage).

Don’t forget to restart Icinga 2 after saving your changes:

Once Icinga 2 is up and running it’ll start collecting data and writing them to InfluxDB:

Once our data arrived, it’s time for visualization. Grafana is widely used to generate graphs and dashboards. To create a Weathermap we can use a Grafana plugin called Worldmap Panel. Make sure to install it using grafana-cli tool:

The plugin will be installed into your grafana plugins directory (/var/lib/grafana/plugins):

Restart Grafana, navigate to Grafana web interface and create a new datasource:

Create a new Dashboard:

The Group By clause should be the country code and an alias is needed too. The alias should be in the form $tag_field_name. See the image below for an example of a query:

Under the Worldmap tab, choose the countries option:

Finally, you should see a tile map of the world with circles representing the state of each host.

The field state possible values (0 – OK, 1 – Warning, 2 – Critical, 3 – Unknown/Unreachable)

Note: For lazy people I created a ready to use Dashboard you can import from GitHub.

MySQL Monitoring with Telegraf, InfluxDB & Grafana

This post will walk you through each step of creating interactive, real-time & dynamic dashboard to monitor your MySQL instances using Telegraf, InfluxDB & Grafana.

Start by enabling the MySQL input plugin in /etc/telegraf/telegraf.conf :

Once Telegraf is up and running it’ll start collecting data and writing them to the InfluxDB database:

Finally, point your browser to your Grafana URL, then login as the admin user. Choose ‘Data Sources‘ from the menu. Then, click ‘Add new‘ in the top bar.

Fill in the configuration details for the InfluxDB data source:

You can now import the dashboard.json file by opening the dashboard dropdown menu and click ‘Import‘ :

Note: Check my Github for more interactive & beautiful Grafana dashboards.

GitLab Performance Monitoring with Grafana

Since GitLab v8.4 you can monitor your own instance with InfluxDB & Grafana stack by using the GitLab application performance measuring system called “Gitlab Performance Monitoring“.

GitLab writes metrics to InfluxDB via UDP. Therefore, this must be enabled in /etc/influxdb/influxdb.conf:

Restart your InfluxDB instance. Then, create a database to store GitLab metrics:

Next, go to Gitlab Setings Dashboard and enable InfluxDB Metrics as shown below:

Then, you need to restart GitLab:

Now your GitLab instance should send data to InfluxDB:

Finally, Point your browser to your Grafana URL, then login as the admin user. Choose ‘Data Sources‘ from the menu. Then, click ‘Add new‘ in the top bar.

Fill in the configuration details for the InfluxDB data source:

You can now import the dashboard.json file by opening the dashboard dropdown menu and click ‘Import‘ :

Note: Check my Github for more interactive & beautiful Grafana dashboards.

Exploring Swarm & Container Overview Dashboard in Grafana

In my previous post, your learnt how to monitor your Swarm Cluster with TICK Stack. In this part, I will show you how to use the same Stack but instead of using Chronograf as our visualization and exploration tool we will use Grafana.

Connect to your manager node via SSH, and clone the following Github repository:

Use the docker-compose.yml below to setup the monitoring stack:

Then, issue the following command to deploy the stack:

Once deployed, you should see the list of services running on the cluster:

Point your browser to http://IP:3000, you should be able to reach the Grafana Dashboard:

The default username & password are admin. Go ahead and log in.

Go to “Data Sources” and create 2 InfluxDB data sources:

  • Vms: pointing to your Cluster Nodes metrics database.
  • Docker: pointing to your Docker Services metrics database.

Finally, import the dashboard by hitting the “import” button:

From here, you can upload the dashboard.json, then pick the data sources you created earlier:

You will end up with an interactive and dynamic dashboard:

Monitor your Infrastructure with TIG Stack

In this tutorial, I will show you how to setup a monitoring stack for your infrastructure. So you can collects data from your servers, docker containers 🐋, and other kinds of network devices 📶 so you can analyze it for trends or problems.

Note: All code is available on my Github. 😎

1 – How it works ?

1.1 – Telegraf

Data collector written in Go for collecting, processing, and aggregating and writting metrics. Its a plugin driven tool, we will use a few plugins while implementing our use case.

1.2 – InfluxDB

Scalable time series database for metrics, events and real-time analytics.

1.3 – Grafana

Data visualization and exploration tool. It lets you create graphs and dashboards based on data from various data sources (InfluxDB, Prometheus, Elasticsearch, Cloudwatch …)

2 – Setup

Clone the repository:

To start all of these containers I’m using docker-compose:

The docker-compose bring up 3 containers:

1 – Influxdb:

Due to the ephemeral nature of containers. We exposed the Influxdb data folder to our host system. So our data wont disappear if the container restarts or is stopped.

The port mapping contains 3 port:

  • 8083: this is the administration web server’s port, you can open the admin page by http://localhost:8083
  • 8086: this is the HTTP API endpoint port, it’s used to send query to Influxdb by Telegraf

2 – Grafana

The port 3000 is the default web server port.

We used docker’s link feature to link Grafana container with our Influxdb container, so Grafana can connect to Influxdb and query data from it.

3 – Telegraf

Telegraf collect metrics from “input” plugins, parse it to the correct format then send it to “output” plugins. There is a lot of input and output plugins, you just have to activate them in the Telegraf configuration file:

Here I’m using the Docker input plugin to fetch all the stats from the docker daemon (resource usage per container) and System input plugin to pull server metrics (Disk, CPU, RAM …)

To start all of these services, we will use docker-compose:

If you type “docker ps“, you should see the TIG containers:

3 – Configure

Point your browser to http://SERVER_IP:3000, you should see Grafana Dashboard:

The default credential is admin with password admin. You will want to change this as soon as you can.

Now we need to create an Influxdb datasource pointing to the InfluxDB container.

3.1 – VM Data Source

We configure Grafana to pull data from vm_metrics database:

3.2 – Docker Data Source

Then we create another data source to fetch data from docker_metrics database.

Once that is completed, you are ready to start creating dashboards.

4 – Dashboards

On the top left menu, click on “Add a new Dashboard” then click on “Add a panel“:

4.1 – VM

4.1.1 – Memory

4.1.1 – Disk

4.1.3 – CPU

4.1.4 – Network

All graphs combined:

4.2 – Docker

4.2.1- Create Container Filter

In order to filter our data by container name, we will use a concept in Grafana called Templating which makes our Dashboard more interactive and dynamic. Therefore we won’t hard-code the name of the container in the metric query but instead we will use a variable.

So to create a variable, click on settings icon ⚒ then “Templating“:

Click on “New” and fill the fields as described below:

Once created, now the variable is shown as dropdown select boxes at the top of the dashboard. This dropdown make it easy to change the data being displayed in your dashboard.

Now our filter is created, we can jump to create our first graph:

4.2.1 – Memory

Here is a screenshot of the result: