Network Infrastructure Weathermap

The main goal of collecting metrics is to store them for long term usage and to create graphs to debug problems or identify trends. However, storing metrics about your system isn’t enough to identity the problem’s & anomalies root cause. It’s necessary to have a high-level overview of your network backbone. Weathermap is perfect for a Network Operations Center (NOC). In this post, I will show you how to build one using Open Source tools only.

Icinga 2 will collect metrics about your backbone, write checks results metrics and performance data to InfluxDB (supported since Icinga 2.5). Visualize these metrics in Grafana in map form.

To get started, add your desired host configuration inside the hosts.conf file:

Note: the city & country attributes will be used to create the weathermap.

To enable the InfluxDBWriter on your Icinga 2 installation, type the following command:

Configure your InfluxDB host and database in /etc/icinga2/features-enabled/influxdb.conf (Learn more about the InfluxDB configuration)

Icinga 2 will forward all your metrics to a icinga2_metrics database. The included host and service templates define a storage, the measurement represents a table by which metrics are grouped with tags certain measurements of certain hosts or services are defined (notice the city & country tags usage).

Don’t forget to restart Icinga 2 after saving your changes:

Once Icinga 2 is up and running it’ll start collecting data and writing them to InfluxDB:

Once our data arrived, it’s time for visualization. Grafana is widely used to generate graphs and dashboards. To create a Weathermap we can use a Grafana plugin called Worldmap Panel. Make sure to install it using grafana-cli tool:

The plugin will be installed into your grafana plugins directory (/var/lib/grafana/plugins):

Restart Grafana, navigate to Grafana web interface and create a new datasource:

Create a new Dashboard:

The Group By clause should be the country code and an alias is needed too. The alias should be in the form $tag_field_name. See the image below for an example of a query:

Under the Worldmap tab, choose the countries option:

Finally, you should see a tile map of the world with circles representing the state of each host.

The field state possible values (0 – OK, 1 – Warning, 2 – Critical, 3 – Unknown/Unreachable)

Note: For lazy people I created a ready to use Dashboard you can import from GitHub.

MySQL Monitoring with Telegraf, InfluxDB & Grafana

This post will walk you through each step of creating interactive, real-time & dynamic dashboard to monitor your MySQL instances using Telegraf, InfluxDB & Grafana.

Start by enabling the MySQL input plugin in /etc/telegraf/telegraf.conf :

Once Telegraf is up and running it’ll start collecting data and writing them to the InfluxDB database:

Finally, point your browser to your Grafana URL, then login as the admin user. Choose ‘Data Sources‘ from the menu. Then, click ‘Add new‘ in the top bar.

Fill in the configuration details for the InfluxDB data source:

You can now import the dashboard.json file by opening the dashboard dropdown menu and click ‘Import‘ :

Note: Check my Github for more interactive & beautiful Grafana dashboards.

Continuous Monitoring with TICK stack

Monitoring your system is required. It helps you detect any issues before they cause any major downtime that effect your customers and damage your business reputation. It helps you also to plan growth based on the real usage of your system. But collecting metrics from different data sources isn’t enough, you need to personalize your monitoring to meet your own business needs and define the right alerts so that any abnormal changes in the system will reported.

In this post, I will show you how to setup a resilient continuous monitoring platform with only open source projects & how to define an event alert to report changes in the system.

Clone the following Github repository:

1 – Terraform & AWS

In the tick-stack/terraform directory, update the variables.tfvars file with your own AWS credentials (make sure you have the right IAM policies) :

Issue the following command to download the AWS provider plugin:

Issue the following command to provision the infrastructure:

2 – Ansible & Docker

Update the inventory file with your instance DNS name:

Then, install the Ansible custom role:

Execute the Ansible Playbook:

Point your browser to http://DNS_NAME:8083, you should see InfluxDB Admin Dashboard:

Now, create an InfluxDB Data Source in Chronograf (http://DNS_NAME:8888):

Create a new Dashboard as follow:

You can create multiple graphs to visualize different types of metrics:

Note: For in depth details on how to create interactive & dynamic dashboards in Chronograf check my previous tutorial.

You need to elaborate on the data collected to do something like alerting. So make sure to enable Kapacitor:

Define a new alert to send a Slack notification if the CPU utilization is higher than 70%.

To test it out, we need to generate some workload. For this case, I used stress:

Stressing the CPU:

After few seconds, you should receive a Slack notification.

GitLab Performance Monitoring with Grafana

Since GitLab v8.4 you can monitor your own instance with InfluxDB & Grafana stack by using the GitLab application performance measuring system called “Gitlab Performance Monitoring“.

GitLab writes metrics to InfluxDB via UDP. Therefore, this must be enabled in /etc/influxdb/influxdb.conf:

Restart your InfluxDB instance. Then, create a database to store GitLab metrics:

Next, go to Gitlab Setings Dashboard and enable InfluxDB Metrics as shown below:

Then, you need to restart GitLab:

Now your GitLab instance should send data to InfluxDB:

Finally, Point your browser to your Grafana URL, then login as the admin user. Choose ‘Data Sources‘ from the menu. Then, click ‘Add new‘ in the top bar.

Fill in the configuration details for the InfluxDB data source:

You can now import the dashboard.json file by opening the dashboard dropdown menu and click ‘Import‘ :

Note: Check my Github for more interactive & beautiful Grafana dashboards.