TSR – The Server Room Show – Shownotes – Episode 42 – Analytics and Interactive Visualization Solutions

Intro

While preparing this article/episode for today I came across the below dilemma which I could summarize as:

Most Monitoring Softwares Are Not So Great In Presenting Visually The Metrics/Data Acquired But Some Analytics and Visualization Solutions make a near perfect Monitoring Solution.

Viktor Madarasz – while preparing this article for this episode

What I try to say is that while Monitoring softwares like the ones we discussed in the previous episodes like (Nagios and Zabbix and OpenNMS) not ace it in visualizing the acquired metrics and data in the most beautiful form possible which makes us couple a Monitoring tool like OpenNMS with Grafana *a tool of Analystics and Visualization I will talk about today* to achieve what we want , suprisingly enough some of these analytics and visualization layers/tools/software are getting better and better to include functions from monitoring softwares such as alarms for example.

Therefore I had a bit of a hard time to draw a line with some of these tools , and many others which nearly made it to the list , of where a data visualization and analytics software ends and a monitoring software begins. This line seems fuzzier each time I look at it.

For the moment Monitoring softwares have more on the monitoring and handling alarms end on the spectrum and less on the presentation and visualization of the acquired metrics/data but Analytics and Visualization tools are becoming more and more a hybrid to try and exists in both words.

Grafana
Out of the Box experience ….

Grafana is a multi-platform open source analytics and interactive visualization web application. It provides charts, graphs, and alerts for the web when connected to supported data sources. It is expandable through a plug-in system. End users can create complex monitoring dashboards using interactive query builders.

As a visualization tool, Grafana is a popular component in monitoring stacks often used in combination with time series databases such as InfluxDB, Prometheus and Graphite; monitoring platforms such as Sensu, Icinga, Zabbix, Netdata, and PRTG; SIEMs (security information and event management) such as Elasticsearch and Splunk; and other data sources.

What is a time series database?

A time series database (TSDB) is a software system that is optimized for storing and serving time series through associated pairs of time(s) and value(s). In some fields, time series may be called profiles, curves, traces or trends.Several early time series databases are associated with industrial applications which could efficiently store measured values from sensory equipment (also referred to as data historians), but now are used in support of a much wider range of applications.

In many cases, the repositories of time-series data will utilize compression algorithms to manage the data efficiently.Although it is possible to store time-series data in many different database types, the design of these systems with time as a key index is distinctly different from relational databases which reduce discrete relationships through referential models.

A time series database typically separates the set of fixed, discrete characteristics from its dynamic, continuous values into sets of points or ‘tags.’ An example is the storage of CPU utilization for performance monitoring: the fixed characteristics would include the name ‘CPU Utilization’ the units of measure ‘%’ and a range ‘0 to 1’; and the dynamic values would store the utilization percentage and a timestamp. The separation is intended to efficiently store and index data for application purposes which can search through the set of points differently than the time-indexed values.

The databases vary significantly in their features, but most will enable features to create, read, update and delete the time-value pairs as well as the points to which they are associated. Additional features for calculations, interpolation, filtering, and analysis are commonly found, but are not commonly equivalent.

In the below example I used Grafana + Influxdb + Telegraf to monitor the localhost for basic metrics as seen on the screenshot. Also known as TIG Stack Telegraf Influxdb and Grafana

Grafana is an open source data visualization and monitoring suite. It offers support for Graphite, Elasticsearch, Prometheus, influxdb, and many more databases. The tool provides a beautiful dashboard and metric analytics, with the ability to manage and create your own dashboard for your apps or infrastructure performance monitoring

Telegraf is an agent for collecting, processing, aggregating, and writing metrics. It supports various output plugins such as influxdb, Graphite, Kafka, OpenTSDB etc.

InfluxDB is an open-source time series database written in Go. Optimized for fast, high-availability storage and used as a data store for any use case involving large amounts of time-stamped data, including DevOps monitoring, log data, application metrics, IoT sensor data, and real-time analytics.

TIG Stack Monitoring the Localhosts Basic Metrics
Kibana
Kibana + Elasticsearch showing Sample Data Out of the box…

Kibana is similar in many ways to Grafana but one key difference when it comes to data sources it can only work with Elasticsearch. This can be a deal breaker for many if they wish to work with other datasources than Elasticsearch.

Grafana is designed for analyzing and visualizing metrics such as system CPU, memory, disk and I/O utilization. Grafana does not allow full-text data querying. Kibana, on the other hand, runs on top of Elasticsearch and is used primarily for analyzing log messages

Kibana is an open source data visualization dashboard for Elasticsearch. It provides visualization capabilities on top of the content indexed on an Elasticsearch cluster. Users can create bar, line and scatter plots, or pie charts and maps on top of large volumes of data.

Kibana also provides a presentation tool, referred to as Canvas, that allows users to create slide decks that pull live data directly from Elasticsearch.

What is Elasticsearch?

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java. Following an open-core business model, parts of the software are licensed under various open-source licenses (mostly the Apache License) while other parts fall under the proprietary (source-available) Elastic License.

Shay Banon created the precursor to Elasticsearch, called Compass, in 2004. While thinking about the third version of Compass he realized that it would be necessary to rewrite big parts of Compass to “create a scalable search solution”. So he created “a solution built from the ground up to be distributed” and used a common interface, JSON over HTTP, suitable for programming languages other than Java as well. Shay Banon released the first version of Elasticsearch in February 2010

Features of Elasticsearch

Elasticsearch can be used to search all kinds of documents. It provides scalable search, has near real-time search, and supports multitenancy. “Elasticsearch is distributed, which means that indices can be divided into shards and each shard can have zero or more replicas. Each node hosts one or more shards, and acts as a coordinator to delegate operations to the correct shard(s). Rebalancing and routing are done automatically”. Related data is often stored in the same index, which consists of one or more primary shards, and zero or more replica shards. Once an index has been created, the number of primary shards cannot be changed.

Elasticsearch is developed alongside a data collection and log-parsing engine called Logstash, an analytics and visualisation platform called Kibana, and Beats, a collection of lightweight data shippers. The four products are designed for use as an integrated solution, referred to as the “Elastic Stack” (formerly the “ELK stack”).

Elasticsearch uses Lucene (a free and open source search engine from Apache Software Foundation) and tries to make all its features available through the JSON and Java API. It supports facetting and percolating which can be useful for notifying if new documents match for registered queries. Another feature is called “gateway” and handles the long-term persistence of the index; for example, an index can be recovered from the gateway in the event of a server crash. Elasticsearch supports real-time GET requests, which makes it suitable as a NoSQL datastore but it lacks distributed transactions.

On 20 May 2019, Elastic made the core security features of the Elastic Stack available free of charge, including TLS for encrypted communications, file and native realm for creating and managing users, and role-based access control for controlling user access to cluster APIs and indexes. The corresponding source code is available under the “Elastic License”, a source-available license. In addition, Elasticsearch now offers SIEM (Security Information and Event Management) and Machine Learning as part of its offered services.

————————————————————————————————————————————————————————————————————————————————————–

The combination of Elasticsearch, Logstash, and Kibana, referred to as the “Elastic Stack” (formerly the “ELK stack”), is available as a product or service. Logstash provides an input stream to Elasticsearch for storage and search, and Kibana accesses the data for visualizations such as dashboards. Elastic also provides “Beats” packages which can be configured to provide pre-made Kibana visualizations and dashboards about various database and application technologies.

Grafana Loki

Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate. It does not index the contents of the logs, but rather a set of labels for each log stream.

Loki is one of the available Datasources in Grafana.

Loki as a Data Source Option under Grafana

Grafana’s Loki in certain scenarios compared to Elasticsearch can offer an alternative option to be inserted into current workflows.

Graphite
Graphite running in Docker instance exposed on port :80

Graphite is a free open-source software (FOSS) tool that monitors and graphs numeric time-series data such as the performance of computer systems. Graphite was developed by Orbitz Worldwide, Inc and released as open-source software in 2008.

Graphite collects, stores, and displays time-series data in real time.

The tool has three main components:

Carbon - a Twisted daemon that listens for time-series data
Whisper - a simple database library for storing time-series data (similar in design to RRD)
Graphite webapp - A Django webapp that renders graphs on-demand using Cairo library.

Graphite is used in production by companies such as Ford Motor Company, Booking.com, GitHub, Etsy, The Washington Post and Electronic Arts.

Links

Grafana Step by Step for beginners:
https://www.youtube.com/watch?v=4qpI4T6_bUw&t=64s

Grafana
https://grafana.com/

Elasticsearch
https://www.elastic.co

Elasticsearch concepts
https://logz.io/blog/10-elasticsearch-concepts/

Kibana
https://www.elastic.co/kibana

Graphite
https://graphiteapp.org/

Grafana Loki
https://www.youtube.com/watch?v=1obKa6UhlkY

How to deploy TIG Stack
https://www.howtoforge.com/tutorial/how-to-install-tig-stack-telegraf-influxdb-and-grafana-on-ubuntu-1804/

Comparing Grafana Kibana Graphite
https://stackshare.io/stackups/grafana-vs-graphite-vs-kibana