Quantcast
Channel: Ganesh Sharma | virtualgyaan
Viewing all articles
Browse latest Browse all 24

Part 1 : Introduction to Elasticsearch, Logstash and Kibana or ELK Stack

$
0
0

Overview:
In this series of posts on ELK we will look at the basic introduction to ELK, the concepts around ELK and types of deployment that ELK supports. We will explore on the strategies around deployment of ELK. In further parts we will see a VM based ELK deployment and ELK Cloud deployment on Microsoft Azure AKS which uses Kubernetes operator at the backend for deploying ELK. This post highlights introduction and basic deployment strategies of ELK.

Elasticsearch Logstash Kibana and Beats makes up ELK stack

ELK is one of the most popular cloud monitoring solutions that is used to store, monitor and analyze both metrics and logs from a variety of cloud and on-premise resources. Broadly the ELK Stack comprises of Elasticsearch, Logstash, Beats and Kibana. The basic functionality of each of these components of ELK stack is as follows :
Elasticsearch is the heart of the solution which comprises of a distributed JSON based search and analytics engine. It acts like a central repository for storing data and helps us query any details from it.
These data can be logs, metrics, Application Performance Metrics and many more
Logstash is a open source data processing pipeline which dynamically ingests, transforms, and ships your data regardless of format or complexity to Elasticsearch where it can be stored. The main function of Logstash is that it provides capabilities to parse and transform data before it can be stashed into Elasticsearch or we can use Logstash API’s to build any plugins
Beats is a lightweight agent that ships the data from various sources to Elasticsearch. Types of beats are as follows:
Filebeat for collecting logs and its related data
Metricbeat for collecting metric data like CPU, Memory and other metrics
Packetbeat for collecting network data
Winlogbeat for collecting Windows event logs
Auditbeat for collecting audit data
Heartbeat for collecting uptime monitoring of different components
Functionbeat for collecting cloud data
Kibana is the UI or the dashboard component of ELK Stack which helps us configure Elasticsearch cluster its indexes and shards, perform many other operations on Elasticsearch. Kibana has hundreds of features capabilities and not all of them is discussed here. Some of the most important Kibana functions are listed as follows:
Create Visualizations by dragging and dropping the appropriate fields
Create Dashboards which can be a combination of number of Visualizations
Query the cluster using CRUD style APIs
Create Alerts to trigger specific actions

Concepts:
Let us take a look at some of the most important concepts that we need to grasp before deployment of ELK.
There are two types of Nodes in Elasticsearch:
> Master eligible nodes: These nodes take up Master roles, They are responsible for storing and keeping account of where all the data is stored, replication to different master nodes, metadata etc.
> Data node: The sole purpose of Data nodes is to store the Elasticsearch data shipped by different types of Beats and is responsible for searching all the data, and performing heavy read write IO operations

How is Data stored in Elasticsearch?
> Data inside Elasticsearch is stored as Elasticsearch indices.
> An index may contain a single shard or may contain multiple shards
> A single instance of a Lucene index is referred to as shards
> Data is written to shards as immutable Lucene segments to the disk and is available for querying
> All replicas of a shard must reside on different data nodes
In case of failure of one of the node the Master node will make a existing replica shard as primary which will again start new replica Shard on a different data node

A bit about Lucene and its segments:
Lucene is the open source search engine from Apache Foundation that powers the Elasticsearch. A small part of Lucene index is referred to as segments.
We can think of segments as basic building blocks of Lucene index, Lucene searches these segments in sequence and is better to have small number of segments per lucene index to improve performance
Multiple such segments make up a single Lucene index and this is referred to as Elasticsearch shard These Elasticsearch shard constitutes to Elasticsearch indices.

Designing using best practices:
Deploying ELK stack can be performed in a number of ways and it really depends upon the use case that is at stake, Here we will list some of popular best practices that needs to considered for a production ready environment :
> Always limit shard size to 50GB
> Time based indices can be helpful in managing and retaining indices
> In Elasticsearch 7.x and later each index is automatically created with primary shard and a replica shard

States of Indexes determine the state of the cluster :
Green – All primary and replicated shards are allocated
Yellow – Not all replication shards are allocated
Red – Missing Primary shards and there are no replica shards to promote and this occurs with data loss

In the next part we will see how to deploy an enterprise ready ELK stack on bunch of Virtual Machines and collect both logs and metrics by installing logbeat and metricbeat on them

The post Part 1 : Introduction to Elasticsearch, Logstash and Kibana or ELK Stack first appeared on virtualgyaan.

Viewing all articles
Browse latest Browse all 24

Trending Articles