Sunday 18 October 2020

Elastic Search basics

 

ELK stack consists of Elastic Search, Logstash, Kibana

Logstash and Beats are the connectors used to bring the data to the elastic search cluster, they have connectors for DB changes, log changes etc

Elastic search ingest the data from logstash/beats and index the data and distribute it across all nodes in its cluster.

It provides an interface to search the documents and has algorithms to index and score the data.

Kibana - provides a nice web UI on top of elastic search for searching the data


Elastic Search use cases:

log analytics

security analytics, anomaly detection

Marketing based on the data

Operational needs like server health, web app response time etc


Elastic search cluster is in the below format:

1. Cluster consists of nodes

2. Each node consists of indexes, the index in elastic search is an inverted index which maps words to documents

3. Each index is further divided into types for storing documents - like order type, product types etc

4. Each types consists of documents

An Index is distributed in shards and each shard have replicas to avoid failover.

By default for creating an index, there will be 5 shards with one replica for each shard.


Downloading and installing elastic search and kibana is straight forward and can be done from elstic site.

Kibana by default starts on http://localhost:5601/


sample search query on elastic search on sample data provided from kibana:

# show me everything

GET kibana_sample_data_ecommerce/_search


# show only data having category contains clothing

GET kibana_sample_data_ecommerce/_search

{

  "query": {

    "match": {

      "category":"clothing"

    }

  }

}


# for filtering on multiple conditions - similarly must_not can be used for not matching query

GET kibana_sample_data_ecommerce/_search

{

  "query": {

    "bool": {

      "must": [

        { "match": {"category": "clothing"} },

        { "match": {"currency": "EUR"}},

        { "range": {

          "taxful_total_price": {

            "gte": 36,

            "lte": 100

          }

        }}

      ]

    }

  }

}


Kibana provides a nice UI for searching these directly.



Completion suggestors:


Elastic search provides a completion suggestor for typeahead scenarios where you want to provide suggestion for the user when he is typing a text.

A typical data structure used for serving such use cases is a trie.

Trie stores data in tree like format with the alphabets forming the words forming nodes of the tree.


Elastic search uses a trie in memory data structure for completion suggestor.

While creating an ES index, you need to specify completion suggestor with type as completion.

Completion suggestor is a prefix suggestor, so only the data with prefix will get selected by default.

however, you can configure multiple suggestion entries while inputting the data.

you also have option to provide fuzzy option to show results even when you have typos

the weightage of the search result can be improved based on the hits by maintaining weight in the documents to be searched.




No comments:

Post a Comment