Question 1. What Is An Index In Elasticsearch ?
An index is similar to a table in relational databases. The difference is that relational databases would store actual values, which is optional in ElasticSearch. An index can store actual and/or analyzed values in an index.
Question 2. What Is A Document In Elasticsearch ?
A document is similar to a row in relational databases. The difference is that each document in an index can have a different structure (fields), but should have same data type for common fields.
Each field can occur multiple times in a document with different data types. Fields can contain other documents too.
Question 3. Does Elasticsearch Have A Schema ?
Yes, ElasticSearch can have mappings which can be used to enforce schema on documents.
Question 4. What Is A Document Type In Elasticsearch ?
A document type can be seen as the document schema / mapping definition, which has the mapping of all the fields in the document along with its data types.
Question 5. What Is Indexing In Elasticsearch ?
The process of storing data in an index is called indexing in ElasticSearch. Data in ElasticSearch can be dividend into write-once and read-many segments. Whenever an update is attempted, a new version of the document is written to the index.
Question 6. What Is A Node In Elasticsearch ?
Each instance of ElasticSearch is called a node. Multiple nodes can work in harmony to form an ElasticSearch Cluster.
Question 7. What Is A Shard In Elasticsearch ?
Due to resource limitations like RAM, vCPU etc, for scale-out, applications need to employ multiple instances of ElasticSearch on separate machines. Data in an index can be divided into multiple partitions, each handled by a separate node (instance) of ElasticSearch. Each such partition is called a shard. By default an ElasticSearch index has 5 shards.
Question 8. What Is A Replica In Elasticsearch ?
Each shard in ElasticSearch has 2 copy of the shard. These copies are called replicas. They serve the purpose of high-availability and fault-tolerance.
Question 9. What Is An Analyzer In Elasticsearch ?
While indexing data in ElasticSearch, data is transformed internally by the Analyzer defined for the index, and then indexed. An analyzer is built of tokenizer and filters. Following types of Analyzers are available in ElasticSearch 1.10.
- STANDARD ANALYZER
- SIMPLE ANALYZER
- WHITESPACE ANALYZER
- STOP ANALYZER
- KEYWORD ANALYZER
- PATTERN ANALYZER
- LANGUAGE ANALYZERS
- SNOWBALL ANALYZER
- CUSTOM ANALYZER
Question 10. What Is A Tokenizer In Elasticsearch ?
A Tokenizer breakdown fields values of a document into a stream, and inverted indexes are created and updates using these values, and these stream of values are stored in the document.
Question 11. What Is A Filter In Elasticsearch ?
After data is processed by Tokenizer, the same is processed by Filter, before indexing. Following types of Filters are available in ElasticSearch 1.10.
- AND FILTER
- BOOL FILTER
- EXISTS FILTER
- GEO BOUNDING BOX FILTER
- GEO DISTANCE FILTER
- GEO DISTANCE RANGE FILTER
- GEO POLYGON FILTER
- GEOSHAPE FILTER
- GEOHASH CELL FILTER
- HAS CHILD FILTER
- HAS PARENT FILTER
- IDS FILTER
- INDICES FILTER
- LIMIT FILTER
- MATCH ALL FILTER
- MISSING FILTER
- NESTED FILTER
- NOT FILTER
- OR FILTER
- PREFIX FILTER
- QUERY FILTER
- RANGE FILTER
- REGEXP FILTER
- SCRIPT FILTER
- TERM FILTER
- TERMS FILTER
- TYPE FILTER
Question 12. What Is The Query Language Of Elasticsearch ?
ElasticSearch uses the Apache Lucene query language, which is called Query DSL.
Question 13. What Is Elasticsearch ?
Elasticsearch is a search engine based on Lucene. It provides a distributed, multitenant-capable full-text search engine with an HTTP web interface and schema-free JSON documents. Elasticsearch is developed in Java and is released as open source under the terms of the Apache License.
Question 14. What Are The Basic Operations You Can Perform On A Document ?
The following operations can be performed on documents
- INDEXING A DOCUMENT USING ELASTICSEARCH.
- FETCHING DOCUMENTS USING ELASTICSEARCH.
- UPDATING DOCUMENTS USING ELASTICSEARCH.
- DELETING DOCUMENTS USING ELASTICSEARCH.
Perform basic operations with Elasticsearch.
Question 15. What Is Inverted Index In Elasticsearch ?
Inverted index is the heart of search engines. The primary goal of a search engine is to provide speedy searches while finding the documents in which our search terms occur. Inverted index is a hashmap like data structure that directs users from a word to a document or a web page. It is the heart of search engines. Its main goal is to provide quick searches for finding data from millions of documents.
Usually in Books we have inverted indexes as below. Based on the word we can thus find the page on which the word exists.
Consider the following statements
- javainuse is a good website
- javainuse is one of the good websites.
For indexing purpose the above text are tokenized into separate terms and all the unique terms are stored inside the index with information such as in which document this term appears and what is the term position in that document.
So the inverted index for the document text will be as follows-
When you search for the term website OR websites, the query is executed against the inverted index and the terms are looked out for, and the documents where these terms appear are quickly identified.
Question 16. What Is A Cluster In Elasticsearch ?
Cluster is a collection of one or more nodes (servers) that together holds your entire data and provides federated indexing and search capabilities across all nodes. A cluster is identified by a unique name which by default is “elasticsearch”. This name is important because a node can only be part of a cluster if the node is set up to join the cluster by its name.
Python Interview Questions
Hadoop Interview Questions
Apache Solr Interview Questions
Hadoop Administration Interview Questions
Apache Solr Tutorial
Apache Kafka Interview Questions
Python Interview Questions
Apache Kafka Tutorial
Hadoop Distributed File System (HDFS) Interview Questions