Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

ElasticSearch in big data, Lecture notes of Computer Security

Islamia University of Bahawalpur (IUB)Computer Security

This document is related to Elastic Search in big data security

Typology: Lecture notes

2023/2024

Uploaded on 11/21/2024

arehman001 🇵🇰

2 documents

1 / 38

This page cannot be seen from the preview

Don't miss anything!

18/ 06 /2 0X X

ELASTIC SEARCH

Discover Lecture notes of Computer Security Islamia University of Bahawalpur (IUB)

Partial preview of the text

Download ElasticSearch in big data and more Lecture notes Computer Security in PDF only on Docsity!

8/ 06 /2 0X X 1

ELASTIC SEARCH

ELASTICSEARCH

(^) Elasticsearch is a search and analytics engine designed to help quickly search through and analyze large amounts of data.
(^) Indexing in Elasticsearch
(^) Index = A storage container for data.
(^) Indexing = The process of storing and organizing data in an index.
(^) When you store data in Elasticsearch, the data is indexed.
(^) This means that Elasticsearch organizes the data in a way that makes it easy to search and retrieve later.
(^) Similar to creating an optimized structure for your data so Elasticsearch can quickly search through it.
(^) When data is saved (e.g., log data, user records, etc.), it’s stored as a document in an index.
(^) Each document is like a record, and it’s typically in JSON format

ELASTICSEARCH

(^) Data Log Entry:
(^) How Indexing Helps Retrieval :
(^) Indexing helps Elasticsearch organize this data efficiently so that it can retrieve any record quickly when you search.
(^) It creates an inverted index (a special data structure) that allows Elasticsearch to: - (^) Quickly search specific fields (like timestamp, user_id, or event). - (^) Find records that match your search criteria (e.g., all logs where event = "login").

INVERTED INDEX IN ELASTICSEARCH

In Elasticsearch, inverted indexing is a technique borrowed from information retrieval systems, similar to the index in a book, but optimized for fast full-text searches across large datasets.
(^) 1. Data Indexing: When you index (store) a document in Elasticsearch, it’s not just saved in its entirety.
(^) Instead, Elasticsearch breaks down the document’s contents into individual terms (or tokens), usually words.
(^) These terms are then stored in a data structure known as an inverted index.
(^) 2. Inverted Index Creation: The inverted index maps terms to the documents that contain them, allowing for fast lookups.
(^) Each unique term in the document gets an entry in the index, and each entry keeps track of which documents contain that term.

INVERTED INDEX IN ELASTICSEARCH

(^) Step 2: Building the Inverted Index
(^) For each unique term, Elasticsearch records the document IDs where the term appears.
The inverted index will look something like this:
(^) Step 3: Searching with the Inverted Index
(^) Now, if you perform a search for "search engine," Elasticsearch can use the inverted index to quickly identify that:"search" appears in Documents 1, 2, and 3."engine" appears in Document 1.
(^) Elasticsearch can then return Document 1 as the most relevant result (since it contains both terms) and rank the other documents based on their relevance.

TYPE OF DATA ELASTICSEARCH CAN

HANDLE

Application Data:
Use Case: It can store structured application data, like user profiles, transaction records, or event data.
(^) Example: A social media platform might store user activity logs or interaction data, which can be searched and analyzed.
(^) Product or Service Data :
(^) Use Case: Elasticsearch is commonly used for storing data about products, services, or inventories in e-commerce platforms.
(^) Example: Data about product descriptions, pricing, availability, and customer reviews.

TYPE OF DATA ELASTICSEARCH CAN

HANDLE

Textual and Unstructured Data:
Use Case: Elasticsearch is built for text-based search and can store unstructured data such as documents, articles, emails, or customer feedback.
(^) Example: Indexing and searching through a collection of articles or support tickets. Summary: E-commerce: Searching for products, filtering results by price, rating, or category. Security: Storing and analyzing security events and alert data. Business Analytics: Aggregating and visualizing business data such as sales performance or customer metrics.

GOOGLE VS ELASTICSEARCH

Ingesting logs: If you have web server logs, these logs can be ingested into Elasticsearch so that you can search, analyze, and visualize them.
(^) Ingesting data from sensors or devices: Data coming from IoT devices can be ingested into Elasticsearch to monitor and

KEY COMPONENTS OF ELASTICSEARCH

ARCHITECTURE:

(^) There are several types of nodes:
(^) Master Node: Manages the cluster, handles node additions/removals, and manages the distribution of data.
Data Node: Stores and manages the data, performs CRUD (Create, Read, Update, Delete) operations, and handles search and aggregation requests.
(^) Coordinating Node: Handles requests from clients and forwards them to the appropriate data nodes.
(^) Ingest Node: Handles data preprocessing and transformation before storing the data in the index.

KEY COMPONENTS OF ELASTICSEARCH

ARCHITECTURE:

3. Index: An index is a collection of documents that share the same data structure.

It is where data (e.g., logs, metrics, documents) is stored.
An index can have multiple shards for scalability and replicas for redundancy. 4. Shard: An index is split into smaller units called shards.
(^) Shards allow Elasticsearch to distribute data across multiple nodes in the cluster.
(^) Each shard contains a portion of the data, and when you perform a search, Elasticsearch can search through all the shards in parallel.

ELASTICSEARCH ARCHITECTURE

FLOW

(^) Ingestion: Data (logs, metrics, documents) is ingested into Elasticsearch.
Indexing: Data is stored as documents in an index, which is split into shards.
(^) Distribution: Shards are distributed across multiple nodes in the cluster for scalability.
(^) Search: When a query is made, it is forwarded to relevant data nodes, which search the shards and return the results.
(^) Aggregation: Elasticsearch performs aggregation queries for analytics and summarization of the data. - (^) Calculate the total revenue by summing the prices of all products sold. - (^) Find the average price of products within each category. - Group sales by month to analyze monthly trends.

BEST PRACTICES FOR SECURING

ELASTICSEARCH

(^) Always Enable Authentication and Authorization: Don’t leave Elasticsearch open without login requirements.
Implement Encryption (TLS) on All Connections: Encrypt connections both within the cluster and for external clients.
(^) Keep Elasticsearch Updated: Security patches are regularly released, so it’s important to stay updated.
(^) Regularly Audit Access and Activity: Review audit logs and set up monitoring for abnormal activity.
(^) Limit Cluster Exposure: Use network isolation strategies to reduce exposure to only trusted networks.

HOW TO IMPLEMENT NETWORK

ISOLATION FOR ELASTICSEARCH

(^) 1. Private Network Configuration: Host your Elasticsearch cluster in a Virtual Private Cloud (VPC) or private subnet.
Ensure it's accessible only within your organization's internal network.
(^) 2. Firewalls and Security Groups: Configure firewalls or security groups to allow connections only from trusted IPs or ranges.
(^) Block incoming requests from public IPs unless necessary.
(^) 3. Access the cluster only via a Virtual Private Network (VPN)
(^) 4. Encrypt traffic between Elasticsearch nodes and clients using TLS. This prevents interception or tampering of data.

RBAC IN ELASTICSEARCH

DataCorp has three main departments with different access needs:
(^) Sales Team: Needs read-only access to sales data to monitor trends and customer orders.
(^) Data Engineering Team: Manages data ingestion and indexing and requires write access to all data indices but not admin privileges.
(^) Admin Team: Manages the entire Elasticsearch cluster and requires full access, including user and role management.
(^) Step 1: Define the Roles in Elasticsearch
(^) DataCorp’s Elasticsearch administrator creates three roles:
- (^) sales_read_only
- data_engineer

ElasticSearch in big data, Lecture notes of Computer Security

Related documents

Partial preview of the text

Download ElasticSearch in big data and more Lecture notes Computer Security in PDF only on Docsity!

ELASTIC SEARCH

ELASTICSEARCH

ELASTICSEARCH

INVERTED INDEX IN ELASTICSEARCH

INVERTED INDEX IN ELASTICSEARCH

TYPE OF DATA ELASTICSEARCH CAN

HANDLE

TYPE OF DATA ELASTICSEARCH CAN

HANDLE

GOOGLE VS ELASTICSEARCH

KEY COMPONENTS OF ELASTICSEARCH

ARCHITECTURE:

KEY COMPONENTS OF ELASTICSEARCH

ARCHITECTURE:

ELASTICSEARCH ARCHITECTURE

FLOW

BEST PRACTICES FOR SECURING

ELASTICSEARCH

HOW TO IMPLEMENT NETWORK

ISOLATION FOR ELASTICSEARCH

RBAC IN ELASTICSEARCH