Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

BIG DATA ANALYTICS (BDA), Summaries of Advanced Data Analysis

Kakatiya University Advanced Data Analysis

Exam ready study material for bug data analytics containing 2nd unit topics with important questions and answers. Very useful last minute document.

Typology: Summaries

2024/2025

Available from 01/12/2026

rahul-panuganti 🇮🇳

2 documents

1 / 16

This page cannot be seen from the preview

Don't miss anything!

Essential parts of a video lecture

1. Objective

2. Lecture Learning Outcomes

3. Introduction to Hadoop

4. Data: The Treasure Trove

5. Why Hadoop?

6. Why Not RDBMS?

7. Reflection spot

8. Lecture Outcome Revisited

9. Lecture Level practice Problems (LLPs)

10. Further Reading

Discover Summaries of Advanced Data Analysis Kakatiya University

Partial preview of the text

Download BIG DATA ANALYTICS (BDA) and more Summaries Advanced Data Analysis in PDF only on Docsity!

Essential parts of a video lecture

**1. Objective

Lecture Learning Outcomes
Introduction to Hadoop
Data: The Treasure Trove
Why Hadoop?
Why Not RDBMS?
Reflection spot
Lecture Outcome Revisited
Lecture Level practice Problems (LLPs)
Further Reading** 2

This is the introduction part of Hadoop where you can understand the concept of Hadoop and RDBMS

Objective

Today, Big Data seems to be the buzz word! Enterprises, the world over, are beginning to realize that tiler, is a huge volume of untapped information before them in the form of structured, semi-structured, and unstructured data. This varied variety of data is spread across the networks.
Let us look at few statistics to get an idea of the amount of data which gets generated every day, ever, minute, and every second.
Every day: (a) NYSE (New York Stock Exchange) generates 1.5 billion shares and trade data. (b) Facebook stores 2.7 billion comments and Likes. (c) Google processes about 24 petabytes of data.

Introduction to Hadoop

Every minute: (a) Facebook users share nearly 2.5 million pieces of content. (b) Twitter users tweet nearly 300,000 times. (c) Instagram users post nearly 220,000 new photos. (d) YouTube users upload 72 hours of new video content. (e) Apple users download nearly 50,000 apps. (f) Email users send over 200 million messages. (g) Amazon generates over $80,000 in online sales. (h) Google receives over 4 million search queries
Every second:
(a) Banking applications process more than 10,000 credit card transactions.

Introduction to Hadoop

Data: The Treasure Trove

Ever wondered why Hadoop has been and is one of the most wanted technologies!! The key consideration (the rationale behind its huge popularity) is:
Its capability to handle massive amounts of data, different categories of data — fairly quickly 1. Low cost Hadoop is an open-source framework and uses commodity hardware (commodity hard-ware is relatively inexpensive and easy to obtain hardware) to store enormous quantities of data**.

Computing power:** Hadoop is based on distributed computing model which processes very large volumes of data fairly quickly. The more the number of computing nodes, the more the processing power at hand. 3. Scalability: This boils down to simply adding nodes as the system grows and requires much less administration

WHY HADOOP?

Reflection Spot - 1

Having discussed some content, here is reflection spot Question-1: Point out the correct statement. a) Hadoop is an ideal environment for extracting and transforming small volumes of data b) Hadoop stores data in HDFS and supports data compression/decompression c) The Giraph framework is less useful than a MapReduce job to solve graph and machine learning d) None of the mentioned

Answer:b

WHY HADOOP?

Reflection Spot - 2

Having discussed some content, here is another reflection spot

Question-2: Data storage in RDBMS is _____

a) Use for large data set (Tera Bytes and Peta Bytes)

b) Used unstructured data

c) average data size in (Giga Bytes)

d) All of the above

Answer: c

Lecture Outcome Revisited

Having completed the discussion on Introduction to Hadoop , now, students should be able to… LO1:Understand what is Hadoop? LO2: Understand the why Hadoop and not RDBMS?

Lecture Level practice Problems (LLPs)

LLP2 ( based on LO2) Explain RDBMS? And compare RDBMS with Hadoop? Answer: Hints:

BIG DATA ANALYTICS (BDA), Summaries of Advanced Data Analysis

Related documents

Partial preview of the text

Download BIG DATA ANALYTICS (BDA) and more Summaries Advanced Data Analysis in PDF only on Docsity!

Essential parts of a video lecture

Objective

Introduction to Hadoop

Introduction to Hadoop

Data: The Treasure Trove

WHY HADOOP?

Reflection Spot - 1

Answer:b

WHY HADOOP?

Reflection Spot - 2

Question-2: Data storage in RDBMS is _____

a) Use for large data set (Tera Bytes and Peta Bytes)

b) Used unstructured data

c) average data size in (Giga Bytes)

d) All of the above

Answer: c

Lecture Outcome Revisited

Lecture Level practice Problems (LLPs)