Download What is BIG DATA? and more Slides Computer Security in PDF only on Docsity!
BIG DATA
Prepared By Nasrin Irshad Hussain And Pranjal Saikia M.Sc(IT) 2nd^ Sem Kaziranga University Assam
Content
- Introduction
- What is Big Data
- Characteristic of Big Data
- Storing,selecting and processing of Big Data
- Why Big Data
- How it is Different
- Big Data sources
- Tools used in Big Data
- Application of Big Data
- Risks of Big Data
- Benefits of Big Data
- How Big Data Impact on IT
- Future of Big Data
- ‘Big Data’ is similar to ‘small data’, but bigger in size
- but having data bigger it requires different approaches: - Techniques, tools and architecture
- an aim to solve new problems or old problems in a better way
- Big Data generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques.
What is BIG DATA?
What is BIG DATA
- Walmart handles more than 1 million customer transactions every hour.
- Facebook handles 40 billion photos from its user base.
- Decoding the human genome originally took 10years to process; now it can be achieved in one week.
1 st^ Character of Big Data
Volume
- A typical PC might have had 10 gigabytes of storage in 2000.
- Today, Facebook ingests 500 terabytes of new data every day.
- Boeing 737 will generate 240 terabytes of flight data during a single flight across the US.
- The smart phones, the data they create and consume; sensors embedded into everyday objects will soon result in billions of new, constantly-updated data feeds containing environmental, location, and other information, including video.
2nd Character of Big Data Velocity
- Clickstreams and ad impressions capture user behavior at millions of events per second
- high-frequency stock trading algorithms reflect market changes within microseconds
- machine to machine processes exchange data between billions of devices
- infrastructure and sensors generate massive log data in real- time
- on-line gaming systems support millions of concurrent users, each producing multiple inputs per second.
Storing Big Data
Analyzing your data characteristics
- Selecting data sources for analysis
- Eliminating redundant data
- Establishing the role of NoSQL
Overview of Big Data stores
- Data models: key value, graph, document,
column-family
- Hadoop Distributed File System
- HBase
- Hive
Selecting Big Data stores
- Choosing the correct data stores based on
your data characteristics
- Moving code to data
- Implementing polyglot data store solutions
- Aligning business goals to the appropriate
data store
The Structure of Big Data
Structured
- Most traditional data sources
Semi-structured
Unstructured
- Video data, audio data 13
Why Big Data
- Growth of Big Data is needed
- Increase of storage capacities
- Increase of processing power
- Availability of data(different data types)
- Every day we create 2.5 quintillion bytes of data; 90% of the data in the world today has been created in the last two years alone
How Is Big Data Different?
Automatically generated by a machine (e.g. Sensor embedded in an engine)
Typically an entirely new source of data (e.g. Use of the internet)
Not designed to be friendly (e.g. Text streams)
May not have much values
- Need to focus on the important part (^16)
Big Data sources
Users
Application
Systems
Sensors
Large and growing files (Big data files)
Big Data Analytics
- Examining large amount of data
- Appropriate information
- Identification of hidden patterns, unknown correlations
- Competitive advantage
- Better business decisions: strategic and operational
- Effective marketing, customer satisfaction, increased revenue
- Where processing is hosted?
- Distributed Servers / Cloud (e.g. Amazon EC2)
- Where data is stored?
- Distributed Storage (e.g. Amazon S3)
- What is the programming model?
- Distributed Processing (e.g. MapReduce)
- How data is stored & indexed?
- High-performance schema-free databases (e.g. MongoDB)
- What operations are performed on data?
- Analytic / Semantic Processing
Types of tools used in Big-Data