Big Data - Data Analytics - Exam, Exams of Advanced Data Analysis

Main points of this past exam are: Big Data, Big Data, Technology Sector, Specific Challenges, Parallelized, Big Data Technologies, Apache Hadoop Project, Current Market Landscape, Contributing Vendors, Hadoop Ecosystem

Typology: Exams

2012/2013
On special offer
30 Points
Discount

Limited-time offer


Uploaded on 03/28/2013

mahmud
mahmud 🇮🇳

4.6

(8)

48 documents

1 / 4

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
CORK INSTITUTE OF TECHNOLOGY
INSTITIÚID TEICNEOLAÍOCHTA CHORCAÍ
Semester 1 Examinations 2012/13
Module Title: Data Analytics
Module Code: COMP9033
School: Science and Informatics
Programme Title: Masters of Science (Honours) in Cloud Computing
Programme Code: KCLDC_9_Y5
External Examiner(s): Dr David White
Internal Examiner(s): Mr. Aengus Daly, Dr Paul Walsh, Ms Aisling O’ Driscoll,
Instructions: THREE separate answer books should be submitted.
One for Section A (Q1&Q2),
one for Section B (Q3) and one for Section B(Q4).
Duration: 2 HOURS
Sitting: Winter 2012
Requirements for this examination:
Note to Candidates: Please check the Programme Title and the Module Title to ensure that you are attempting the
correct examination.
If in doubt please contact an Invigilator.
pf3
pf4
Discount

On special offer

Partial preview of the text

Download Big Data - Data Analytics - Exam and more Exams Advanced Data Analysis in PDF only on Docsity!

CORK INSTITUTE OF TECHNOLOGY

INSTITIÚID TEICNEOLAÍOCHTA CHORCAÍ

Semester 1 Examinations 2012/

Module Title: Data Analytics

Module Code: COMP

School: Science and Informatics

Programme Title: Masters of Science (Honours) in Cloud Computing

Programme Code: KCLDC_9_Y

External Examiner(s): Dr David White

Internal Examiner(s): Mr. Aengus Daly, Dr Paul Walsh, Ms Aisling O’ Driscoll,

Instructions: THREE separate answer books should be submitted.

One for Section A (Q1&Q2),

one for Section B (Q3) and one for Section B(Q4).

Duration: 2 HOURS

Sitting: Winter 2012

Requirements for this examination:

Note to Candidates: Please check the Programme Title and the Module Title to ensure that you are attempting the correct examination. If in doubt please contact an Invigilator.

Section A

(Both questions are mandatory)

(a) Discuss in detail the trend of big data in the technology sector and other diverse sectors (supported with relevant statistics) along with the specific challenges that “big data” presents and why/how cloud computing is addressing these.

[18]

(b) State briefly the motivation for distributed and parallelized big data technologies and provide a detailed description of the current market landscape choosing the Apache Hadoop project as an example i.e. contributing vendors, distributions etc. [12]

Total 30 Marks

Describe in detail the following:

(a) The structure and operation of the MapReduce paradigm illustrating with a detailed example. [10] (b) The structure and operation of HDFS illustrating with a detailed example. [10] (c) The extended Hadoop ecosystem, its sub projects and their purpose. [4] (d) Choosing the Mahout distributed machine learning library, list the algorithms it offers, stating their purpose and provide a brief example of where these could be used using big data sets. [6]

Total 30 Marks

(i) Perform a hierarchical cluster analysis using the distances between cities above. [10]

(ii) Draw the corresponding dendrogram and recommend two cities in which to situate warehouses. [5]

(iii) Briefly list the limitations of such analysis. [5]

Total 20 marks