






















































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
INTRODUCTION TO DATA SCIENCE & DATA SCIENCE LIFECYCLE
Typology: Exams
1 / 62
This page cannot be seen from the preview
Don't miss anything!























































Introduction to Data Science Evolution of Data Science Data Science Roles Stages in a Data Science Project Applications of Data Science in various fields Data Security Issues
Data Science is about finding patterns in data, through analysis, and make future predictions. By using Data Science, companies are able to make: Better decisions (should we choose A or B) Predictive analysis (what will happen next?) Pattern discoveries (find pattern, or maybe hidden information in the data)
Data is a collection of information. One purpose of Data Science is to structure data, making it interpretable and easy to work with. Data can be categorized into two groups: Structured data: Unstructured data is not organized. We must organize the data for analysis purposes. Unstructured data:Structured data is organized and easier to work with.
Fraud and Risk Detection Healthcare Internet Search Targeted Advertising Website Recommendations Advanced Image Recognition Speech Recognition Airline Route Planning Gaming Augmented Reality Fraud and Risk Detection
The earliest applications of data science were in Finance. Companies were fed up of bad debts and losses every year. However, they had a lot of data which use to get collected during the initial paperwork while sanctioning loans. They decided to bring in data scientists in order to rescue them from losses. Over the years, banking companies learned to divide and conquer data via customer profiling, past expenditures, and other essential variables to analyze the probabilities of risk and default. Moreover, it also helped them to push their banking products based on customer’s purchasing power. Healthcare The healthcare sector, especially, receives great benefits from data science applications. Medical Image Analysis Procedures such as detecting tumors, artery stenosis, organ delineation employ various different methods and frameworks like MapReduce to find optimal parameters for tasks like lung texture classification. It applies machine learning
Virtual assistance for patients and customer support Optimization of the clinical process builds upon the concept that for many cases it is not actually necessary for patients to visit doctors in person. A mobile application can give a more effective solution by bringing the doctor to the patient instead. The AI-powered mobile apps can provide basic healthcare support, usually chatbots. You simply describe your symptoms, or ask questions, and then receive key information about your medical condition derived from a wide network linking symptoms to causes. Apps can remind you to take your medicine on time, and if necessary, assign an appointment with a doctor. This approach promotes a healthy lifestyle by encouraging patients to make healthy decisions, saves their time waiting in line for an appointment, and allows doctors to focus on more critical cases. Internet Search
Now, this is probably the first thing that strikes your mind when you think Data Science Applications. When we speak of search, we think ‘Google’. Right? But there are many other search engines like Yahoo, Bing, Ask, AOL, and so on. All these search engines (including Google) make use of data science algorithms to deliver the best result for our searched query in a fraction of seconds. Considering the fact that, Google processes more than 20 petabytes of data every day. Targeted Advertising If you thought Search would have been the biggest of all data science applications, here is a challenger – the entire digital marketing spectrum. Starting from the display banners on various websites to the digital billboards at the airports – almost all of them are decided by using data science algorithms.
Ask the right questions - To understand the business problem. Explore and collect data - From database, web logs, customer feedback, etc. Extract the data - Transform the data to a standardized format. Clean the data - Remove erroneous values from the data. Find and replace missing values - Check for missing values and replace them with a suitable value (e.g. an average value). Normalize data - Scale the values in a practical range (e.g. 140 cm is smaller than 1,8 m. However, the number 140 is larger than 1,8. - so scaling is important).
Analyze data, find patterns and make future predictions. Represent the result Present the result with useful insights in a way the ”company” can understand.