











Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Machine Learning Algorithm for Data analysis Machine Learning Algorithm for Data analysis
Typology: Summaries
1 / 19
This page cannot be seen from the preview
Don't miss anything!












TABLE OF CONTENTS
If you asked someone on the street if they have ever heard of or utilized machine learning, their answer would probably be no. What they don’t know is that they’ve probably encountered it numerous times—just in one day. When you ask Siri what the weather forecast is, that’s machine learning. When you Google search something at work to help you do your job better or more efficiently, you can thank machine learning. Another everyday example is our spam folders—a machine learning algorithm is used to determine which emails are inbox-worthy, and which are spam and don’t deserve attention. Similarly, when Netflix suggests a show you should watch based on preference, it’s getting the suggestion from an algorithm. From TV suggestions to self-driving cars, machine learning is subtly in the background of almost all that we do. These algorithms, and machine learning as a whole, is intended to improve and radically simplify our lives. According to Srinivas Bangalore, Director of Research and Technology at Interactions, “good machine learning should not be in your face. It should be behind the scenes, tracking, and helping achieve goals much more quickly and efficiently.” GOOD MACHINE LEARNING SHOULD NOT BE IN YOUR FACE. IT SHOULD BE BEHIND THE SCENES, TRACKING, AND HELPING ACHIEVE GOALS MUCH MORE QUICKLY AND EFFICIENTLY. Srinivas Bangalore —Director of Research and Technology at Interactions
BRIEF HISTORY OF MACHINE LEARNING From the 1950s to now, machine learning has significantly developed. Below is a brief history of machine learning within the AI field. We show how the algorithms we described are motivated by the need to solve very simple automation tasks, such as the recognition of spoken words or written digits, and how AT&T showed a strong leadership in this process.
In 1950, Alan Turing created the “Turing Test” to determine whether or not a computer was capable of real intelligence. In order to pass the test, the computer had to be able to fool another human into believing it was also human.
Arthur Samuel created the first implementation of machine learning, the game of checkers, in 1952. The computer improved at the game the more it played by determining which moves resulted in winning strategies, and incorporating those strategies into the game.
Frank Rosenblatt designed the first neural network for computers in 1957, which was meant to simulate the thought process of a human brain.
The “nearest neighbor” algorithm was written in 1967, allowing computers to begin recognizing basic patterns. This could be used as a mapping route for traveling salesmen.
EBL, or Explanation Based Learning, was created in 1981 by Gerald Dejong. This concept allowed a computer to analyze training data and create a general rule it can follow by discarding unimportant data.
Researchers from AT&T created the first research group for machine learning in 1985. They also began a series of machine learning meetings that eventually turned into NIPS, the leading conference on machine learning. This group was representative of the early machine learning community, breaking away from a computer science field still mostly interested in expert systems. These theoreticians were confronted with real world problems where machines had to replace humans in recognizing noisy written digits: mainly check amounts and zip codes.
In 1992, Jay Wilpon (SVP of Natural Language Research at Interactions) and a team of researchers at AT&T deployed the first nationwide automated speech recognition (ASR) using a machine learning approach called Hidden Markov Models (HMMs). This saved billions of dollars in operating costs by spotting things like collect calls.
Researchers at AT&T invented Support Vector Machines (SVMs) in 1992, a technique that revolutionized large scale classification because of its predictable performance.
BRIEF HISTORY OF MACHINE LEARNING </>
The overall goal of machine learning is to build models that imitate and generalize data. These models need to learn how to discriminate certain things to achieve a desired end result. Simply put, machine learning uses a variety of techniques, and algorithms within these techniques, to reach a specific goal.
Machine learning learns from data, and uses that data to recognize patterns. Jay Wilpon, Senior Vice President of Natural Language Research at Interactions, best describes how machine learning works by using an analogy of fruits. For instance, let’s assume someone handed you an orange and a grapefruit, and you’ve never seen them before. How do you tell them apart? They’re both round, but the grapefruit is slightly bigger. You could then determine that size is one feature that can separate the two. Now, let’s say someone hands you an apple. While the shapes are similar, this fruit is red, triggering you to realize that color is another potential differentiator. Finally, someone gives you a banana...now you can add shape as another characterization. This simple analogy is similar to how machine learning works. The job of machine learning is not only to recognize that what it’s being handed is fruit, but also to make sure that it is not calling a grapefruit a banana and vice versa. HOW IT WORKS HOW DO YOU TELL THEM APART? Jay Wilpon explains how machine learning works with an analogy of how algorithms decipher the difference between types of fruits. Size is one feature that can separate the two. Color is another potential differentiator. You can add shape as another characterization.
Deep neural networks (DNNs), also known as artificial neural networks (ANN), represent a set of techniques used to build powerful learning systems. Unlike algorithms such as SVMs and Adaboost, they add a number of “hidden” layers that are used to extract intermediate representations. While invented in the 1980s, DNNs took off after 2010 thanks to powerful parallel hardware and easy-to-use open source software. DNNs cover a huge range of different neural architectures, the best known being:
THE IMPORTANCE OF THE HUMAN ELEMENT Regardless of how intelligent technology can be, at the end of the day it will never be perfect. Humans can accelerate the process of understanding by teaching the technology in real-time. For example, if machine learning comes across a piece of data it cannot understand, a human can interfere and tell the technology what it is, making it more accurate the next time it comes across that same piece of data. Technology does not have the same level of understanding as a human, and adding humans to the machine learning process can assist with decision making and allow the technology to become more self-aware. This human touch can personify machine learning, make it easier to relate to, and in-turn, make it less intimidating. Aside from making it more personalized, when humans and robots work together, the results are truly exceptional—and accurate. Humans can become involved in the process in a few ways. First, they can assist with labeling data that will be fed into the machine learning model, and secondly they help machine learning predict and correct inaccuracies, which results in more accurate end results. Interactions understands the crucial role the human element plays in artificial intelligence, which is why we’ve focused on integrating human intelligence into our technology. Our proprietary Adaptive Understanding™ technology combines speech recognition, natural language processing, and Human Assisted Understanding to provide our customers, and their customers, conversational and engaging self-service. This enables continuous improvement and learning in live applications.
Diagnoses - Machine learning can analyze data and identify trends or red flags within patients to potentially lead to earlier diagnoses and better treatments. Patient information - Data can be collected from a patient’s device to assess their health in real-time. Drug discovery - Given its ability to detect patterns within data, scientists are able to better predict drug side effects and results of drug experiments without actually performing them.
Personalization - Machine learning allows online brands to suggest and advertise things you may like based on your browser and search history. Brands use their collected data to give customers a unique and personalized experience.
Energy sources - By analyzing different minerals in the ground, machine learning provides the potential to discover new energy sources. Streamlining oil distribution - Algorithms work to make oil distribution more efficient and cost-effective. Reservoir modeling - Certain machine learning techniques can focus on optimization of hydraulic fracturing, reservoir simulation, and more.
Efficient transportation - Analysis of data can identify certain patterns and trends to make routes more efficient for public transportation, delivery companies, and more. $
While machine learning has proved to have a profound impact across all industries, there are still uncertainties and challenges regarding the technology.
First and foremost is the fear that technology will overcome humans. As we discussed, technology is not perfect, and often needs the assistance of humans to ensure accuracy. However, there is still a lot of fear and uncertainty regarding the power of technology and its ability to become smarter than we are. At its core, AI is a set of mathematical equations and algorithms that require human training. This means that AI, and machine learning, are only as smart as we teach them to be. When applied properly, AI is a perfect assistant to help humans become more productive. Technology is not here to overcome us and overpower us, but rather assist us and improve our quality of life.
A more technical issue with both machine learning and artificial intelligence is the technology’s ability to handle unlabeled data. Because machine learning relies on data to learn, it naturally requires a large amount of labeled data to work most efficiently. However, there are many cases when data isn’t readily available or is unlabeled. This makes creating algorithms more challenging. With on-going research and new advancements, we’re training these systems to become smarter and reach human-level accuracy, so that one day unlabeled data will be just as sufficient as labeled data. CHALLENGES AND HESITATIONS
With more than 150 published papers and patents in speech and natural language research to his name, Jay Wilpon is one of the world’s pioneers and a chief evangelist for speech and natural language technologies and services. During his career, Jay has been a leading innovator for a number of industry-defining voice enabled services, including AT&T’s How May I Help You service – the first nationwide deployment of a true human-like spoken language understanding service. Jay and his team are addressing the key challenges in speech, natural language processing and multimodal dialog systems. Jay has previously been awarded the distinguished honor of IEEE Fellow for his leadership in the development of automatic speech recognition algorithms. For pioneering leadership in the creation and deployment of speech recognition-based services in the telephone network, Jay has also been awarded the honor of AT&T Fellow.
As Vice President of Speech Research, David manages Interactions’ R&D teams to further Interactions’ goal to redefine the speech technology industry. David is at the forefront of Interactions’ objective to create the most accurate, fastest, and highest quality speech solutions. Prior to joining Interactions, David spent five years with AT&T Labs, where he was responsible for the development of technology from speech research. He has held senior executive-level positions at SpinVox, SpeechPhone, and Fonix. He also spent 18 years at Lucent Technologies (now Alcatel-Lucent), where he developed voice activated systems that have handled over 20 billion calls. David has published 30 research papers and secured 11 patents in natural language research. CONTRIBUTORS
Dr. Srinivas Bangalore is currently the Director of Research and Technology at Interactions. After receiving his PhD in Computer Science from The University of Pennsylvania, he became a Principal Research Scientist at AT&T Labs—Research. Dr. Bangalore has worked on many areas of Natural Language Processing including Spoken Language Translation, Multimodal Understanding, Language Generation and Question-Answering. He has co-edited three books on Supertagging, Natural Language Generation, and Language Translation. He has authored over a 100 research publications and holds over 100 patents in these areas. He has been awarded the Morris and Dorothy Rubinoff award for outstanding dissertation, the AT&T Outstanding Mentor Award, in recognition of his support and dedication to AT&T Labs Mentoring Program and the AT&T Science & Technology Medal for technical leadership and innovative contributions in Spoken Language Technology and Services. He has served on the editorial board of Computational Linguistics Journal, Computer, Speech and Language Journal and on program committees for a number of ACL and IEEE Speech Conferences.
Dr. Patrick Haffner has worked on machine learning algorithms since 1988. With Yann LeCun, he was one of the pioneers in applying Neural Networks to speech and image recognition, and led the deployment of the first NN used for an automation task (check reading). With AT&T Labs Research, he was an expert in the learning algorithms that enable data engineers to efficiently train machines using real world data, for tasks ranging from language understanding to network monitoring. He was also an expert advisor to the European Union for their funding programs on machine learning and cognitive sciences. Dr. Haffner is a Lead Inventive Scientist at Interactions with responsibility for managing the ever increasing variety of machine learning techniques and software that an AI-driven company needs to use.
Dr. Michael Johnston has over 25 years of experience in speech and language technology. His research lies at the intersection of Natural Language Processing, human-computer interaction, and spoken and multimodal dialog. More specifically, his work focuses on the development of language and dialog processing techniques that support spoken and multimodal interaction and the application of these to the creation of novel systems and services. Dr. Johnston has over 50 technical papers and 32 patents in speech and language processing. Before joining Interactions, he held positions at AT&T Labs Research, Oregon Graduate Institute and Brandeis University. He is member of the board of AVIOS and editor and chair for the W3C EMMA multimodal standard.