Download Deep Learning and Embeddings and more Lecture notes Artificial Intelligence in PDF only on Docsity!
Deep Learning and Embeddings
(Remote) Lecture 18
COVID-19 Accommodations
- Classes, assignments, exams, etc. all remote through the rest of the semester - For this class, this will mean diligence in working remotely with teammates - PC5 (Cooperative Testing) has been moved back another week (now due 4/6) - PC6 (Sprint Review 3) will now be delivered as a YouTube video (now also 4/6) - PC7 (Final Presentations) will be a scheduled telecon with all of your team members, me, and one of the IAs (forthcoming) - Look at the Piazza post; you can schedule a 30 minute block on my calendar via the link there - Try to have most/all your team members present for that
- Grades now P/NRC with option to uncover letter grade
Recap: Applying to Conversational AI
- Intent Classification
- Data: tuples of (utterance, intent class)
- Model: clustering, SVM, rules;
- Inference: mapping from model output to intent class label
- Slot Extraction
- Data: tuples of (token position, slot label)
- Model: n-grams, RNN
- Inference: RNN output mapped back to a vocabulary
One Slide Summary: Deep Learning and
Embeddings
- Machine Learning is driven by applied statistics
- Simple linear models are more interpretable (e.g., best-fit line)
- More complex models yield better accuracy (trading off interpretability)
- Deep Learning is used in the NLP space to accurately represent language and classify intents and slots - Deep learning allows black-boxing of inputs to eliminate the need to derive costly features or rules - In particular, Recurrent Neural Networks and derivatives are state-of-the-art for NLU tasks
- Embeddings are numerical representations of NLU elements
- Expressed as fixed-dimensional vectors
- We say that we embed a token, sentence, or utterance into a vector space called the embedding space
Machine Learning
- AI is an application of Machine Learning
- ML is an application of statistics to make predictions from existing data
Machine Learning
- AI is an application of Machine Learning
- ML is an application of statistics to make predictions from existing data
?
Machine Learning
- Decision Trees can be used to classify inputs (e.g., tall vs. not tall; high risk vs. low risk)
- Example: cardiovascular risk
- Perhaps doctors have access to tons of old medical histories.
- Might notice clusters in data (i.e., domain expertise ): - Minimum systolic <= 90 -> high risk of death - Old with sinus tachycardia rhythm -> high risk
Machine Learning
- We use ML to teach software to make predictions
- Software learns from existing data
Supervised learning Labeled data (e.g., tall vs. short)
Unsupervised learning Unlabeled data (e.g., just points of data)
( 20 y.o., 6ft, tall ) ( 20 y.o., 5ft, short )
( 20 y.o., 6ft ) ( 20 y.o., 5ft )
Labeled Data
Labeled / Unlabled Data
Machine Learning algorithm
Learned model (^) Prediction
Training Prediction
Machine Learning in an NLU Context
0
1
2
3
0 0.5 1 1.5 2 2.5 3
“I want a burger”
“I want a chicken sandwich”
“What’s in your Caesar salad?”
“Tell me the nutrition in a milkshake.”
A Model allows us to quantify utterances. Depending on the specific model, we can visualize data
Order_food intent class
Get_nutrition_info intent class
X: Feature 1
Y: Feature 2
Machine Learning in an NLU Context
0
1
2
3
0 0.5 1 1.5 2 2.5 3
“I want a burger”
“I want a chicken sandwich”
“What’s in your Caesar salad?”
“Tell me the nutrition in a milkshake.”
14
A Model allows us to quantify utterances. Depending on the specific model, we can visualize data
Order_food intent class
Get_nutrition_info intent class
X: Feature 1
Y: Feature 2
How do we pick features? (hint: it’s hard)
Deep Learning Crash Course
- Deep Learning is a catch-all phrase that refers to Neural Networks that have multiple layers (c.f. deep pipeline from architecture)
Deep Learning Crash Course
- Deep Learning is a catch-all phrase that refers to Neural Networks that have multiple layers (c.f. deep pipeline from architecture)
“depth” = more layers^17
Neural Network
- We use Deep Neural Networks (DNNs) to perform classification of intents, slot mapping, and slot-value pairing - DNNs can learn from (or “notice”) patterns in data that are not immediately obvious to human domain knowledge experts
- DNNs benefit from data
- As long as features are represented, DNNs can learn which ones are important
Deeper in NNs
- Each cell in a NN is a simple combination of floating-point inputs