Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Artificial Intelligence - Computer Science I Programming - Slides | CSCI 1300, Study notes of Computer Science

University of Colorado Boulder (CU Boulder)Computer Science

Material Type: Notes; Class: Computer Science 1: Programming; Subject: Computer Science; University: University of Colorado - Boulder; Term: Unknown 1989;

Typology: Study notes

Pre 2010

Uploaded on 02/10/2009

koofers-user-c1w 🇺🇸

9 documents

1 / 35

This page cannot be seen from the preview

Don't miss anything!

CSCI 1300

Artificial Intelligence Lecture

Mike Mozer

December 4, 2003

Discover Study notes of Computer Science University of Colorado Boulder (CU Boulder)

Partial preview of the text

Download Artificial Intelligence - Computer Science I Programming - Slides | CSCI 1300 and more Study notes Computer Science in PDF only on Docsity!

CSCI 1300

Artificial Intelligence Lecture

Mike Mozer

December 4, 2003

Computer Science

Operating SystemsProgramming LanguagesNetworkingSecurityTheoryArtificial Intelligence

Machine Learning

Supervised Learning

spam filters (hotmail.com)ALVINN (autonomous vehicle navigation) Unsupervised Learning

collaborative filtering (amazon.com)fault monitoring Reinforcement Learning

td-gammon (champion backgammon playing program)elevator controlleradaptive home lighting/heating control

Reinforcement Learning: A Simple Example

Suppose you are in one of two

states

hungrysleepy Suppose you can take one of two

actions

go to Turley’slie on bed Reward contingencies

hungry -> go to Turley’s

reward

hungry -> lie on bed

no reward

sleepy -> go to Turley’s

no reward

sleepy -> lie on bed

reward

Reward depends on what action you take in a given state.

Reinforcement Learning in the Real World

Issues

Delayed reinforcement (e.g., car accident due to worn tires)Occasional reinforcement (e.g., chess playing)Short term versus long term rewards (e.g., skipping class)Exploration versus exploitation (e.g., trying new restaurants)Partially observable state (e.g., viral infection)Multiple agents (e.g., multiple elevators)

s^1

s^2

s^3

s^4

s^5

s^6

s^7

time intervalstateaction instantaneous

a^1

a^2

a^3

a^4

a^5

a^6

a^7

r^1

r^2

r^3

r^4

r^5

r^6

r^7

reinforcement

Elevator Control

Q learning

(Watkins, 1989; Watkins & Dayan, 1992)

Q(x,u): If action u is taken in state x, what is the minimumcost we can expect to obtain?Policy based on Q values:Incremental update rule for Q values:Given fully observable state, infinite exploration, etc.,guaranteed to converge on optimal policy.

π^

x ( t

)^

argmin

Q u

t , (^

)^

with probability

random

with probability

exploration rate

Q

t , (^

)^

)Q

t , (^

)^

max

ˆ^ u

c^ t

Q

1 +^

ˆ u ,

(^

[^

]

discount factor

learning rate

The Adaptive House

Michael Mozer

+^ *

Robert Dodier

Debra Miller

Marc Anderson

Josh Anderson

✩^

Diane Lukianow

✩

Dan Bertini

Tom Moyer

Matt Bronder

Charles Myers

✩

Michael Colagrosso

Tom Pennell

Robert Cruickshank

James Ries

✩

Brian Daugherty

Erik Skorpen

✩

Mark Fontenot

Joel Sloss

✩

Okechukwu Ikeako

✩^

Lucky Vidmar

Paul Kooros

✩^

Matthew Weeks

✩

University of Colorado

Department of Computer Science^ +

Institute of Cognitive Science

Department of Civil, Environmental, and Architectural Engineering

✩ Department of Electrical and Computer Engineering

Department of Mechanical Engineering^ Department of Aerospace Engineering

Artificial Intelligence - Computer Science I Programming - Slides | CSCI 1300, Study notes of Computer Science

Related documents

Partial preview of the text

Download Artificial Intelligence - Computer Science I Programming - Slides | CSCI 1300 and more Study notes Computer Science in PDF only on Docsity!

CSCI 1300

Artificial Intelligence Lecture

Mike Mozer

December 4, 2003

Computer Science

Machine Learning

Reinforcement Learning: A Simple Example

Reinforcement Learning in the Real World

Elevator Control

Q learning

)^

)^

Q

)^

)Q

)^

Q

(^

[^

]

Department of Civil, Environmental, and Architectural Engineering

The adaptive house

Some of the gang

Bedrooms and bathrooms

Sensors

Water heater

Furnace