





























Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Prepara tus exámenes
Prepara tus exámenes y mejora tus resultados gracias a la gran cantidad de recursos disponibles en Docsity
Prepara tus exámenes con los documentos que comparten otros estudiantes como tú en Docsity
Encuentra los documentos específicos para los exámenes de tu universidad
Estudia con lecciones y exámenes resueltos basados en los programas académicos de las mejores universidades
Responde a preguntas de exámenes reales y pon a prueba tu preparación
Consigue puntos base para descargar
Gana puntos ayudando a otros estudiantes o consíguelos activando un Plan Premium
Comunidad
Pide ayuda a la comunidad y resuelve tus dudas de estudio
Ebooks gratuitos
Descarga nuestras guías gratuitas sobre técnicas de estudio, métodos para controlar la ansiedad y consejos para la tesis preparadas por los tutores de Docsity
Algunas soluciones al libro Machine Learning de Mitchel
Tipo: Monografías, Ensayos
1 / 37
Esta página no es visible en la vista previa
¡No te pierdas las partes importantes!






























1 TODO An empty module that gathers the exercises’ dependen- cies 1
2 Exercises 2 2.1 DONE 1.1.............................. 2 2.2 DONE 1.2.............................. 2 2.3 DONE 1.3.............................. 3 2.4 DONE 1.4.............................. 3 2.5 TODO 1.5.............................. 4
3 Notes 4 3.1 Chapters................................ 4 3.1.1 1................................ 4 3.2 Exercises............................... 11 3.2.1 1.3............................... 11 3.2.2 1.4............................... 11 3.2.3 1.5............................... 12
such that running chicken-install -s installs them.
CLOSED: 2011-10-12 Wed 04:
Appropriate animal languages could craft appropriate responses and prompts, perhaps, though ignorant of the semantics. fugues train on bach data, or buxtehude. performance measure? perfect authentic cadence, of course. ;) no, not that simple. narratives learn the structure of narratives? performance measure is tricky here.
Not appropriate comedy requires a bizarre ex nihilo and sponteneity (dis- tinguishable from three above?) in fact, the second and third above are inappropriate, rather? define “inappropriate”: difficult? vague performance measure? data representation and search or have meta-learning-problems been solved? new science and mathematics can “creativity” be modelled?
So we can’t, indeed, escape the question of modelling; once the mechanics of learning have been mastered, there lies the ex nihilo.
CLOSED: 2011-10-12 Wed 04: Learning task: produces melodic answers to query phrases. Given a phrase that ends on a dominant, say, within a key; gives an appropriate response that ends on the tonic. Must follow a constrained set of progressions (subdomi- nant to dominant, dominant to tonic, flat-six to neopolitan, etc.), and be of an appropriate length.
task T constructing answering phrases to musical prompts (chords)
performance measure P percent of answers that return to the dominant once at the end (given appropriate length and progression constraints)
training experience E expert (bach, chopin, beethoven) prompts and an- swers.
target function V : progression → R; V (b = final tonic) = 100, V (b = final non-tonic) = − 100.
target function representation Vˆ (b) = w 0 +w 1 x 1 , where x 1 = length of prompt− number of chords in answer
- very difficult to learn given the kind of indirect training experience available - alternative target function: assigns a numerical score to any given board state
x 1 number of black pieces x 2 number of red pieces x 3 number of black kings x 4 number of red kings x 5 number of black pieces threatened by red x 6 number of red pieces threatened by black
T playing checkers
is a risk function, corresponding to the expected value of the squared error loss or quadratic loss... the defference occurs because of randomness or because the estimator doesn’t account for information that could produce a more accurate estimate.
http://en.wikipedia.org/wiki/Mean_squared_error
least mean squares (LMS) algorithms is a type of adaptive filter used to mimic a desired filter by finding the filter coefficients that relate to producing the least mean squares of the error signal (difference between the desired and the actual signal). it is a stochastic gradient descent method in that the filter is only adapted based on the error at the current time. the diea behind LMS filters is to use steepest descent to find filter weight h(n) which minimize a cost function: C(N ) = E
|e(n)|^2
where e(n) is the error at the current sample ‘n’ and E{.} denotes the expected value. this cost function is the mean square error, and is minimized by the LMS. applying steepest descent means to take the partial derivatives with respect to the individual entries of the filter coefficient (weight) vector, where ▽ is the gradient operator: ˆh(n+′) = ˆh(n) − μ 2 ▽^ C(n) = ˆh(n) + μE{x(n)e∗(n)} where mu 2 is the step size. that means we have found a se- quential update algorithm which minimizes the cost function. unfortunately, this algorithm is not realizable until we know E{x(n)e∗(n)}. for most systems, the expectation function must be approxi- mated. this can be done with the following unbiased estimator: E^ ˆ{x(n)e∗(n)} = 1 N
i=0 x(n^ −^ i)e
∗(n − i) where N indicates the number of samples we use for that esti- mate. the simplest case is N = 1: ˆh(n + 1) = hˆ(n) + μx(n)e∗(n)
http://en.wikipedia.org/wiki/Least_mean_squares_filter
in probability theory and statistics, the expected value (or ex- pectation value, or mathematical expectation, or mean, or first moment) of a random variable is the integral of the random variable with respect to its probability measure. for discrete random variables this is equivalent to the probability- weighted sum of the possible values. for continuous random variables with a density function it is the probability density-weighted integral of the possible values. it os often helpful to interpret the expected value of a random variable as the long-run average value of the variable over many independent repetitions of an experiment. the expected value, when it exists, is almost surel the limit of the sample mean as sample size grows to infitiny.
http://en.wikipedia.org/wiki/Expected_value
- damn, everytime we encroach something interesting; find out why differential equations, linear algebra, probability and statistics are so important. that’s like two years of fucking work, isn’t it? or at least one? maybe it’s worth it, if we can pull it
- program represents the learned eval function using an artifical neural network that considers the complete description of the board state rather than a subsect of board features.
From page 11: “The LMS training rule can be viewed as performing a stochastic gradient-descent search through the space of possible hypotheses (weight values) to minimize the squared error E.”
Training Games against self −−−−−−−−−−−−→ V
−^ Board−−−−→−value−−→ Representation Linear function −−−−−−−−−−→ Algorithm Gradient descent −−−−−−−−−−→ Design
Figure 1: Summary of design
Experiment generator Take as input the current hypothesis and output a new problem for the performance system to explore. Our experiment gen- erator always proposes the same initial board game. More sophisticated
x 9 X empty corner? x 10 O empty corner? x 11 X empty side? x 12 O empty side?
Page 8: “In general, this choice of representation involves a crucial tradeoff. On one hand, we wish to pick a very expressive representation to allow repre- senting as close an approximation as possible to the ideal target function V. On the other hand, the more expressive the representation, tho more training data the program will require in order to choose among the alternative hypotheses it can represent.” Here’s a crazy thought: since the space-state complexity of tic-tac-toe is utterly tractable, let’s have nine features: one corresponding to each of the squares. How do we deal with training the opposite direction, by the way: invert the outcome of the training data? I have no idea how much training data nine variables need: we’ll have to plot it; interesting to compare a strategy containing e.g. forks and wins. Is it interesting that each variable is binary? Let’s start with the generalizer and a catalog of games; in order to map the number of training-examples... Ah, I see: the second player has a fixed evaluation function. Can we abstract xkcd? Problem is, the space for O is much more complicated. Maybe we can abstract the Wikipedia strategy: # wikipedia-strategy
(It looks like the Wikipedia strategy was abstracted from here, by the way; damn: it looks like there are separate X- and O-heuristics.) Represent the board as a vector of nine values; can we set up abstractions for < x, y > as well as {map,reduce,for-each}-{row,column,diagonal,triplet}? Meh; maybe we can implement the X/O-agnostic heuristics.
;;;; Tic-tac-toe with heuristic player
(use debug
vector-lib srfi- srfi- srfi-26)
;;;; General tic-tac-toe definitions
(define n (make-parameter 3))
(define (n-by-n) (* (n) (n)))
(define (row start) (iota (n) start))
(define (column start) (iota (n) start (n)))
(define (a) 0)
(define (b) (- (n) 1))
(define (c) (- (n-by-n) 1))
(define (d) (- (c) (- (n) 1)))
(define (ac-diagonal) (iota (n) (a) (+ (n) 1)))
(define (bd-diagonal) (iota (n) (b) (- (n) 1)))
(define (rows) (map row (iota (n) (a) (n))))
(define (columns) (map column (iota (n))))
(define (diagonals) (list (ac-diagonal) (bd-diagonal)))
(define (tuplets) (append (rows) (columns) (diagonals)))
(vector-map (lambda (i mark) (cond ((X? mark) "X") ((O? mark) "O") (else " "))) board))))
(define (display-board board) (display (board->string board)))
(define (make-empty-board) (make-board (n-by-n) ))
;;; Functional variant of Knuth shuffle: partitions the cards around a ;;; random pivot, takes the first card of the right-partition, repeat.
(define shuffle (case-lambda ((deck) (shuffle ’() deck)) ((shuffled-deck deck) (if (null? deck) shuffled-deck (let ((pivot (random (length deck)))) (let ((left-partition (take deck pivot)) (right-partition (drop deck pivot))) (shuffle (cons (car right-partition) shuffled-deck) (append left-partition (cdr right-partition)))))))))
(define (make-random-board) (let ((board (make-empty-board))) (let iter ((moves (random (n-by-n))) (indices (shuffle (iota (n-by-n))))) (if (zero? moves) board (let ((mark (random (length indices)))) ;; You may end up with a board where there are more Os ;; than Xs. (vector-set! board (car indices) (if (even? moves) X O)) (iter (- moves 1) (cdr indices)))))))
(define (fold-tuplet cons nil board) (fold (lambda (tuplet accumulatum) (cons tuplet accumulatum)) nil
(tuplets)))
;;;; Play mechanics
(define (empty-spaces board) (vector-fold (lambda (space empty-spaces mark) (if (empty? mark) (cons space empty-spaces) empty-spaces)) ’() board))
;;; Putting tuplet first would allow you to use many boards. (define (first-empty-space board tuplet) (find (cute empty? board <>) tuplet))
(define (winning-tuplet? player? tuplet board) (let ((non-player-marks (filter (cute (complement player?) board <>) tuplet))) (equal? (map (cute board-ref board <>) non-player-marks) ‘(,))))
;;; The solutions here may be non-unique: in which case, we have a ;;; convergent fork. (define (winning-spaces player? board) (fold-tuplet (lambda (tuplet winning-spaces) (if (winning-tuplet? player? tuplet board) (cons (first-empty-space board tuplet) winning-spaces) winning-spaces)) ’() board))
(define (forking-space? player? player space board) (let ((board (board-copy board))) (board-set! board space player) (> (length (winning-spaces player? board)) 1)))
(define (forking-spaces player? player board) (filter (lambda (space) (forking-space? player? player space board)) (empty-spaces board)))
(define (center-space board) (/ (- (n-by-n) 1) 2))
(define (center-empty? board)
(lambda (board) (random-empty-space board)))
;;; http://www.buzzle.com/articles/tic-tac-toe-strategy-guide.html (define (make-heuristic-player player? player opponent? opponent) (lambda (board) (let ((my-winning-spaces (winning-spaces player? board))) (if (null? my-winning-spaces) (let ((losing-spaces (winning-spaces opponent? board))) (if (null? losing-spaces) (let ((my-forking-spaces (forking-spaces player? player board))) (if (null? my-forking-spaces) (let ((opponent-forking-spaces (forking-spaces opponent? opponent board))) (if (null? opponent-forking-spaces) (if (center-empty? board) (center-space board) (let ((opposite-corners (opposite-corners player? board))) (if (null? opposite-corners) (let ((empty-corners (empty-corners board))) (if (null? empty-corners) (random-empty-space board) (random-ref empty-corners))) (random-ref opposite-corners)))) (random-ref opponent-forking-spaces))) (random-ref my-forking-spaces))) (random-ref losing-spaces))) (random-ref my-winning-spaces)))))
(define (make-heuristic-X-player) (make-heuristic-player X? X O? O))
(define (make-heuristic-O-player) (make-heuristic-player X? X O? O))
(define make-default-X-player (make-parameter make-heuristic-X-player))
(define make-default-O-player (make-parameter make-heuristic-O-player))
;;; Can we get rid of =move= if we simply cycle through X and O; ;;; thence recurse? (define play (case-lambda (()
(play (make-empty-board))) ((board) (play ((make-default-X-player)) ((make-default-O-player)) board)) ((X-player O-player board) (play 0 X-player O-player board)) ((move X-player O-player board) (debug move) (display-board board) (or (outcome board) (let-values (((token player) (if (even? move) (values X X-player) (values O O-player)))) (let ((next-move (player board))) (board-set! board next-move token) (play (+ move 1) X-player O-player board)))))))
(use test debug)
(include "tic-tac-toe.scm")
(let ((board (board X X O O X O O X))) (test "winning-spaces with X" ’(2 2) (winning-spaces X? board)) (test "winning-spaces with O" ’(2) (winning-spaces O? board)))
(let ((board (board X X O O X O X))) (test "empty-spaces" ’(7 2) (empty-spaces board)))
(let ((board (board X X O