








Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The fisher linear discriminant method for feature selection in computational functional genomics. The fisher linear discriminant finds the projection to a line that preserves direction useful for data classification. How to find the best direction w, the measure of separation between the projected points, and the optimal line direction w. The document also covers the concept of scatter and its relation to variance, and the fisher linear discriminant's objective function. An example is provided to illustrate the concepts.
Typology: Study notes
1 / 14
This page cannot be seen from the preview
Don't miss anything!









Prof. Yang Dai BioE 594 Computational Functional Genomics
x^ ) in^ d-dimensional space onto a line.n n: number of the points in subset
D, label +;
n: number of the points in subset
D, label -;
Let^ kwk^ = 1, consider the inner products of
w^ and^ x^ i
ziis the projection of^
x^ onto a line in the directioni^
w
Divide^ z^ into two classes according to the labels ofi^
x^ i
Prof. Yang Dai BioE 594 Computational Functional Genomics
w ^ A measure of the separation between the projected points is thedifference of the sample means. ^ Let Then the sample mean for projected points of each class is
Prof. Yang Dai BioE 594 Computational Functional Genomics
Dk
^ scatter measures the same thing as variance, the spread ofdata around the mean. It is just on different scale thanvariance : the total within-class scatter of the projected samples The Fisher linear discriminant employs that linear functions
T wx
for which
is maximized
Prof. Yang Dai BioE 594 Computational Functional Genomics
Prof. Yang Dai BioE 594 Computational Functional Genomics
-1 w=S (μW
does not matter. This
w^ is the direction determined by Fisher's linear discriminant. Thus the classification has been converted from a
d-
dimensional problem to more manageable one-dimensional one. All that remains is to find the threshold,
i.e. , the point
along the one-dimensional subspace separating theprojected points.
Prof. Yang Dai BioE 594 Computational Functional Genomics
Sand^ S^
for each class
^ Within class scatter matrix ^ The inverse of
Sw^ is ^ Therefore the optimal line direction
w
Prof. Yang Dai BioE 594 Computational Functional Genomics
1D^ vector
y^ for each class
Prof. Yang Dai BioE 594 Computational Functional Genomics
s.projection for thematricesscatter areand, | || SB (^) )( SSSC = BW | SW
Prof. Yang Dai BioE 594 Computational Functional Genomics
(a previously determined value) then stop, otherwise (^0) go to 2.