























Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
where c is called “coefficient of coincidence”. • Interference: I = 1 - c. Map Distance. The map distance x between two loci, in.
Typology: Exercises
1 / 31
This page cannot be seen from the preview
Don't miss anything!
























r/r
R/r
r/r
R/r
r/r
Human Genome, Chromosomes
Graphical representation of the idealized human diploid karyotype
Centromeres
Mitochondrial
DNA not shown
Autosomes
Sex
chromosomes
Sequences of Base Pairs Mapping
Genetic maps: relative positions of loci in chromosomes or
linkage groups. Distances in genetic maps are measured
in centimorgans (cM, about 1 million base pairs)
Physical maps: overlapping collections of DNA fragments
(measured in kilobases, kb) which are assembled
together to build the base-by-base sequence of DNA
Crossing Over and Recombination
Because crossing over takes place at random, the
probability of recombination (r) is higher for loci that are
farther apart than for loci that are closer to each other
An odd number of
crossovers between two loci
results in a recombination
between them
0 ≤ r ≤ 0. 5
completely
linked loci
unlinked loci
Two Point Linkage Analysis
ð Backcross experiment
ð Genotypic information for two loci (A and B)
ð Estimate the recombination rate r AB
ð Are these two loci linked? A 1
1
1
1
2
2
2
2
1
2
1
2
1
1
1
1
Four possible genotypes
Individual A B
n 1 1
ð Suppose n = 80 and y = 16 (recombinants)
ð Point estimate of r AB
ð Confidence interval (95%) of r AB
AB
CI(r AB
; 95 %) = [ 0. 1189 ; 0. 3044 ]
Two Point Linkage Analysis
Recombination Rate
and Linkage Map
Estimates of recombination rates between pairs
of markers are used to order markers and to
infer their genetic distances (centimorgans; cM)
Map Functions
Map functions provide a transformation from map
distance to recombination rate. Two approaches have
been used to derive map functions:
In the first case, a probability model is assumed for the
number of crossovers in an interval of length x. Then,
recombination rate is calculated as the probability of an
odd number of crossovers in the interval
In the second approach, recombination events in two
adjacent intervals are modeled, allowing for interference
Examples of map functions: Haldane, Binomial, Kosambi
Haldane Map Function
Haldane (1919) suggested that the number of
crossovers in any chromosomal interval follows a
Poisson distribution, with no interference
If P k
is the probability of k crossovers, then the
probability of recombination (r) is r = P 1
3
5
This leads to the Haldane’s map function:
The inverse of which is: x =
ln( 1 − 2 r) , if 0 ≤ r < 0. 5
∞ , if r = 0. 5
r =
( 1 − e
− 2 x )
Haldane Map Function
Multipoint Point Linkage Analysis
ð Instead of two loci, suppose there are M loci
ð If order is unknown: M!/2 alternatives
Goal: Determine the order of the loci and estimate
recombination fractions between neighboring loci,
i.e. “Map Construction”
BC
Purebreds,
lines
80 40
F
65
57 68 55 61 59
QTL Mapping
(^65 57 68 5561 )
Marker
61 59
57 55
65 68
Genotype
65
60
55
70
QTL Mapping
(^65 57 68 5561 )
Marker
59 61
68 55
65 57
Genotype
65
60
55
70
QTL Mapping
Single Marker Analysis
C Simple example with candidate gene and BC population
Q 1
Q 1
Q 2
Q 2
Q 1 Q 2 Q 1 Q 1
Q 1 Q 2 Q 1 Q 1
δ
μ 1
μ 2
Q 1
Q 2
Q 1
Q 1
Genotype Obs. Mean STD
Q 1 Q 1 n 1 m 1 s 1
Q 1 Q 2 n 2 m 2 s 2
ð H 0
: δ = 0 vs H 1
: δ ≠ 0
( 2 )
1 2
2
1 2
1 2
~
1 1
⎟
⎟
⎠
⎞
⎜
⎜
⎝
⎛
n n
t
n n
s
m m t
n n 2
(n 1 )s (n 1 ) s s
1 2
2 2 2
2 2 1 1
2
[ ;( 1 )]:( )
1 2
2
2 1 ( 2 ;/ 2 ) 1 2
− − ±
s
CI m m t n n α δ α
C QTL and marker (M); recombination frequency = r
M 1
M 1
Q 1
Q 1
M 1
M 2
Q 1
Q 2
M 1 M 1
Q 1 Q 1
M 1
M 2
Q 1
Q 2
M 1
M 2
Q 1
Q 1
M 1
M 1
Q 1
Q 2
Genotype Freq. E[y] Marker group Freq. E[y]
M 1 M 1 Q 1 Q 1 (1- r )/2 μ 1 M 1 M 1 ½
M 1
M 1
Q 1
Q 2
r /2 μ 2
M 1 M 2 Q 1 Q 1 r /2 μ 1 M 1 M 2 ½
M 1 M 2 Q 1 Q 2 (1- r )/2 μ 2
1 2
r μ +( 1 − r ) μ
1 2
( 1 − r ) μ + r μ
Difference between marker
group expected values
1 2 1 2
r μ +( 1 − r ) μ −( 1 − r ) μ− r μ
( 1 2 )( μ μ) ( 1 2 ) δ 2 1
= − r − = − r
Single Marker Analysis
ð Brassica napus; Flowering time
ð 10 Markers
(positions: 0, 8.8, 20.6, 27.4, 34.2, 42.9, 53.6, 64.1, 69.2, 83.9 cM)
ð 104 individuals; Double haploid
3.0204 -1 -1 -1 -1 -1 -1 -1 -1 -99 -
2.9704 -1 -1 -1 -1 -99 -1 -1 -1 -1 1
2.7408 -1 -1 1 1 1 1 1 1 1 1
!!!!!!!!!!!
3.3673 1 1 1 1 -1 -1 -1 -1 -1 1
3.0681 1 1 1 1 -99 1 1 1 -1 -
3.2771 -1 -99 -1 -1 -1 -1 -1 -1 -1 -
(Satagopan et al. Genetics 144: 805-816, 1996)
Single Marker Analysis
Interval Mapping
M QTL N
r 1
r 2
r
(Lander & Botstein, 1989)
M m
N n
Backcross
M m
Q q
N n
m m
q q
n n
m m
n n
M m
n n
m m
N n
δ
μ
Qq QQ
i i i
phenotype QTL
genotype
residual
0 , if qq
1 , if Qq
q i
ð Likelihood estimation: EM algorithm to estimate
parameters, including λ (position of QTL)
ð Alternatively: Fix λ (grid search) and evaluate LOD
⎥
⎦
⎤
⎢
⎣
⎡
=
=
L(ˆ,ˆ ,ˆ| , 0 )
,ˆ ,ˆ| )
ˆ L(ˆ,
LOD log 2
2
10
μ σ δ
μ δ σ
λ
q y
q y
C A QTL is detected whenever the LOD score gets
larger than a threshold; estimated position of the
QTL maximizes LOD
Interval Mapping
REGRESSION APPROACH
(Haley & Knott, 1992)
y = Xβ + ε
N N N N
p p
p p
p p
y
y
y
2
1
2
1
1 2
21 22
11 12
2
1
⎥
⎥
⎥
⎥
⎦
⎤
⎢
⎢
⎢
⎢
⎣
⎡
⎥ ⎦
⎤
⎢ ⎣
⎡
⎥
⎥
⎥
⎥
⎦
⎤
⎢
⎢
⎢
⎢
⎣
⎡
=
⎥
⎥
⎥
⎥
⎦
⎤
⎢
⎢
⎢
⎢
⎣
⎡
N N N p
p
p
y
y
y
ε
ε
ε
δ
μ
!!!!
2
1
2
22
12
2
1
1
1
1
β ( X ' X ) X ' y
y y β ' X ' y
Residual Sum of Squares:
Estimated position of the
QTL minimizes RSS.
alternatively
Interval Mapping
(^65 57 68 5561 )
Chromosome, marker positions (cM)
Test statistics
(evidence for QTL)
M 1
M 2
M 3
M 4
M 5
M 6
QTL Mapping
ð COMMENTS:
Backcross to both parental lines, or use F2 design,
to estimate additive and dominance effects
Threshold; multiple testing; false positives
Confidence intervals
Multiple QTL, ghost QTL
Interval Mapping
i
k j j
i ij k ik
∑
≠ ,+ 1
0
Intercept Genetic effect of the
putative QTL
(between markers j and j+1)
Dummy variables
Nj N Np
j p
j p
x w w
x w w
x w w
1
2 21 2
1 11 1
y = Xβ + ε
β ( X ' X ) X ' y
(Zeng, 1993, 1994)
Composite Interval Mapping
ð Brassica napus; Flowering time (Satagopan et al., 1996)
0
5
10
15
20
25
30
35
40
0 10 20 30 40 50 60 70 80 90
Position (cM)
LRT
Interval Mapping
0
2
4
6
8
10
12
14
16
18
0 10 20 30 40 50 60 70 80 90
Position (cM)
LRT
ð Brassica napus; Flowering time (Satagopan et al., 1996)
Composite Interval Mapping
QTL Database (Livestock)