




























































































Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Power control algorithms for wireless networks, state equations for a linear mechanical system, and standard time-series models. It includes a MATLAB simulation and equations of motion for a lumped mechanical system. relevant to courses in electrical engineering, signal processing, and control systems.
Typology: Study notes
1 / 144
This page cannot be seen from the preview
Don't miss anything!





























































































EE263 Autumn 2007-08 Prof. S. Boyd
2.1 A simple power control algorithm for a wireless network. First some background. We consider a network of n transmitter/receiver pairs. Transmitter i transmits at power level pi (which is positive). The path gain from transmitter j to receiver i is Gij (which are all nonnegative, and Gii are positive). The signal power at receiver i is given by si = Giipi. The noise plus interference power at receiver i is given by qi = σ +
∑
j 6 =i
Gij pj
where σ > 0 is the self-noise power of the receivers (assumed to be the same for all receivers). The signal to interference plus noise ratio (SINR) at receiver i is defined as Si = si/qi. For signal reception to occur, the SINR must exceed some threshold value γ (which is often in the range 3 – 10). Various power control algorithms are used to adjust the powers pi to ensure that Si ≥ γ (so that each receiver can receive the signal transmitted by its associated transmitter). In this problem, we consider a simple power control update algorithm. The powers are all updated synchronously at a fixed time interval, denoted by t = 0, 1 , 2 ,.. .. Thus the quantities p, q, and S are discrete-time signals, so for example p 3 (5) denotes the transmit power of transmitter 3 at time epoch t = 5. What we’d like is Si(t) = si(t)/qi(t) = αγ where α > 1 is an SINR safety margin (of, for example, one or two dB). Note that increasing pi(t) (power of the ith transmitter) increases Si but decreases all other Sj. A very simple power update algorithm is given by pi(t + 1) = pi(t)(αγ/Si(t)). (1) This scales the power at the next time step to be the power that would achieve Si = αγ, if the interference plus noise term were to stay the same. But unfortunately, changing the transmit powers also changes the interference powers, so it’s not that simple! Finally, we get to the problem. (a) Show that the power control algorithm (1) can be expressed as a linear dynamical system with constant input, i.e., in the form p(t + 1) = Ap(t) + b, where A ∈ Rn×n^ and b ∈ Rn^ are constant. Describe A and b explicitly in terms of σ, γ, α and the components of G. (b) Matlab simulation. Use matlab to simulate the power control algorithm (1), starting from various initial (positive) power levels. Use the problem data
, γ = 3, α = 1. 2 , σ = 0. 01.
Plot Si and p as a function of t, and compare it to the target value αγ. Repeat for γ = 5. Comment briefly on what you observe. Comment: You’ll soon understand what you see.
2.2 State equations for a linear mechanical system. The equations of motion of a lumped me- chanical system undergoing small motions can be expressed as
M q¨ + D q˙ + Kq = f
where q(t) ∈ Rk^ is the vector of deflections, M , D, and K are the mass, damping, and stiffness matrices, respectively, and f (t) ∈ Rk^ is the vector of externally applied forces. Assuming M is invertible, write linear system equations for the mechanical system, with state x = [qT^ q˙T^ ]T^ , input u = f , and output y = q.
2.3 Some standard time-series models. A time series is just a discrete-time signal, i.e., a function from Z+ into R. We think of u(k) as the value of the signal or quantity u at time (or epoch) k. The study of time series predates the extensive study of state-space linear systems, and is used in many fields (e.g., econometrics). Let u and y be two time series (input and output, respectively). The relation (or time series model)
y(k) = a 0 u(k) + a 1 u(k − 1) + · · · + aru(k − r)
is called a moving average (MA) model, since the output at time k is a weighted average of the previous r inputs, and the set of variables over which we average ‘slides along’ with time. Another model is given by
y(k) = u(k) + b 1 y(k − 1) + · · · + bpy(k − p).
This model is called an autoregressive (AR) model, since the current output is a linear com- bination of (i.e., regression on) the current input and some previous values of the output. Another widely used model is the autoregressive moving average (ARMA) model, which com- bines the MA and AR models:
y(k) = b 1 y(k − 1) + · · · + bpy(k − p) + a 0 u(k) + · · · + aru(k − r).
Finally, the problem: Express each of these models as a linear dynamical system with input u and output y. For the MA model, use state
x(k) =
u(k − 1) .. . u(k − r)
,
and for the AR model, use state
x(k) =
y(k − 1) .. . y(k − p)
.
You decide on an appropriate state vector for the ARMA model. (There are many possible choices for the state here, even with different dimensions. We recommend you choose a state
2.7 Consider the (discrete-time) linear dynamical system
x(t + 1) = A(t)x(t) + B(t)u(t), y(t) = C(t)x(t) + D(t)u(t).
Find a matrix G such that (^)
y(0) y(1) .. . y(N )
x(0) u(0) .. . u(N )
The matrix G shows how the output at t = 0,... , N depends on the initial state x(0) and the sequence of inputs u(0),... , u(N ).
2.8 Some sparsity patterns.
(a) A matrix A ∈ Rn×n^ is tridiagonal if Aij = 0 for |i − j| > 1. Draw a block diagram of y = Ax for A tridiagonal. (b) Consider a certain linear mapping y = Ax with A ∈ Rm×n. For i odd, yi depends only on xj for j even. Similarly, for i even, yi depends only on xj for j odd. Describe the sparsity structure of A. Give the structure a reasonable, suggestive name.
2.9 Matrices and signal flow graphs.
(a) Find A ∈ R^2 ×^2 such that y = Ax in the system below:
x 1
x 2
y 1
y 2
2
(b) Find B ∈ R^2 ×^2 such that z = Bx in the system below:
x 1
x 2 z 2
z 1
. 5. 5. 5. 5
2 2 2 2
Do this two ways: first, by expressing the matrix B in terms of A from the previous part (explaining why they are related as you claim); and second, by directly evaluating all possible paths from each xj to each zi.
2.10 Mass/force example. Find the matrix A for the mass/force example in the lecture notes. For n = 4, find a specific input force sequence x that moves the mass to final position 1 and final velocity zero.
2.11 Optimal force design. We consider the mass/force example in the lecture notes, and in exer- cise 10, with n = 4, and the requirement that the final position is 1 and final velocity is 0. Roughly speaking, you have four variables and two equations, and therefore two extra degrees of freedom. In this problem you use the extra degrees of freedom to achieve other objectives, i.e., minimize some cost functions that are described below.
(a) Find f that meets the specifications and minimizes the sum of squares of the forces, i.e., f 12 + f 22 + f 32 + f 42. (b) Find f that meets the specifications and minimizes the maximum force applied, i.e., max{|f 1 |, |f 2 |, |f 3 |, |f 4 |}. (c) Find f that meets the specifications and minimizes the sum of absolute values of the forces applied, i.e., |f 1 | + |f 2 | + |f 3 | + |f 4 |. (This would correspond to minimum fuel usage if the force comes from a thruster.)
There might be more than one minimizer; we just want one. If you can solve these optimiza- tion problems exactly, that’s fine; but you are free to solve the problems numerically (and even approximately). We don’t need an analysis verifying that your solution is indeed the best one. Make sure your solutions make sense to you. Hints:
2.12 Undirected graph. Consider an undirected graph with n nodes, and no self loops (i.e., all branches connect two different nodes). Let A ∈ Rn×n^ be the node adjacency matrix, defined as Aij =
{ 1 if there is a branch from node i to node j 0 if there is no branch from node i to node j
Note that A = AT^ , and Aii = 0 since there are no self loops. We can intrepret Aij (which is either zero or one) as the number of branches that connect node i to node j. Let B = Ak, where k ∈ Z, k ≥ 1. Give a simple interpretation of Bij in terms of the original graph. (You might need to use the concept of a path of length m from node p to node q.)
2.13 Counting sequences in a language or code. We consider a language or code with an alphabet of n symbols 1, 2 ,... , n. A sentence is a finite sequence of symbols, k 1 ,... , km where ki ∈ { 1 ,... , n}. A language or code consists of a set of sequences, which we will call the allowable sequences. A language is called Markov if the allowed sequences can be described by giving the allowable transitions between consecutive symbols. For each symbol we give a set of
active, etc. This cycle repeats indefinitely: when t = mK + k, where m is an integer, and k ∈ { 1 ,... , K}, transmissions can occur only over edges assigned to time-slot k. Although it doesn’t matter for the problem, we mention some reasons why the possible transmissions are assigned to time-slots. Two possible transmissions are assigned to different time-slots if they would interfere with each other, or if they would violate some limit (such as on the total power available at a node) if the transmissions occurred simultaneously. A message or packet can be sent from one node to another by a sequence of transmissions from node to node. At time period t, the message can be sent across any edge that is active at period t. It is also possible to store a message at a node during any time period, presumably for transmission during a later period. If a message is sent from node j to node i in period t, then in period t + 1 the message is at node i, and can be stored there, or transmitted across any edge emanating from node i and active at time period t + 1. To make sure the terminology is clear, we consider the very simple example shown below, with n = 4 nodes, and K = 3 time-slots.
1 2
k = 1 k = 1
k = 2
k = 2
k = 3
In this example, we can send a message that starts in node 1 to node 3 as follows:
You can check that at each period, the transmission used is active, i.e., assigned to the associated time-slot. The sequence of transmissions (and storing) described above gets the message from node 1 to node 3 in 5 periods. Finally, the problem. We consider a specific network with n = 20 nodes, and K = 3 time-slots, with edges and time-slot assignments given in ts_data.m. The labeled graph that specifies the possible transmissions and the associated time-slot assignments are given in a matrix A ∈ Rn×n, as follows:
Aij =
k if transmission from node j to node i is allowed, and assigned to time-slot k 0 if transmission from node j to node i is never allowed 0 i = j.
Note that we set Aii = 0 for convenience. This choice has no significance; you can store a message at any node in any period. To illustrate this encoding of the graph, consider the simple example described above. For this example, we have
Aexample =
.
Very important: the problems below concern the network described in the mfile ts_data.m, and not the simple example given above.
(a) Minimum-time point-to-point routing. Find the fastest way to get a message that starts at node 5, to node 18. Give your solution as a prescription ordered in time from t = 1 to t = T (the last transmission), as in the example above. At each time period, give the transmission (as in ‘transmit from node 7 to node 9’) or state that the message is to be stored (as in ‘store at node 13’). Be sure that transmissions only occur during the associated time-slots. You only need to give one prescription for getting the message from node 5 to node 18 in minimum time. (b) Minimum time flooding. In this part of the problem, we assume that once the message reaches a node, a copy is kept there, even when the message is transmitted to another node. Thus, the message is available at the node to be transmitted along any active edge emanating from that node, at any future period. Moreover, we allow multi-cast: if during a time period there are multiple active edges emanating from a node that has (a copy of) the message, then transmission can occur during that time period across all (or any subset) of the active edges. In this part of the problem, we are interested in getting a message that starts at a particular node, to all others, and we attach no cost to storage or transmission, so there is no harm is assuming that at each time period, every node that has the message forwards it to all nodes it is able to transmit to. What is the minimum time it takes before all nodes have a message that starts at node 7?
For both parts of the problem, you must give the specific solution, as well as a description of your approach and method.
2.16 Solving triangular linear equations. Consider the linear equations y = Rx, where R ∈ Rn×n is upper triangular and invertible. Suggest a simple algorithm to solve for x given R and y. Hint: first find xn; then find xn− 1 (remembering that now you know xn); then find xn− 2 (remembering that now you know xn and xn− 1 ); etc. Remark: the algorithm you will discover is called back substitution. It requires order n^2 floating point operations (flops); most methods for solving y = Ax for general A ∈ Rn×n^ require order n^3 flops.
2.17 Gradient of some common functions. Recall that the gradient of a differentiable function f : Rn^ → R, at a point x ∈ Rn, is defined as the vector
∇f (x) =
∂f ∂x 1 .. . ∂f ∂xn
,
where the partial derivatives are evaluated at the point x. The first order Taylor approxima- tion of f , near x, is given by
fˆtay(z) = f (x) + ∇f (x)T^ (z − x).
This function is affine, i.e., a linear function plus a constant. For z near x, the Taylor approximation fˆtay is very near f. Find the gradient of the following functions. Express the gradients using matrix notation.
where αi are positive constants. Each gate has a delay di, which is given by
di = βi + γiC iload /xi,
where βi and γi are positive constants, and C iload is the load capacitance of gate i. Note that the gate delay di is always larger than βi, which can be intepreted as the minimum possible delay of gate i, achieved only in the limit as the gate scale factor becomes large. The load capacitance of gate i is given by
C iload = C iext +
∑
j∈FO(i)
C jin ,
where Cext i is a positive constant that accounts for the capacitance of the interconnect wires and external circuitry. We will follow a simple design method, which assigns an equal delay T to all gates in the circuit, i.e., we have di = T , where T > 0 is given. For a given value of T , there may or may not exist a feasible design (i.e., a choice of the xi, with 1 ≤ xi ≤ xmax) that yields di = T for i = 1,... , n. We can assume, of course, that T > maxi βi, i.e., T is larger than the largest minimum delay of the gates. Finally, we get to the problem.
(a) Explain how to find a design x⋆^ ∈ Rn^ that minimizes T , subject to a given area constraint A ≤ Amax. You can assume the fanout lists, and all constants in the problem description are known; your job is to find the scale factors xi. Be sure to explain how you determine if the design problem is feasible, i.e., whether or not there is an x that gives di = T , with 1 ≤ xi ≤ xmax, and A ≤ Amax. Your method can involve any of the methods or concepts we have seen so far in the course. It can also involve a simple search procedure, e.g., trying (many) different values of T over a range. Note: this problem concerns the general case, and not the simple example shown above. (b) Carry out your method on the particular circuit with data given in the file gate_sizing_data.m. The fan-out lists are given as an n × n matrix F, with i, j entry one if j ∈ FO(i), and zero otherwise. In other words, the ith row of F gives the fanout of gate i. The jth entry in the ith row is 1 if gate j is in the fan-out of gate i, and 0 otherwise.
Comments and hints.
2.19 Some matrices from signal processing. We consider x ∈ Rn^ as a signal, with xi the (scalar) value of the signal at (discrete) time period i, for i = 1,... , n. Below we describe several transformations of the signal x, that produce a new signal y (whose dimension varies). For each one, find a matrix A for which y = Ax. formula for
(a) 2× up-conversion with linear interpolation. We take y ∈ R^2 n−^1. For i odd, yi = x(i+1)/ 2. For i even, yi = (xi/ 2 + xi/2+1)/2. Roughly speaking, this operation doubles the sample rate, inserting new samples in between the original ones using linear interpolation. (b) 2× down-sampling. We assume here that n is even, and take y ∈ Rn/^2 , with yi = x 2 i. (c) 2× down-sampling with averaging. We assume here that n is even, and take y ∈ Rn/^2 , with yi = (x 2 i− 1 + x 2 i)/2.
2.20 Smallest input that drives a system to a desired steady-state output. We start with the discrete- time model of the system used in pages 16-19 of lecture 1:
x(t + 1) = Adx(t) + Bdu(t), y(t) = Cdx(t), t = 1, 2 ,... ,
where Ad ∈ R^16 ×^16 , Bd ∈ R^16 ×^2 , Cd ∈ R^2 ×^16. The system starts from the zero state, i.e., x(1) = 0. (We start from initial time t = 1 rather than the more conventional t = 0 since Matlab indexes vectors starting from 1, not 0.) The data for this problem can be found in ss_small_input_data.m. The goal is to find an input u that results in y(t) → ydes = (1, −2) as t → ∞ (i.e., asymptotic convergence to a desired output) or, even better, an input u that results in y(t) = ydes for t = T + 1,... (i.e., exact convergence after T steps).
(a) Steady-state analysis for desired constant output. Suppose that the system is in steady- state, i.e., x(t) = xss, u(t) = uss and y(t) = ydes are constant (do not depend on t). Find uss and xss. (b) Simple simulation. Find y(t), with initial state x(1) = 0, with u(t) = uss, for t = 1 ,... , 20000. Plot u and y versus t. If you’ve done everything right, you should observe that y(t) appears to be converging to ydes. You can use the following Matlab code to obtain plots that look like the ones in lecture 1. figure; subplot(411); plot(u(1,:)); subplot(412); plot(u(2,:)); subplot(413); plot(y(1,:)); subplot(414); plot(y(2,:)); Here we assume that u and y are 2 × 20000 matrices. There will be two differences between these plots and those in lecture 1: These plots start from t = 1, and the plots in lecture 1 scale t by a factor of 0.1. (c) Smallest input. Let u⋆(t), for t = 1,... , T , be the input with minimum RMS value ( 1 T
∑^ T
t=
‖u(t)‖^2
) 1 / 2
that yields x(T + 1) = xss (the value found in part (a)). Note that if u(t) = u⋆(t) for t = 1,... , T , and then u(t) = uss for t = T + 1, T + 2,.. ., then y(t) = ydes for t ≥ T + 1. In other words, we have exact convergence to the desired output in T steps. For the three cases T = 100, T = 200, and T = 500, find u⋆^ and its associated RMS value. For each of these three cases, plot u and y versus t.
3.1 Price elasticity of demand. The demand for n different goods as a function of their prices is described by a function f (·) from Rn^ to Rn:
q = f (p),
where p is the price vector, and q is the demand vector. Linear models of the demand function are often used to analyze the effects of small price changes. Denoting the current price and current demand vectors by p∗^ and q∗, we have that q∗^ = f (p∗), and the linear approximation is: q∗^ + δq ≈ f (p∗) +
df dp
∣∣ ∣∣ p∗
δp.
This is usually rewritten in term of the elasticity matrix E, with entries
eij = dfi dpj
∣∣ ∣∣ ∣ p∗ j
1 /q∗ i 1 /p∗ j
(i.e., relative change in demand per relative change in price.) Define the vector y of relative demand changes, and the vector x of relative price changes,
yi =
δqi q i∗ , xj =
δpj p∗ j
and, finally, we have the linear model y = Ex. Here are the questions:
(a) What is a reasonable assumption about the diagonal elements eii of the elasticity matrix? (b) Consider two goods. The off-diagonal elements of E describe how the demand for one good is affected by changes in the price of the other good. Assume e 11 = e 22 = −1 and e 12 = e 21 , that is, E =
[ − 1 e 12 e 12 − 1
] .
Two goods are called substitutes if they provide a similar service or other satisfaction (for example: train tickets and bus tickets, cake and pie, etc.) Two goods are called complements if they tend to be used together (for example: automobiles and gasoline, left and right shoes, etc.) For each of these two generic situations, what can you say about e 12? (c) Suppose the price elasticity of demand matrix is
[ − 1 − 1 − 1 − 1
] .
Describe the nullspace of E, and give an interpretation (in one or two sentences.) What kind of goods could have such an elasticity matrix?
3.2 Color perception. Human color perception is based on the responses of three different types of color light receptors, called cones. The three types of cones have different spectral response characteristics and are called L, M, and, S because they respond mainly to long, medium, and short wavelengths, respectively. In this problem we will divide the visible spectrum into 20 bands, and model the cones’ response as follows:
Lcone =
∑^20
i=
lipi, Mcone =
∑^20
i=
mipi, Scone =
∑^20
i=
sipi,
where pi is the incident power in the ith wavelength band, and li, mi and si are nonnegative constants that describe the spectral response of the different cones. The perceived color is a complex function of the three cone responses, i.e., the vector (Lcone, Mcone, Scone), with different cone response vectors perceived as different colors. (Actual color perception is a bit more complicated than this, but the basic idea is right.)
(a) Metamers. When are two light spectra, p and ˜p, visually indistinguishable? (Visually identical lights with different spectral power compositions are called metamers.) (b) Visual color matching. In a color matching problem, an observer is shown a test light and is asked to change the intensities of three primary lights until the sum of the primary lights looks like the test light. In other words, the observer is asked the find a spectrum of the form pmatch = a 1 u + a 2 v + a 3 w, where u, v, w are the spectra of the primary lights, and ai are the (nonnegative) in- tensities to be found, that is visually indistinguishable from a given test light spectrum ptest. Can this always be done? Discuss briefly. (c) Visual matching with phosphors. A computer monitor has three phosphors, R, G, and B. It is desired to adjust the phosphor intensities to create a color that looks like a reference test light. Find weights that achieve the match or explain why no such weights exist. The data for this problem is in the an m-file color perception.m. Running color perception will define and plot the vectors wavelength, B phosphor, G phosphor, R phosphor, L coefficients, M coefficients, S coefficients, and test light. (d) Effects of illumination. An object’s surface can be characterized by its reflectance (i.e., the fraction of light it reflects) for each band of wavelengths. If the object is illuminated with a light spectrum characterized by Ii, and the reflectance of the object is ri (which is between 0 and 1), then the reflected light spectrum is given by Iiri, where i = 1,... , 20 denotes the wavelength band. Now consider two objects illuminated (at different times) by two different light sources, say an incandescent bulb and sunlight. Sally argues that if the two objects look identical when illuminated by a tungsten bulb, they will look identical when illuminated by sunlight. Beth disagrees: she says that two objects can appear identical when illuminated by a tungsten bulb, but look different when lit by sunlight. Who is right? If Sally is right, explain why. If Beth is right give an example of two objects that appear identical under one light source and different under another. You can use the vectors sunlight and tungsten defined in color perception.m as the light sources.
to derive a bound on how large the error can be. You will do that here. In fact you will prove that 0 ≤ η ≤ α^2 2 where α = ‖δx‖/‖x 0 − a‖ is the relative size of δx. For example, for a relative displace- ment of α = 1%, we have η ≤ 0 .00005, i.e., the linearized model is accurate to about 0 .005%. To prove this bound you can proceed as follows:
√ 1 + α^2 + 2β − β where β = kT^ δx/‖x 0 − a‖.
√ 1 + α^2 + 2β − β with |β| ≤ α. By maximizing and minimizing g over the interval −α ≤ β ≤ α show that
0 ≤ η ≤ α^2 2
3.7 Orthogonal complement of a subspace. If V is a subspace of Rn^ we define V⊥^ as the set of vectors orthogonal to every element in V, i.e.,
V⊥^ = { x | 〈x, y〉 = 0, ∀y ∈ V }.
(a) Verify that V⊥^ is a subspace of Rn. (b) Suppose V is described as the span of some vectors v 1 , v 2 ,... , vr. Express V and V⊥ in terms of the matrix V =
[ v 1 v 2 · · · vr
] ∈ Rn×r^ using common terms (range, nullspace, transpose, etc.) (c) Show that every x ∈ Rn^ can be expressed uniquely as x = v + v⊥^ where v ∈ V, v⊥^ ∈ V⊥. Hint: let v be the projection of x on V. (d) Show that dim V⊥^ + dim V = n. (e) Show that V ⊆ U implies U⊥^ ⊆ V⊥.
3.8 Consider the linearized navigation equations from the lecture notes. Find the conditions under which A has full rank. Describe the conditions geometrically (i.e., in terms of the relative positions of the unknown coordinates and the beacons).
3.9 Suppose that 6 (Ax, x) = 0 for all x ∈ Rn, i.e., x and Ax always point in the same direction. What can you say about the matrix A? Be very specific.
3.10 Proof of Cauchy-Schwarz inequality. You will prove the Cauchy-Schwarz inequality.
(a) Suppose a ≥ 0, c ≥ 0, and for all λ ∈ R, a + 2bλ + cλ^2 ≥ 0. Show that |b| ≤
ac. (b) Given v, w ∈ Rn^ explain why (v + λw)T^ (v + λw) ≥ 0 for all λ ∈ R. (c) Apply (a) to the quadratic resulting when the expression in (b) is expanded, to get the Cauchy-Schwarz inequality: |vT^ w| ≤
vT^ v
wT^ w.
(d) When does equality hold?
3.11 Vector spaces over the Boolean field. In this course the scalar field, i.e., the components of vectors, will usually be the real numbers, and sometimes the complex numbers. It is also possible to consider vector spaces over other fields, for example Z 2 , which consists of the two numbers 0 and 1, with Boolean addition and multiplication (i.e., 1 + 1 = 0). Unlike R or C, the field Z 2 is finite, indeed, has only two elements. A vector in Zn 2 is called a Boolean vector. Much of the linear algebra for Rn^ and Cn^ carries over to Zn 2. For example, we define a function f : Zn 2 → Zm 2 to be linear (over Z 2 ) if f (x + y) = f (x) + f (y) and f (αx) = αf (x) for every x, y ∈ Zn 2 and α ∈ Z 2. It is easy to show that every linear function can be expressed as matrix multiplication, i.e., f (x) = Ax, where A ∈ Zm 2 ×n is a Boolean matrix, and all the operations in the matrix multiplication are Boolean, i.e., in Z 2. Concepts like nullspace, range, independence and rank are all defined in the obvious way for vector spaces over Z 2. Although we won’t consider them in this course, there are many important applications of vector spaces and linear dynamical systems over Z 2. In this problem you will explore one simple example: block codes. Linear block codes. Suppose x ∈ Zn 2 is a Boolean vector we wish to transmit over an unreliable channel. In a linear block code, the vector y = Gx is formed, where G ∈ Zm 2 ×nis the coding matrix, and m > n. Note that the vector y is ‘redundant’; roughly speaking we have coded an n-bit vector as a (larger) m-bit vector. This is called an (n, m) code. The coded vector y is transmitted over the channel; the received signal ˆy is given by ˆy = y + v, where v is a noise vector (which usually is zero). This means that when vi = 0, the ith bit is transmitted correctly; when vi = 1, the ith bit is changed during transmission. In a linear decoder, the received signal is multiplied by another matrix: ˆx = H yˆ, where H ∈ Zn 2 ×m. One reasonable requirement is that if the transmission is perfect, i.e., v = 0, then the decoding is perfect, i.e., ˆx = x. This holds if and only if H is a left inverse of G, i.e., HG = In, which we assume to be the case.
(a) What is the practical significance of R(G)? (b) What is the practical significance of N (H)? (c) A one-bit error correcting code has the property that for any noise v with one component equal to one, we still have ˆx = x. Consider n = 3. Either design a one-bit error correcting linear block code with the smallest possible m, or explain why it cannot be done. (By design we mean, give G and H explicitly and verify that they have the required properties.)
Remark: linear decoders are never used in practice; there are far better nonlinear ones.
3.12 Quadratic extrapolation of a time series. We are given a series z up to time n. Using a quadratic model, we want to extrapolate, or predict, z(n + 1) based on the three previous elements of the series: z(n), z(n−1), and z(n−2). We’ll denote the predicted value of z(n+1) by y. Another way to describe this problem is: find a quadratic function f (t) = a 2 t^2 +a 1 t+a 0 which satisfies f (n) = z(n), f (n − 1) = z(n − 1), and f (n − 2) = z(n − 2). The extrapolated value is then given by y = f (n + 1). Let the vector x denote the three previous elements of
(e) B is upper triangular, i.e., Bij = 0 for i > j. (f) B is lower triangular, i.e., Bij = 0 for i < j.
3.14 Nonlinear unbiased estimators. We consider the standard measurement setup:
y = Ax + v,
where A ∈ Rm×n, x ∈ Rn^ is the vector of parameters we wish to estimate, y ∈ Rm^ is the vector of measurements we take, and v ∈ Rm^ is the vector of measurement errors and noise. You may not assume anything about the dimensions of A, its rank, nullspace, etc. If the function f : Rm^ → Rn^ satisfies f (Ax) = x for all x ∈ Rn, then we say that f is an unbiased estimator (of x, given y). What this means is that if f is applied to our measurement vector, and v = 0, then f returns the true parameter value x. In EE263 we have studied linear unbiased estimators, which are unbiased estimators that are also linear functions. Here, though, we allow the possibility that f is nonlinear (which we take to mean, f is not linear). One of the following statements is true. Pick the statement that is true, and justify it completely. You can quote any result given in the lecture notes.
A. There is no such thing as a nonlinear unbiased estimator. In other words, if f is any unbiased estimator, then f must be a linear function. (This statement is taken to be true if there are no unbiased estimators for a particular A.) If you believe this statement, explain why. B. Nonlinear unbiased estimators do exist, but you don’t need them. In other words: it’s possible to have a nonlinear unbiased estimator. But whenever there is a nonlinear un- biased estimator, there is also a linear unbiased estimator. If you believe this statement, then give a specific example of a matrix A, and an unbiased nonlinear estimator. Explain in the general case why a linear unbiased estimator exists whenever there is a nonlinear one. C. There are cases for which nonlinear unbiased estimators exist, but no linear unbiased estimator exists. If you believe this statement, give a specific example of a matrix A, and a nonlinear unbiased estimator, and also explain why no linear unbiased estimator exists.
3.15 Channel equalizer with disturbance rejection. A communication channel is described by y = Ax + v where x ∈ Rn^ is the (unknown) transmitted signal, y ∈ Rm^ is the (known) received signal, v ∈ Rm^ is the (unknown) disturbance signal, and A ∈ Rm×n^ describes the (known) channel. The disturbance v is known to be a linear combination of some (known) disturbance patterns, d 1 ,... , dk ∈ Rm. We consider linear equalizers for the channel, which have the form ˆx = By, where B ∈ Rn×m. (We’ll refer to the matrix B as the equalizer; more precisely, you might say that Bij are the equalizer coefficients.) We say the equalizer B rejects the disturbance pattern di if ˆx = x, no matter what x is, when v = di. If the equalizer rejects a set of disturbance patterns, for example, disturbances d 1 , d 3 , and d 7 (say), then it can reconstruct the transmitted signal exactly, when the disturbance v is any linear combination of d 1 , d 3 , and d 7. Here is the problem. For the problem data given in cedr_data.m, find an equalizer B that rejects as
many disturbance patterns as possible. (The disturbance patterns are given as an m × k matrix D, whose columns are the individual disturbance patterns.) Give the specific set of disturbance patterns that your equalizer rejects, as in ‘My equalizer rejects three disturbance patterns: d 2 , d 3 , and d 6 .’ (We only need one set of disturbances of the maximum size.) Explain how you know that there is no equalizer that rejects more disturbance patterns than yours does. Show the Matlab verification that your B does indeed reconstruct x, and rejects the disturbance patterns you claim it does. Show any other calculations needed to verify that your equalizer rejects the maximum number of patterns possible.
3.16 Identifying a point on the unit sphere from spherical distances. In this problem we con- sider the unit sphere in Rn, which is defined as the set of vectors with norm one: Sn^ = {x ∈ Rn^ | ‖x‖ = 1}. We define the spherical distance between two vectors on the unit sphere as the distance between them, measured along the sphere, i.e., as the angle between the vectors, measured in radians: If x, y ∈ Sn, the spherical distance between them is
sphdist(x, y) = 6 (x, y),
where we take the angle as lying between 0 and π. (Thus, the maximum distance between two points in Sn^ is π, which occurs only when the two points x, y are antipodal, which means x = −y.) Now suppose p 1 ,... , pk ∈ Sn^ are the (known) positions of some beacons on the unit sphere, and let x ∈ Sn^ be an unknown point on the unit sphere. We have exact measurements of the (spherical) distances between each beacon and the unknown point x, i.e., we are given the numbers ρi = sphdist(x, pi), i = 1,... , k. We would like to determine, without any ambiguity, the exact position of x, based on this information. Find the conditions on p 1 ,... , pk under which we can unambiguously determine x, for any x ∈ Sn, given the distances ρi. You can give your solution algebraically, using any of the concepts used in class (e.g., nullspace, range, rank), or you can give a geometric condition (involving the vectors pi). You must justify your answer.
3.17 Some true/false questions. Determine if the following statements are true or false. No jus- tification or discussion is needed for your answers. What we mean by “true” is that the statement is true for all values of the matrices and vectors given. You can’t assume any- thing about the dimensions of the matrices (unless it’s explicitly stated), but you can assume that the dimensions are such that all expressions make sense. For example, the statement “A + B = B + A” is true, because no matter what the dimensions of A and B (which must, however, be the same), and no matter what values A and B have, the statement holds. As another example, the statement A^2 = A is false, because there are (square) matrices for which this doesn’t hold. (There are also matrices for which it does hold, e.g., an identity matrix. But that doesn’t make the statement true.)
a. If all coefficients (i.e., entries) of the matrix A are positive, then A is full rank. b. If A and B are onto, then A + B must be onto.
c. If A and B are onto, then so is the matrix
[ A C 0 B
] .