



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Inequalities in statistics and probability, specifically comparing random replacement schemes to sampling with replacement. The authors explore multivariate majorization and Schur functions. a theorem and proof related to negative association and birthday problems. The document cites various authors who have obtained inequalities for sampling schemes, but the question of characterizing the class of functions for which the inequalities hold remains unresolved. AMS1980 subject classifications and key words and phrases related to the topic.
Typology: Lecture notes
1 / 6
This page cannot be seen from the preview
Don't miss anything!




Inequalities in Statistics and Probability IMS Lecture Notes-Monograph Series Vol. 5 (1984), 35-
RANDOM REPLACEMENT SCHEMES AND MULTIVARIATE MAJORIZATION^1
BY SAMUEL KARLIN and YOSEF RINOTT Stanford University and the Hebrew University of Jerusalem
In this note we obtain certain inequalities comparing random replacement schemes to sampling with replacement. Some of the results are related to multivariate majorization and Schur functions.
1. Various Stochastic Comparisons and Random Replacement Schemes. Let J* = {aλ , ... , aN}, ai*.CR = the real line. We shall consider a sample of size n(n^N) drawn from A, and denote the observations by Xλ , ... , Xn. In a symmetric random replacement scheme the observation Xλ is drawn with equal probabilities from A, i.e., P(Xλ = at)= UN, i = 1, ... ,N. The element drawn for Xλ is replaced in A with probability ττι , and removed from A with probability l-πι. Then X 2 is sampled, and the element which is drawn is replaced with probability τr 2. Continuing to Xnλ_ , the vector π = (TΓI , ... , irn_i) defines the random replacement scheme R(τr). Note that for ΊΓ = 0 = (0, ... , 0), R(π) is equivalent to sampling without replacement while forir = l = (l, ... ,1), R(π) corres- ponds to sampling with replacement and Xί9 ... ,Xrtarei.i.d. It follows from Joag-Dev and Proschan (1983) that under R(0), Xλ , ... , Xn are negatively associated, i.e.,
(1.1) E{φ(Xif i j j for any partition A,B of 1, ... ,n , where φ and ψ are increasing functions. In particular, (1.1) implies (l 2a) E{IIU φ, (X, )}*£ ΠUiEφfiCd for any functions φ,, all increasing (or all decreasing) and nonnegative. Note that (1.2a) can be written as (l 2b) ER(0){ΠU φ/(X,)} * W Π 7 - i φ,(^)} Inequalities for sampling schemes were obtained by various authors including Sen (1970), Rosen (1972), Serfling (1973), Kemperman (1973), Karlin (1974), Van Zwet (1983), and Krafft and Schaefer (preprint). The question of characterizing the class of functions for which
(1.3) ER remains unresolved. The next result provides a class of functions for which (1.3) holds. THEOREM 1. ERM{Uni= , φ(Xf)} ^ ER ( 1 ) {Π7= (^) ι <p(Xt)}for any 0. Proof.
(^1) Supported in part by NIH Grant 5R01GM10452-20 and NSF Grant MSC79-24310. AMS1980 subject classifications. 62D05,62H05. Key words and phrases: Schur functions, negative association, birthday problem.
35
36 SAMUEL KARLIN and YOSEF RINOTT
Now Σφ^2 (α) ^* (Xφ(ak))^2 /N, and therefore with Σφ(fl) =* m the last expression in (1.4) is bounded above by >πλm
(^2) IN (^2) + {(l-τri)/(W(ΛM))} {m\ - \IN)} = m (^2) IN (^2) = Λr (^2) (Σ£=, φ(a k))
2
and the case n = 2 is established. We now proceed by induction. Letψ(X,, ... ,Xn) = Π7=1φ(Xί).Then(seeKarlin(1974),Lemma3.1)
where Ef^ computes the expectation when ak is removed from the sample space of X 2 , ... , Xn. Invoking the induction hypothesis, i.e., Theorem 1 holding for _n-_ variables this leads to
Hence in order to complete the induction argument it suffices to prove (1.5) %,,,...i l } φ(Y,, ... ,Xn)^E{ι 1}ψ(X,, ... ,Xn). Since 9(0,) is only a relabeling of at we assume φ(di) = α, ^ 0 and also without loss of generality aλ ^ a 2 ^ ... =^ aN. With this (1.5) becomes (1.6) {NiN-lT-'V^M^aj-a^ ^N-n(^=ιaj)n and with ^ = ^(2^=10,) so that Σ ^ = i ^ = 1 we obtain that (1.6) is equivalent to (1.7) Σ l 1
We now prove (1.7) by induction on Λf. For Λf = 1, (1.7) is trivial. If the maximum of Σ= A(l-fc)n~!**^ is obtained at a boundary point of the simplex {bt ^ 0, Σ^=i^ = 1}, then some bi = 0 and by the induction hypothesis at the maximum point
If the maximum is at an interior point, then by differentiating Σfc(l -* bk)n~ι^ - λ(Xbk - 1) we obtain the equation (1 - bk)n~ι^ -(n-1 )bk{ 1 - bk)n~^2 - λ = 0, and equivalently (1.8) (^) n(i-bky-^1 -(n Summing in (1.8) over k we have (19) n?,ΐ=ί(l-bky-ϊ-( Now, using the Tchebycheff rearrangement inequality,
and therefore from (1.9) (N/n)λ ^ ((N- l)/M)?,»=ι(l-bky-^2 -((n- l)/n)^=1(l-bk)n -^2 ^ 0.
Returning to the expressing in (1.8), λ > 0 implies that the polynomial rur-γ-(n-)xr-^2 _-_ has only one positive root by Descartes' rule of signs. Therefore, an interior maximum of
38 SAMUEL KARLIN and YOSEF RINOTT
Thus, again (1.16) has a unique critical point with bk = l/N. Since for bk = l/N (1.13) hold with equality, the result follows. α
Examples. Theorem 2 implies inequality (1.12) for φ(xl9 ... , xn) = (Xi + ... + xnf for any integer a > 0. For Σ^L^ < 1 we obtain by expansion that (1.12) also holds with ψ(*j, ... , xn) = [ 1—(JCJ + ... + xn)a]~ι^ or any positive combination Σck(xx + ... + xnTk, otk ^ 0 integers. The preceding inequalities are related to multivariate majorization and Schur function as explained next.
2. Multivariate Majorization and Negative Association. A function φ{) defined on 3?N^ is said to be Schur concave if φ(x) > φ(y) for any x,y c 9? such that x = yM for some matrix Me 7) = the class of NxN doubly stochastic matrices. See Marshall and Olkin (1979) for details, references and historical remarks. Let X and Y be nxN matrices whose columns are x,, ... , xN, and y,, ... , y^, respectively. The inequality Σ^gίx,) ^ ΣNi=ιg(yi) holds for every concave function g defined on 7?n^ if and only if there exists a matrix M € 2) such that X = YM. (This result is due to Hardy, Littlewood and Pόlya (1934) for n = 1, and to Sherman (1951), Stein and Blackwell (1953)). In particular, the matrix function ψ(X) = Σy=1g(x, ) satisfies ψ(X) ^ ψ(Y) whenever X = YM provided g is concave. Related notions of multivariate Schur concavity and prob- abilistic applications were studied by Rinott (1973), Marshall and Olkin (1979), Karlin and Rinott (1981), Tong (1982) and Karlin and Rinott (1983). In some of the applications one obtains the inequality ψ(X) ^ ψ(Y) whenever X = YM where M belongs to a subclass of 2) 4 Of particular interest is the class 3 of matrices which can be represented as products of matrices of the form tl + (l-ί)P where I is the ΛfxΛf identity matrix, P is a permutation matrix which interchanges only two coordinates, and 0 ^ t ^ 1. Our next theorem describes an example of Schur concavity with respect to the class O. A probabilistic interpretation of the result in terms of a birthday-problem of coincidence probabilities will be given. We first need a lemma which extends Ostrowski's (1952) well- known criterion for Schur concavity. The proof can be found in Rinott (1973), Marshall and Olkin (1979).
LEMMA 1. A differentiable function ψ.•#«*.-> R is multivariate Schur concave with respectto 7, i.e., ψ(X) < ψ(XT) for every T e 7 and nxN matrix X = ||x || ifandonlyif (i) ψ(X) = φ(XP)for every NxN permutation matrix P; and (ii) Σ?=, (^-^)[aψ(X)/a^-aψ(X)/axJ < O for all 1 < j'Φ k < N. Let α^1 , ... , oC e 9?N^ denote the n rows of the nxN matrix A, cί:^ = (cή, ... , άn), i = 1, ... , n. We assume that the rows are similarly ordered, that is (oίj-άk)(άj-oΐ'k) ^ 0 for all 1 <y, k < N, 1 < /, /' < n. Note that if T = ίl + (l-ί)P where P is a permutation matrix that interchanges only two coordinates, then applied to these two coordinates, T operates like the matrix (,^ ι~') which preserves the order if t > xh and reverses the order if t < V2. If the rows of A are similarly ordered, then so are the rows of AT for any T e 7.
THEOREM 3. Let ψ(A) be defined by
where the sum extends over all (*) vectors of n different indices between 1 and N A = \a}\ is nxN satisfying αj > 0, i = 1, ... , Λ, 7 = 1, ... ,N,n<N, and the rows of A, α^1 , ... ,o? are similarly ordered. Then ψ(A) < ψ(AT) /or allΎ e
Proof. In view of Lemma 1, we compute
Let
Then (2.2) θψ/θα^1 , - aψ/aα! 2 = (Σ»k=2of 2 uk + «,)-(Σϊ. 2 αn + «,)*
Therefore ΣLiK-^Xaψ/aαί - dφ/da 2 ) = τuΐkΦiWx-*t 2 ){ct 2 -ctx)uk < 0
since similar ordering implies (αj-α^αr-αf) ^ 0, and replacing 1,2 by any pair of indices the required result follows from Lemma 1. • Note that Theorem 3 involves Schur concavity with respect to y on the set of nonnega- tivenxN matrices having similarly ordered rows. In the proof of Theorem 3 consider the subclass S of 7 consisting of finite products of matrices of the form T = tΎ + (l-ί)P, P a permutation matrix that interchanges only two adjacent coordinates and V2 ^ t ^ 1. Such a T preserves the ordering of the components when applied to a vector. The calculation in (2.2) implies (α^1 , - _aι 2 )(dφ/da_ - aψ/dα' 2 ) = (α i-α2)Σ(α2-ot)κ ^ 0 and the same holds if we replace the pair of indices 1,2 by any pair. By the well known criterion of Ostrowski (1952) it follows that ψ(A) = ψ(αx^ , ... , α") is Schur convex in a^1 , when a^2 , ... , a" are fixed and a^1 , ... , a" are all similarly ordered. This implies THEOREM 4. _Under the conditions of Theorem 3 φ(a_ ... , α") < ψίc^T,, ... ,cfΎn)forallΎl9... ,Ύne ό. As a special case of Theorem 3 we obtain the inequalities (1.2a)-(1.2b). This is given by
P R O P O S I T I O N 1. LetO^φibeincreasing functions, i = l , ... ,n,then (2.3) Eκ(o){Π?= 1 φ,{X/)} ^ E*<i){Π?« 1 M)}. Proof. Set αj = φ,(α,), i = 1, ... , n, j = 1, ... , N. Then ER(0){Π7= ψ(A)/(MW-l) ••• (N-n+l)) where ψ(A) is defined by (2.1), while ER(ί){Ώ?=ι ΛΓrlΠ7= 1 (ΣjLiαj). It is easy to see that inequality of (2.3) is homogeneous and we can as- sume XjL ictj = 1, i = 1, ... , n, without loss of generality. Then (2.3) becomes (2.4) ψ(A) = N(N-l)... (N-n H- 1)/ΛT For J € 7 having all entries equal to ΛΓ^1 we now have AJ = J , and a simple calculation shows that ψ(J) = N(N-) ... (N-n + 1)/ΛT Schur concavity of ψ implies ψ(A) < ψ(AJ)