


















Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The analysis of conditional distribution estimation using order quantiles and estimators. It covers topics such as the production function or frontier, the order one quantile g(x), and the estimation of f(x, y) using the kernel density estimator. The document also explores the consistency of the estimator and the relationship between fx(x) and f(x, y).
Typology: Papers
1 / 26
This page cannot be seen from the preview
Don't miss anything!



















Accepted at Journal of Econometrics.
Carlos Martins-Filho Feng Yao Department of Economics Department of Economics Oregon State University Chinese University of Hong Kong Ballard Hall 303 and Esther Lee Building Corvallis, OR 97331-3612 USA Shatin, Hong Kong email: [email protected] email: [email protected] Voice: + 1 541 737 1476 Voice: + 852 2609 8007
August 2007
Abstract. Traditional estimators for nonparametric frontier models (DEA, FDH) are very sensitive to extreme values/outliers. Recently, Aragon, Daouia, and Thomas-Agnan (2005) proposed a nonparametric α-frontier model and estimator based on a suitably defined conditional quantile which is more robust to extreme values/outliers. Their estimator is based on a nonsmooth empirical conditional distribution. In this paper, we propose a new smooth nonparametric conditional quantile estimator for the α-frontier model. Our estimator is a kernel based conditional quantile estimator that builds on early work of Azzalini (1981). It is computationally simple, resistant to outliers and extreme values, and smooth. In addition, the estimator is shown to be consistent and √n asymptotically normal under mild regularity conditions. We also show that our estimator’s variance is smaller than that of the estimator proposed by Aragon et al. A simulation study confirms the asymptotic theory predictions and contrasts our estimator with that of Aragon et al.
Keywords and Phrases. conditional quantile estimation; nonparametric frontier; production function. JEL Classifications. C14, C
(^1) We thank participants of the Econometrics in Rio Conference, Jeff Racine, Aman Ullah, an associate editor and two anonymous referees for excellent comments. Special thanks go to Professor Takeshi Amemiya for his editorship. The authorsretain responsibility for any remaining errors.
The specification and estimation of production frontiers, and the measurement of the associated efficiency level of production units has been the subject of a vast and expanding literature since the seminal work of Farrell (1957). The main objective of this literature can be stated simply. Consider (X, Y ) ∈ _d + × _+ where Y describes the output of a production unit and X describes the d inputs used in production. The output set is given by Ψ = {(x, y) ∈ _d + × _+ : x can produce y} and the production function or frontier associated with Ψ is g(x) = sup{y ∈ _+ : (x, y) ∈ Ψ} for all x ∈ _d +. Let (x 0 , y 0 ) ∈ Ψ characterize the performance of a production unit and define 0 ≤ R 0 ≡ (^) g(yx^00 ) ≤ 1 to be this unit’s (inverse) Farrell output efficiency measure. The main objective in production and efficiency analysis is, given a random sample of production units χn ≡ {(Xi, Yi)}ni=1 that share the set Ψ, to obtain estimates of g(·) and by extension Ri = (^) g(YXii) for i = 1, · · · , n. Deterministic frontier models and estimators, largely represented by data envelopment analysis (DEA) and full disposal hull (FDH), have gained popularity among applied researchers because their construction relies on very mild assumptions on Ψ.^1 These models are based on the assumption that χn lie in Ψ, i.e., P ((X, Y ) ∈ Ψ) = 1, where P is the probability measure associated with the random vector (X, Y ). The most appealing characteristic of such models is that there is no need to assume any restrictive parametric structure on g(·) or the probability measure P to perform estimation. In addition to accommodating a flexible nonparametric structure, the appeal of DEA and FDH estimators has increased since Gijbels et al. (1999) and Park et al. (2000) obtained their asymptotic distributions under some fairly reasonable assumptions. DEA and FDH type estimators have two serious deficiencies. First, since they are based on the idea of enveloping the observed data, these estimators are very sensitive to outliers or extreme observations and are inherently biased. Second, even in cases where the production technology induces a smooth production fron- tier, estimated frontiers based on FDH and DEA are discontinuous or piecewise linear function, respectively. Efforts to remedy such deficiencies have appeared in different nonparametric frontier modeling contexts (Gi- rard and Jacob, 2004; Hall et al. (1998); Knight, 2001; Martins-Filho and Yao, 2007). Prominent among (^1) See Simar and Wilson (2006) for a review of deterministic frontiers and illustrations of their widespread use.
estimator and an associated conditional quantile estimator, however their conditioning set is X = x rather than X ≤ x. In this paper, we propose a smooth nonparametric kernel estimator for the α-frontier (qα(x)). Our estimator is an extension of the seminal idea of Nadaraya (1964) and is based on a smooth estimator of the conditional distribution F (y/x). Besides having the properties of consistency and √n-asymptotic normality, the variance of our estimator is smaller than that of the estimator proposed by Aragon et al., con- firming that the gains first identified by Azzalini in unconditional quantile estimation extend to conditional quantile estimation. Our simulations also confirm the superior performance of our proposed estimator. Besides this introduction, this paper has five additional sections. Section 2 describes the stochastic model in detail, contrasts its assumptions with those in the past literature and describes the estimation procedure. Section 3 provides the main theorems establishing the asymptotic behavior of our estimator and discusses bandwidth selection. Section 4 contains a Monte Carlo study that implements the estimator, sheds some light on its finite sample properties and compares its performance with that of the estimator proposed by Aragon et al. Section 5 provides an empirical illustration of our estimation procedure using data on electric utilities from the United States. Lastly, section 6 provides a summary and some directions for future work.
Consider χn = {(Xi, Yi)}ni=1 a sequence of independent random vectors taking values in Ψ∗^ and having the same distribution F as the vector (X, Y ). Throughout the paper, X will represent a d-vector of inputs used in the production process and Y will represent a scalar measure of output. F is taken to be absolutely continuous with associated density function given by f. The marginal distribution and density functions of X are denoted by FX and fX respectively. Given that our interest is on the estimation of the α-frontier, which coincides with conditional quantile qα(x) for α ∈ [0, 1], we define an estimator Fˆ (y/x) for F (y/x) as
Fˆ (y/x) =
l (^0) if y = 0, F^ ˆF (^) ˆ( x,y(x)) if y > 0. (3)
where Fˆ (x, y) = (nhn)−^1 ni=1^ $^0 y K^ p^ Yi h^ −nγ^ Q dγI(Xi ≤ x) and Fˆ (x) = n−^1 ni=1 I(Xi ≤ x), I(A) is the indicator function for the set A, K(·) is a suitably defined kernel function and hn is a nonstochastic sequence
of bandwidths such that 0 < hn → 0 as n → ∞. The estimator is different from that proposed by Aragon et al. in that their estimator for F (x, y) is given by Fn(x, y) = n−^1 i=1 I(Xi ≤ x, Yi ≤ y). In essence, rather than estimating F (y/x) by the empirical distribution of the data such that Xi ≤ x for i = 1, · · · , n, we estimate F (y/x) by integrating a smooth Rosenblatt density estimator constructed using the observations {(Xi, Yi)}i∈{i:Xi ≤x}. It is easy to demonstrate that Fˆ (y/x) is asymptotically a distribution function, i.e., for suitably defined kernels: (a) Fˆ (y/x) is nondecreasing in y; (b) Fˆ (y/x) is right continuous in _+; (c) limy→ 0 Fˆ (y/x) = 0; and (d) there exists N (x) such that for all n > N (x) we have limy→∞ Fˆ (y/x) = 1. Assuming that qα(x) is the unique α order quantile for the conditional distribution F (y/x), we define the estimator qα,n(x) as the root of
Fˆ (qα,n(x)/x) = α for α ∈ (0, 1] and x ∈ _d +. (4)
Using the mean value theorem, absolute continuity of F and smoothness of the kernel function we can write qα,n(x) − qα(x) = F^ (qα(xfˆ )(¯/xqα,n)− (^ Fxˆ^ )(/xqα)( x)/x) where fˆ (y/x) = ∂^ Fˆ^ ( ∂y/xy )= (nhn^ )−^1
ni=1 K^ D^ Yihn −y^ iI(Xi ≤x) Fˆ (x) for^ y^ ≥^0 ( fˆ (y/x) = 0 for y < 0) and ¯qα,n(x) = λqα,n(x) + (1 − λ)qα(x) for λ ∈ (0, 1).
The stochastic properties of the estimator defined in (4) are obtained under the following regularity condi- tions: Assumption A1. a. χn = {(Xi, Yi)}ni=1 is a sequence of independent random vectors taking values in Ψ∗^ and having the same distribution F as the vector (X, Y ), with support in Ψ∗; b. Ψ∗^ is compact and 0 < f (x, y) < Bf for all (x, y) ∈ Ψ∗. The assumption that χn is an independent and identically distributed sequence, and the existence of the density f as a bounded function in Ψ is standard in the deterministic frontier literature (Aragon et al., 2005, Cazals et al., 2002; Gijbels et al., 1999; Martins-Filho and Yao, 2007; Park et al., 2000). Assumption A2. a. K(γ) : SK → _ is a symmetric bounded function with compact support SK = [−BK , BK ] such that: b. $^ −BBKK K(γ)dγ = 1; c. $^ −BBKK γK(γ)dγ = 0, $^ −BBKK γ^2 K(γ)dγ = σ^2 K ; d. for all γ, γI^ ∈ SK we have |K(γ) − K(γI)| ≤ mK |γ − γI| for some 0 < mK < ∞; e. for all γ, γI^ ∈ _ we have
processes. The lemmas, Theorems 1 and 2 and proofs can be found in Appendix 1.^3 We now state, Theorem 1 Let 0 < hn → 0 be a nonstochastic sequence of bandwidths with nhn → ∞ as n → ∞. Assume A1, A2, A3, A4 and that for given x ∈ _d + and some N(x) we have that for all n > N (x) min{i:Xi ≤x}Yi ≥ hnBK. Then, qα,n(x) − qα(x) = op(1). (5) Asymptotic normality of qα,n(x) under suitable normalization is obtained in the following Theorem 2. Theorem 2 Let 0 < hn → 0 be a nonstochastic sequence of bandwidths with nh^2 n → ∞ and nh^4 n = O(1) as n → ∞. Assume A1, A2, A3, A4 and that for given x ∈ _d + and some N (x) we have that for all n > N (x) min{i:Xi ≤x}Yi ≥ hnBK. Then, for all α ∈ (0, 1) we have
vn(x)−^1 √n(qα,n(x) − qα(x) − Bn(x)) →d N(0, 1) (6)
where Bn(x) = − 12 h^2 nσ^2 K
g−^1 ([qα F(xX) (,gx()xf)]) (q (^) αf^ (1)(x)^ (/xγ,q)α (x))dγ+ o(h (^2) n) and v (^2) n(x) = (^) (FX (x)f (q (^1) α(x)/x)) 2 (F (x, qα(x))− F 2 ( Fx,qX (αx()x ))− 2 hnσκ^ $ g− (^1) ([qα(x),g(x)]) f (γ, qα(x))dγ^ Q + o(hn) with κ(x) = $^ −xBK K(γ)dγ, 0 < σκ = $^ −BBKK γκ(γ)
K(γ)dγ, and f (1)(X, y) denotes the first derivative of f with respect to Y. The conditional quantile estimator proposed by Aragon at al. (2005) is also consistent and √n as- ymptotically normal under similar assumptions, however there are some important differences between the estimators. First, we observe that although our estimator depends on kernel smoothing, and therefore a bandwidth hn is necessary in constructing the estimator, there is no asymptotic cost as the rate of conver- gence to normality occurs at the parametric rate √n. Hence, the number of inputs d has no impact on the convergence rate of the estimator. Most importantly, even though there is smoothing in Fˆ (y/x), it produces no slowing on the convergence in distribution, a result obtained by Falk (1985) and Hansen (2004a) in the context of unconditional distribution functions. Second, although the extra smoothing we propose might impose modest computational costs compared to the estimator proposed by Aragon et al., Theorem 2 reveals that the extra smoothness produces a smaller variance due to the higher order terms. Note that the variance (^3) Additional proofs and technical details can be found in Martins-Filho and Yao (2007a).
of the asymptotic distribution of their estimator is given by α(1 − α) f 2 (qα(x)/x)FX (x) ≡^
(FX (x)f (qα(x)/x))^2
w F (x, qα(x)) − F^2 ( Fx, qX (αx()x))
and given that the extra term that appears in v n^2 is nonnegative, the variance of our estimator is smaller for all n finite. Third, the extra smoothing we propose does introduce a bias term Bn(x) = O(h^2 n), but provided that nh^4 n = o(1) the bias vanishes asymptotically. We note that this condition is consistent with the conditions on hn necessary to obtain Theorem 2. Finally, we observe that given that Bn(x) = O(h^2 n) and that the variance is of order O(n−^1 + hnn−^1 ) the optimal bandwidth rate for minimization of the asymptotic mean integrated squared error is hn ∝ n−^1 /^3. The next theorem provides the joint asymptotic distribution of qα,n(x^1 ), qα,n(x^2 ), · · · , qα,n(xr^ ) which can be used to construct joint asymptotic confidence sets for the α-frontier for various levels of input usage. The result is similar to that in theorem 4.2. in Aragon et al. (2005). Theorem 3 Let x^1 , x^2 , · · · , xr^ be r levels of input X and let all assumptions in Theorem 2 hold. Then, for α ∈ (0, 1) we have √n Dqα,n(x (^1) ) − qα(x (^1) ) − B(x (^1) ), qα,n(x (^2) ) − qα(x (^2) ) − B(x (^2) ), · · · , qα,n(xr (^) ) − qα(xr (^) ) − B(xr (^) )iI (^) →d N (0, Q)
where B(xl) = − (^) f (qα(xl)/x^1 l)FX (xl) σ^2 Kh 22 n^ $ g− (^1) ([qα(xl),g(xl)]) f (1)(X, qα(xl))dX + o(h^2 n), l ∈ { 1 , 2 , · · · , r} and Q is an r × r matrix with (l, m)th^ element Ql,m given by (1) Ql,l = (^) f (^2) (qα(αxl(1 (^) )/x−αl))FX (xl) if l = m, (2) Ql,m = (^) f (qα(xl)/xl)FX (xl)f^1 (qα(xm)/xm)FX (xm) [F (xlm, qα(xl))(1 − α) − αF (xlm, qα(xm)) + α^2 FX (xlm)] if l W= m, and qα(xl) ≤ qα(xm), (3) Ql,m = (^) f (qα(xl)/xl)FX (xl)f^1 (qα(xm)/xm)FX (xm) [F (xlm, qα(xm))(1 − α) − αF (xlm, qα(xl)) + α^2 FX (xlm)] if l W= m, and qα(xl) ≥ qα(xm), where xlm^ = {min(xl 1 , xm 1 ), min(xl 2 , xm 2 ) · · · , min(xld, xmd )}I. As is typical in applied work, for inference purposes, the unknown higher order components of the variance terms in Theorems 2 and 3 must be estimated via consistent nonparametric estimators. f (qα(x)/x) can be estimated by fˆ (qα,n(x)/x) the conditional Rosenblatt density estimator, using the rule-of-thumb bandwidth of Silverman (1986). Note the consistency of fˆ (qα,n(x)/x) has been established in the proof of Theorem 2.
Since our expression for AMISE accounts for all possible values of α, h∗ n can be interpreted as a global optimal bandwidth with respect to α for given input level x. Since our α-frontier estimator is constructed as a quantile estimator which smooths only the output for the underlying conditional distribution, it is not surprising that the optimal bandwidth is of order O(n−^13 ). This is the same order obtained in Azzalini (1981), Bowman et al. (1998) and Hansen (2004a) where a kernel estimator is used to estimate an unconditional distribution. However, our constant C is different from theirs. Compared with other conditional quantile estimators, both the order and the constant C in the expression for h∗ n are different from those in Hansen (2004b) and Li and Racine (2005) since the conditioning set we consider {X ≤ x} is different. The practical use of h∗ n requires the estimation of the unknowns appearing in its expression, as in the traditional plug-in bandwidth selection methods. In the next section, we provide an easily implementable estimation procedure for these unknowns and shed light on the finite sample performance of our estimator via a small Monte Carlo study.
In this section, we perform a Monte Carlo study which implements our smooth α-frontier estimator (S) and provides evidence on its finite sample performance. For comparison purpose we also include in the study two alternative estimators, the empirical α-frontier estimator of Aragon et al. (2005) (E) and a conditional α-quantile estimator based on a linearly interpolated empirical conditional distribution (Kincaid and Cheney, 1996)(I). The interpolated estimator is interesting in that extra smoothness is obtained without the need for bandwidth estimation. The data are simulated according to the model Yi = g(Xi)Ri, i = 1, 2 , · · · , n where Yi represents output, the univariate input Xi are pseudorandom variables generated from a uniform distribution with support given by [bl, bu]. Ri = exp(−Zi) and Zi are independently generated pseudorandom variables from an exponential distribution with parameter β = 13 , therefore the efficiency Ri has support (0, 1] with global average level of efficiency E(Ri) = 0.75. We consider two specifications for g(·), g 1 (x) = √x with [bl, bu] = [4, 25] and g 2 (x) = x^3 with [bl, bu] = [1, 2] which are associated with convex and nonconvex production technologies
respectively. This data generating process (DGP) has been considered in Aragon et al. (2005), Gijbels et al. (1999), Martins-Filho and Yao (2007), Park et al. (2000) and is regarded as reasonable with respect to many applications found in the econometric literature (Gijbels et al., 1999, p. 224). For each specification of g(x) we consider three sample sizes n = 100, 200 and 400 and perform 1000 repetitions at each experiment design. We estimate the α−frontiers for α − 0. 25 , 0. 5 , 0 .75 and 0.99. Since the estimators for qα(x) are constructed using data points with input levels which are less than or equal to x, we avoid estimation with extremely small samples by evaluating the performance of the estimators over the input interval starting from the 33rd percentile to the upper bound of the support. Using thirty equally spaced points in the support interval, we obtain the averaged bias, standard deviation and root mean squared error of each estimator. We also construct 95% asymptotic confidence intervals for the α−frontiers at different α levels using the asymptotic distributions available for our estimator and the estimator proposed by Aragon et al.
The empirical α-frontier estimator is implemented as described in Aragon et al. We implement the interpo- lated α-frontier estimator as
qˆα,nI (x) =
Y(i 1 ) if 0 ≤ α < (^) N^1 x Y(ik ) + (α − (^) Nkx )Nx(Y(ik+1) − Y(ik )) if (^) Nkx ≤ α < k N+1x , 1 ≤ k ≤ Nx − 1 1 if y ≥ Y(iNx ) where Nx = ni=1 I(Xi ≤ x) and Y(ij ) is the jth^ order statistic for the observations Yi such that Xi ≤ x. We note that ˆqα,nI (x) produces estimates that are identical to those given by the empirical α-frontier estimator when α coincides with the nodes (^) Nkx. Our estimator is implemented using the Epanechnikov kernel and the following plug-in bandwidth
ˆhP I =
⎜⎝^2 σκ
(^0) fˆ 2 I^ (ˆq^2 α(x,(xα))/x) dα (σ^2 K )^2 $^01 fˆ 2 I(ˆ^1 q^2 α(x,(xα))/x) dα
(^13) n−^13
where Iˆ 1 (x, α), Iˆ 2 (x, α), fˆ (qα(x)/x) are estimators for I 1 (x, α), I 2 (x, α) and f (qα(x)/x) appearing in h∗ n. Specifically, fˆ (qα,n(x)/x) = ngn^1
ni=1 K( Yi −q gnα,n (x))I(Xi ≤x) Fˆ (x) where ˆF^ (x) is the empirical distribution function. Since fˆ (qα,n(x)/x) is a suitably defined Rosenblatt density estimator, we utilize the rule-of-thumb bandwidth
from the DGP. As will be discussed later, simulation results reveal that the performance of the smooth estimator with both bandwidths are similar for large sample sizes, suggesting that ˆhP I is “close” to h∗ n in probability. Theorem 2 supports asymptotic confidence intervals for the smooth α−frontier estimator. Given that the asymptotic bias is O(h^2 n) and h∗ n ∝ n−^1 /^3 we have that O(√nh^2 n) = O(n−^16 ) = o(1). Hence, the normalized bias vanishes asymptotically and for 97.5% quantile Z 0. 975 of a standard normal distribution, we obtain limn→∞ P (qα,n(x) − n−^12 ( Sˆ^22 ) 12 Z 0. 975 ≤ qα(x) ≤ qα,n(x) + n−^12 ( Sˆ^22 ) 12 Z 0. 975 ) = 0.95 where Sˆ^22 =
Fˆ (x)( fˆ^ α ((1qα−,nα (x))/x))^2.^ Fˆ^ (x) and fˆ^ (qα,n(x)/x) are estimated as described in the bandwidth selection procedure. The asymptotic confidence interval for the empirical α-frontier estimator is constructed in a similar manner.
Figure 1 depicts the true α−frontier with estimated smooth and empirical frontiers for α ranging over
slightly larger bias, confirming the asymptotic result in Theorem 2. A comparison between our smooth estimator and the interpolated estimator resembles that between our estimator and the empirical estimator. We also observe that interpolated estimator has larger bias, slightly smaller standard deviation, but slightly larger root mean squared error than the empirical estimator. We find that as n increases the root mean squared error of all estimators decreases confirming the asymptotic results in the previous section.^5 This indicates that ˆqα,nI (x) may be a consistent estimator of the α-frontier and that ˆhP I is likely a consistent estimator for h∗ n. We observe that for both g 1 (x) and g 2 (x) the root mean squared error for all estimators is generally larger when evaluating α-frontier with α closer to 1 than when evaluating frontiers with α = 0. 25 , 0. 5 , 0 .75. The fact that it is more difficult to estimate α-frontiers in this case is intuitively understood as there are relatively less representative data available when α is closer to 1. The empirical coverage probability (the frequency that the estimated confidence interval contains the true α−frontier in 1000 repetitions) is given in Table 2 for the 33rd(x 10 ), 66th(x 20 ), and 100th(x 30 ) percentile of the input evaluation interval for empirical and smooth α-frontier estimators based on ˆhP I. For most experiments we observe that the smooth estimator is superior to empirical estimator, i.e., the empirical coverage probability with the smooth estimator is closer to the target value 95% than that with the empirical estimator, where exceptions occur mostly for α = 0.25. As n increases the empirical coverage probabilities from both estimators tend to get closer to 95% with some exceptions. There is also weak evidence that for the empirical estimator the coverage gets closer to 95% as α decreases. Figure 2 provides 95% empirical coverage probability for the estimators for the α = 0.99 frontier and a sample size n = 100 for 30 points across the input support. As indicated in the graph, for both g 1 (x) and g 2 (x), the smooth α-frontier estimator’s empirical coverage probability slightly overestimates the 95% target. Coverage for empirical estimator is largely below the 95% target with large deviations close to the boundary of the input support. To provide further evidence on the finite sample distribution of the two estimators, we provide kernel density estimates for the smooth and empirical α-frontier estimators centered around the true value (for (^5) This is also true for bias (with exceptions) and variance.
simulation section. We note that the confidence bands are wider in regions of the input space where there are a smaller number of observations. This follows from our definition for asymptotic confidence intervals and Theorem 4.1 of Aragon et al. Indeed the width of the confidence interval depends on the density f (qα(x)/x) and marginal probability FX (x). In regions of the input space where there are more data, both the density and marginal probability will be larger, and hence it is natural to observe narrower confidence intervals. Given the comments in Aragon et al. (2005) regarding the robustness of the empirical frontier to extreme observations, we conjecture that for α ∈ (0, 1) our smooth estimator should also be reasonably robust to extreme values and outliers.
In this paper we proposed a nonparametric α-frontier estimator based on a smooth kernel estimator of a conditional quantile of order α. Our estimator is an alternative to the conditional quantile estimator proposed by Aragon et al. (2005), which is based on empirical distribution functions. The estimator is easily implementable and we show that it is consistent and √n asymptotically normal. In addition, the extra smoothness pays off in that our estimator’s variance is smaller due to higher order terms than that of the estimator proposed by Aragon et al. (2005). Our simulation study confirms the asymptotic theory predictions and contrasts our estimator with that of Aragon et al. In most of the experiment designs in the simulations, our smooth estimator outperforms the empirical distribution based estimator of Aragon et al. (2005). Future work is needed in the context of α-frontiers, specifically estimators that can produce smooth boundaries over the input set are desirable in the applied economics literature.
Lemma 1 For all x ∈ _d + and y ∈ _+ and under assumptions A1, A2.a, A2.b, A2.c, and A3, we have: (a) E( Fˆ (x, y)) =
F (x, y) + 12 h^2 nσ^2 K^ $ g− (^1) ([y,g(x)]) f (1)(X, y)dX + o(h^2 n) if 0 < y < g(x), F (x, y) + o(h^2 n) if y > g(x), F (x, y) + o(hn) if y = g(x).
(b)
V ( Fˆ (x, y)) =
F (^) n− (^1) F (x, y)(1 − F (x, y)) − 2 n− (^1) hnσκ^ $ g− (^1) ([y,g(x)]) f (X, y)dX + o (hn/n) if 0 < y < g(x), n−^1 F (x, y)(1 − F (x, y)) + o (hn/n) if y ≥ g(x).
where κ(x) = $^ −xBK K(γ)dγ, σκ = $^ −BBKK γκ(γ)K(γ)dγ, f (1)(X, y) denotes the first derivative of f with respect to Y , and 0 < hn → 0 is a nonstochastic sequence of bandwidths. Proof (a): Let Cx = ×di=1[0, xi] where xi is the ith^ component of x. Since hn → 0 as n → ∞, there exists N (x) ∈ _+ such that for all n > N (x), E( Fˆ (x, y)) = $ Cx^ $ [0,g(X)] κ^ p^ y−hnY^ Q^ ∂Ff ∂^ (X,YY )dY dX where Ff (x, y) = $ [0,y] f (x, γ)dγ. Using integration by parts 8 [0,g(X)]^ κ
w (^) y − Y hn
dFf (X, Y ) = κ
w (^) y − g(X) hn
Ff (X, g(X)) +
(^8) y/hn y− hng(X)^ Ff^ (X, y^ −^ hnγ)K(γ)dγ.
By A3.d and Taylor’s theorem Ff (X, y − hnγ) = Ff (X, y) − hnγf (X, y) + 12 h^2 nγ^2 f (1)(X, y) + o(h^2 n). Hence, E( Fˆ (x, y)) = E 1 n + E 2 n − E 3 n + E 4 n + o(h^2 n), where E 1 n = $ Cx κ^ p^ y− hg(nX)^ Q Ff (X, g(X))dX, E 2 n = $ Cx Ff^ (X, y)^ $ (^) y/hn (y−g(X))/hn K(γ)dγdX,^ E^3 n^ =^ hn
Cx f^ (X, y)^ $ (^) y/hn (y−g(X))/hn γK(γ)dγdX,^ E^4 n^ =^ h (^2) n 2
Cx f^ (1)(X, y) $ (^) y/hn (y−g(X))/hn γ^2 K(γ)dγdX. For (x, y)^ ∈^ Ψ∗, if^ y^ ≤^ 0 then Fˆ^ (x, y) = 0. We now consider the limiting behavior of each term when: (1) 0 < y < g(x); (2) y > g(x); (3) y = g(x). (1): For any A ⊆ Cx and B ⊆ [0, g(x)], let g(A) = {g(x) : x ∈ A} and g−^1 (B) = {x : x ∈ Cx, g(x) ∈ B}. Then, E 1 n = $ g− (^1) ([0,y]) κ^ p^ y− hg(nX)^ Q Ff (X, g(X))dX + $ g− (^1) ([y,g(x)]) κ^ p^ y− hgn(X)^ Q Ff (X, g(X))dX = E 11 ,n + E 12 ,n. By A1, ee eκ^ p^ y− hgn(X)^ Qeee |Ff (X, g(X)| < ∞ and by Lebesgue’s dominated convergence (LDC) theorem E 11 ,n → $ g− (^1) ([0,y])^ $ [0,g(X)] f (X, Y )dXdY since X ∈ g−^1 ([0, y]) and κ^ p^ y− hgn(X)^ Q → 1. Similarly, E 12 ,n → 0 since X ∈ g−^1 ([y, g(x)]) and κ^ p^ y− hg(nX)^ Q → 0. E 2 n → $ g− (^1) ([y,g(x)])^ $ [0,y] f (X, Y )dY dX since for X ∈ g−^1 ([0, y]), we have $^ (y/hy−gn(X))/hn K(γ)dγ → 0, and for X ∈ g−^1 ([y, g(x)]) we have $^ (y/hy−gn(X))/hn K(γ)dγ → 1. h− n 1 E 3 n → 0 since for X ∈ g−^1 ([0, y]) we have $ (^ y/hy−gn(X))/hn γK(γ)dγ → 0 and by A2.c, for X ∈ g−^1 ([y, g(x)]) we have $^ (y/hy−gn(X))/hn γK(γ)dγ → 0. Now, h− n 2 E 4 n → 12 σ^2 K^ $ g− (^1) ([y,g(X)]) f (1)(X, y)dX since for X ∈ g−^1 ([0, y]) we have $^ (y/hy−gn(X))/hn γ^2 K(γ)dγ → 0 and by A2.c, for X ∈ g−^1 ([y, g(x)]) we have $^ (y/hy−gn(X))/hn γ^2 K(γ)dγ → σ^2 K. Hence, for 0 < y < g(x) we have E( Fˆ (x, y)) = F (x, y) + h 22 n σ K^2 $ g− (^1) ([y,g(x)]) f (1)(X, y)dX + o(h^2 n). For cases (2) and (3) results are obtained in a similar manner. (b) Note that V ( Fˆ (x, y)) = (^) n^1 (V 1 n − V 2 n) where
+supy∈[0,g(x)]
ee ee eE^2 n(x, y)^ −
g−^1 ([y,g(x)])
[0,y]^ f^ (X, Y^ )dY dX
ee ee e +^ supy∈[0,g(x)]|E^3 n(x, y)| where E 1 n(y) = $ g− (^1) ([0,y])^ $ [0,g(X)] κ^ p^ y−hnY^ Q f (X, Y )dY dX, E 2 n(x, y) = $ g− (^1) ([y,g(x)])^ $ [0,y] κ^ p^ y−hnY^ Q f (X, Y )dY dX and E 3 n(x, y) = $ g− (^1) ([y,g(x)])^ $ [y,g(X)] κ^ p^ yh−nY^ Q f (X, Y )dY dX. To complete the proof we show that each supremum on the r.h.s. is o(1). For the first term we have: a) X ∈ g−^1 ([0, y]) which implies that g(X) ≤ y and as n → ∞, yh−nY > BK and κ^ p^ y−hnY^ Q → 1, hence by LDC theorem E 1 n(y) → $ g− (^1) ([0,y])^ $ [0,g(X)] f (X, Y )dY dX for every y ∈ [0, g(x)]; b) For all y ∈ [0, g(x)] E 1 n(y) ≤ E 1 ,n+1(y), hence given A1.b, κ satisfies a Lipschitz condition, which together with A4 shows that E 1 n(y) is continuous. Since $ g− (^1) ([0,y])^ $ [0,g(X)] f (X, Y )dY dX is continuous in y, supy∈[0,g(x)] ee eE 1 n(y) − $ g− (^1) ([0,y])^ $ [0,g(X)] f (X, Y )dY dX^ eee = o(1). Following a similar argu- ment supy∈[0,g(x)] ee eE 2 n(x, y) − $ g− (^1) ([y,g(x)])^ $ [0,y] f (X, Y )dY dX ee e = o(1). Theorem 1. Proof From Nadaraya (1964), for all 6 > 0, we have F (qα(x) + 6 /x) > F (qα(x)/x) > F (qα(x) − 6 /x). If ω ∈ A ≡ {ω : |qα,n(x) − qα(x)| > 6 } we have that either F (qα,n(x)/x) ≥ F (qα(x) + 6 /x) or F (qα,n(x)/x) ≤ F (qα(x) − 6 /x). Hence, there exists 0 < δ( 6 , x) where δ( 6 , x) = min {F (qα(x) + 6 /x) − F (qα(x)/x), F (qα(x)/x) − F (qα(x) − 6 /x)} such that ω ∈ B = {ω|F (qα,n(x)/x) − F (qα(x)/x)| > δ( 6 , x)} so A ⊆ B and P (|qα,n(x) − qα(x)| > 6 ) ≤ P (|F (qα,n(x)/x) − F (qα(x)/x)| > δ( 6 , x)). Since, |F (qα,n(x)/x) − F (qα(x)/x)| = |F (qα,n(x)/x)− Fˆ (qα,n(x)/x)| ≤ sup (^) y∈+ | Fˆ (y/x)−F (y/x)| and we write supy∈+ | Fˆ (y/x)− F (y/x)| ≤ (^) Fˆ 1 (x) supy∈+ | Fˆ (x, y) − F (x, y)| +^ eee (^) FX^1 (x) − (^) Fˆ 1 (x)^ eee FX (x) since F (x, y) ≤ FX (x). Now, we have supy∈+ | Fˆ (x, y) − F (x, y)| ≤ supy∈[0,g(x)]| Fˆ (x, y) − F (x, y)| + sup(g(x),∞)| Fˆ (x, y) − F (x, y)|. From Lemma 2, supy∈[0,g(x)]| Fˆ (x, y) − F (x, y))| = op(1). For all y ∈ (g(x), ∞) we have that F (x, y) = F (x, g(x)) = $ Cx
[0,g(X)] f^ (X, Y^ )dY dX^ =^ FX^ (x).^ In addition, given^ min{i:Xi^ ≤x}Yi^ ≥^ hnBK^ and 0^ < Y^ ≤^ g(x), we have that for all y ∈ (g(x), ∞), y − Y > 0. Hence, there exists N (x) such that for all n > N (x) we have that Fˆ (x, y) = n−^1 ni=1^ $^ −BBKK K(γ)dγI(Xi ≤ x) = n−^1 ni=1 I(Xi ≤ x) = Fˆ (x). There- fore, sup(g(x),∞)| Fˆ (x, y) − F (x, y)| = sup(g(x),∞)| Fˆ (x) − FX (x)| = op(1) given Chebyshev’s inequality and Fˆ (x) − FX (x) = op(1). To complete the proof, note that Fˆ (x) = Op(1), and FX (x) > 0, hence by Slutsky theorem Fˆ (x)−^1 − FX (x)−^1 = op(1). Theorem 2. Proof qα,n(x) − qα(x) = (An + Cn)^ p^ f (qα(^1 x)/x) + βn^ Q , where An = F (qα(x)/x) − E( ˆF E^ (( x,qFˆ (αx())x ))),
βn = fˆ −^1 (¯qα,n(x)/x) − f −^1 (qα(x)/x) and Cn = E( ˆF E^ (( ˆx,qF (αx())x )))− Fˆ (qα(x)/x). The theorem follows if: a) βn = op(1); b) An = − 12 h^2 nσ^2 K
$ g− (^1) ([q α(x),g( FxX)]) (^ fx^ (1)) (γ,qα(x))dγ+ o(h (^2) n); c)^ p^ sFnˆ ((xx))^ Q−^1 √nCn →d N(0, 1) where
s^2 n(x) = F (x, qα(x)) − F^ (x,q FXα ((xx))) 2 − 2 hnσκ^ $ g− (^1) ([qα(x),g(x)]) f (X, qα(x))dX + o(hn). a) It suffices to show that fˆ (¯qα,n(x)/x) − f (qα(x)/x) = op(1) for all α ∈ (0, 1). Since qα,n(x) − qα(x) = op(1) it suffices to show that supy∈G|f ˆ (y/x) − f (y/x)| = op(1), where G ⊂ (0, g(x)), G compact. Note that
supy∈G| fˆ (y/x) − f (y/x)| ≤ (^) Fˆ (^1 x) supy∈G
ee ee e
nhn
3 n i=
w (^) Yi − y hn
I(Xi ≤ x) −
g−^1 ([y,g(x)])^ f^ (X, y)dX
ee ee e
ee ee e
FX (x) −^
Fˆ (x)
ee ee e supy∈G
g−^1 ([y,g(x)])^ f^ (X, y)dX. By A1, supy∈G^ $ g− (^1) ([y,g(x)]) f (X, y)dX ≤ Bf^ $ g− (^1) ([y,g(x)]) dX = O(1) for all finite x, and since Fˆ (x)−^1 − FX (x)−^1 = op(1) the second term on the r.h.s. is op(1). We now establish that the first term on the r.h.s. is op(1). From Lemma 1 in Martins-Filho and Yao (2007), if nh^2 n → ∞, supy∈G |s 0 ,x(y) − E(s 0 ,x(y))| = Op
wp (^) ln(n) nhn
Q 1 / 2 W where s 0 ,x(y) =^ nh^1 n^ ni=1 K p (^) Yi −y hn
i ≤^ x). Now,^ E(s 0 ,x(y)) =^ $ Cx^ $ [−y/hn,(g(X)−y)/hn] K (γ) f (X, y + hnγ)dγdX and by A3.c ee eE(s 0 ,x(y)) − $ Cx^ $ [−y/hn,(g(X)−y)/hn ] K (γ) f (X, y)dγdX ee e ≤ mf hn $ C x
K ,BK ]^ |γ|K^ (γ)^ dγdX^ =^ O(hn).^ Given that^ y^ ∈^ G^ ⊂^ (0, g(x)), there exists^ N(x) such that for all n > N(x) we have $ Cx^ $ [−y/hn,(g(X)−y)/hn] K (γ) f (X, y)dγdX = H 1 n(x, y) + H 2 n(x, y) where H 1 n(x, y) = $ g− (^1) ([0,y]) κ^ p^ g(Xh)−y n
Q f (X, y)dX and H 2 n(x, y) =^ $ g− (^1) ([y,g(x)]) κ p (^) g(X)−y hn
Q f (X, y)dX. Following the proof
for Lemma 2, we obtain supy∈G|H 1 n(x, y)| = o(1) and supy∈G ee eH 2 n(x, y) − $ g− (^1) ([y,g(x)]) f (X, y)dX ee e = o(1). Consequently, we have supy∈G^ eee$ Cx κ^ p^ g(Xhn)−y^ Q f (X, y)dX − $ g− (^1) ([y,g(x)) f (X, y)dX^ eee = o(1) and also supy∈G^ eee (^) nh^1 n^ ni=1 K^ p^ Yhi^ −ny^ Q I(Xi ≤ x) − $ g− (^1) ([y,g(x)]) f (X, y)dX^ eee = op(1). b) An = (E( Fˆ (x)))−^1 (A 1 n(x) + A 2 n(x)) where A 1 n(x) = F (qα(x)/x)E( Fˆ (x)) − F (x, qα(x)) and A 2 n(x) = F (x, qα(x)) − E( Fˆ (x, qα(x))). Since E( Fˆ (x)) = FX (x), A 1 n(x) = 0. Given that 0 < α < 1, we have 0 < qα(x) < g(x) and from Lemma 1, A 2 n(x) = − 12 h^2 nσ^2 K^ $ g− (^1) ([qα(x),g(x)]) f (1)(X, qα(x))dX + o(h^2 n). Thus, An = − (^) FX^1 (x)^ h 22 n σ^2 K^ $ g− (^1) ([qα (x),g(x)]) f (1)(X, qα(x))dX +o(h^2 n). c) √nCn = − (^) Fˆ (^1 x)^ ni=1 Zin where Zin = √^1 n^ p^ h^1 n^ $ [0,qα(x)] K^ p^ Yi h^ −nγ^ Q I(Xi ≤ x)dγ −I(Xi ≤ x) E( ˆF^ F(x,qX (αx)(x)))^ Q with E(Zin) = 0, s^2 n = ni=1 E(Z in^2 ). By A1, s^2 n = s 1 n + s 2 n + s 3 n where s 1 n = E^ p^ h^1 n^ $ [0,qα(x)] K^ p^ Yi h^ −nγ^ Q I(Xi ≤ x)dγ^ Q^2 , s 2 n = E(I(Xi ≤ x)) (E( ˆF F^ (Xx,q (xα)( 2 x )))^2 and s 3 n = − 2 E( ˆF F^ (x,qX (xα)(x)) E^ p^ h^1 n^ $ [0,qα(x)] K^ p^ Yi h−nγ^ Q I(Xi ≤ x)dγ^ Q. From Lemma 1, s 1 n = F (x, qα(x)) − 2 hnσκ^ $ g− (^1) ([qα (x),q(x)])