Abstract
Suppose $X_1, \\cdots, X_m, Y_1, \\cdots, Y_n$ are $m + n = N$ independent random variables, the $X$'s identically distributed and the $Y$'s identically distributed, each with a continuous cdf. Let $$z = (z_1, \\cdots, z_m, z_{m + 1}, \\cdots, z_N) = (x_1, \\cdots, x_m, y_1, \\cdots, y_n)$$ represent an observation on the $N$ random variables and let $$u(z) = (1/m) \\sum^m_{i = 1} z_i - (1/n) \\sum^N_{i = m + 1} z_i = \\bar x - \\bar y$$. Consider the $r = N! N$-tuples obtained from $(z_1, \\cdots, z_N)$ by making all permutations of the indices $(1, \\cdots, N)$. Since we assume continuous cdf's, then with probability one, these $r N$-tuples will be distinct. Denote them by $z^{(1)}, \\cdots, z^{(r)}$, and suppose that they have been ordered so that $$u(z^{(1)} \\geqq \\cdots \\geqq u(z^{(r)})$$. Notice that since $$\\bar x - \\bar y = (1/m) \\sum^N_{i = 1} z_i - (N/m)\\bar y = (N/n)\\bar x - (1/n) \\sum^N_{i = 1} z_i,$$ the same ordering can be induced by choosing $u(z) = c\\bar x$ or $u(z) = - c\\bar y$ for any $c > 0$. Assuming that the cdf's of $X_1, Y_1$ are of the form $F(x), F(x - \\Delta)$ respectively, Pitman [2] suggested essentially the following test of the hypothesis $H'$ that $\\Delta = 0$. Select a set of $k (k > 0)$ integers $i_1, \\cdots, i_k, (1 \\leqq i_1 < \\cdots < i_k \\leqq r)$. If the observed $z$ is one of the points $z^{(i_1)}, \\cdots, z^{(i_k)}$, reject $H'$, otherwise accept. When $H'$ is true, the type one error does not depend on the specific form of the distribution of the $X$'s and the $Y$'s and is in fact equal to $k/r$. The choice of the rejection set $i_1, \\cdots, i_k$ should depend on the alternative hypothesis. For instance, if the experimenter wants protection against the alternative that the "$X$'s tend to be larger than the $Y$'s," then the labels $1, \\cdots, k$ might be reasonable. For the alternative that the "$X$'s tend to be smaller than the $Y$'s" the analogous procedure is to use the other tail, $r - k + 1, \\cdots, r$. Against both alternatives, a two-tail procedure could be used. Lehmann and Stein have shown in [1] that in the class of all tests (of size $\\alpha = k/r$) of the hypothesis $$H: \\text{the distribution of} X_1 \\cdots, X_m, Y_1, \\cdots, Y_n \\text{is invariant under all permutations},$$ the single-tail test based on $1, \\cdots, k$ is uniformly most powerful against the alternatives that $F_1$ is an $N(\\theta, \\sigma)$ cdf, $F_2$ is an $N(\\theta + \\Delta, \\sigma)$ cdf, $\\Delta < 0$; the test based on $r - k + 1, \\cdots, r$ is uniformly most powerful for $\\Delta > 0$. A practical shortcoming of this procedure is the great difficulty in enumerating the points $z^{(i)}$ and the evaluation of $u(z^{(i)})$ for each of them. For instance, even after eliminating those permutations which always give the same value of $u$, then for sample sizes $m = n = 5$, there are $\\binom{10}{5} = 252$ permutations to examine, and for sample sizes $m = n = 10$, there are $\\binom{20}{10} = 184,765$ permutations to examine. In the following section, we propose the almost obvious procedure of examining a "random sample" of permutations and making the decision to accept or reject $H$ on the basis of those permutations only. Bounds are determined for the ratio of the power of the original procedure to the modified one. Some numerical values of these bounds are given in Table 1. The bounds there listed correspond to tests which in both original and modified form have size $\\alpha$, and for which the modified test is based on a random sample of $s$ permutations drawn with replacement. These have been computed for a certain class of alternatives which is described below. For simplicity, we have restricted the main exposition to the two-sample problem. In Section 5, we point out extensions to the more general hypotheses of invariance studied in [1].
Keywords
Related Publications
A Class of Statistics with Asymptotically Normal Distribution
Let $X_1, \\cdot, X_n$ be $n$ independent random vectors, $X_\\nu = (X^{(1)}_\\nu, \\cdots, X^{(r)}_\\nu),$ and $\\Phi(x_1, \\cdots, x_m)$ a function of $m(\\leq n)$ vectors $x_...
Correlation hole of the spin-polarized electron gas, with exact small-wave-vector and high-density scaling
For a uniform electron gas of density n=${\mathit{n}}_{\mathrm{\ensuremath{\uparrow}}}$+${\mathit{n}}_{\mathrm{\ensuremath{\downarrow}}}$=3/4\ensuremath{\pi}${\mathit{r}}_{\math...
Noiseless coding of correlated information sources
Correlated information sequences <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">\cdots ,X_{-1},X_0,X_1, \cdots</tex> and <tex xml...
Laws of Large Numbers for Sums of Extreme Values
Let $X_1, X_2, \\cdots$, be a sequence of nonnegative i.i.d. random variables with common distribution $F$, and for each $n \\geq 1$ let $X_{1n} \\leq \\cdots \\leq X_{nn}$ deno...
Estimation in a Multivariate "Errors in Variables" Regression Model: Large Sample Results
In a multivariate "errors in variables" regression model, the unknown mean vectors $\\mathbf{u}_{1i}: p \\times 1, \\mathbf{u}_{2i}: r \\times 1$ of the vector observations $\\m...
Publication Info
- Year
- 1957
- Type
- article
- Volume
- 28
- Issue
- 1
- Pages
- 181-187
- Citations
- 731
- Access
- Closed
External Links
Social Impact
Social media, news, blog, policy document mentions
Citation Metrics
Cite This
Identifiers
- DOI
- 10.1214/aoms/1177707045