World Library  
Flag as Inappropriate
Email this Article

Mid-range

Article Id: WHEBN0002814021
Reproduction Date:

Title: Mid-range  
Author: World Heritage Encyclopedia
Language: English
Subject: Sample maximum and minimum, L-estimator, Xiaomi Mi4i, Xiaomi Mi 4c, Midhinge
Collection: Means, Summary Statistics
Publisher: World Heritage Encyclopedia
Publication
Date:
 

Mid-range

In statistics, the mid-range or mid-extreme of a set of statistical data values is the arithmetic mean of the maximum and minimum values in a data set, defined as:[1]

M=\frac{\max x + \min x}{2}.

The mid-range is the midpoint of the range; as such, it is a measure of central tendency.

The mid-range is rarely used in practical statistical analysis, as it lacks efficiency as an estimator for most distributions of interest, because it ignores all intermediate points, and lacks robustness, as outliers change it significantly. Indeed, it is one of the least efficient and least robust statistics. However, it finds some use in special cases: it is the maximally efficient estimator for the center of a uniform distribution, trimmed mid-ranges address robustness, and as an L-estimator, it is simple to understand and compute.


Contents

  • Robustness 1
  • Efficiency 2
    • Small samples 2.1
  • Sampling properties 3
  • Deviation 4
  • See also 5
  • References 6

Robustness

The midrange is highly sensitive to outliers and ignores all but two data points. It is therefore a very non-robust statistic, having a breakdown point of 0, meaning that a single observation can change it arbitrarily. Further, it is highly influenced by outliers: increasing the sample maximum or decreasing the sample minimum by x changes the mid-range by x/2, while it changes the sample mean, which also has breakdown point of 0, by only x/n. It is thus of little use in practical statistics, unless outliers are already handled.

A trimmed midrange is known as a midsummary – the n% trimmed midrange is the average of the n% and (100−n)% percentiles, and is more robust, having a breakdown point of n%. In the middle of these is the midhinge, which is the 25% midsummary. The median can be interpreted as the fully trimmed (50%) mid-range; this accords with the convention that the median of an even number of points is the mean of the two middle points.

These trimmed midranges are also of interest as descriptive statistics or as L-estimators of central location or skewness: differences of midsummaries, such as midhinge minus the median, give measures of skewness at different points in the tail.[2]

Efficiency

Despite its drawbacks, in some cases it is useful: the midrange is a highly efficient estimator of μ, given a small sample of a sufficiently platykurtic distribution, but it is inefficient for mesokurtic distributions, such as the normal.

For example, for a continuous uniform distribution with unknown maximum and minimum, the mid-range is the UMVU estimator for the mean. The sample maximum and sample minimum, together with sample size, are a sufficient statistic for the population maximum and minimum – the distribution of other samples, conditional on a given maximum and minimum, is just the uniform distribution between the maximum and minimum and thus add no information. See German tank problem for further discussion. Thus the mid-range, which is an unbiased and sufficient estimator of the population mean, is in fact the UMVU: using the sample mean just adds noise based on the uninformative distribution of points within this range.

Conversely, for the normal distribution, the sample mean is the UMVU estimator of the mean. Thus for platykurtic distributions, which can often be thought of as between a uniform distribution and a normal distribution, the informativeness of the middle sample points versus the extrema values varies from "equal" for normal to "uninformative" for uniform, and for different distributions, one or the other (or some combination thereof) may be most efficient. A robust analog is the trimean, which averages the midhinge (25% trimmed mid-range) and median.

Small samples

For small sample sizes (n from 4 to 20) drawn from a sufficiently platykurtic distribution (negative excess kurtosis, defined as γ2 = (μ4/(μ2)²) − 3), the mid-range is an efficient estimator of the mean μ. The following table summarizes empirical data comparing three estimators of the mean for distributions of varied kurtosis; the modified mean is the truncated mean, where the maximum and minimum are eliminated.[3][4]

Excess kurtosis (γ2) Most efficient estimator of μ
−1.2 to −0.8 Midrange
−0.8 to 2.0 Mean
2.0 to 6.0 Modified mean

For n = 1 or 2, the midrange and the mean are equal (and coincide with the median), and are most efficient for all distributions. For n = 3, the modified mean is the median, and instead the mean is the most efficient measure of central tendency for values of γ2 from 2.0 to 6.0 as well as from −0.8 to 2.0.

Sampling properties

For a sample of size n from the standard normal distribution, the mid-range M is unbiased, and has a variance given by:[5]

\operatorname{var}(M)=\frac{\pi^2}{24 \ln(n)}.

For a sample of size n from the standard Laplace distribution, the mid-range M is unbiased, and has a variance given by:[6]

\operatorname{var}(M)=\frac{\pi^2}{12}

and, in particular, the variance does not decrease to zero as the sample size grows.

For a sample of size n from a zero-centred uniform distribution, the mid-range M is unbiased, nM has an asymptotic distribution which is a Laplace distribution.[7]

Deviation

While the mean of a set of values minimizes the sum of squares of deviations and the median minimizes the average absolute deviation, the midrange minimizes the maximum deviation (defined as \max\left|x_i-m\right|): it is a solution to a variational problem.

See also

References

  1. ^ Dodge 2003.
  2. ^ Velleman & Hoaglin 1981.
  3. ^ Vinson, William Daniel (1951). An Investigation of Measures of Central Tendency Used in Quality Control (Master's). University of North Carolina at Chapel Hill. Table (4.1), pp. 32–34. 
  4. ^ Cowden, Dudley Johnstone (1957). Statistical methods in quality control. Prentice-Hall. pp. 67–68. 
  5. ^ Kendall & Stuart 1969, Example 14.4.
  6. ^ Kendall & Stuart 1969, Example 14.5.
  7. ^ Kendall & Stuart 1969, Example 14.12.
  • Dodge, Y. (2003). The Oxford dictionary of Statistical Terms. Oxford University Press.  
  • Kendall, M.G.; Stuart, A. (1969). The Advanced Theory of Statistics, Volume 1. Griffin.  
  • Velleman, P. F.; Hoaglin, D. C. (1981). Applications, Basics and Computing of Exploratory Data Analysis.  
This article was sourced from Creative Commons Attribution-ShareAlike License; additional terms may apply. World Heritage Encyclopedia content is assembled from numerous content providers, Open Access Publishing, and in compliance with The Fair Access to Science and Technology Research Act (FASTR), Wikimedia Foundation, Inc., Public Library of Science, The Encyclopedia of Life, Open Book Publishers (OBP), PubMed, U.S. National Library of Medicine, National Center for Biotechnology Information, U.S. National Library of Medicine, National Institutes of Health (NIH), U.S. Department of Health & Human Services, and USA.gov, which sources content from all federal, state, local, tribal, and territorial government publication portals (.gov, .mil, .edu). Funding for USA.gov and content contributors is made possible from the U.S. Congress, E-Government Act of 2002.
 
Crowd sourced content that is contributed to World Heritage Encyclopedia is peer reviewed and edited by our editorial staff to ensure quality scholarly research articles.
 
By using this site, you agree to the Terms of Use and Privacy Policy. World Heritage Encyclopedia™ is a registered trademark of the World Public Library Association, a non-profit organization.
 


Copyright © World Library Foundation. All rights reserved. eBooks from Project Gutenberg are sponsored by the World Library Foundation,
a 501c(4) Member's Support Non-Profit Organization, and is NOT affiliated with any governmental agency or department.