Home >> Encyclopedia-britannica-volume-18-plants-raymund-of-tripoli >> Organized Health Work to Plumbing >> Other Frequency Distributions 25

Other Frequency-Distributions 25

distribution, mean, formula, normal, distributed, individuals, values, law, according and value

OTHER FREQUENCY-DISTRIBUTIONS 25. Genesis of Normal Distributions.—The normal figure of frequency has been obtained as the limit of the frequency polygon for errors of random sampling from two categories. But it has other important relations.

(i.) If a quantity Y is distributed according to the normal law about a mean value a with mean square of deviation and n values of Y are taken at random, the mean of these values is distributed according to the normal law about the mean value a with mean square of deviation c2/n.

(ii.) If a quantity Y is distributed according to (almost) any law whatever about a mean value a with mean square of de viation and n values of Y are taken at random, the mean of these values is distributed about the mean value a with mean square of deviation and the distribution is approximately normal; the approximation being less or more close according as is is smaller or greater and according as the original distri bution of Y is less or more similar to a normal distribution.

(iii.) In the frequency-distribution of a quantity Y, suppose that the deviation of any individual value of Y from the mean Y is the total effect of a very large number of very small deviations, due to causes which operate independently of one another. Then (in general) the distribution of Y is approximately normal.

26. Observation of Distributions.—The pioneer in the syste matic study of actual frequency-distributions was L. A. J. Quetelet, the Belgian astronomer. In works published in and after 1835 he showed that variations in many kinds of phenom ena—temperature, price of grain, astronomical observations, heights and chest-measurements of men—were distributed about a mean value in a manner similar to the binomial distribution (sec. 16). His standard distribution, with which distributions of the above kind were to be compared, was that due to draw ing 999 balls from an urn containing white and black balls in equal proportions; it was therefore a symmetrical distribution, very close to the Gaussian distribution, but discontinuous. As an example of the frequency-distributions to which Quetelet called attention, we may take the following table (Table IV.) of chest-measurements of 5,732 soldiers in Scottish regiments.

27. Laws of Distribution.—For comparisons similar to those made by Quetelet, the normal law was usually taken as a basis. But it had two obvious defects (cf. sec. 21 [iv.]). One was that it assumed the possibility of an infinite range of variation. The more serious defect was that it implied symmetry, whereas in many cases there was a definite asymmetry or skewness. The generalized law, mentioned in sec. 23, allows for skewness, but does not provide for restricted range of variation. Provision was therefore required for other types of frequency-distribution. This was necessary for two reasons. When we are studying a particu lar distribution, it is desirable to obtain a formula which fits it as closely as possible: for, when we have obtained a closely fitting formula, the actual distribution can conveniently be represented by the constants of this formula. But there is a further question. We are studying, let us say, the distribution of men's heights in a fairly homogeneous population; the vari ations of height being produced by a number of small variations which are distributed by intermarriage and are being supple mented by new variations of the same kind. We find a formula,

containing a few parameters or " arbitrary constants," which can be fitted very closely to the observations by obtaining suit able values for these parameters. But we are at any time study ing only a limited number of persons, though this number may be a large one; and the result of this limitation of number is some irregularity as between the numbers in the various cate gories of classification—e.g., the numbers of men of different heights, measured in inches. As the number of persons measured increases, these irregularities become relatively smaller. The question which thus arises may be stated as follows. We adopt the hypothesis that, as the number of observations is increased, not merely actually, but also potentially by supposing the causes of variation to be extended to a very large number of individuals, there is a tendency to a definite statistical law of variation. The question then is: Can the irregularities which appear in our data for n individuals be regarded as due to errors of random sampling of the n individuals from a hypothetical universe of this kind? Thus the study of a frequency-distribution falls mainly under the following heads: (I) the choice of a suitable formula, (2) the determination of the constants of this formula from the data, (3) enquiry whether the data may reasonably be regarded as the result of random sampling from a universe in which the variations are distributed according to the formula chosen. There are also certain minor matters of consideration: in particular, the ques tion (4) whether the data can really be regarded as homogeneous, or whether individual excessive variations—giants or dwarfs— ought to be excluded, either as not really belonging to the uni verse we are postulating, or as being sports, or, possibly, as being the result of incorrect observation.

28. Some Types of Formula.

The most familiar types of formula which have been fitted to observations are those given by Karl Pearson's extension of the Gaussian formula. This latter formula is essentially based on the differential equation 29. Determination of Constants.—(i.) There are various ways of fitting a formula to a particular distribution by finding suitable constants. The most usual method is that of equating moments. The number of individuals being n, and the equation of the figure to be fitted being nu=nf(X, a, b, c .), so that f(X, a, b, c.. .) dX is the proportion of individuals for which X lies within the limits X.±idX, and a, b, c... are the con stants to be determined, we equate the ist, 2nd, 3rd . . . moments of nu to those of the actual distribution, and thus obtain equa tions for determining a, b, c . . . . In doing this, it must be remem bered that the values of X are (usually) not given exactly for the observed individuals, since these latter are grouped in classes, e.g., in the case of heights, they may be given to the nearest inch. The individuals in each class are therefore treated as having all the same X, and certain corrections are applied (see