Dispersion.— Dispersion is a statistical term used to denote the variation in magnitude of the items constituting a series. It is meas ured by the difference in size of the extremes or by the general deviation of the items from the type. Such measurement may be absolute or relative. For purposes of comparison co efficients of dispersion are found by dividing the absolute measure of dispersion by the quantity representing the typical item. The most common measures of dispersion are known as the average deviation and the stand ard deviation; the former is found by dividing the sum of the deviations of the several items from the arithmetical mean, without regard to signs, by the number of items; the standard deviation is computed by dividing the sum of the squares of the several deviations from the arithmetical mean by the number of items and extracting the square root of the quotient. In series where the items represent different fre quencies the standard deviation is found by multiplying the squares of the several devia tions by the corresponding frequencies and di viding the sum of the products by the total frequencies and extracting the square root of the quotient.
Skewness in statistics denotes asymmetrical dispersion or that at points of like deviation above and below the mode the frequencies are unequal. Skewness is measured in various ways, the simplest measure being in dicated by the difference between the arithmeti cal mean and the mode. The simplest co efficient of skewness is found by dividing the difference thus found by the average deviation from the mode. Other coefficients are found by more complicated mathematical processes.
Correlation.— The determination of the causal relationship between two or more groups of phenomena known as correlation, constitutes one of the principal developments of statistics in recent years. While statistics alone are not able to prove causal relationships absolutely, they are able with other factors to furnish satisfactory demonstrations of the fact and ex tent of correlation in many realms of phe nomena. Correlation between two statistical series may be shown by diagrams or by corn putation of the coefficient of correlation. The former method is used in statistical work designed for the general reader, as the cor respondence of the two frequency curves on the diagram is recognized at a glance. The second method is used in cases where a definite measure of correlation is desired. The formula most commonly used in computing co efficients of correlation is that devised by the biologist Karl Pearson. Without attempting a full mathematical demonstration the method of computation may be briefly described as follows: Two series of items of equal length representing two groups of phenomena are taken. The one group may be designated as group A and the other as group B. The devia tions from the arithmetical mean of the several items of each group are determined. The devia tions in group A may be designated by xs, xs, xs, etc., and those of group B by ys, ys, ys, etc. The several deviations of the items of group A are multiplied by the deviations of the correspond ing items of group B and the products are added. This summation of products is desig nated as (xy). The standard deviation of group A is designated as al and that of group B as as. The number of items is denoted by n.. Then r, the coefficient of correlation of groups A and B, is determined by the following formula, r, stated in words, the sum of the prod nal 02 ucts of the deviations as above obtained is divided by the product of the number of items and the standard deviations of the two groups. Modifications of the method are used in deal ing with various kinds of phenomena, and elab orate formulae have been devised to determine the coefficient of correlation of three or more groups of phenomena. Through the work of Pearson, Bowley, Yule, Persons and others, rapid development has taken place in mathe matical statistics during recent years. Much of the work in this field, however, is of a theoretical nature and has limited application in general statistics.
History.— The beginnings of statistics are found in very early times. The ancient Egyp tians compiled data concerning population and wealth preparatory to the building of the Pyra mids, about 3050 B.C. There are accounts of statistical works in China as early as 2300 B.C. Two censuses of the Israelites are recorded in the book of Numbers. A census was taken in Greece for the purpose of levying taxes in 594 B.C. Athens took a census of population in 309 a.c. The Romans excelled every other nation of antiquity in making definite measure ments. Besides making quinquennial enumera tions of the population, they prepared surveys of the entire country including parts of the provinces. They also kept records of births and deaths. It was many centuries after the fall of the Roman Empire before another com prehensive system of governmental statistics was established. Very few statistical records of value were made during the Middle Ages but occasional investigations were ordered by sovereigns. Pepin the Short in 758 and Charle magne in 762 ordered the detailed description of church lands. The registries of lands in France taken at these and succeeding periods known as "Polyptiquesx' included an enumera tion of the rural population. The best known census of this period is the °Domesday Booka prepared by order of William the Conqueror in 1088 A.D. The purpose of this was to ac quaint the sovereign with the extent of his new dominion and the tenure of the various parcels of land. Sebastian Muenster, a Heidelberg professor, anticipated the field of comparative statistics by preparing a systematic treatise on the organization, wealth, armies, commerce, laws, etc., of ancient countries. Francesco Sansovino published a similar work in 1562, and Giovanni Botero, another in 1589. Pierre d'Avity, Seigneur de Montmarin, compiled a much more comprehensive and accurate treatise in the same field in 1614. The registration of deaths was begun in London in 1532 and was followed by that of baptisms by parish clergy men. In 1629 the distinction of sex was made in the record. In 1661 a noteworthy statistical advance was made by the publication by Capt. John Graunt of his 'Observations' on the Lon don Bills of Mortality. He was the first to
point out the regularity of social phenomena and the excess of male births over female. Caspar Neumann of Breslau in 1691 compiled data relative to 5,869 deaths from the parish records of that city and pointed out that the supposed fateful significance of the years seven and nine was unfounded. His work attracted the attention of Edmund Halley, the astronomer, who laid the foundations of scientific life in surance by calculating a mortality table from the data collected. Another noteworthy con tribution of this period was the mathematical demonstration of the theory of probabilities by Jacques Bernouilli, a professor of Basel. The first to use the word statistik was Gottfried Achenwall, (1719-72) a professor of philoso phy at the University of Gottingen. The word comes from the Italian statista, which denoted statesman. Achenwall in his book discussed matters that he believed of interest to the state, such as population, resources, constitu tions, etc. The name statistics was introduced into England by Dr. Zimmerman about 1787 and was popularized by Sir John Sinclair in his 'Statistical Account of Scotland' (1791 99). Writings more of the nature of modern statistics are found in Sir William Petty's 'Political Arithmetick) (1691) and Siissmilch's 'Betrachtungen fiber die goettliche Ordnung in den Veranderungen des menschlichen Gesch lechts aus der Geburt, dem Tode and der Fort pflanzung desselben erwiesen' (1741). The close of the 18th century and the beginning of the 19th marked the revival of census taking. The United States took its first decennial census of population in 1790. England and France began periodic census taking in 1801. The former country has continued the practice at 10-year intervals, but the latter after taking a second census in 1806 took no further enumera tions of population until 1836. From that year to 1901 four general censuses were taken; since 1901 a census of population has been taken in France every five years. Belgium took its first census in 1829. The analysis of this census by Quetelet contains noteworthy observations on e influence of age, sex, occupation, economic condition and season upon mortality. Upon the organization of the Belgium Statistical Central Commission in 1841, Quetelet became its presi dent and continued to hold the office until his death in 1874. His work not onlyplaced the statis tics of his native country on a high plane but did much to raise the standards of statistical work throughout the world. The progress of statistics was accelerated by the founding of the Societe de Statistique de Paris in 1803, the Royal Statistical Society of London in 1834, and the American Statistical Association in 1839. Each of these societies holds meetings periodically and publishes the papers presented by its members. The first meeting of the International Statistical Congress was held in Brussels in 1853. After holding eight addi tional meetings it was succeeded by the Inter national Statistical Institute in 1885. These so cieties have done much to promote uniformity in statistical classifications, schedules and tables, and otherwise to advance the science. Since 1890 there has been wonderful expansion in the field of statistics. Practically every civilized country now takes periodic enumerations of its population, keeps continuous records of births, deaths, marriages and divorces, and pre-. pares elaborate statistics relative to production, foreign and domestic commerce, finance, public utilities, labor, agriculture, education, immigra tion, etc. The United States made a great advance in 1902 by making the Census Bureau a permanent office. Previous to that time the bureau was organized anew every 10 years for the purpose of taking the general census and was discontinued when the work was com pleted. While national statistics has been ex panding there has been an equally great exten sion of statistical work in States and cities. Massachusetts took the lead in State statistics by establishing a permanent bureau of statistics and labor in 1869. A similar bureau was estab lished by Pennsylvania in 1872, by Connecticut in 1873, and by Ohio in 1877. Subsequently several other States took like action. With the enlargement of State activities during the past 30 years the need of adequate statistics has been met, in part at least, by creating bureaus of statistics within State departments. These bureaus publish annual statistics relating to finances, taxation, banking, insurance, intra State commerce, labor conditions, public utili ties, education, dependents, delinquents, defect ives, deaths, births, disease, accidents, etc. Scarcely less important are the valuable statis tics prepared by the principal cities of the country. These statistics deal with the func tions of the city and the problems that the city is required to solve, such as education, housing, transportation, docks, pavements, water supply, electricity, gas, crime, fires, accidents and dis eases. Noteworthy contributions to statistics are being made by various organizations such insurance nsurance companies, banking houses, na tional sociological and philanthropic societies, etc. The insurance companies especially have done remarkable work in studying health con ditions in various parts of the world.
Bailey, W. B.,