Home >> Encyclopedia-britannica-volume-8-part-2-edward-extract >> Overseas to Ιtagθre >> Theory of Equations

Theory of Equations

Loading


EQUATIONS, THEORY OF. An equation is the state ment of an equality between one or more unknown numbers and known, or given, numbers, which is true, not for all values of the unknowns, but only for certain of them (Lat. aequatio, an equalizing). An equality which holds for all values of the unknowns is an identity. Thus 3x+2 = 5, true only for x is an equation; - = (x+ y) (x- y) , holding for all values of x, y, is an identity. To distinguish identities from equations the symbol = is used, as in - - (x+ y) (x - y) . This symbol will also be used, where no confusion can arise, to signify a definition; thus x = a means .c is a.

The earliest known equivalents of algebraic equations occur in the Rhind papyrus, evidently compiled from earlier works, by the Egyptian Ahmes, about 165o or 170o B.C. For example, he proposes this problem :—` `A quantity and its seventh added together become 19. What is the quantity? " His word for the unknown is `aha` or `h`, formerly written hau and translated "heap" or "mass." The problem, therefore, is to solve the equa tion x++x= 19, as we would now express it. Lacking a conven ient algebraic notation, he proceeded by a cumbersome method later known as that of "false position." Indeed, neither the Egyptians nor their Greek successors made any progress that is significant from a modern point of view, and neither people rose to the abstract conception of a theory of equations as a fruit ful field of mathematical science. The Indians, with their peculiar addition to arithmetic, achieved more.

The theory of equations is concerned chiefly with the prop erties of a single algebraic equation of the type cox" +. C i + ... + = o, in which n is a positive whole number, the coefficients • • • , are any given numbers, or numbers that are not speci fied but are assumed known, and cow o. The nature of the coeffi cients will be made more precise later, as on them the whole theory depends. The degree of this equation is n. Roughly speaking, the theory of equations discusses this problem :—The coefficients being specified, find all values of x which make the equation true. This again will be amplified and made more definite as we proceed. The finding of x is called solving the equation.

The Greeks are sometimes credited with solving equations of the second degree. Thus Euclid's Elements, ii. II, is equivalent to solving There are two values of x; Euclid was content with one. In the 9th century P.D. the Arab, Mohammed ibn Musa al-Khowarizmi (whose name is variously trans literated) gave both values 3, 7 of x in = lox; he also discussed many more equations of the second degree. Like other Arabic writers, he used the equivalent of the term "root" for a value of the unknown. Great advances were made in the 15th and i6th centuries by the Italian mathematicians, who solved the general equations of the third and fourth degrees. These will be considered later. In spite of its brilliance, their work had but little direct influence on the evolution of a theory of algebraic equations. Curiosity as to the underlying reason for success or failure seems not to have perturbed the practical mind of the 16th-century algebraist. That rare type of speculation was re served for the golden age of the late i8th and early 19th cen turies; and although significant progress was made in the i8th century, notably by Joseph Louis Lagrange in a classic memoir of I 7 70-7 I, it was only with the researches of Evariste Galois (181 I-183 2) that the theory, at one stride, reached its maturity. Galois was killed in a duel at the age of 21.

Formerly the theory included much that is now relegated to other departments of algebra, e.g., the solution of simultaneous equations of the first degree in several unknowns, which is now an application of determinants (q.v.) and matrices. As commonly understood to-day, the theory of algebraic equations is concerned chiefly with two problems and their numerous ramifications, all of which sprang directly from the necessity for solving the equa tions presented by problems in pure and applied science. To describe these, a few definitions must first be recalled. For an understanding of certain parts of the sequel an elementary ac quaintance with the plotting of simple curves is presupposed; for others, a knowledge of derivatives, and Taylor's theorem (q.v.), and finally, for the modern theory, the reader is assumed to have read parts of the article GROUPS.

A in which a, b are real numbers and is called complex (see COMPLEX NUMBERS) ; if b =o, the number is real, otherwise it is imaginary. Let n be a positive integer other than zero, and let ci. • • • , be complex numbers not involving x. If is not zero, the polynomial f (x) • • • +cn is of degree n. A complex number k, which is such that f(k) = o, is called a root of the algebraic equation f(x) =o of degree n, and the value k of n is said to satisfy the equation; f(x) is also said to vanish when x = k. According as none, or some, of the coefficients • • • , c" are imaginary, the equation f(x) = o is called real or imaginary. It is necessary to consider both real and imaginary equations. If f(x) = o is imaginary, its solution is reduced to that of real equations thus:—Write f (x) in the form g(x) +ih(x), where the coefficients of g(x), h(x) are all real. Then g(x) +ih(x) = o. Multiply the last throughout by the conjugate imaginary g(x) -ih(x). The result is a real equation, among whose roots occur all those of f(x) = o. Real equations are thus the funda mental ones, but we shall not assume f(x) = o to be real unless so stated.

The central problems of the theory are these: (a) To find a root (x) = o, i.e., to solve the equation, when the degree n and the coefficients • • • , are given. (b) To determine the precise conditions under which the roots of f(x) = o can be ex pressed in terms of the coefficients by means of a finite number of algebraic operations (additions, multiplications, subtractions, divisions, extractions of roots). This is called the algebraic solution, or the solution by radicals, = o. The exact sense in which the coefficients are "given" in this problem is the crux of the modern theory; for the present it suffices to state that they may be considered as independent variables. If in (a), when the coefficients have given numerical values, a root cannot be found exactly, a practicable process must be devised whereby a root may be exhibited to any prescribed degree of approximation. If in (b) the roots are not expressible in the form demanded, it is required to construct the simplest functions of the coefficients that do satisfy the equation. For example, it was almost proved in 1824 by Niels Heinrick Abel, then only 22 years of age, that the solution by radicals of the general equation of degree >4 is impossible. His attempt contains two oversights, now easily rectified by the Galois theory. The current assertion that Abel proved the general equation of degree >4 solvable by radicals impossible, is definitely incorrect. The objections by William Rowan Hamilton in 1839 to Abel's alleged proof alone are valid.

In 1858 Charles Hermite first solved the general equation of de gree 5 by means of elliptic functions (q.v.). Modern work in this direction, originating with Henri Poincare about 188o, solves the general equation of degree n in terms of Fuchsian functions. Current developments of (b) are inextricably interwoven with the theories of substitution groups, algebraic numbers (modern higher arithmetic) and special functions of a complex variable; (a) is practically exhausted.

The Fundamental Theorem.---A

basic result for both (a) and (b) is the so-called fundamental theorem of algebra, which states that every algebraic equation has a root. More fully, it is proved either from this, or almost in one step from an applica tion of an integral formula of Augustin Louis Cauchy in the theory of functions of a complex variable, that an equation of degree n has precisely n roots. The remarkable feature of this theorem is that, in order to solve completely any algebraic equa tion, it is unnecessary to go beyond the domain of complex numbers. It is by no means obvious that this should be so. The novice in algebra regards the theorem as a truism; the seasoned mathematician sees in it a species of fortunate miracle, while the sophisticated critic views it with suspicion. The first alleged satisfactory proof was given in 1799 by Karl Friedrich Gauss, who subsequently added three more.

Actually the theorem is not in the purview of algebra, as all proofs, depending ultimately upon continuity, are analytic and belong to the calculus. A proof adequate to the demands of modern rigour would implicitly traverse the entire theory of the continuum. Certain ultra-rigorists of the school founded by Leopold Kronecker, and invigorated to-day by L. E. J. Brouwer and Hermann Weyl, might even assert that, not only has the fundamental theorem not yet been proved, but also that it is without meaning. From the standpoint of modern mathematical foundations a fatal epistemological imperfection of the classical proofs is their failure to exhibit a process for constructing, in a well-defined number of well-defined operations, the roots whose existence the purported proofs undertake to establish. The dif ficulty here, of course, is irrelevant for the pragmatic problem of solving a numerical equation to a prescribed degree of accuracy. However disturbing such scepticism may be to the 2oth-century critical logician, it need not deter the engineer who is capable of plotting a graph sufficiently accurate for most practical purposes. It is interesting, however, on account of its indication that mathe matical reasoning may be as fallible as any other; and it should not be forgotten by the professional mathematician that to-day's heterodoxy is to-morrow's rigorous orthodoxy.

Symmetric Functions.-Before

developing the elementary consequences of the fundamental theorem (provided it be proved, or accepted as a hypothesis), it is necessary to define symmetric functions and to state a few of their properties. Let • • • , x„ be independent variables. A rational function of • • • , x„ (obtained from these by a finite number of additions, multiplica tions, subtractions, divisions, no divisor being zero), is said to be symmetric in • • • , if it is unchanged when any two of • • • , are interchanged. Thus is symmetric in since it is unchanged when and or and or and are interchanged. Let • • • denote a substitution which replaces by xb, xb by • • • , by by and let I indicate the substitution ("the identity") which leaves every letter unchanged. The set of all substitutions on - • • , x„ is a group, called the symmetric group on these letters; any symmetric function of the n letters is said to belong to this group, since it is unchanged under all substitutions of the group. Thus belongs to I, . These ideas and their generalization to groups other than the symmetric will be of use later.

The jth elementary symmetric function of • - • , for j= 2, • • • , n, is, by definition, the sum of all possible products of different variables chosen from the set • • • , x„. Thus, for n = 3, the ist, end, 3rd elementary symmetric functions are xi+x2+x3, x1x2+x2x3+x3x1i and these are all the ele mentary symmetric functions of There are obviously an unlimited number of symmetric functions other than the ele mentary; it suffices to apply to any rational function of • • • , x„ the n! substitutions of the symmetric group on these letters and add the results; any numerical factor common to all the terms may be suppressed. For example, when n = 3, is symmetric in and is equal to The last illustrates the important theorem that any polynomial P which is symmetric in • • • , is equal to a polynomial Q in the elementary symmetric functions and the coefficients of P; the coefficients of Q are whole numbers. If all the coefficients of P are also whole numbers, Q is a polynomial in the elementary symmetric functions alone with whole number coefficients. These properties constitute the fundamental theorem of symmetric functions. The reduction to elementary symmetric functions is unique.

Relations Between Roots and Coefficients.-The

first use ful consequence of the fundamental theorem of algebra is this: If • • • , x„ are the n roots of f(x) = o, where Conversely, if f(k) = o, then x - k is a factor of f(x). More gen erally, if t is any complex number, f(t) is equal to the remainder obtained on dividing f(x) by x -t, and hence f (t) can be calculated by division, a result of importance in the numerical solution of equations. This is called the remainder theorem.

Since • • • , are any complex numbers independent of x, and cow o, the equation f (x) = o may be divided throughout by It then becomes xn-f • • • +an = o, where • • - , are complex numbers, and this form is precisely as general as the original. When convenient we shall use it. If the roots are • • • , the linear (= first degree in x), factors of and hence, on comparing coefficients of like powers of x, we see that = (- i)i times the jth elementary symmetric function of the roots, for j = i, 2, • • • , n. In particular, (- I)" times the product of all the roots. This frequently is useful in testing for rational roots of an equation whose coefficients are rational numbers; by this means all the rational roots may be found. By the fundamental theorem on symmetric functions it follows that any symmetric polynomial P in the roots . • • , a„ is equal to a polynomial Q in the coefficients • • • , and the coefficients of P; the coefficients of Q are whole numbers. If both the coefficients of the equation and those of P are rational num bers, then Q is a rational number.

Simple Properties of Real Equations.-Among

the more useful elementary properties of real equations are the following: Imaginary roots, if any, occur in conjugate pairs, ad- bi, a - bi (a, b are real numbers and i = 1 - I) . Hence, if the degree n is odd, the equation has at least one real root. Again, f(x) being a real, continuous function of x (see FUNCTION), it follows that if f (r), f(s) have opposite signs, where r, s are real numbers, the curve whose equation is y = f (x) cuts the axis of x an odd number of times between r and s. If among the n roots of f(s) = o there are precisely h each equal to a, and h> i, the root a is said to be of multiplicity h. Let f'(x) be the first derivative of f(x). Then the theorem of Michel Rolle states that between two consecutive real roots = o there is an odd number of real roots off (x) = o, provided a root of multiplicity h be counted as h roots.

As several subsequent theorems are considerably simpler for equations having no multiple roots, it is important to refer all cases back to this, as follows: If the highest common factor of f(x) and f'(x) involves x, let it be g(x) . Then a root of g(x) = o, of multiplicity in, is a root of f(x) = o, of multiplicity md-- i conversely, any root of f(x) = o, of multiplicity m+ i, is a root of g(x)=o, of multiplicity m. By successive applications of the process for finding the highest common factor, any multiple roots that may be present can be found. If the root a is of multi plicity h, (x is a factor of f(x), and similarly for all mul tiple roots. Dividing f(x) by the product of all such (x — • • • , we obtain a polynomial which vanishes only for the simple roots of f(x) = 0. This argument for multiple roots is perfectly general and is not restricted to real equations.

When

f(x) = o is real, it is possible, by measuring where the graph of y =f(x) crosses the x-axis, to ascertain how many real roots lie within given limits; by sufficiently enlarging the scale a particular real root may be located with any desired degree of accuracy. The graphical method is a valuable adjunct to the arithrhetical processes, as in all numerical solutions the initial difficulty is in approximately locating the root to be calculated. If the real equation f(x) = o has imaginary roots u+iv, o, the values of u, v are found by solving the simultaneous equations f(u) (2) (u) / 2 ! - I - f . . . =o, f (i) — f (3) !+f (5) ! — . • • _ o, where f (u) denotes the j th derivative (u) . These are by equating to zero the real and imaginary parts of the Taylor expansion (see TAYLOR'S THEOREM) of f (u+iv). Their solution is reduced to that of an equation in u or v alone by elimination. The determination of the roots, real and imaginary, of real equa tions is thus thrown back to the problem of finding the real roots of real equations and, as we have seen, the solution of an imaginary equation is reducible to the same. A further reduction, which in the case of equations with numerical coefficients is of great utility, is possible. The negative real roots of the real equation f(x) = o are evidently the positive real roots of f = o. To find the negative real roots of f(x)=o, it suffices therefore to find the positive real roots of the equation obtained from f(x)=o by changing the signs of all the terms of odd degree in f(x)=o. Thus problem (a) may be limited to the discovery of the positive real roots of real equations.

Elimination.

This process being of frequent occurrence in the theory and its applications, we shall describe the so-called dialytic method, invented by James Joseph Sylvester, for per forming it. Usually this is as simple as any other, and often is simpler. Let f(x), g(x) be polynomials in x of degrees m, n respectively. We seek first a necessary and sufficient condition that f(x) = o and g(x)=o shall have a root in common. Let • • • , x" be the roots = o. Then f(x) = o and g(x) = o will have a common root when and only when • • • g(x") = o, since g(x) vanishes only when x is a root of g(x) =o. To avoid fractions in the final expression, multiply by the co efficient of x" in f (x) =0), and obtain • • • g(x"), whose vanishing expresses the required condition. This function of • • • , x" is called the resultant of the two equations. The vanishing of their resultant is therefore a necessary and sufficient condition that two equations shall have at least one root in common. Since the resultant as written is a symmetric function of the roots • • • , x", it is a rational (also integral, by the factor co) function of the coefficients of both equations. Denote this function by [ f, g]. Then [1, d= o is called the result of eliminating x from the two equations.

The actual elimination is easily performed by Sylvester's

method. First, it is shown in the theory of determinants, that a necessary and sufficient condition that n linear and homogeneous equations in n unknown shall have a common set of solutions, other than the trivial one in which each unknown is zero, is that the determinant of the coefficients vanish. Apply this to f(x) = o, of degree m, and g(x)=o, of degree n, as follows:— Multiply the first equation throughout by i, x, • • - , in turn, and the second by 1, x, • • • , This gives n-{-m equations linear and homogeneous in r, x, • • • , The determinant of this set equated to zero is the required elimination.

Discriminants.

Occasionally in physical problems it is suf ficient to know whether a given equation has real roots, and, if so, how many. A graph will usually give the desired information most readily. Algebraically, the question is answered for de grees 2, 3, 4 by examining the simplest rational integral function of the coefficients whose vanishing is the condition that the given equation shall have a pair of equal roots. This function is called the discriminant of the given equation. From the foregoing statements concerning multiple roots and resultants, a sufficient condition that f(x)=o shall have a pair of equal roots is that the resultant of f(x)=o and f' (x)=o vanish. Thus, for the left of which is the discriminant of the given equation.

An alternative definition is the following. Let

• • • , x„ be the n roots of f(x) =o, and write for b> a. There are thus precisely (n— 0+ (n— 2) + • • • + 1 differences bab, as a runs through i, 2, • • • , n—i. The product of the squares of these differences, multiplied by co2"-2 (where co is the coefficient of x" in the stated equation), is a rational integral function of the coefficients of f(x) = o, since it is symmetric in the roots and the factor cancels all denominators. This function is taken as the discriminant of f(x)=o, as it obviously satisfies all the requirements of the definition. It is easily proved that the two methods of finding the discriminant lead to the same result.

As specimens of the information furnished by discriminants,

examples for real equations of the second and third degrees will suffice. The discriminant of is If this equation is real, its roots are real and distinct, real and equal, imaginary and distinct, according as the discriminant is greater than, equal to, or less than zero. Likewise, a real equation of the third degree has three distinct real roots, or one real root and two conjugate imaginary roots, or at least two equal roots, according as the discriminant of the equation is greater than, less than, or equal to zero. Discriminants are of capital impor tance in solutions by radicals.

Location of Roots.

We have seen that the problem of solv ing an equation with given numerical coefficients is reducible to that of finding the positive real roots of real equations. Ac cordingly, it is assumed in the following discussion that f(x) = o is a real equation. The first step in finding the real roots, positive, zero, or negative, is to isolate them. A real root r of f(x) = o is said to be isolated when two real numbers, a, b, between which r lies, and between which lies no other root of f(x) = o, are known. The graphical method, as already indicated, is useful here as a reconnaissance, but usually more powerful weapons must be applied. One is Rene Descartes' rule of signs which gives, in most cases, some information regarding the total number of real roots. Let f(x) be x"—i+ • • • +c", the coefficients being real numbers, positive, zero or negative, If change the signs throughout f(x) =o. Thus, without loss of generality, we assume o. Write down now all the signs of the non-zero coefficients in the order in which they occur in f(x). A change from + to — , or from — to +, is called a variation. Count the variations. Thus, in + + — — — + — there are three. Des cartes' rule states that f(x) = o has as many positive real roots, or fewer by an even number, as there are variations. The roots of x) = o being the negatives of those (x) = o, the rule, ap plied to = o, gives similar information regarding the num ber of negative real roots. This rule, however, may fail to tell us anything of value. Its proof is quite simple.

A conclusive method for isolating the roots was discovered

in 1829 by J. C. F. Sturm. Let f'(x) be the first derivative of f(x). Write f(x) = f, and similarly, in what follows, for all polynomials in x. Proceed as in finding the highest common factor of f, f'. Let qi be the quotient and the remainder at the first step. Then f Before using as the next divisor, change its sign, and write — — f so that f = q i f' — Divide f' by ; denote the remainder, with its sign changed, by Con tinue thus with all remainders. For simplicity suppose first that f(x)=o has no pair of equal roots. The last changed remainder, will then be a constant ; o. The sequence of changed re mainders, with f, f' prefixed, viz., f(x), f'(x), 12(x), ... ,fn_i(x), fn, is called the set of Sturm functions for f(x).

Now let a, b be real numbers, neither a root of f(x)=o, and let a < b. To find the number of real roots of f(x)=o lying between a and b, put x = a in the Sturm functions, and delete any terms that then vanish. Count the variations of sign (as in Descartes' rule) in the resulting sequence of real numbers. Let there be V. variations. Proceed similarly with b, and obtain Vb. Then the number of real roots between a and b is V a — Vb. In particular, if a= — oo , b= +00. Sturm's theorem thus gives the total number of real roots when we attend only to the signs attached to the highest powers of x in his functions. (The com putations are usually laborious, but awkward fractions can be avoided by multiplying each dividend by a properly chosen posi tive constant before dividing ). Next, if f(x) =o has multiple roots, is still the number of real roots between a and b, provided each multiple root be counted once only. In practice, however, it is simplest to get rid of the multiple roots first, by the method already indicated.

A less powerful theorem by the French physician F. D. Budan (1807), proved by J. B. J. Fourier about 1829, involves less com putation than Sturm's, and is often usable. Let the jth derivative of f(x); replace Sturm's sequence by f(x), (x), (x) , • • • , (x) , and proceed in this sequence with the same a, b as before, pre cisely as in calculating Vb for Sturm's. Then, if a root of multiplicity m be now counted as m roots, for this se quence is either the number of real roots of f(x)=o between a and b, or exceeds it by a positive whole number.

It is sometimes convenient to know an upper limit L to the value of the real roots of f(x)=o. Let G be the greatest of the numerical values of the coefficients • • • , c,,. If the first negative coefficient is preceded by precisely s coefficients that are greater than, or equal to, zero, then L = 1 + . Another upper limit is as follows: If is negative, change all signs in f(x)=o. Divide then the numerical value of each negative co efficient by the sum of all those positive coefficients that precede it. Let Q be the greatest of these quotients. Then 1+Q is the upper limit in question.

Computation of Roots.

When a real root of a real equation f(x) =o has been isolated by any of the methods suggested, it may be calculated, digit by digit, by any one of several arith metical processes, the commonest of which is that, named after W. G. Horner (1819), which was known to the Chinese mathe maticians of the 13th century, and Isaac Newton's of about 1675. Newton's method has the great advantage of being ap plicable to equations other than algebraic. It can be explained by his own example, — 2X — 5 = o. This has only one root be tween 2 and 3. To find this root, replace x by 2+h, where h is necessarily between o and 1. The terms in and being neglected, in comparison with h, in the new equation we get h=o-1 as a first approximation to h, and hence the root is roughly 2.1. For the next approximation, replace h by o i+k in the h-equation, and obtain Omit the terms, which will be small in comparison with k; then k is approximately —0•0054. From h=o•i+k we get h = o •o946, giving 2.0946 as an approximate value of the required root. By repeating the process a sufficient number of times any desired degree of accuracy is attainable. The method amounts to retaining only the first two terms in the Taylor series (see TAYLOR'S THEOREM) for f(r+h), on the supposition that h is small in comparison with the approximate value r of a real root of f(x)=0; so that o=f (r+h) = f (r) -}-hf (r) approximately, and h= — fir)/ (r) ; whence =r-1- h is the first approximation, The process is repeated with = — f /f ; and = r+h-F is the second approximation. In Newton's example, h=0•I, In Horner's method (the one usually given in elementary texts on algebra, except those most popular on the continent of Europe) a set of equations, one for each digit of the required root, is obtained successively from the given equation. The process is an arithmetical restatement and refinement of the crude graphical method, and amounts to successive shifts of the origin and magnifications of the scale. As already pointed out, it is sufficient to discuss the method for positive real roots only. The essence of the method is the successive diminution of the roots of the given equation by the smaller member in successive pairs of positive real numbers; i.e., we gradually creep up on the root from behind. Suppose, for example, that a real root of f(x) =o has been isolated between 200 and 300. If we can construct an equation f, (x) = o, whose roots are those off(x)=o each diminished by 200, then = o will have one and only one root between o and ioo. By any of the proposed methods locate this root; say it is between 6o and 70. Then f(x)=o has a root between 26o and 27o. Construct whose roots are those of = o diminished by 6o, and repeat the argument. Then has a root between o and 10, which may be located as before; e.g., between 5 and 6. Then 5 is the third digit of the required root of f(x)=o, and we next form = o, whose roots are those of f2(x)=o diminished by 5, and =o has a root between o and 1, say 0•2. The required root so far is 265.2, and the process can be continued to any prescribed number of digits. It will automatically terminate, or yield a repeating decimal, if the root is a rational number. The rational roots, if any, are however best obtained first by trial by examining the constant term of f(x) as already indicated.

The essential detail of constructing the equation fi(x) =o whose roots are those of f(x)=o each diminished by the positive number h, is performed by synthetic division according to the following theorem, which is an immediate consequence of the expansion of f (x+h) by Taylor's theorem. If f(x) . • . +cn be divided by x — h, let the quotient be and the remainder Divide by x — h; let the quotient be and the remainder Continue thus to n divisions. The last quotient is say the last remainder is rl. Then yn—lx+rn• Many devices for shortening the labour of Horner's method are explained in treatises on equations; in particular the last digit (at least) that is required can, in general, be obtained by simple division. Horner's method is, beyond any question, the most practical yet devised for the numerical solution of equations with numerical coefficients. Other methods of solving numerical equations are an impracticable one by continued fractions, due to Lagrange, and others by expansions in infinite series. The latter has recently been reconsidered by E. T. Whittaker, who expresses the coefficients in the series for a root in terms of de terminants that can be easily computed.

Algebraic Solutions of Equations of the Second, Third

and Fourth Degrees.—These equations are called respectively the quadratic, the cubic, and the biquadratic. The initial step in seeking a solution by radicals is usually the reduction of the given equation to another containing fewer powers of the unknown. In particular, the second highest power of the unknown, if present, may be removed as follows : In • - • +c„= o put x= y+k. The coefficient of is then Hence by choosing k = we obtain an equation in y lacking the second term. If the y-equation can be solved, the roots of the x-equation are obtained from x = Thus, for the general quadratic, = o, we have k = the y-equation is = K, where K ° (co k+c2)/co, _ - 4C0 and hence, from y = ± K, x=y+k, we have the roots of the x-equation in the usual form. For the general cubic co = o, the value of k is - the substitution x = y - gives where If are the roots of the y-equation, and those of the x-equation, xj = ys (j = 1, 2, 3)• Before Lagrange the solution of = o was achieved by ingenious substitutions based on nothing, apparently, but skill in guessing. Thus, by choosing y=z-p/3z, Francois Viete (Vieta) in 1591 obtained z6+qz3- = o, a quadratic in whence he obtained three permissible values of y, and thence The final formulae, reproduced in texts on algebra, were first published in the Ars Magna (1545) by Hieronimo Cardano, who had obtained them from Tartaglia (" the stutterer," true name Nicolo Fontana) by questionable means. In numerical work, Horner's method is greatly preferable to direct substitu tion into Tartaglia's formulae. When all three roots are real, the so-called irreducible case, the formulae involve cube roots of imaginaries and are then necessarily worthless. In this case a usable trigonometric solution is found by putting y = tz in py+q = o, and identifying the resulting equation with cos3 8 = 4 6 - 3 cos 9, by means of z=cos8, t= cos39= The last gives the 3 values of z, namely cos 8, cos(8+ 12o°), cos(9+ 240°), whose numerical values can be obtained from trigonometric tables. Finally the values of y are found by multi plying those of z by t. The irreducible case has a plethoric literature of its own.

The general biquadratic may be taken in the form = o.

Its solution, due to Ludovici Ferrari, but first published in the Ars Magna, was found by adding to both sides and imposing the condition that the new left-hand member be iden tically the square of x2+a1 x/2+h, where h is to be found. By comparison of coefficients in the assumed identity, and sub sequent elimination of m, a cubic for h is obtained. This cubic is called the resolvent, or reducing, cubic; its 3 roots enable us to find m and b. The solution of the biquadratic is thus finally reduced to that of two quadratics, (mx+b) whose 4 roots are those required. The ultimate formulae, with the value of h inserted, are too complicated to be usable. Solu tions by means of elliptic functions are known, but they also are mere algebraic curiosities.

There is a vast literature on cubic and biquadratic equations. Little of it is to-day of any vital mathematical consequence, and the most of it is of no practical value. Nevertheless, the incessant activity of nearly three centuries reflected in this accumulation of algebraic lore was not wholly futile, for without at least some of it, the sure clue to the maze could probably not have been discovered. To appreciate the true magnitude of this early work, the modern algebraist should attempt to restore it for himself, with but such tools and notations as its creators had. He will rise from his efforts with a new respect for his forbears.

A new era began with Lagrange in 177o. In an illuminating critique of his predecessors' solutions of the general cubic and biquadratic, he observed that the solution by radicals of any algebraic equation can be made to depend upon that of another, now called the resolvent, which may or may not be easier to solve. Thus, for equations of degree 5, the resolvent is of degree 6. The roots of a resolvent equation are rational functions of those of the original. By such considerations, and others arising naturally from them, Lagrange transposed the problem of solution by radicals to a profound study of rational functions of the roots of equations, and in particular to an investigation of the number of distinct values, which such functions of the roots, considered as independent variables, assume under permutations of the roots. This work contains a germ of the modern theory founded by Galois, in which the theory of substitution groups plays a central part.

An idea of Lagrange's attack can be gained from a brief re consideration of px+q = o, which will also prepare the way for the Galois theory. Let the roots be The discrimi nant of this equation is the square of (xi-x3) (x2-x3).

Let co be an imaginary root of 1 =o, and write F G= The three functions D have the significant property that any substitution on which leaves one of these functions unchanged, leaves also the other two unchanged; for example, The totality of substitutions on n independent variables which leave a given function of those variables unchanged, form a, group; the function is said to belong to the group. Lagrange proved that, if several functions belong to the same group, any one of them is a rational function of each of the others. Hence each of is a rational function of D, and therefore each is rational in the square of a polynomial in p, q, since being the discriminant, is rational and integral in the coefficients of the equation. The coefficient of being zero in px+q = o, we have = o, which, with the values of F, G just indicated, gives three equations of degree 1 to solve for in terms of algebraic functions of p, q. Since the determinant of the set does not vanish, a solution exists, and it is thus known d priori that the general cubic is solvable by radicals. When elaborated, this method yields the roots explicitly. The biquad ratic is treated in a similar manner; the general equation of degree 5 cannot be so solved, for a reason that will appear in the concluding sections.

As Lagrange's great work was but a preliminary to that of the modern school, further discussion of it may be omitted. Con temporaneously with Lagrange, an Italian physician, Paolo Ruf fini, began issuing in 1799 a series of memoirs containing inci dentally theorems which would now be restated in terms of substitution groups on five letters, in an attempt to prove the general equation of the fourth degree algebraically unsolvable. He almost succeeded, but his projected proof, like Abel's, is incomplete, and the whole matter is to-day surveyed from the higher point of view of Galois. Before passing to this we exam ine a general process of which several applications have occurred in what preceded.

Transformations.-The importance of transforming a given equation into another, whose roots are particular functions of those of the original, was seen in the preceding sketches of the solutions, numerical or algebraic, of equations. Thus, an im portant detail of Horner's method was equivalent to the linear transformation x=y+k; the roots of the y-equation were those of the x-equation each diminished by k. This is one of the simplest examples of a more general transformation introduced in 1683 by Ehrenfried Walther Tschirnhausen (or Tschirnhaus), who attempted to solve all equations by reducing them to the form yn = A - an impossible project. Let • • • , be the roots = o. On dividing any polynomial P in a given root, say xi, by f the polynomial is reduced to another of degree n at most, in Tschirnhaus transformations are of the type y = P(x)/Q(x), where P, Q are polynomials of degree < n, and Q(x) vanishes for no root of f(x) = o; Q(x) may reduce to a numerical constant, in particular to 1. This transforms f(x) = o into an equation in y whose roots are (j 2, • • • , n). The y-equation can be obtained by elimination; the details usually are tedious.

The use of such transformations is evident from the following specimens. By a transformation y= where p, q do not involve x, the general cubic is reducible to = A , where A depends only on p, q and the coefficients of the cubic. More generally, by a linear transformation, or by a Tschirnhaus trans formation whose coefficients involve only i square root, the general equation of degree n can be reduced to an equa tion of degree n in y lacking the terms in Or again, by a Tschirnhaus transformation whose coefficients involve only r cube root and 3 square roots, the general equation of degree n is reducible to an equation in y of degree n lacking the terms in yn-i, yn-3. The last is of capital importance, for the general equation of degree 5, which is thus reducible to = o, a result obtained by E. S. Bring about 1786, and independently by G. B. Jerrard in 1827. This is one point of departure for the solution in terms of elliptic functions, as a similar equation appears naturally in the construction of elliptic functions whose periods are fifths of those of given functions.

Fields and Reducibility.

These concepts are fundamental in the modern theory. A field is a set of elements closed under addition, subtraction, multiplication, division, no divisor being the zero of the set. The elements may be numerical, or mere marks, or they may be partly one and partly the other. Marks can be considered as independent complex variables. Elements of a given field are said to be "rationally known." In the general equation . • • = o of degree n, the coefficients are independent complex variables. Rational functions of the roots • • • , are equal only when they are equal for all sets of values of the roots; i.e., the roots are considered as indeterminates. If, however, the coefficients of an equation are given numerical constants, equality of rational functions of the roots means equal ity of numerical values of the functions. It must be noticed that, although a substitution on the roots may change the form of a rational function, the numerical value of the function may re main unchanged. The substitutions leaving unchanged a rational function of the roots, considered as marks, form a group; in general the like is false for roots of numerical equations. If the coefficients of a rational function are rationally known, it is called rational, and similarly for rational relations between the roots. Functions not equal as just defined are called distinct; two functions are said to be unchanged by a substitution on the roots if the new function is equal, as defined, to the original. If all the coefficients of a polynomial P are in a given field F, P is said to be in F. If P is neither the product of two polynomials in F, nor a constant, P is called irreducible (in F).

Group of an Equation.

Let the roots . • • , of f(x) = o be distinct. For a proper choice of • • • , the function = • • . takes n! distinct values under the sym metric group on the roots. It can be shown that any rational function • • • , of the roots is a rational function of VI, say ; and, with a certain restriction which we may ignore here, if =----- r, S2, • • • , Sh is any substitution group on the roots, and if - • • , is the result of applying to • • • , then . • , =R(V;), where V; comes from applying Si to In particular then, each of • • • , is a rational function of Write and, whether F(V) is or is not reducible, let = o be the irreducible factor of F(V) such that = o has as a root; = o is called a Galois resolvent of f(x) = o. Each of its roots is a rational function of one of them. To solve f(x) = o is equivalent then to finding one root, say of its Galois resolvent.

The degree g of = o does not exceed n !; its g roots can be derived from by the substitutions of a group G, the so-called group of f(x) = o for the field of its coefficients. Every rational function of the roots which is unchanged by all the substitutions of G is rationally known; every rationally known rational func tion of the roots of f(x) = o is unchanged by all the substitutions of G; moreover, G is the smallest group having the first property, and the largest having the second. The group of the general equation of degree n is the symmetric group on the roots. If to the field of the coefficients (x) = o there be adjoined a rational function of the roots, giving an enlarged field which contains the original, and if this function belongs to a subgroup of G, the group of the equation is reduced to the subgroup.

Solvability by Radicals.—A group is simple only if its invariant subgroups are itself and the identity; non-simple groups are called composite. A subgroup of G other than G is called proper; an invariant proper subgroup not a subgroup of a larger in variant proper subgroup is called maximal. Let H be a maximal invariant proper subgroup of any group G; let K be a maximal invariant proper subgroup of H, and so on, till the identity group I. Then G, H, K, • • • , L, i is called a series of composition of G. Let the respective orders of these groups be g, h, k, . • • ,1, 1. Then g/h, h/k, • • • , l are integers. They are the same, except for order, for all series of composition for G, and are called the factors of composition of G, which is called solvable if and only if its factors of composition are all primes.

The crown of Galois's theory is the beautiful theorem that an algebraic equation is solvable by radicals if and only if its group for the field of its coefficients is solvable. By the known properties of the symmetric groups on n letters, it follows at once that when n> 4, the general equation of degree n is not solvable by radicals.

BIBLIOGRAPHY..

J. A. Serret, Algebre Superieure (1854) ; C. Jordan, Bibliography..—J. A. Serret, Algebre Superieure (1854) ; C. Jordan, Traite des Substitutions (187o) ; W. S. Burnside and A. Panton, Theory of Equations (1881) ; O. Balza, "Theory of Substitution Groups," Amer. Journal of Math., vol. xiii. (1891) ; E. Netto and F. N. Cole, Theory of Substitutions (1895) ; H. Weber, Lehrbuch der Algebra (1895) ; J. Petersen, Theorie des equations algebriques (trans. H. Laurent, 1897) ; Encyklopddie der Mathematischen Wissenschaften, vol. i. (1898, etc.) ; F. Cajori, Introduction to the Modern Theory of Equations (1904) ; L. E. Dickson, Elementary Theory of Equations (1914) ; G. A. Miller, H. F. Blichfeldt and L. E. Dickson, Theory of Groups (1916) ; L. E. Dickson, A First Course in the Theory of Equations (1922), Modern Algebraic Theories (1926). (E. T. B.)

roots, equation, real, fx, root, coefficients and degree