Home >> Encyclopedia-britannica-volume-8-part-2-edward-extract >> Expert to Otto Linne Erdmann >> Methods of Examination

Methods of Examination

Loading


METHODS OF EXAMINATION Examinations on traditional lines are usually carried out in the case of the most advanced work by a thesis only; in the other grades by means of (I) written papers; (2) oral tests; (3) prac tical tests; (4) a combination of two or more of these.

Thesis.

A thesis or dissertation is usually required only for the higher degrees of universities; it normally embodies the results of research into some branch of knowledge and is usually looked over by two or more specialists, whilst it is generally necessary to undergo a public examination in addition to the thesis submitted.

Oral Tests.

An oral examination provides the opportunity to test the individual as regards the range of his knowledge, and makes it possible to test many important qualities, such as readi ness of wit, presence of mind, common sense, which are not so easily tested by written papers. One useful extension of the oral examination is practised by the headmaster of secondary schools when interviewing candidates who have been qualified by a writ ten examination for admission into a secondary school, to decide whether they are suitable for admission into his school. The interview is also used on a still larger scale by some local authorities who examine orally candidates for senior or technical scholarships. Similarly, candidates who are successful at competi tive examinations for entrance into some of the public services, notably the navy, are subjected to an oral examination to de termine whether their personality justifies their entry.

Practical Tests.

Oral tests sometimes shade off into practical tests, as in testing foreign languages orally. Such tests are, in fact, designed to test predominantly the manipulative skill of the candidate, whether it be in science, craft-work, medicine or some skilled trade or profession. A doctor or surgeon, for instance, must show that he is familiar with the practical side of his calling, an engineer or plumber must show that he has already attained sufficient skill to be allowed to be certified as capable of following his trade.

Written Tests.

In written examinations the candidates attend for several sessions of one and a half to three hours' duration, and answer printed papers of questions under a prescribed rubric. The number of questions on a paper may vary from a very large number to one or two according to the type of examination. The answers required may consist of a symbol or single word, or a single essay of considerable length, as in the case of a university honours candidate.

Simple Written Examinations.

The simplest form of examination question involves merely a single mental operation and an answer which is right or wrong, e.g., What is twice two? Answer: Four. The children ultimately to be examined are taught an algebraic operation such as: if a=3, b=2, c=o, then ab=6, 3c=o, etc. They work through sets of examples on the operation until they are familiar with it under its various guises. They reach a stage when they can be said to know the operation and are ready for examination. The examiner prepares the exami nation paper by selecting a large number, say 40, of the simplest form of questions each involving this operation. He tries his question out on a group of children similar in character to those to be examined, and grades his 4o questions in order of difficulty until he is justified in the expectation that only one or two candidates will fail to answer the easiest question and that only one or two candidates will succeed in answering the hardest question. He then sets this examination paper to the group of candidates and is able to produce an order of merit showing the relative ability of these candidates to perform the algebraic operation tested.

A simple examination paper of this character has the merit of objectivity. The questions are simple, can only be correctly answered in one way, are graded in difficulty by an objective pre liminary test. The only variable characteristics which affect the result of the examination, apart from the difference in ability to do the algebraic operation, are the temperament and nerves of the candidates. Some pupils are not good examination subjects. Another element of uncertainty is introduced when examiner and teacher are not the same person, for even if the ground is common the teacher may have stressed some points more than others. Again, the moment more difficult questions are introduced into an examination paper a further element of chance is introduced; this is the chance that the form of the question will affect some candidates differently from others; for two questions of equal difficulty but of different forms will be answered with different success by some, at least, of the examinees.

When the subject of examination is English, the element of chance assumes greater importance. Children's knowledge of words and phrases, their ability to understand a short passage of connected prose is conditioned by many other influences than school. Hence children are very liable to suffer even in the simplest form of examination in English by the fact that they do not "know" a word or phrase, not having met beforehand with it, in the question paper. The examiner, even by the most exhaustive preliminary trials, cannot provide against such a possibility and the element of chance affects the scores made by some candidates.

Complex Examination Papers.

Examination papers in crease in complexity whenever two or more mental operations are tested. In arithmetic a paper of 20 questions might test 20 processes ; or ten questions in an English paper might test facility in the use and comprehension of language in ten different ways. In history, geography, science, etc., where the examiner is required to set a paper covering a prescribed syllabus, the test samples the candidates' acquaintance with the facts specified in the syllabus, and the first test of the quality of the examination paper refers to the goodness of his sampling. In such papers the element of chance tends to be concentrated upon the average candidate; the very best and the very worst candidates are definitely determined, for they are familiar or unfamiliar with the material put before them in the paper of questions, but the candidates of a little more than average ability may score below the average by a chance unfamiliarity with one question. In much the same fashion the candidates may be concentrated towards the average by the fact that they will have covered the different portions of a syllabus with different degrees of industry, and the chance that the ex aminer has emphasized different aspects of the subject in a differ ent way from some of the candidates, tends to mass the candi dates together in a large group where the scores are about half the maximum. The concentration of the incidence of luck upon the average candidates is one of the reasons why it is inexpedient to pay too much attention to the candidates who score ; in the neighbourhood of "half marks" there is a very high probabil ity that a difference of five or six marks between the scores of candidates represents the luck of the examination and nothing else.

The Use of Intelligence Tests in Examinations.

It would be impossible for methods of examination to remain unaffected by the recent development of mental tests. It was not long before mental tests themselves extended their boundaries beyond their native realm of intelligence, and invaded the province of school studies. The technique which had served to test general intelligence was carried over to the testing of academic attain ments. This procedure gave rise to standardized scholastic tests. Just as intelligence tests were made to form a scale of mental ages, so were scholastic tests made to form a scale of educational ages. When the scholastic tests were not standardized, when they were not intended to be used over and over again like an electric torch, but were intended to be used once only, like a lucifer match, then they became known (in the United States) as "new type tests." Ignoring, therefore, the distinction in purpose and in content between these three classes of new tests, and regarding them purely from the point of view of method, we find that a new style of examination has come into being, a style which stands in marked contrast with the old traditional type. And in connection therewith a large body of doctrine has been developed, and a large variety of statistical methods devised, all bearing upon the validity of tests. Prof. F. Y. Edgeworth, writing in 189o, long before the new methods were heard of, gives the following as a fundamental postulate : "The true or standard mark of any piece of work is the average of the marks given by a large number of competent examiners equally proficient in the subject and in structed as to the character and purpose of the examination." This postulate, of vital importance to the old examiner, is of no use at all to the new. For it assumes that judgments vary, and the new examiner regards judgments that vary as invalid. The marking, in fact, should be objective, in the sense that dif ferent examiners would inevitably give the same mark to the same examination product. This is possible only when the contents examined on have been so analysed that each question contains only one element, the answer to which must be either definitely right or definitely wrong. It thus carries one mark or no mark at all. Thus the essential difference between the old examination and the new lies in the analysis of the subject-matter into mark able elements. The new examiner analyses before the examina tion, the old examiner after the examination. The new examiner's analysis is complete and definite : it admits of no variation or extension on the part of the marker. The old examiner has a complex product, such as an essay, to mark, and either does not analyse it at all and judges by general impression, or analyses it into such vague factors as style, ideas, logical arrangement and so forth. It thus happens that the main superficial difference be tween the old examination and the new is that the former requires a small number of long answers, and the latter a large number of short answers.

The new examiner lays great stress on "reliability"—a technical term which means the degree to which the order of merit secured by one application of a series of tests agrees with the order se cured by a second application of the same series. The agreement is measured by the mathematical method of correlation. Tests which give a low coefficient of reliability are discredited and re jected. The new examiner, therefore, tests his tests before apply ing them. He "tries them out" on children who are not genuine examinees.

A subsidiary characteristic of the new testing is that it occa sionally probes a lower stratum of knowledge than is reached by the ordinary examination question. It does this through the "limited option" test. Here is an example : "The author of The Scarlet Letter is [Hawthorne, Poe, Stevenson, Kipling] ." The testee has to underline the right author. He need not recall the author's name, he need only recognize it ; and recognition is much easier than definite recall.

The new school tries to keep steadily in view the purpose for which an examination is designed, and modifies the questions ac cording as the purpose is diagnostic, prognostic, selective or evaluative. It is maintained that though an examination may serve more than one purpose, it can only serve one purpose well.

How far has the new examination influenced the old? In Eng land, slightly ; in America, profoundly. In the United States the new type of tests has in certain instances superseded the tradi tional examination. European countries have shown a much more critical attitude towards them. On one side of the Atlantic the new testing is almost universal ; on the other side it is sporadic and experimental. The rivalry between the two systems lies not so much in practice as in principles. What the ultimate issue of the contest will be no man can at present tell: much experiment and research must yet take place before a final adjustment is reached. In the meantime it seems fairly clear that each system has its points of weakness and its points of strength. The main merit of the new system is that it succeeds as far as seems possible in eliminating the element of chance; its main defect is that it fails to test—directly at any rate—constructive and creative ability. This is a grave defect, a defect from which the traditional examina tion is singularly free. But the freedom is secured at the expense of objectivity : the examination result is more liable to suffer from the vagaries of personal judgment.

Since the merits of one system are the defects of the other, it seems reasonable to regard the two as complementary. It is not improbable that both will, in the course of time, be brought within one scheme, yielding a final estimate of ability which would be of higher value than either of the constituent estimates. However that may be, the scientific precision of the new system cannot be ignored. It is a significant fact that no thesis embodying the results of psychological or educational research would be accepted by any university unless those results had been measured by the new methods. (See also below, under United States.) Mental tests for examination purposes are actually used by 21 local authorities. An enquiry by the Psychological Society shows that three-quarters of those who answered the questionnaire were making use of them mainly as providing an additional criterion to the orthodox tests, or in adjudicating on borderline cases. They were also found of special value in reducing the disabilities of pupils from small rural schools.

tests, candidates, examiner, questions, paper, test and question