Part 8 (1/2)

On the other hand, the child with an I Q of 120 or above is almost never found below the grade for his chronological age, and occasionally he is one or two grades above. Wherever located, his work is always ”superior”

or ”very superior,” and the evidence suggests strongly that it would probably remain so even if extra promotions were granted.

CORRELATION BETWEEN I Q AND THE TEACHERS' ESTIMATES OF THE CHILDREN'S INTELLIGENCE. By the Pearson formula the correlation found between the I Q's and the teachers' rankings on a scale of five was .48. This is about what others have found, and is both high enough and low enough to be significant. That it is moderately high in so far corroborates the tests. That it is not higher means that either the teachers or the tests have made a good many mistakes.

When the data were searched for evidence on this point, it was found, as we have shown in Chapter II, that the fault was plainly on the part of the teachers. The serious mistakes were nearly all made with children who were either over age or under age for their grade, mostly the former. In estimating children's intelligence, just as in grading their school success, the teachers often failed to take account of the age factor. For example, the child whose mental age was, say, two years below normal, and who was enrolled in a cla.s.s with children about two years younger than himself, was often graded ”average” in intelligence.

The tendency of teachers is to estimate a child's intelligence according to the quality of his school work _in the grade where he happens to be located_. This results in overestimating the intelligence of older, r.e.t.a.r.ded children, and underestimating the intelligence of the younger, advanced children. The disagreements between the tests and the teachers'

estimates are thus found, when a.n.a.lyzed, to confirm the validity of the test method rather than to bring it under suspicion.

THE VALIDITY OF THE INDIVIDUAL TESTS. The validity of each test was checked up by measuring it against the scale as a whole in the manner described on p. 55. For example, if 10-year-old children having 11-year intelligence succeed with a given test decidedly better than 10-year-old children who have 9-year intelligence, then either this test must be accepted as valid or the scale as a whole must be rejected. Since we know, however, that the scale as a whole has at least a reasonably high degree of reliability, this method becomes a sure and ready means of judging the worth of a test.

When the tests were tried out in this way it was found that some of those which have been most criticized have in reality a high correlation with intelligence. Among these are naming the days of the week, giving the value of stamps, counting thirteen pennies, giving differences between president and king, finding rhymes, giving age, distinguis.h.i.+ng right and left, and interpretation of pictures. Others having a high reliability are the vocabulary tests, arithmetical reasoning, giving differences, copying a diamond, giving date, repeating digits in reverse order, interpretation of fables, the dissected sentence test, naming sixty words, finding omissions in pictures, and recognizing absurdities.

Among the somewhat less satisfactory tests are the following: repeating digits (direct order), naming coins, distinguis.h.i.+ng forenoon and afternoon, defining in terms of use, drawing designs from memory, and aesthetic comparison. Binet's ”line suggestion” test correlated so little with intelligence that it had to be thrown out. The same was also true of two of the new tests which we had added to the series for try-out.

Tests showing a medium correlation with the scale as a whole include arranging weights, executing three commissions, naming colors, giving number of fingers, describing pictures, naming the months, making change, giving superior definitions, finding similarities, reading for memories, reversing hands of clock, defining abstract words, problems of fact, bow-knot, induction test, and comprehension questions.

A test which makes a good showing on this criterion of agreement with the scale as a whole becomes immune to theoretical criticisms. Whatever it appears to be from mere inspection, it is a real measure of intelligence. Henceforth it stands or falls with the scale as a whole.

The reader will understand, of course, that no single test used alone will determine accurately the general level of intelligence. A great many tests are required; and for two reasons: (1) because intelligence has many aspects; and (2) in order to overcome the accidental influences of training or environment. If many tests are used no one of them need show more than a moderately high correlation with the scale as a whole.

As stated by Binet, ”Let the tests be rough, if there are only enough of them.”

CHAPTER VI

THE SIGNIFICANCE OF VARIOUS INTELLIGENCE QUOTIENTS

FREQUENCY OF DIFFERENT DEGREES OF INTELLIGENCE. Before we can interpret the results of an examination it is necessary to know how frequently an I Q of the size found occurs among unselected children. Our tests of 1000 unselected children enable us to answer this question with some degree of definiteness. A study of these 1000 I Q's shows the following significant facts:--

The lowest 1 % go to 70 or below, the highest 1 % reach 130 or above ” ” 2 % ” ” 73 ” ” ” ” 2 % ” 128 ” ”

” ” 3 % ” ” 76 ” ” ” ” 3 % ” 125 ” ”

” ” 5 % ” ” 78 ” ” ” ” 5 % ” 122 ” ”

” ” 10 % ” ” 85 ” ” ” ” 10 % ” 116 ” ”

” ” 15 % ” ” 88 ” ” ” ” 15 % ” 113 ” ”

” ” 20 % ” ” 91 ” ” ” ” 20 % ” 110 ” ”

” ” 25 % ” ” 92 ” ” ” ” 25 % ” 108 ” ”

” ” 33+1/3% ” ” 95 ” ” ” ” 33+1/3% ” 106 ” ”

Or, to put some of the above facts in another form:--

The child reaching 110 is equaled or excelled by 20 out of 100 ” ” ” (about) 115 ” ” ” ” ” 10 ” ” ”

” ” ” ” 125 ” ” ” ” ” 3 ” ” ”

” ” ” ” 130 ” ” ” ” ” 1 ” ” ”

Conversely, we may say regarding the subnormals that:--

The child testing at (about) 90 is equaled or excelled by 80 out of 100 ” ” ” ” ” 85 ” ” ” ” ” 90 ” ” ”