It’s difficult to design a test such that all those taking it do not all do well nor all do poorly. For instance, suppose the SAT were to consist of the single question, “Fill in the blank: A, B, __, D, E, …” We would expect something like 99% pass rates. (Only an enlightened optimist would say everybody would get this right.)
Again, suppose the test were to consist of the single question, “Derive from first principles twenty-six dimensional Bosonic string theory.” Pass rates would be south of 1%.
Neither version could ascertain the differences in ability between people.
What is our hidden premise for these conclusions? The pool of examinees. If this pool were to consist of professional, working physicists pass rates for either version would be high. But if it were to consist of one-year olds, pass rates for either would be low or zero.
The SAT’s pool are near-graduation high schoolers. This pool is not static; it changes and has changed characteristics dramatically since the SAT was first introduced in 1901 (the first multiple-choice version was issued in 1926). Perhaps the most profound shift over this period is the percentage of children who attend high school and who took the test. In 1926, this percentage was low. Only the most highly educated children completed high school and took the SAT. In 2013, the percentage was high. Children of all abilities enrolled in high school and sat the exam.
If the test were static, this increase in the pool, because it contains larger and growing proportions of less able children, would drive scores lower and lower, maybe so low that the majority of takers would now fail (especially considering the SAT used to penalize for wrong answers). The static test would resemble (in a crude way) our string theory version. Discrimination would suffer.
The widening pool, along with various other cultural changes too depressing to contemplate, also caused an easing of the material taught in high school thus leaving kids even less prepared for the more difficult versions of the SAT. Perhaps the most fundamental alteration is in reading. Kids (and teachers) read much less now than a century ago. But the SAT assumes reading ability. (Similar watering downs occur with increasing frequency in college; consider, for example, “remedial” reading and math university courses using materials historically taught in middle school.)
Some even claim that intelligence on the whole has been decreasing for reasons of biology. Maybe this is so, but it is very difficult to tease out environmental versus biological influences as they change through time.
Anyway, it’s impossible for anyone but a bureaucrat or academic to say that kids are growing smarter or are better educated. Given the road our culture (and politics) are taking, there is no reversal of these downward trends in sight.
This means it was right for ETS to dumb down the SAT.
If they did not, the quondam SAT would have larger and growing clusters of scores at the low end and fewer and more strung out scores at the high end. Discriminating between students would thus become more and more difficult. (What’s ideal is a test the result of which is a spread out distribution of scores over the enter range of possibilities with the mean score somewhere near the middle.) Considering that the goal of the SAT is discrimination, no other course of action makes sense.
Dumbing down the SAT is thus like a clothing manufacturer retooling his patterns to reflect a population which is growing shorter and squatter. Further, it makes no sense to decry the manufacturer’s sensible and prudent decision.
This isn’t the first time the SAT has been dumbed down, either.
In 2005 the SAT famously removed analogy questions. Example, P-VALUES : STATISTICAL THEORY :: MARXISM : POLITICAL THEORY. Analogies are difficult and require higher levels of thinking than other types of question. For the same reasons noted above, the proportion of kids doing poorly on the analogies was growing too large, so out they went.
Interestingly, besides eliminating penalties for wrong answers and simplifying its math, in this latest purge the SAT will remove “obscure” and “esoteric” vocabulary words (the Settle Times lists “prevaricator” and “sagacious”). Both analogies and “obscure” words, of course, relate to reading. Reading ability among high schoolers must be shriveling. Well, no surprise there.
The timing is also curious. It was only a short nine years since the last major restructuring. This indicates the pace of decline is accelerating. And this is despite the increasing efforts (and funding) of teachers to “teach the test” to the exclusion of more important matters.
Or maybe “despite” is the wrong word.
For all non US citizens who don’t have a clue what this blog post is about:
http://en.wikipedia.org/wiki/SAT_Reasoning_Test
Thanks, Hans.
“It’s difficult to design a test such that all those taking it do not all do well nor all do poorly.” I hear you brother, although I would suggest that the all do well is more difficult than the all do poorly.
“If this pool were to consist of professional, working physicists pass rates for either version would be high.” I doubt that this is true for the boson string theory question.
“Some even claim that intelligence on the whole has been decreasing for reasons of biology.” What about the Flynn effect? Wouldn’t this mean that the SAT should be tougher and not easier? They may be dumbing down the test because the quality of education has declined but probably not because of changes in intelligence. I suggest that the reason is quite different. It is to justify a greater number of student admission to university without the obvious embarrassment of accepting low SAT scores. Also an easier test may reduce the range of scores rather than increase it as you suggest. This would allow for a more arbitrary selection process based on non-academic criteria.
“quondam”. Have you discovered the thesaurus or are we to believe that this is part of your everyday vocabulary?
“Considering that the goal of the SAT is discrimination, no other course of action makes sense.” I believe that there are two separate goals here held by different groups of people. One does indeed favour discrimination but the other is horrified by it. The latter currently is in political favour.
A separate question(s): Why are SAT scores offset by, I believe, 200 points? Why not just grade from zero to 100%? I have my own explanation for this and it is similar to the points that I made above. It should also be noted that the SAT is not the only test that does this.
Obsession with the SAT is peculiar since its power for predicting college grades is quite low (10-15%). Schools use it to classify themselves on “selectivity” scales, but test scores are correlated with the wealth of students so they’re selecting on opportunity as well as aptitude. Some schools now use scores only for merit scholarship qualifications instead of admission standards. Males and females perform differently on the test too — males do better on the math, females on the reading. Some of this may result from guessing strategies (males are more willing to guess, but females are reluctant, according to a recent, limited study).
As for the timing of the changes, it’s mostly a response to its competitor, the ACT test, which has been gaining in popularity. The ACT is a friendlier test that reflects more of what was studied in high school. The SAT has been more of a cleverness and speed test. Most schools take either and students find the ACT “easier.”
I did fair to OK on the SAT’s when I took them many years ago but ironically, when I had to take the Miller Analogy Test for graduate school admission 2 years ago, I scored in the top 2% at age 57 so I am happy to agree with the erudite and insightful Dr. Briggs when he wrote “Analogies are difficult and require higher levels of thinking than other types of question.”
“Obsession with the SAT is peculiar since its power for predicting college grades is quite low (10-15%).” Certainly if a college only accepts the top SAT students the predictive value will be low because the range of test scores is narrow. Remember that the low SAT students were not admitted. This is not a valid criticism of the test. In the same way a test of football ability in high school would not be a good predictor of success in the NFL for those drafted into the league.
“test scores are correlated with the wealth of students”. There is the possibility that the wealth of the parents is highly correlated with the intelligence of the children. Genetic inheritance and all that.
“males are more willing to guess, but females are reluctant”. You reject the possibility of inherent differences between the sexes?
“Most schools take either and students find the ACT “easier.— There may very well be an arms race between the tests, but the underlying driving force may still be political. After all it is the use of the tests by the colleges that counts and not the preference of the students.
Anybody who has been to college knows that men are better at math than women, but as Larry Summers found out you better not say that in public. The feminists will be demanding your head for saying politically incorrect things. Consider the winers in the Putnam mathematical competition.
http://en.wikipedia.org/wiki/Putnam_exam
@ Gary
“….but test scores are correlated with the wealth of students so they’re selecting on opportunity as well as aptitude.”
Are people with higher incomes smarter than people with lower incomes? And thus, their offspring are smarter?
Surely it depends on whether the role of the SAT is to distinguish between students in a single year, or to distinguish between students across all years?
If you are using them for recruitment, say, and you have a choice of a 30-year old with a medium score and a 20-year old with a high score, how do you weight them?
I think getting rid of the essay question is good. The essay leaves too much up to the subjective judgement of the grader. It’s an easy place for ideologies and political correctness to sneak in.
“quondamâ€. Have you discovered the thesaurus or are we to believe that this is part of your everyday vocabulary?
Certainly for me the term is quotidian. (And neither “prevaricator†nor “sagacious†strike me as esoteric. Wait. Is “esoteric” and esoteric word?)
+ + +
Analogies were removed because girls scored uniformly lower than boys. There is a regular step in test calibration that seeks to identify questions ‘always’ or ‘never’ answered correctly, and questions answered incorrectly by specified subgroups of takers.
Consequently, it is almost impossible nowadays to make an analogy and not find it misconstrued by some GenXer or younger. A is to B as C is to D is taken as a claim that A is just like C, or that B is like D. Why can we not react to test differentials by acting on improving the teaching rather than trimming the yardstick?
YOS, dang I wish that I knew about “quotidian” when I wrote that sentence!
“improving the teaching rather than trimming the yardstick?” Another good one.
“but test scores are correlated with the wealth of student”
Boy has this trivially true, yet shallow and politically correct interpretation been making the rounds lately.
Let’s define smart as people who score well on the SAT.
Smart people tend to have higher incomes. Smart people tend to meet and marry other smart people. Smart couples tend to pass along superior genes to their offspring (never, ever, say this out loud in progressive company). Smart couples tend to be responsible parents and instill a value of education in their offspring.
And the cycle continues.
And the thing is, this isn’t some double secret plan all the “wealthy” people have. They are quite transparent about how this process helps one become successful. Feel free to follow this plan, and pay no-one!
“Obsession…SAT….predicting college grades…”
What, exactly, is the correlation between grades and learning? Consider graduate school, where a “C” is akin to failing. Or grade inflation in general.
I think the SAT is a better assessment of knowledge (at least when it had analogies) than a GPA.
An oldie but goodie- West Viginia 8th grade exam from 1931…
http://www.washingtonpost.com/wp-srv/education/v/tests.pdf
Why can we not react to test differentials by acting on improving the teaching rather than trimming the yardstick?
I suspect it’s because the number of teachers who are unable to understand the SAT analogy questions is increasing. Gosh, if the teachers can’t figure ’em out there must be something wrong with the test!
As to SAT prepping: these strike me as learning how to read and understand the questions along with practice in answering them. I don’t see this as a bad thing unless SAT has been attempting to measure how the students answer when surprised by novel questions.
If the standards achieved by pupils are changing, then it does make sense to change the SAT to reflect the distribution. On the other hand, doing so conceals the underlying change.
Perhaps a useful compromise might be to make the necessary changes – but also publicise the fact of, and reason for, the change. Of course, this is NOT going to happen.
P-VALUES : STATISTICAL THEORY :: MARXISM : POLITICAL THEORY
is, of course, the wrong answer.
The correct answer is:
P-VALUES : STATISTICAL THEORY :: PC-VALUES : POLITICAL THEORY
@ Jim S and @ Tom Sharf
You will need to define “smarter” I suppose. Is the kid who makes millions as a pro athlete but who scored a 10 on the Wonderlic less smart than the one who scored a perfect 1600 on the SAT but couldn’t run a successful business? All I’m saying is that higher SAT scores *tend* to be achieved by those who had better educational opportunities, good schools, and good families — all things that are associated with higher socio-economic status. So scores measure more than native aptitude. That doesn’t make them bad. And recognizing it doesn’t bow to political correctness. I actually think the SAT/ACT tests, flawed as they are, are acceptable instruments for judging college readiness because they’re standardized, unlike high school grades, recommendations, and the application essay.
Here’s a secret: for most students the SAT score really doesn’t matter. You can get a good education (or what passes for it today) at just about any school. High scores might help marginally for getting into a prestigious school. If that’s what you’re after, fine. But to learn, really all you need is a good mentor/adviser who will show you how to teach yourself. Information is ubiquitous now, knowing how to evaluate and use it is what learning is about. Another secret: getting into college is far less important than getting everything possible out of your educational investment, wherever it’s located.
@GP
There may be some correlation between grades and learning, but probably not much. That’s why colleges are moving toward different kinds of learning assessments besides grades. These have their drawbacks too. Learning is a difficult thing to measure until you can evaluate performance in real life tasks.
Gary,
We may disagree on cause (high SAT) and effect (wealth) here. They are intertwined so it will probably be hard to conclude that debate.
Many (maybe not you) quote the wealth correlation along with access to high cost SAT mentoring to infer that the wealthy have an unfair advantage. I think the advantage is mostly due to family culture and better genetics. Tutoring I believe has only shown to improve scores by about 30 points on average.
I agree that the reality is you can get a good education almost anywhere (or self educate). The cat fight over getting into only the most selective schools is a bit misplaced from a purely knowledge viewpoint.
However, if you graduate from MIT, Stanford, or Harvard, you will get better opportunities in today’s world. It’s not so much that these schools give students a markedly better education than a slightly less selective school, it is that the public believes they do. Look at who the media quotes in science articles, the same schools come up over and over.
“males are more willing to guess, but females are reluctantâ€.
The SAT supposedly corrects for that by penalizing wrong answers more than blank answers, though I’ve heard that will change with the next revision of the SAT. Here’s the theory behind their correction:
If every question has 5 possible answers, only one of which was right, someone guessing at all the answers will get 20% of the questions correct, and 80% wrong, but really knows as much as the person who left all the answers blank. By making a correct answer score 1 point, a blank score 0 point, and a wrong answer score -0.25 point, the person who guessed will get 20 points for their correct answers, but lose 20 points for their wrong answers, scoring zero, the same as the person who left all the answers blank.
@ Tom
We’re on the same page regarding public perception and the prestige factor.
@ Anthony
Being able to eliminate one or two choices and then guessing has an advantage even with the penalty. The study I read found the females to be more risk averse and also less likely to guess when the odds favored them.
Learning assessment is moving beyond grading — see http://www.learningoutcomeassessment.org/knowingwhatstudentsknowandcando.html