Among the most frequent questions we receive here at Veritas Prep headquarters (sadly, “How much am I allowed to tip my instructor?” is not one of them!) is the genre of “On my most recent practice test, I got X right and Y wrong and only Z wrong in a row… Why was my score higher/lower than my other test with A right and B wrong and C wrong in a row?” inquiries from students desperately trying to understand the GMAT scoring algorithm. We’ve talked previously in this space about why simply counting rights and wrongs isn’t all that great a predictor of your score. And perhaps the best advice possible relates to our Sentence Correction advice here a few months ago: Accept that there are some things you can’t change and focus on making a difference where you can.
But we also support everyone’s desire to leave no stone unturned in pursuit of a high GMAT score and everyone’s intellectual curiosity with regard to computer-adaptive testing. So with the full disclosure that these items won’t help you game the system and that your best move is to turn that intellectual curiosity toward mastering GMAT concepts and strategies, here are four major reasons that your response pattern — did you miss more questions early in the test vs. late in the test; did you miss consecutive questions or more sporadic questions, etc. — won’t help you predict your score:
1) The all-important A-parameter.
Item Response Theory incorporates three metrics for each “item” (or “question” or “problem”): the B parameter is the closest measurement to pure “difficulty”. The C parameter is essentially a measure of likelihood that a correct answer can be guessed. And the A parameter tells the scoring system how much to weight that item. Yes, some problems “count” more than others do (and not because of position on the test).
Why is that? Think of your own life; if you were going to, say, buy a condo in your city, you’d probably ask several people for their opinion on things like the real estate market in that area, mortgage rates, the additional costs of home ownership, the potential for renting it if you were to move, etc. And you’d value each opinion differently. Your very risk-loving friend may not have the opinion to value highest on “Will I be able to sell this at a profit if I get transferred to a new city?” (his answer is “The market always goes up!”) whereas his opinion on the neighborhood itself might be very valuable (“Don’t underestimate how nice it will be to live within a block of the blue line train”). Well, GMAT questions are similar: Some are extremely predictive (e.g. 90% of those scoring over 700 get it right, and only 10% of those scoring 690 or worse do) and others are only somewhat predictive (60% of those 700+ get this right, but only 45% of those below 700 do; here getting it right whispers “above 700” whereas before it screams it).
So while you may want to look at your practice test and try to determine where it’s better to position your “misses,” you’ll never know the A-values of any of the questions, so you just can’t tell which problems impacted your score the most.
2) Content balancing.
OK, you might then say, the test should theoretically always be trying to serve the highest value questions, so shouldn’t the larger A-parameters come out first? Not necessarily. The GMAT values balanced content to a very high degree: It’s not fair if you see a dozen geometry problems and your friend only sees two, or if you see the less time-consuming Data Sufficiency questions early in the test while someone else budgets their early time on problem solving and gets a break when the last ten are all shorter problems. So the test forces certain content to be delivered at certain times, regardless of whether the A-parameter for those problems is high or low. By the end of the test you’ll have seen various content areas and A-parameters… You just won’t know where the highest value questions took place.
3) Experimental items.
In order to know what those A, B, and C parameters are, the GMAT has to test its questions on a variety of users. So on each section, several problems just won’t count — they’re only there for research. And this can be true of practice tests, too (the Veritas Prep tests, for example, do contain experimental questions). So although your analysis of your response pattern may say that you missed three in a row on this test and gotten eight right in a row on the other, in reality those streaks could be a lot shorter if one or more of those questions didn’t count. And, again, you just won’t know whether a problem counted or not, so you can’t fully read into your response pattern to determine how the test should have been scored.
4) Item delivery vs. Score calculation.
One common prediction people make about GMAT scoring is that missing multiple problems in a row hurts your score substantially more than missing problems scattered throughout the test. The thinking goes that after one question wrong the system has to reconsider how smart it thought you were; then after two it knows for sure that you’re not as smart as advertised; and by the third it’s in just asking “How bad is he?” In reality, however, as you’ve read above, the “get it right –> harder question; get it wrong –> easier question” delivery system is a bit more nuanced and inclusive of experimentals and content balancing than people think. So it doesn’t work quite like the conventional wisdom suggests.
What’s more, even when the test delivers you an easier question and then an even easier question, it’s not directly calculating your score question by question. It’s estimating your score question-by-question in order to serve you the most meaningful questions it can, but it calculates your score by running its algorithm across all questions you’ve seen. So while missing three questions in a row might lower the current estimate of your ability and mean that you’ll get served a slightly easier question next, you can also recover over the next handful of questions. And then when the system runs your score factoring in the A, B, and C parameters of all of your responses to “live” (not experimental) questions, it doesn’t factor in the order in which those questions were presented — it only cares about the statistics. So while it’s certainly a good idea to get off to a good start in the first handful of problems and to avoid streaks of several consecutive misses, the rationale for that is more that avoiding early or prolonged droughts just raises your degree of difficulty. If you get 5 in a row wrong, you need to get several in a row right to even that out, and you can’t afford the kinds of mental errors that tend to be common and natural on a high-stakes exam. If you do manage to get the next several right, however, you can certainly overcome that dry spell.
In summary, it’s only natural to look at your practice tests and try to determine how the score was calculated and how you can use that system to your advantage. In reality, however, there are several unseen factors that affect your score that you just won’t ever see or know, so the best use of that curiosity and energy is learning from your mistakes so that the computer — however it’s programmed — has no choice but to give you the score that you want.
By Brian Galvin