The Judgement Calibration test is supposed to do two things: first, make sure that students have really read the material and know its content; and second, test whether they can properly calibrate their confidence regarding the truth of their own answers.”
This is really cool Simon, and awesome that you actually got permission to give actual grades by this mechanism. Curious how it works out in practice!
I just tried out the judgment calibration test with my Munich students, with 25 relatively difficult-to-assess statements.
Main finding: Roughly half score better and half score worse than they would have done if they had just uniformly answered “50%”. I guess that this indicates that the test was slightly too difficult. Notably, I had included many statements about orders of magnitude (e.g. energy released by the sun in a year, or time scales), and those seem challenging.
But the best students had a mean square deviation of the estimate from the truth value of about 16%, which I guess is quite good.
“2. Judgement calibration test
The Judgement Calibration test is supposed to do two things: first, make sure that students have really read the material and know its content; and second, test whether they can properly calibrate their confidence regarding the truth of their own answers.”
This is really cool Simon, and awesome that you actually got permission to give actual grades by this mechanism. Curious how it works out in practice!
Great that you like this testing idea, Siebe!
I just tried out the judgment calibration test with my Munich students, with 25 relatively difficult-to-assess statements.
Main finding: Roughly half score better and half score worse than they would have done if they had just uniformly answered “50%”. I guess that this indicates that the test was slightly too difficult. Notably, I had included many statements about orders of magnitude (e.g. energy released by the sun in a year, or time scales), and those seem challenging.
But the best students had a mean square deviation of the estimate from the truth value of about 16%, which I guess is quite good.