Reliability Concerns for Classroom Formative Assessment

This week, I started teaching a course called Assessment: Theory and Practice to graduate students in the Leadership program at Saint Mary’s University. More than 70% of the students in the class are K-12 teachers. In a course like this, reliability and validity are of course big topics. In fact, next week’s class (5 hours of class time) will be spent on these two topics. So, I guess it’s the right time to return to these topics on this blog.

My last blog post on this topic was on May 25 on reliability concerns for classroom summative assessments. Today, let’s discuss reliability concerns for classroom formative assessment. Recall, we defined reliability as the degree to which students’ results remain consistent over time or over replications of an assessment procedure. An important point to remember is that reliability is a necessary, but insufficient, condition for valid score-based inferences. That is, you cannot make valid inferences from a student’s test score unless the test is reliable.

So, what are the reliability concerns for classroom formative assessment? Before we answer that question, let’s first define what we mean by “formative assessment.” Let’s use Popham’s definition (he always seems to have nice clear definitions. Professor Popham (2009) says “Formative assessment is not a test. Rather, it is an ongoing process in which teachers use test-elicited evidence to adjust their instruction or students use it to adjust their learning tactics.” Basically, formative assessment is a process by which teachers use assessment results to improve teaching and to help guide student learning. You can also find more of our thoughts on formative assessment here.

Let’s return to Nitko and Brookhart (2011) to summarize the reliability concerns for classroom formative assessment and what you can do to address them.

1. For oral questioning, you should be concerned about the dependability of the interpretation of answer(s) and the accuracy of teacher judgment. To increase reliability, you should:

  • Use sufficient number of questions or observations
  • Allow enough time

I really like this last suggestion. All too often, I find that teachers (me included) tend to answer their own questions too quickly when students don’t readily provide an answer. Give students enough time to think about an answer. If they don’t answer in 10-15 seconds, wait it out. Give them several more minutes. Usually, they eventually provide a very good answer.

2. For observations, you should also be concerned about the dependability of interpretation and accuracy of teacher judgment. To increase reliability, you should:

  • Interpret the observed behavior with the most likely and reasonable explanation.
  • Use a systematic procedure to observe students.

3. For self-assessment, you should be concerned with the rater (self) judgment. To increase reliability, you should:

  • Have a systematic procedure for rating. Often, it is good to use a rubric.
  • Instruct students on the use of the rubric.

So, how reliable are your assessment procedures? How do you increase reliability? How do you measure (compute) the reliability of your assessment procedures? Please leave a comment. I’d love to hear and learn from you.

In the next post in this series, I will consider validity concerns.


Nitko, A. J. & Brookhart, S. M. (2011). Educational Assessment of Students (6th Edition). Boston, MA: Pearson.

Popham, J. (2009). A process – not a test. Educational Leadership, 66(7), 85-85.

Posted on


Custom Wordpress Website created by Wizzy Wig Web Design, Minneapolis MN