About AE   About NHM   Contact Us   Terms of Use   Copyright Info   Privacy Policy   Advertising Policies   Site Map
Science Education Reform    
Custom Search of AE Site
spacer spacer

Ruminating on Rubrics

An abstract of two related papers:

Howe, Alice A. 1997. Reliability Study on the Use of a Rubric in Elementary Science. Research Paper Presented in Partial Fulfillment of Requirements for the Degree of Master's of Arts. Adams State College. Alamoso, Colorado.
Liu, Katherine. 1995. Rubrics Revisited. The Science Teacher. October, 1995. Pages 49-51.

Abstract prepared by: Chuck Downing, PhD.

Previous papers in this section have "abstracted" single articles. This one is different in that it uses two sources for information presented. "Reliability Study on the Use of a Rubric in Elementary Science" is aresearch paper done by Alice Howe as part of her Master's program at Adams State College in Alamoso, Colorado. I attended her session at the NSTA Regional Convention in Denver, and obtained a copy of her paper from her advisor. "Rubrics Revisited" is an article by AE's own Kathy Liu. I remembered reading it in The Science Teacher, and remembered that it accelerated my thinking in this area.

If you are a "good" science teacher, you have changed in the recent past. You have changed your curriculum, your delivery, and your assessment. Well, at least you tried to change your assessment. What you probably discoveredwas that changing from traditional assessment to alternative assessment required more than a new answer sheet. This paper looks at one key element in assessment reform: scoring.

First, let's define some terms. While you don't have to agree with these definitions, you do have to know what is meant by their use in this paper. Understand that the definitions of assessment types are intentionallyrestrictive-there are times when each type assessment is the most appropriate.

Traditional Assessments. These "do not measure the broad range of scientific processes of higher order thinking skills." (Howe, p. 5) They involve answers that require recall of facts and/or recognition of knownphenomena.

Alternative Assessments. These normally extend beyond the paper/pencil boundary. Two types of alternative assessments are commonly used: authentic assessments and performance-based assessments. Authentic assessment is performed more in context with an activity-simulating "real life" situations. Performance-based assessment determines how a studentperforms on a given task.

Rubrics. These are specific sets of criteria that clearly define for both student and teacher what a range of acceptable and unacceptable performance looks like. Criteria define descriptors of ability at each level of performance and assign values to each level. Levels referred to areproficiency levels which describe a continuum from excellent to unacceptable product. If you are familiar with rubrics at all, it is probably through scoring of student written work. Rubrics for written work are fairly common. If you are like me, you have been suspicious of rubrics. After all, they are so subjective. Right?

Well, I've been researching this rubric mania for about two years. I've heard several people speak about rubrics and witnessed others demonstrate their use. What I have discovered is that rubrics are like a lot of other educational strategies-very effective if used properly; loaded with the potential for disaster in the hands of the inept, inexperienced, or insensitive instructor.

Kathy Liu describes two types of rubrics: traditional and additive. A traditional rubric for a PCR process is shown below. This is probably the type rubric with which you are familiar. Student work is graded holistically, usually by judging the paper from the high score end of the rubric and subtracting points based on interpretation of the criteria.

Score of 1, 2 ,3, 4 , or Honors is assigned according to the following criteria:


Science Content: Student understands the PCR process.

  1. Student does not understand the PCR process and its applications..
  2. Student understanding of the PCR process and its applications is vague.
  3. Student understands most of the PCR process and some of its applications.
  4. Student shows good understanding of the PCR process and its applications.
  5. Student has mastered the chemistry, dynamics and applications of the PCR process.

Collaborative Worker:

  1. Student does not participate and/or is disruptive.
  2. Student works, but not with other group members and/or does not take appropriate care of materials.
  3. Student does not facilitate learning of others. Handles some group roles well some of the time.
  4. Student sometimes facilitates learning of others. Handles group roles well most of the time.
  5. Student facilitates his/her learning and that of classmates. Takes any role in group and contributes to group process. (After Liu, p. 50)

Liu's second type rubric is the additive rubric. "With an additive rubric, students have to learn more content in greater depth to achieve higher levels." (Liu, p. 49) Take a look at the following example of an additive rubric for the same PCR experience and compare it to the traditional rubric above.

Achievement Level is assigned the values of

1, 2 ,3 (Level 2+), 4 (Level 3+), or Honors (Level 4+) according to the following criteria:


Science Content:Student understands the PCR process.

  1. Level 2 tasks attempted but not completed or mastered.
  2. Demonstrate how a primer works in the process of DNA replication. Prepare check cells for PCR.
  3. Demonstrate what happens to DNA during each step of the PCR process.
  4. Describe advantages and disadvantages of the PCR technique.
  5. Explain ways you might get more bands than predicted in a PCR.

Collaborative Worker:

  1. Participates but does not successfully complete one or more requirements of Level 2.
  2. Arrives on time with materials. Shows respect for others; cares for equipment and resources.
  3. Stays focused on assigned task and helps others do the same. Shares work equally.
  4. Facilitates the participation of all in group. Tutors and/or supports other students.
  5. Takes all group roles with equal skill. Assists others as they learn to do the same. (After Liu, p. 50)

With the additive rubric, each student knows the minimal level of learning expected (Level 2). To achieve higher levels, more specific content must be mastered. There are fewer areas ambiguity-students know what is expected-as opposed to terms such as "some" and "most" used in thetraditional rubric.

Liu lists five reasons for using rubrics:

  1. Rubrics tell students they must do a careful job. Information on the expected quality of the
    task performed is given to students.
  2. Rubrics set standards. Students know in advance what they have to do to achieve a certain level.
  3. Rubrics clarify expectations. When levels are based on a "minimum expectation" (e.g., Level 2), everyone knows what is required. This is especially important in heterogeneously-grouped classrooms
  4. Rubrics help students take responsibility for their own learning. Students use rubrics to help study information the teacher values.
  5. Rubrics have value to other stakeholders. Anyone (including parents and community members) seeing an additive rubric and a student score based on that rubric knows what content was mastered by that student.

If rubrics are so cool, why don't more teachers use them? Howe offers some insight into this aspect of rubrics as a strategy. Quality rubrics are not cast in stone-they are revised, based on studentwork. The use of "anchor papers" or exemplars at each level of achievementin constructing or modifying a rubric is advocated. By using student work as a basis, teachers are more likely to be realistic in their expectations of students. The measure of student achievement must be based on several similar tasks. Various researchers advocate scoring between six and 20 tasks before a determination of level of mastery accurately can be achieved.Well-designed rubrics should be shared with students. Most students aremuch more critical of their peers than teachers are of their students. Allowing students to use rubrics during organized, teacher-monitored peerreview sessions will generate higher quality products over time.

Teachers need training in the use of rubrics. Most people will not embrace a new or unfamiliar idea because of the "fear factor." One common problem with rubrics is the idea that there will be wider variation in scoring from one teacher to another using a rubric rather than when using an answer key. Studies have shown that trading student papers with another teacher and scoring them with a common teacher-generated rubric increases reliability, often to a correlation value beyond 0.80.

Howe's study included 47 very heterogeneous students: 29 "regular education;" 8 "at risk;" and 11 "special education" (including two with traumatic brain injuries). Three independent evaluators were used in scoring the students. Their inter-evaluator reliability was determined byscoring 15 pieces of work from each student. Inter-evaluator reliability was very high, between 0.85-0.93.

The rubric "scale" used in this study is unique and deserves comment. Rather than place numeric values on each item, an unnumbered line 10cm in length was included after each item. Evaluators marked their "score" on the line. Numeric values were determined by measuring from the left end pointin millimeters.

Interestingly enough, despite training of both evaluators and students in rubric use for this study, the result with the highest percentage of "undecided" evaluations was on confidence in the ratings of student work. This lack of confidence was common to both students and adult evaluators. Howe suggests that lack of collaboration and feedback as necessitated by the terms of the study contributed to this lack of confidence. She goes on to report it was significant that even with the feeling of uneasiness about usinig the instrument, the regular education students, at-risk students, and all three evaluators showed high reliability on their assessments. These results supported the rationale that clear criteria, as evidenced by arubric, positively impacts rating reliability. (p. 24)

One final finding of interest. Howe's study indicated mean reliability ratings of 0.90-0.94 between outside adult evaluators and regular education students evaluating the same pieces of work. Mean reliability ratings between outside adult evaluators and at-risk students evaluating the same pieces of work ranged from 0.87-0.94. Although special education students mean reliability ratings compared to the those of the outside evaluatorswas below the target 0.80 rating, comparative values still ranged from 0.75-0.92.

Both Kathy Liu and Alice Howe use rubrics in class and use them effectively. If you are unfamiliar with rubrics and would like to know more, here are some references you might use to begin accelerating your learning curve.

Suggested Bibliography

Jensen, K. 1995. Effective Rubric Design. The Science Teacher. 62(5): 34-37.

O'Neil, J. 1994. Making Assessment Meaningful - "Rubrics" Clarify Expectations, Yield Better Feedback. Alexandria, VA: Association for Supervision and Curriculum Development.

Pate, P. 1993. Designing Rubrics for Authentic Assessment. Middle School Journal. 25: 25-27.

Wiggins, G. 1994. Assessment Rubrics and Criteria: Design Tips and Models. Alexandria, VA: Association for Supervision and Curriculum Development.

Science Education Reform Index

Let's Collaborate Index

Custom Search on the AE Site