Assessing Teaching Practice

April 15, 2012

Tony Milanowski

Tony Milanowski

Done well, evaluation of teacher performance has the potential to improve student achievement. Done indifferently, it has the potential to stagnate student achievement. Too often, teachers are evaluated by a single administrator, with minimal training, who rates the teacher satisfactory or unsatisfactory, based on a single classroom observation.

Policymakers are disappointed by the tendency of current performance evaluation practices to rate almost every teacher alike. Many have turned to performance measures based on students' standardized test scores. But even when used correctly, these measures don't provide enough information to improve teacher performance. Outcome measures, such as those based on value-added, and measures of instructional practice need to be used together.

Former WCER researcher Tony Milanowski, now with Westat, urges evaluators to adopt recent advances in measuring teaching practice and to use them along with the developing technology for measuring outcomes. Milanowski recommends that states and districts develop practice measures by translating their visions of effective instruction into explicit models of competency. These would include what teachers need to know and do to carry out state or district priorities. The model can serve as the foundation for a set of practice measures that include observational rubrics for

  • Performance evaluation and management,
  • Performance assessments that would be part of tenure and pay systems, and
  • Walkthrough tools for day-to-day performance management and for evaluating the implementation of instructional strategies.

Developing a teacher competency model doesn't require reinventing the wheel, Milanowski says. Designers can start with state teaching standards, for example, those developed by the National Board for Professional Teaching Standards. It's good to begin with an already existing model, he says, because that captures the aspects of teaching that are similar across states and districts. Evaluation designers can then add competencies that reflect the needs of a particular state or district, and those needed to support local instructional initiatives and strategies.

Milanowski says three measurement systems are needed to assess teaching practice: (1) observations of classroom practice, for use in periodic formal teacher evaluation, (2) teaching "work samples," or performance assessments, for decisions including granting tenure or movement up a career ladder, and (3) classroom walkthroughs that provide information for everyday performance management.

Productive classroom observations use a measurement system that is reliable and valid. Evaluations should help teachers learn from the results. Feedback should be specific and should refer to the rating scale. It should help teachers understand why they received the scores they did. A trained person should be available to assist teachers who want to improve. Other kinds of professional development should be available and linked to the competencies.

Work samples, or performance assessments, complement classroom observations. They better portray the teacher's content knowledge, instructional planning skills, use of formative assessment data, and differentiation of instruction. Teachers describe examples of practice related to specific competencies in response to prompts or questions. They share artifacts, including unit or lesson plans, assignments, completed student work, and assessments.

Formal observations, even frequent ones, can't provide a clear picture of typical instructional practice, especially about how key instructional strategies are routinely implemented. Classroom walkthroughs (brief, focused visits) are more efficient for this purpose. Walkthroughs get school leaders, instructional coaches, and mentors into classrooms frequently enough to see whether key instructional strategies are being implemented.

Together, the three measurement systems provide a rounded picture of instructional practice. These tools can then be combined with measures of teaching productivity, such as value-added.

Value-added Measures

As an emerging technology for measuring student performance, and thereby a tool for evaluating teachers, value-added measures have raised some questions, but Milanowski says they are the best productivity measure available. Value-added measures compare average student academic growth in a school to the average growth of similar students across a district, and they can account for such factors as a student's prior performance level or socioeconomic status. Many states are adopting value-added methods, and recent federal education policy has promoted them (e.g., the Race to the Top competition).

The best foundation for making management decisions about teachers is using value-added estimates of classroom productivity together with assessments of teaching practice. But Milanowski cautions that averaging them into one overall measure of teacher performance is not as simple as it seems. They represent two different constructs and have different measurement properties. The two scores can be used together, so just adding them together would be like adding a person's weight and height.

Setting cutoff points for the value-added measure will require substantial thought, because there is no natural cutoff point that represents acceptable performance. Schools should use multiple years of value-added data for such decisions as tenure, pay raises, or termination, and the cut-off for tenure or termination decisions should take into account the variability of the vale-added measures. Generally, the cut off should be set well below the value-added average.

Value-added productivity measures should also be used to evaluate the quality of teaching practice measures and the effects of human capital management programs.

[Adapted from the article Strategic Measures of Teacher Performance. Kappan, v92 N7, April 2011, pp 19-25.] .