If our country hopes to improve teaching quality, we need to know what contributes to effective teaching. In the UTQ study, we are examining several instruments that are leading candidates for improved measurement of teaching quality. This will help us to develop a rich understanding of how these instruments relate to value-added measures of teachers' contributions to student learning. From this thorough investigation, we will develop recommendations for the next generation of quality measures. Unlike most studies, which consider only a very limited set of measures and teaching contexts, the broad breadth of this study will allow for a truly comprehensive investigation of teaching quality and the generalizability of findings across contexts.
The overall purpose of the UTQ project is to provide a solid foundation for the development of robust teaching evaluation systems that attend to central characteristics of teaching practice, including teachers' knowledge of content and how to teach it, effective instructional practices, and how to engage students in effective classroom assignments.
We see this foundational research as a first step in rethinking and redesigning teaching evaluation practices. We hope to understand which tools can be used in the near future to develop evaluation programs that can help teachers improve their teaching practice. The measurement tools we are studying include direct observations of classroom teaching, the analysis of classroom assignments and student work, paper-and-pencil measures of teachers' pedagogical and content knowledge for teaching in mathematics and language arts, and measures based on standardized student achievement test scores. Our study is comparing all of these measures in order to better understand the potential of each measure to support the improvement of teaching and learning.
Research supports what parents, students, and school administrators have always "known": teachers matter. Multiple value-added studies, for example, have found considerable variation in teacher effects, with some teachers adding a great deal to students' academic learning in a given year and others adding much less. Other studies have shown that these effects persist and predict students' future outcomes. Thus, the identification of effective teachers is crucial to fostering the achievement of all students. Further, careful understanding of the theoretical constructs that characterize effective teachers can also stimulate efforts to support the development of promising teachers over the course of their careers.
A problem, however, is that our current system of identifying effective teachers-at the points of hiring, at tenure, and during post-tenure reviews-is broken (e.g. Wilson, 2008). For example, research demonstrates that widely used surface markers of professional preparation (such as certification status and coursework) only weakly predict actual teaching effectiveness (Goe, 2007; Wayne & Youngs, 2003) and that teacher evaluation practices in too many schools are burdensome, outdated, and unable to provide sound guidance about how to improve teaching practices (Danielson & McGreal, 2000). In particular, once teachers enter practice, they are often evaluated infrequently, and with methods that have troubled the research and practice communities for years. Though potentially powerful for providing support and feedback, these evaluation systems are rarely grounded in strong theories of instructional effectiveness and are rarely validated against subsequent teacher performance and student outcomes. Lacking the tools to look closely at teaching, many administrators rely on their own informal hunches about who is effective and who is not. This practice is not without merits, but provides insufficient evidence for making high-stakes decisions on teacher tenure and teacher compensation policies.
The problems with current practice have prompted some policy-makers and researchers to consider using value-added measures to evaluate individual teachers. Value-added measures use complex analytic methods applied to longitudinal student achievement data to estimate teacher effects that are separate from other inputs into student learning. Though these models are promising, they have important methodological and political limitations (Braun, 2004; Clotfelter, Ladd, & Vigdor, 2007; Gitomer, 2008; Kupermintz, 2003; Ladd, 2007; Lockwood et al., 2007; Raudenbush, 2004). Beyond methodological concerns, many educators are suspicious of using value-added measures because of the limitations of test scores. They are concerned that tests only cover a limited portion of the learning teachers and schools promote, that the psychometric properties of the tests are not sufficient to support the complex statistical analyses used by value-added modeling, and that evaluating teachers by test scores will create even greater motivation for teachers to emphasize the material covered by the tests at the expense of presenting other learning opportunities for students.
However, even if student assessments and value-added methods were dramatically improved, they still would not provide information about the attributes of high quality teachers or guidance for improving teachers. Value-added measures tell us which teachers make a difference to student learning, but not why they make a difference. Without knowing why teachers make a difference, however, we don't know what (if anything) can be done to improve teaching.
Recent developments also suggest that there are many factors associated with teaching quality, and therefore, a range of measures may be needed to fully explain the large teacher-to-teacher differences that have been observed. However, as we move toward measuring multiple dimensions of teaching quality, we must pay careful attention to the demands these evaluation systems place on practitioners. We must develop less burdensome and less costly, but still valid, methods for measuring teaching quality.
- Braun, H. I. (2004). Value-added modeling: What does due diligence require? Princeton, NJ: Educational Testing Service.
- Clotfelter, C. T., Ladd, H. F., & Vigdor, J. L. (2007). How and why do teacher credentials matter for student achievement? (NBER Working Paper No. 12828). Cambridge, MA: National Bureau of Economic Research.
- Danielson, C., & McGreal, T. L. (2000). Teacher evaluation to enhance professional practice. Princeton, NJ: Educational Testing Service.
- Gitomer, D. H. (2008). Crisp measurement and messy context: A clash of assumptions and metaphors. In D. H. Gitomer (Ed.) Measurement issues and assessment for teaching quality. Thousand Oaks, CA: Sage Publications.
- Goe, L. (2007). The link between teacher quality and student outcomes. Washington, DC: National Comprehensive Center for Teacher Quality.
- Kupermintz, H. (2003). Teacher effects and teacher effectiveness: A validity investigation of the Tennessee value added assessment system. Educational Evaluation and Policy Analysis, 25(3), 287-298.
- Ladd, H. (2007, November). Holding schools accountable revisited. Paper presented at the 2007 APPAM Fall Research Conference, Washington DC.
- Lockwood, J. R., McCaffrey, D. F., Hamilton, L. S., Stecher, B. M., Le, V.-N., & Martinez, J. F. (2007). The sensitivity of value-added teacher effect estimates to different mathematics achievement measures. Journal of Educational Measurement, 44(1), 47-67.
- Raudenbush, S. W. (2004). What are value-added models estimating and what does this imply for statistical practice? Journal of Educational and Behavioral Statistics, 29(1), 121-129.
- Wayne, A. J., & Youngs, P. (2003). Teacher characteristics and student achievement gains: A review. Review of Educational Research, 73(1), 89-122.
- Wilson, S. (2008). Measuring teacher quality for professional entry. In D. H. Gitomer (Ed.), Measurement issues and assessment for teaching quality. Thousand Oaks, CA: Sage Publications.
The core design of this study is to collect a range of information about teaching and learning and compare that information in order to better understand its meaning. Data collection will occur over two years in 225 mathematics and 225 English language arts (ELA) middle school classrooms. Measures were selected on the basis of several factors, including a strong conceptual and theoretical basis, broad acceptance in the field, and/or availability (particularly in the case of student test scores). UTQ is working with the respective instrument developers to adapt and refine the measures for the UTQ study. Specifically, the following measures will be used in this study:
- The Classroom Assessment Scoring System for secondary settings (CLASS-S) developed by researchers at the University of Virginia
- Charlotte Danielson's Framework for Teaching (FFT)
- The Learning Mathematics for Teaching Project's Mathematical Quality of Instruction (MQI)
- Stanford University's Protocol for Language Arts Teaching Observation (PLATO)
- The Intellectual Demand Assignment Protocol (IDAP) created by the Consortium on Chicago School Research
- Tests of teacher knowledge