Students are invited to complete student evaluations of teaching (SETs) for their classes at the end of each term at the University of Oregon. In principle, SETs have a dual purpose. Faculty can use SET results to help them identify areas of their teaching that need attention and improvement; that is, they have a formative purpose. SETs also have a summative purpose: they are used to inform evaluations of a faculty member’s teaching as part of decisions about tenure and promotion, contract renewal, and merit raises.
The latter purpose, especially, relies on the assumption that SETs are a valid measure of teaching effectiveness (assumed to be related to student learning). The research literature on SETs is extensive and stretches back nearly 100 years, but over that time little consensus has emerged about whether there is in fact a correlation between SET ratings and student learning, or even how one should measure student learning.
Many—but not all—studies show a modest positive correlation between SET results and student learning  . But recent work, including a careful meta-analysis of previous results , indicates that there is no correlation between SET ratings and student learning after controlling for sample size and publication bias.
Other problems arise as well. For example, there are indications that students often do not interpret questions and terminology on SETs in the same way faculty do so care must be taken with wording of questions and interpretation of results. Persistent questions also remain (see, for example ) regarding students’ ability to assess teaching effectiveness, the use of SETs to compare faculty in the absence of information about the spread of scores within a relevant group of faculty, and whether student response rates on non-mandatory SETs accurately reflect the true distribution of student opinion. In addition, there is evidence that SET scores vary depending on class size, the level of the class, the discipline, and prior preparation of the students.
Most disturbing, though, are results indicating that SETs show bias in gender , race  , and ethnicity , with women, African-Americans, and Latino faculty receiving lower scores on SETs than their white male colleagues.
While there is debate about the validity, utility, and fairness of SETs, there is agreement in the research literature that if they are used at all, SETs should be only one of several tools used to assess teaching   . Peer reviews, self-evaluations, administrator reviews, student interviews, and alumni ratings are alternative strategies that can be combined to create a more representative picture of a faculty member’s teaching.
 P. Spooren, B. Brockx and D. Mortelmans, "On the Validity of Student Evaluation of Teaching: The State of the Art," Review of Educational Research, vol. 83, no. 4, pp. 598-642, 2013.
 S. L. Benton and W. E. Cashin, "IDEA Paper No. 50: Student Ratings of Teaching: A Summary of Research and Literature," The IDEA Center, Manhattan, KS, 2012.
 B. Uttl, C. A. White and D. W. Gonzalez, "Meta-analysis of faculty's teaching effectiveness: Student evaluation of teaching ratings and student learning are not related," Studies in Educational Evaluation, vol. xxx, p. xxx, 2016.
 C. Lauer, "A comparison of faculty and student perspectives on course evaluation terminology," in To Improve the Academy: Resources for faculty, instructional, and organizational development, J. Groccia and C. L., Eds., San Francisco, Wiley and Sons, Inc., 2012, pp. 195-211.
 P. B. Stark and R. Freishtat, "An Evaluation of Course Evaluations," ScienceOpen Research, 2014.
 L. McNell, A. Driscoll and A. N. Hunt, "What's In a Name: Exposing Gender Bias in Student Ratings of Teaching," Innovation in Higher Education, vol. 40, pp. 291-303, 2015.
 A. Boring, K. Ottoboni and P. A. Stark, "Student evaluations of teaching (mostly) do not measure teaching effectiveness," ScienceOpen Research, 2016.
 B. P. Smith and B. Hawkins, "Examining Student Evaluations of Black College Faculty: Does Race Matter?," The Journal of Negro Education, vol. 80, no. 2, pp. 149-162, 2011.
 B. P. Smith, "Student Ratings of Teaching Effectiveness: An Analysis of End-of-Course Faculty Evaluations," College Student Journal, vol. 41, no. 4, pp. 788-800, 2007.
 G. Smith and K. J. Anderson, "Students' Ratings of Professors: The Teaching Style Contingency for Latino/a Professors," Journal of Latinos & Education, vol. 4, no. 2, pp. 115-136, 2005.
 R. Berk, "A Survey of 12 Strategies to Measure Teaching Effectiveness," International Journal of Teaching and Learning on Higher Education, vol. 17, pp. 48-62, 2005.