Annotated Readings of  Select Student Rating Research

ANNOTATED READINGS - provided by Mike Theall 

Anderson, L. & Krathwohl, D. (2001). A Taxonomy for Learning, Teaching and Assessing: A Revision of Bloom's Taxonomy of Educational Objectives. New York: Longman.

Though not a book about evaluation, this revision of the Bloom et. al. original taxonomy (1956) provides a framework for investigating issues that relate to teaching, learning, assessment, and evaluation.  The revised taxonomy includes dimensions of knowledge as well as dimensions of cognitive process and incorporates “creation” as a new cognitive dimension.


Angelo, T. A., & Cross, K. P.  (1993)  Classroom assessment techniques:  a handbook for college teachers.  (2nd ed.)  San Francisco:  Jossey Bass.

This widely read text is derived from work at the National Center for Research into Postsecondary Teaching and Learning at the University of Michigan.  The book offers extensive description and application of various strategies for gathering and using information to assess student learning.  Since evidence of student learning is an important part of the larger evaluation of faculty performance, knowledge and use of effective assessment techniques is critical to good evaluation practice.


Arreola, R. A.  (2007) (3rd ed.)  Developing a comprehensive faculty evaluation system.     San Francisco: Jossey-Bass

This is the best practical guide to developing and implementing a comprehensive evaluation system.  It contain, a thorough overview, grounded discussion of operational issues, samples of instruments and process, and proven techniques for generating dialogue and consensus.  If one could have only one book to guide faculty evaluation, this should be it.

Arreola, R. A., Theall, M. & Aleamoni, L. M.  (2003) Beyond scholarship.  Recognizing the multiple roles of the professoriate.  Paper presented at the 84th annual meeting of the American Educational research Association.  Chicago: April 22.  Available on-line at:

This paper and associated website allow readers to access the full presentation as well as several matrices which describe in some detail, the ‘meta-profession’ concept and its application to faculty work.  A separate section is included demonstrating how ‘the scholarship of teaching and learning” represents a special case of the meta-professional construct.


Benton, S. L., & Cashin, W. E. (2012)  Student ratings of teaching: A summary of research and literature. IDEA Paper # 50. Manhattan, KS: The IDEA Center.  Available at: 

This is the most recent, major review of ratings research and it reaches conclusions similar to those of other major reviews (e.g., Arreola, 2007; Braskamp & Ory, 1994, Centra, 2003, Feldman’s studies [1976 through 1997],  Marsh, 1989, 2007), namely, that student ratings, when properly conducted, analyzed, reported, and interpreted, can provide information useful for both formative and summative purposes. 


Berk, R. A. (2006) Thirteen strategies to measure college teaching. Sterling, VA: Stylus Publishing.

This practical guide provides basic and helpful information for faculty and administrators who use ratings for formative and summative purposes.  Berk espouses a “360 degree” approach that aligns with the recommendations of other major writers.  He encourages the use of multiple sources of data of several kinds and with specific connection to the intents of the evaluation process and the needs of stakeholders.  The book is written in Berk’s unique humorous style, and includes a foreword by Michael Theall.


Birnbaum, R.  (1988)  How colleges work:  the cybernetics of academic organization and leadership.  San Francisco:  Jossey Bass

___________  (1992)  How academic leadership works.  Understanding success and failure in the college presidency.  San Francisco:  Jossey Bass.

___________  (2001)  Management fads in higher education. Where they come from, what they do, and why they fail.  San Francisco:  Jossey Bass.

Birnbaum’s work is clear, insightful, realistic, and very readable.  His books provide many descriptive examples and the work is grounded in a substantial and broad literature base.  Few authors have described higher education as clearly and realistically as Birnbaum.  While this work centers on the ways higher education works, it is important to evaluation because it provides a better understanding of the context in which evaluation takes place.  Evaluation systems must be responsive to the environments in which they exist, thus this literature is relevant and useful.


Bloom, S.  (1998)   Structure and ideology in medical education:  an analysis of resistance to change. Journal of Health and Social Behavior, 29, 294-306.

This is an interesting view of the change process and a sobering description of  “reform without change”, that is, the futility of responding to the need to alter or improve something without implementing any long-lasting or meaningful changes.  In conjunction with Birnbaum’s text on “fads”, it alerts readers to the need for making change relevant and connecting to stakeholder frames of reference.  By its very nature, evaluation is often threatening, and it is potentially an explosive issue.  Faculty evaluation is often (unfortunately) done in a ‘pro forma’ manner in response to pressure for reform or change and Bloom’s article points out the dangers and frustrations associated with this error. 

Boyer, E. L.  (1990)  Scholarship reconsidered:  the priorities of the professoriate.  San Francisco:  Jossey Bass.

This short work has had tremendous impact in the decade since its publication, generating dialogue and action on campuses of all types.  It’s reconceptualization of scholarship, particularly the introduction of “the scholarship of teaching” as an equal partner with discovery, integration, and application has led to numerous projects and attempts to broaden the roles of faculty.   Faculty evaluation must include clear and public definition of its expectations, process, and intended uses.  By further defining the “priorities of the professoriate, Boyer opened a dialogue that has brought more attention to teaching and the scholarship that underlies it.

Braskamp, L. A. & Ory, J. C. (1994).  Assessing faculty work.  San Francisco:  Jossey-Bass.


Centra, J. A. (1993). Reflective faculty evaluation: enhancing teaching and determining faculty effectiveness.  San Francisco: Jossey-Bass.

These are two of the most substantial, books on the general topic of faculty evaluation, both by widely respected authors. They are comprehensive and discuss a range of issues, more theoretical than the Arreola text, but excellent additions to a basic library on the topic.

Brinko, K. T. (2012) Practically speaking: A sourcebook for instructional consultants in higher education (2nd ed.).  Stillwater, OK: New Forums Press.

This second edition (original edition, 1997, edited by Kate Brinko & Robert Menges) remains focused on instructional consultation, but it includes chapters on incorporating the use of evaluation results as part of the consultation process.  It is especially useful with respect to formative evaluation processes and bridges the sometimes artificial divide between evaluation and faculty development practices.


Chickering, A. W. & Gamson, Z.F.  (1987)  Seven principles for good practice in undergraduate  education.”  Wingspread Journal, 9 (2), special insert.

The seven principles have been used as guidelines for assessment in traditional and even on-line instruction and there has been widespread use of the faculty, student, and institutional inventories devised to measure implementation of the principles on campuses.  There is debate about the extent to which these general guidelines should be used in evaluating faculty performance, but their impact can not be overlooked.

Cross, K. P. & Steadman, M. H.  (1996)  Classroom research:  implementing the scholarship of teaching.   San Francisco:  Jossey Basss.

This work expands on Cross’ work in classroom assessment to incorporate Boyer’s definitions of scholarship and to provide teachers with strategies for conducting classroom research that supports better understanding of classroom process and simultaneously has the rigor to be incorporated into more traditional outlets for scholarship.  Classroom research can be legitimate scholarship and its results can lead to improved teaching and learning.  As such, the efforts of faculty who undertake classroom research should be considered valid evidence of professional activity and/or scholarship and should be among the considerations in evaluating performance.

Diamond, R. M.  (2002)  A field guide to academic leadership.  San Francisco:  Jossey Bass

This book considers a broad range of topics related to effective leadership in higher education including a section that deals with assessment, faculty evaluation, program review, and the faculty reward system.


Farmer, D. W.  (1999)  Institutional improvement and motivated faculty:  a case study.  In M. Theall (Ed.)  “Motivation from within:  approaches for encouraging faculty and students to excel”  New Directions for Teaching and Learning # 78.  San Francisco:  Jossey Bass.

___________   (1988)  Enhancing student learning:  emphasizing essential competencies in academic programs.  Wilkes Barre, PA:  Kings College Press.

Farmer’s work as an academic administrator and leader in change processes is summarized in the New Directions chapter and his specific efforts at Kings College are discussed in detail in the book.  Both are useful for those in a position to effect change through leadership. While the thrust of Farmer’s work was in assessment, the principles he discusses also apply to institutional programs for evaluation.


Feldman, K. A.  (1976 through 1998; see extended bibliography)  A series of research syntheses in  Research in Higher Education.

In a series of over a dozen articles, Feldman has provided the most in-depth reviews of specific issues relating to faculty evaluation and student ratings of instruction.  These definitive works have explored all of the major issues raised as potential biases to faculty evaluation.  Feldman’s findings concur with other major work on evaluation, noting that in general, student ratings provide valid, reliable, and useful information.

Feldman, K. A. & Paulsen, M. B.  (1998)  (2nd ed.)  Teaching and learning in the college classroom.    Needham Heights, MA:  Simon & Schuster Custom Publishing. 

This is the most comprehensive collection of readings on this topic.  It is broad and substantial, and includes landmark items like Boyer’s definitions of the four kinds of scholarship, Chickering & Gamson’s  “Seven Principles”, and also incorporates some of Feldman’s studies of college teaching and Paulsen & Feldman’s work on creating a campus teaching culture.  This is probably the best single compendium of the research on college teaching.  Its collection of articles make clear, the complexity and multi-dimensionality of college teaching and the associated notion that the evaluation of that teaching is equally complex.


Franklin, J. & Theall, M. (1989) “Who reads ratings: knowledge, attitudes, and practices of users of student ratings of instruction.”  Paper presented at the 70th annual meeting of the American Educational Research Association. San Francisco: March 31.  ERIC # ED 306-241.

This was the first large-scale study of faculty and administrator knowledge of and attitudes about ratings.  It included a survey determining user knowledge of evaluation practice, basic statistics, and the literature of the field.  Faculty and administrators in general scored significantly lower than staff in centers for teaching, learning, assessment, and evaluation, who, in turn, scored lower than a group of 40 experts in the field. Less knowledge of the literature and practice in the field was significantly correlated with negative views about evaluation and about students as providers of data.


Glassick, C. E., Huber, M. T., & Maeroff, G. I.  (1997)  Scholarship assessed:  evaluation of the professoriate.  San Francisco:  Jossey Bass.

This work continues where Boyer’s text left off, providing discussion and guidelines for evaluating and documenting scholarly work, particularly in  the realm of the scholarship of teaching.   Completed after Boyer’s death, it nonetheless includes a prologue by Boyer.  The outline of criteria for assessing scholarship has been widely accepted but its limitation with respect to faculty evaluation is that it considers only scholarship while thorough faculty evaluation must consider all aspects of performance.


Marsh, H. W.  (1987)  Students evaluations of university teaching: Research findings, methodological issues, and directions for future research.  International Journal of Educational Research, 11, 253-388.

This review of the faculty evaluation literature remains the most cited work of its kind.  It depth and breadth are unusual and its conclusions, supported by virtually all major researchers in evaluation, are that student ratings are, “1) multidimensional; 2) reliable and stable; 3) primarily a function of the instructor who teaches a course rather than the course that is taught; 4) relatively valid against a variety of indicators of effective teaching; 5) relatively unaffected by a variety of variables hypothesized as potential biases; and 6) seen to be useful by faculty…by students…and by administrators.”


Marsh, H. W. (2007) Students’ evaluations of university teaching: dimensionality, reliability, validity, potential biases, and usefulness. In R. P. Perry & J. C. Smart (Eds.) The scholarship of teaching and learning in higher education: an evidence-based perspective. Dordrecht, The Netherlands: Springer, 319-383.

This update of Marsh’s (1987) work reviews some additional research but arrives at conclusions similar to the original, namely that student ratings are generally reliable, valid, robust, and useful for a variety of purposes. 


Miller, R. I.  (1987)  Evaluating faculty for promotion and tenure.    San Francisco:  Jossey Bass.

This text lists ten useful characteristics of effective systems that are the basis for many of the recommendations in this chapter, and includes legal issues, administrative roles, and discussion of promotion and tenure procedures.   


New Directions for Teaching and Learning / Institutional Research 

These series from Jossey Bass include several relevant titles.  Those dealing with evaluation and closely related issues are identified below.


·       Aleamoni, L. M. (Ed.)  (1987) Techniques for evaluating and improving instruction.  New Directions for Teaching and Learning #31.  San Francisco: Jossey Bass. 

       This is the first in a series of volumes exploring evaluation issues and it includes contributions by many of the leading writers in the field.

  • Anderson, R. S., Bauer, J. F. & Speck, B. W. (Eds.) (2002) “Assessment strategies for the on-line class:  from theory to practice.” New Directions for Teaching and Learning #91.  San Francisco: Jossey Bass. 

       On-line teaching and learning present new questions and issues in terms of the effect of context on instructional effectiveness.   There are important questions about the evaluation of on-line teaching that have not yet been addressed and this volume’s emphasis on assessment can contribute to more accurate and effective evaluation of on-line teaching.

  • Angelo, T. A. (1998) “Classroom assessment and research: an update on uses, approaches, and research findings.”  New Directions for Teaching and Learning #75.  San Francisco: Jossey Bass. 

In its discussions of the use of assessment to improve teaching and learning; to help new instructors; and to support program accountability, this volume is a model partner to effective evaluation.  Since student learning is one measure of effective teaching, assessment can and should be part of good evaluation practice. 

  • Colbeck, C. L. (ed.) (2002). Evaluating Faculty Performance (New Directions for Institutional Research No. 114). San Francisco: Jossey Bass.

Authors of this volume take an appropriately broad view of  faculty evaluation, noting that “Forces for change within and outside academe are modifying faculty work and the way that work is—or should be—evaluated” This volume is more philosophical in its approach and presents interesting perspectives on issues that underlie evaluation practice.

  • Johnson, T., and Sorenson, L. (2004). Online Student Ratings of Instruction (New Directions for Teaching and Learning No. 96). San Francisco: Jossey Bass.

This is the newest discussion in the New Directions series and it focuses on the use of on-line systems for collecting and reporting ratings data.  It contains the latest thinking about the use of on-line systems and presents the advantages and disadvantages of on-line and paper systems.

  • Knapper, C. & Cranton, P. (2001) “Fresh Approaches to the evaluation of teaching. New Directions for Teaching and Learning #88.  San Francisco: Jossey Bass.   

      This volume considers evaluation from several perspectives, connecting it to the scholarship of  teaching, teaching awards, the use of technology, portfolio development, accreditation, and other issues.

  • Lewis, Karron (Ed.) (2000) “Techniques and strategies for interpreting student evaluations.” New Directions for Teaching and Learning #87.  San Francisco: Jossey Bass. 

       This volume deals with useful techniques for avoiding common errors in reporting and interpreting ratings results.  Since such errors are one of the most common problems with ratings, the guidelines in this volume can be particularly useful.

  • Ryan, K E.  (Ed.)  (2000)  “Evaluating teaching in higher education:  a vision for the future.”  New Directions for Teaching and Learning #83, Fall.  San Francisco:  Jossey Bass.

This volume considers past practice and new approaches , recommending practical strategies for improving current methods, and for developing new ones.  It also looks ahead to possible changes in practice affected by issues such as the increasing demand for distance and on-line learning, and the effects of new definitions of validity.

  • Theall, M., Abrami, P. C. and Mets, L. A. (eds.) (2001). The Student Ratings Debate: Are They Valid? How Can We Best Use Them? (New Directions for Institutional Research No. 109).  San Francisco: Jossey Bass.

A  recent review of issues relating to student ratings of instruction, this volume presents current thinking along with a proposal for improving the precision of ratings practice. 

  • Theall, M. & Franklin, J.  (1990a)  “Student ratings of instruction: Issues for improving practice.”  New Directions for Teaching and Learning #43.  San Francisco:  Jossey Bass

       This is one of the first discussions of ratings and evaluation to focus on context and the systematic nature of evaluation and the requirements it imposes.  Several aspects of evaluation from ethical issues to disciplinary differences are presented.

  • Theall, M. & Franklin, J. (Eds.) (1991)  “Effective practices for improving teaching.” New Directions for Teaching and Learning #48.  San Francisco:  Jossey Bass.

This volume deals with teaching improvement but connects such efforts to gathering, analyzing, and interpreting evaluation and assessment data for purposes of improvement.


Perry, R. P. & Smart, J. C.(Eds.) (2007) The scholarship of teaching and learning in higher education: An evidence-based perspective. Dordrecht, The Netherlands: Springer.

This is the most recent and substantial review of college teaching and learning including issues related to assessment, evaluation, and, ratings.  Though the title includes the words “scholarship of teaching and learning”, the book does not deal with individual, small-scale, classroom studies like those advocated by Boyer (1990) or Crross & Steadman (1996).   Its focus is on organized, established, long-term research that has been replicated and has contributed to the body of knowledge in substantial ways.


Pescolozodi, B. & Aminzade, R. (1999)  The social worlds of higher education:  handbook for teaching in a new century.  Thousand Oaks, CA:  Pine Forge Press.

This is a very interesting collection of works by sociologists and considers a range of higher education issues.  Its treatment of higher education as a sociological microcosm is unique and worthwhile, and like the work of Birnbaum, it offers insights into the context for evaluation.


Scriven, M. (1991) Evaluation thesaurus. (4th ed.) Newbury Park, California: Sage Publications.

         This is the definitive guide to evaluation and related terminology.

Scriven, M. (1967) “Methodology of evaluation.” In R. Tyler, R. Gagne, & M. Scriven (Eds.) Perspectives of curriculum evaluation. Chicago: Rand McNally & Company.

         This is Scriven’s seminal work, in which he outlines the different purposes of evaluation (formative & summative)  and the different kinds of data that can be used (instrumental and consequential).  The terminology has remained and was adopted by the “assessment movement” to describe similar purposes related to measuring student learning outcomes. 

Scriven, M.  (1967)  Methodology of evaluation.  In R. Tyler, R. Gagne, & M. Scriven (Eds.)  Perspectives of curriculum evaluation.   Chicago:  Rand McNally & Co.

If there is a seminal article on modern evaluation, this is it.  Scriven’s terminology and approach are the basis for contemporary evaluation and assessment.