more on grading policy

I’ve written before about my work through EPC on grading policy. After a year’s worth of consideration, we are presenting a resolution tomorrow for UNC to report grade distributions on transcripts for each class, and to report grade patterns to faculty each semester.

Two colleagues wrote me a detailed and thoughtful message about the proposal, and while I do not agree with their position, I asked and they agreed to have me post it to scatterplot for further discussion. Their message is below the break; my response and further discussion is posted as the first responses to the post.

We are writing in response to Faculty Council Resolution 2010‐3 on “Enhanced Grade Reporting.” We appreciate the work of the committee, but have serious reservations about the proposal. We hope you’ll distribute/forward our email to members of the Faculty Council so that this view can become part of the conversation. Instead of adopting the proposal, we call for the start of a  serious conversation about the  *purpose* of grades and the factors that may contribute to variations in  grade distributions across time and courses.

Like the Achievement Index, reporting course and section grade  distributions along with student individual grades will presumably allow  viewers of the transcript (employers, award committees) to assess the  student’s performance against those of her/his peers. This goal  assumes that our primary purpose as educators is to  differentiate between students and that grades are the primary markers  of this differentiation.

An *alternative view* is that our primary goal is to successfully educate  our students in the substance and methods of our respective fields, and  grades should be used to mark the performance of the student against  predetermined standards of proficiency and learning  objectives that  instructors painstakingly develop for each course. In this view, the  loftiest goals to which we can aspire would be to set appropriately  tough standards that are in line with the  expectations of our  profession, and then work diligently and creatively to help all of our  students reach those goals.  Achieving these goals would result in a  better educated populous but would certainly not eliminate grade  compression.  In fact, accomplishing this goal would *fairly* result  in  lots of As. This perspective on grading is called “the mastery method” among education scholars.

“Grade inflation” implies that higher grades are being awarded for  comparable work done in the  past. Supposedly our standards for  performance have declined while student performance has  remained  relatively stable or declined. We have little doubt that slipping  grading standards  *could* be *one* of the factors explaining variations  in grade distributions across time and  courses. This is certainly an  issue that deserves additional study, starting with serious  conversations, both within and across departments, about the  establishment of rigorous learning  objectives and fair grading standards.
However, educational standards have surely increased over time as well.  For example, how many  professors in the 1970s required their students to  write well, analyze data, lead discussion  sections, AND perform  community service?  Moreover, there are many other factors besides  changing standards that likely contribute to variations in grade  distributions across time. Indeed,  increases in average grades might  result from the development of more sophisticated learning and  teaching  tools, stronger economic incentives among students to earn good grades,  or improvements in the proficiency of teachers to convey expectations  and imbue their students  with necessary skills.

Increases in average  grades may also reflect powerful selection processes. With improvements  in  the distribution of information on the content of courses and the  teaching style of specific  instructors, students are now better able to  select courses that fit their interests and strengths,  allowing them to  perform better. In light of these changes, it is quite likely that  actual student  performance and proficiency has increased over time,  possibly at a faster pace than increases in  learning objectives and  grading standards.  In this sense, it would be inaccurate to conceive of  increasing grade averages as true “grade inflation,” and we would  conceive of different remedies  for the issue, assuming that increasing  grade averages, by themselves, constitute a problem in  need of remedy.
Similarly, variations in grades across courses and sections, either in  the cross-section or across  time, are likely a product of a wide range  of factors. For example, the strongest students were  choosing very  different majors twenty years ago than they are today and there has been  similar  shuffling between majors between these extremes. Larger  increases in grades in some disciplines  than in others might just  reflect the fact that selection processes have resulted in stronger  increases in the quality of students who choose different majors.  Similarly, variations in  professional emphasis on pedagogy might result  in the adoption of stronger teaching methods in  some fields than in  others and we would expect variations in student performance to emerge  as a  result, even if all disciplines and courses have similarly rigorous  standards. Even changes in grade  distributions for individual faculty  members could reflect the adoption of improved  teaching techniques and  strengthening ability of individual instructors to inspire their  students.   Indeed, these individual-level improvements in teaching  performance are exactly what we expect  from our faculty development efforts.
As with any policy rooted in incomplete information, there is a strong  chance that the proposed  actions would do more harm than good.  Encouraging the comparison of an individual’s grade  relative to the  distribution of grades for the course, either through some kind of  Achievement  Index or through the steps currently on the table, will  necessarily punish the students who do well  in classes in which their  classmates also succeeded. As intended, it will devalue the grades of  those students who habitually seek out the instructors who construct  overly easy courses and  dole out easy As to undeserving students.  But it would also punish those students who take    courses from excellent  instructors who strive to help all of their students achieve lofty  goals. It  would also increase status competition among students and make  them less interested in working together, the hallmarks of participatory  learning and the teaching of cooperation and teamwork.  At the same time, the steps  in the current proposal will embolden faculty members whose courses  have  what might be deemed sufficiently low grade distributions, giving them  no incentive to think  about whether the fact that so few students earn  high marks in their classes might reflect  their inability or  unwillingness to assist students in achieving excellence. We are very concerned   about these consequences.

All of this suggests that the factors behind any increase in average  grades and variations across  disciplines, courses, and sections are  exceedingly complex and cannot be easily equated with a  grade-inflation  phenomenon or reduced to a simple slipping-standards argument. All of  these  factors deserve additional attention if we are to develop  effective remedies for any flaws in the  current grading system.  Rather than going forward with Faculty  Council Resolution 2010‐3 on  “Enhanced Grade Reporting,” we recommend  rigorous study of the factors affecting changes in  grade distributions,  as well as increased conversation about pedagogical goals and the  *purpose*  of grades in the attainment of those goals.

Sherryl Kleinman, Professor
Department of Sociology

Kyle Crowder, Professor
Department of Sociology

Author: andrewperrin

Johns Hopkins University - Sociology and SNF Agora Institute

17 thoughts on “more on grading policy”

  1. Dear Sherryl and Kyle,

    Thank you for your thoughtful message on grading. Because of my great respect for both of you as educators and scholars, I am going to respond in some significant detail below. With your permission, I would also like to post your message and my response to scatterplot, the sociology blog, for others to discuss.

    The Educational Policy Committee (EPC), of which I am chair, has been studying grading patterns in various ways since at least 2001. In the spring of 2009 we submitted the most comprehensive report on grading patterns at UNC ever undertaken, and quite possibly the most comprehensive report of its kind ever undertaken at *any* university ( ). We have had several long and substantial discussions with faculty and students on grading matters, including:

    The current proposal has been discussed at EPC numerous times, covered in the Daily Tar Heel, and discussed with the Deans’ Council; the Faculty Executive Committee; and the Faculty Council, as well as in numerous informal conversations with colleagues around campus.

    The reason I detail this history is to demonstrate that this issue is neither understudied nor under-discussed. Your recommendation of “rigorous study of the factors affecting changes in grade distributions” has already been done, and done well.

    If I understand your concerns correctly, they are the following:
    1.) Using grades comparatively assumes that our “primary purpose” is to differentiate between students.
    2.) Teaching and grading for mastery necessarily produces compressed grades at the top of the spectrum.
    3.) A variety of developments in recent decades make students’ performance in class better, which in turn explains the overall rise in grades.
    4.) Variations across instructors and departments are likely the result of better teaching techniques in high-grading departments and by high-grading instructors.
    5.) Reporting grade distributions would “punish” students in classes in which all students perform excellently.

    I’d like to take each of these concerns in order.

    1.) Using grades comparatively assumes that our “primary purpose” is to differentiate between students.

    Actually I think it only assumes that *one* purpose is to differentiate between students – not necessarily the primary one. And grades are routinely used for this purpose already, as for awarding Dean’s List status, graduation with distinction, entry into the Honors program, and eligibility for countless scholarships and programs. So short of prohibiting the use of grades for these sorts of things, we already do this. The question is how to make these comparisons fair and accurate while respecting genuine differences among faculty on grading practices or, alternatively, how to insure that grades are not used to compare students’ achievement, which would also be a huge change in policy.

    2.) Teaching and grading for mastery necessarily produces compressed grades at the top of the spectrum.

    Again, I don’t think this is the case. Indeed, existing UNC grade policy *assumes* a mastery model ( ). But different students achieve different levels of mastery. There are any number of reasons for this, but the key point is that grading for mastery does not necessarily lead to no variation among students’ achievement. See the next item for more on this point.

    3.) A variety of developments in recent decades make students’ performance in class better, which in turn explains the overall rise in grades.

    Personally, I am skeptical of most of these explanations, based on my own experiences in the classroom and my conversations with many other instructors as well. Last year’s report did as much as it could given the data we had available to evaluate the extent to which rising grades are related to increased talent or performance, but of course that’s very difficult to evaluate. However, even if all these explanations account for the overall rise in grades, it does not follow that grade compression ought to be accepted. If, in fact, we are getting better students, teaching them better, and using more resources to help them succeed, we ought to recalibrate our expectations! Our responsibility is to challenge and inspire the students we have now, not to keep teaching to standards set for prior generations of students.

    Another way of putting this is: if every student in a class is achieving an “A”, it is very likely that some of the students could be achieving *more* than that. Returning, again, to the grading system memo ( ), “The A grade states clearly that the students have shown such outstanding promise in the aspect of the discipline under study that he/she may be strongly encouraged to continue.” So in general (and I am aware there are important exceptions) a class with many A’s is missing the opportunity to inspire some students to achieve to their full potential.

    The discussion of the “inflation” metaphor is appropriate, and has been addressed before, i.e., in Alfie Kohn’s old and mostly discredited piece, . But it’s a red herring here – regardless of whether the phenomenon is truly “inflation,” i.e., higher grades for similar work, the overall increase in grades nevertheless presents a concern.

    4.) Variations across instructors and departments are likely the result of better teaching techniques in high-grading departments and by high-grading instructors.

    This is an interesting hypothesis, but one I think is unlikely to be borne out. It would be interesting to check whether teaching awards over the years are related to grading practices, for example. But many of our highest-ranking *departments* in the College are also ones with relatively low overall grading records, which suggests that your thesis that academic quality leads to higher grades is not supported. Returning, though, to the point in 3.) above, it does not follow that successful teaching will necessarily lead to uniform grades.

    More generally: there is lots of evidence that students actively seek out classes in which they expect to earn “easy A” grades. Sadly, our own discipline is one to which they look for this service. This practice interferes with the substantive intellectual mission of our university, in at least two ways. First, it prevents some students from seeking out challenging experiences in “low-grading” departments because of concerns about the impact on the all-important GPA. Second, it prevents students from working to excel in “high-grading” departments because they are not rewarded for outstanding work. So quite separate from the inflation question, systematic grade inequality threatens the intellectual integrity of the university.

    5.) Reporting grade distributions would “punish” students in classes in which all students perform excellently.

    I don’t think there’s any “punishment” involved here. This is a move toward transparency in grades. Unlike the Achievement Index proposal (with which your message inappropriately conflated this proposal), the Enhanced Grade Reporting proposal takes no stand on the appropriateness of a given distribution of grades. Quite candidly, I do expect (as does EPC) that we will need to do something in the near future about the ways we use grades to compare students across instructors and departments (see, e.g.,, and that change may take any one of several forms. But gathering and reporting information about grading patterns is a “sunshine” measure which will help, not hinder, an ongoing discussion about the philosophy and mechanics of grading at UNC. I very much hope the Faculty Council approves the measure.

    Again, many thanks for your thoughts, and best wishes-


  2. Hi Andy,

    Thanks very much for your detailed response to our email. I agree that our study of the topic of grade inflation has been really impressive, far exceeding anything I’ve seen anywhere else. However, I firmly believe that there are important limits to what can be learned from the available data, as your response and the EPC appear to acknowledge. For example, while the report produced by the EPC makes a persuasive case that average grades have increased over time, that there is some marked compression of the distribution of grades, and that there is significant variation in these trends across departments, the analysis to date seems to provide little solid evidence about /why/ these patterns are occurring. Even the fixed-effects models presented in the EPC report, while demonstrating the committee’s eagerness to use the available data to the fullest possible extent, are unable to account for time-varying factors (e.g., changes in the techniques used by instructors) and student selection processes that may have a significant impact on grade distributions.

    So, my primary concern is that we are embarking on a treatment regimen without understanding the root causes of the problem. I should point out that the email you received yesterday was not intended to make a definitive case about any specific cause of the observed changes in the distribution of grades. For example, I personally doubt that the trends described by EPC can be completely attributed to increasing “quality” of students. But I do suspect that the causes are more complex than the current policy recommendations, departmental conversations, and coverage in the Daily Tar Heel appear to imply. After further study into /the factors leading to/ changes in grade distributions, and inter-department differences therein, we might very well find that the observed temporal trends represent true grade inflation and/or that there are significant numbers of instructors passing out easy grades to undeserving students. But I don’t think we can conclude these things with any authority, so embarking down a policy path that seems to assume these things is, at best, premature. Moreover, while reporting grade distributions and subsequent measures along these lines may lead to a recalibration of grade distributions (individual instructors might start handing out more C’s), I simply do not see how reporting grade distributions will lead to any recalibration of course standards. learning objectives, or teaching practices that you and I both believe needs to happen. Thus, I think we would be better served by moving on to the next phase of the investigation to get to the bottom of these grading patterns before proceeding with specific remedies (more in line with the Seton Hall approach, Remedy #4 in the EPC report).

    Again, I really appreciate the time and effort you’ve put in on this topic. I also appreciate your willingness to post our exchange on your blog and hope that you also distribute it to your colleagues on the Faculty Council.

    Thanks again.


    Kyle Crowder, Professor
    Department of Sociology


  3. Kyle, thanks. I don’t think the current proposal makes any assumption that “there are significant numbers of instructurs passing out easy grades to undeserving students,” or any other specific cause or mix of causes for the three concerns: rising grades, grade compression, and systematic grade inequality. Therefore I don’t see how the harms you outline can reasonably be expected to result from this policy.

    An eventual next step might well be the “Seton Hall Plan” you suggest. Nothing in the proposal precludes that from happening. In fact, the data generated under this proposal might well provide for a better set of conversations.

    Can you say more about the study you envision into “the factors leading to changes in grade distributions, and inter-department differences therein”? EPC remains charged with reporting on grading at UNC on an annual basis, and we remain open to any new approaches to that report.

    I have posted this all on scatterplot, , so feel free to follow and comment there if you prefer.

    All the best,


  4. An unusually thoughtful exchange that airs almost all the issues I am aware of, especially the mastery vs ranking dilemma. It addresses the “rising quality of education” issue which I think is real: at least at my school, the students today ARE much better on average than they were in the 1980s. It seems to omit (unless I missed it) the issue of late drop dates, which lead the students doing poorly to drop a class.

    I agree that including grade distributions with grades is probably the best way to balance various imperatives. I have read such transcripts when I’ve done grad admissions, and they are clearly the most informative ones, especially when the include the class size and full distribution (instead of just the median).

    Um, I realize that I have more opinions about grading issues, but not the time to write them now. Maybe later.


    1. I never thought of that withdrawal timing issue. Seems important. E.g.,

      This year during graduate admissions season I consulted the website a few times, which lists the average GPA for many schools. Some people at very good schools had lower GPAs, and I discovered their schools had low average GPAs. But then I had a student from a very prestigious private school with a GPA of less than 3.5 listing one of those Latin-sounding honors (can’t remember which one). The grade inflation site lists the average GPA at that school as more than 3.5, however. So what to make of that?

      Of course, they are great students, taking great classes with great instructors, so they deserve great grades. On the other hand, they already get credit for that through the prestige effect of the registrar’s letterhead.

      Anyway, I just checked, and that school permits withdrawing from a class after about 80% of the weeks in the teaching period (I’m not saying whether it’s quarters or semesters). That’s something!


  5. Great exchange. If I’m checking during admission reviews, does that mean I support the proposal? Is that wrong of me? The difference in average GPA between Spelman and Yale is apparently almost 4-tenths. Shouldn’t I know that when comparing applications?


    1. What adds to the complication is that grade point averages are lowest at the open-admission junior colleges.

      At Wisconsin the grades are A, AB, B, BC, C, D, F. If you have to go down 5 grades to get to a C, you can guess that there are not going to be a lot of them. Even if you don’t give out many A’s, you are still likely to give a significant fraction of AB’s and B’s. Wisconsin’s average GPA is 3.2, which means that on average people are centering their grades on a B+, which is cognitively what you’d expect from the grading choices.

      But I’ve looked at grade distributions — they are posted on the web if you know where to look for them, as they are hidden behind a link that wouldn’t clue you in to what was there and available only as a 500-page PDF file. My department is wildly variable. Some people give virtually all As. Others have a distribution centered on B. In general, people making D’s and F’s just drop and I think it is unreasonable and immoral to give people D’s and F’s unless they are deserved.


  6. Andy,
    I think you are right that the collection of information on grade distributions of specific courses is an important next step in efforts to understand the factors affecting grade compression, variation, and average increases. But I think this information, if collected, should be used very carefully. Given the way that this issue has been covered in the media, both locally and nationally, it is very likely that viewers of these grades will assume that a relatively high grade distribution for a course reflects a lack of rigor without a full understanding of other factors shaping the grade distribution within a specific course or differences across time and between departments. There is real harm in this if this overly simplistic interpretation leads award committees, employers, promotion committees, etc. to discount the grades earned by students in these classes and/or to cast doubt on the intellectual merit of an instructor. Similarly, many are likely to erroneously attach assumptions of intellectual rigor to courses with a lower or more even grade distribution.
    Given this, any information that is collected should be used to generate the next stage of the data collection processes. Information about course grade distributions might be sent to all instructors so that they might engage in some self reflection on their expectations for students, their goals in assigning grades, and how their courses fit into the broader grading structure of the university. Conversations about these issues at the department level should also be encouraged. More importantly, the information on grade distributions could be used by EPC to select a purposive sample of courses or whole departments from which to gather more information about grading standards, course structure, student selectivity, and academic expectations. For these courses we might take a close look at the grading standards, the expectations placed on students, the SAT and GPA of students enrolled in the course, and a range of other information related to the factors that might affect differences in grade distributions across courses and across time. I think it would also be quite useful to assemble some focus groups of instructors to get a handle on the level of diversity in approaches to grading, teaching philosophies, and expectations placed on students. Focus groups of students would also allow us to test our assumptions about how courses are selected.
    There is a lot to be gained from this data collection effort. We might find that most courses with high or right-skewed grade distributions are weak courses with low standards, taught by instructors who carelessly dole out easy A’s and populated by students who search for courses in which they can collect high marks without working much. In this case, putting in place sanctions and/or other measures to correct these problems, and rewards for students and instructors in courses with low grade distributions, would be appropriate. However, I suspect that we are likely to find that the factors shaping the grade distributions of courses are considerably more complex and variable and that knowing this would lead to a different set of remedies.
    You might be right that many of the measures in the proposal currently under consideration might do little tangible harm. However, I hope that the Faculty Council will aspire to loftier goals by extending your excellent efforts to date into the next stage of data collection so that we can ultimately develop strategies that will actually redress the problems associated with the current grading regime.
    Thanks again for your efforts on this.


  7. This is a wonderful exchange. I have one question: do you know anything of the tie between grade inflation and course evaluations?

    I once saw research that showed that the strongest relationship between positive course evaluations and anything else was “expected grade” (not retention, etc. — at least that’s what I think I saw — I couldn’t point you to it now, but I”ll look later today). One might imagine that the growth of teacher evaluations (both in frequency and importance) is tied to the growth of grade inflation.

    Okay, so I lied. Another set of questions: is there a class basis to this? I see from you all that this has happened more in private than public school. And I know that colleges are getting more and more expensive, and that the wealth of student bodies is increasing. Any sense of how the wealth of university student bodies and the cost of tuition is tied to the increase in grades?


  8. About a thousand years ago, well ok 40 years ago, when I was an undergrad at Stanford, the official grading policy was something like this: “Courses differ and groups of students vary from semester to semester and grades may well all be high in small advanced seminars, but across multiple semesters in large general courses, professors should expect to find that they give about 15% As, 35% Bs, 35% Cs, and 15% Ds and Fs” This is a B- curve, which was viewed as inflationary in its day.

    I repeat my earlier point, that once you permit people to drop fairly late in the term, you will generally have virtually no Ds and Fs unless you use one of those evil systems where you tally points and wait until the end of the term to curve them. At my school, the drop date is the 9th week of a 15 week term. That’s late enough to cull most of the really low grades.

    One implication of the “drop day” argument is that no study of grade inflation should ignore drops. At most schools, drops show on a transcript and any transcript-reader with sense is spotting a pattern of repeated drops in evaluating an individual. If you are going to report grade distributions, you ought to report the number who began the course and the number who ended it. At my school, the two relevant numbers would be 1) on the roll after the 10th day of classes (after which drops are recorded, so before that the changes are pre-grade course shuffling), and 2) on the roll at the end and received a grade.


  9. I don’t have time to read the reports listed above so forgive me if these ideas have already been discussed, but two sources of data could provide insight into whether students are doing less work for better grades:

    1) syllabi from earlier times. Senior faculty at my institution often mention reading deflation, and if students are reading fewer pages and writing shorter papers for higher average grades, that might provide some clear evidence, no? Of course, quantity of work is not quality, but it’s probably not a bad proxy for expectations for mastery.

    2) The optimistic scenario laid out by Kyle and Sherryl, whereby students are seeking out courses that best fit their interest and abilities rather than an easy A, whithers a bit when one peruses this gem of a website. It’s obviously not a representative sample, but it’s clear that many students are seeking an easy A. One could improve the data by including such questions on official student evaluations.


  10. You heard it here first, folks. Not with a bang, but a whimper, the UNC Faculty Council approved Resolution 2010-3 on a voice vote; there were some “no” votes, but sufficiently few that no division was requested and the motion carried. There will probably be 1-2 years of planning and implementation work to go, but the faculty of the nation’s first (and, if I may say so, finest) public university is on the record approving a sunshine plan putting contextual grade information on transcripts. YEE-HAW!


  11. This is a very interesting discussion. I thought I would chime in with some of my experiences as a recent undergraduate and current graduate student.

    While I agree that publishing the grade distributions of courses would punish students for taking classes where everyone does well, in my undergraduate career I personally felt punished for taking courses where few people did well. I went to a large flagship public university. The mathematics department there, for instance, had a 15% ‘A’s cut-off. I had a professor who had come from an Ivy League school tell us that at that school it was 50% ‘A’s. It hardly seemed fair that students from that school would have higher GPAs and an institutional status bump on top of that–talk about cumulative advantage.

    I also agree that grading hard increases competition among students, and a somewhat toxic culture among students in competitive classes. However, in my experience that competition also encourages cooperation. In my case, and I think this holds in general, I never studied in groups for easy courses–I would just get an ‘A’ on my own. The harder graded the class was the more likely I was to cooperate with other students.

    In my experience students do seek out easy-A classes, even good students do it, many make a veritable science out of it. Knowing that their futures depend partially upon their grades they would be fools not to do it.

    The saddest part of all of this is that certain disciplines, notably our own, are known for being easy-A classes. This, in my experience, is not due in any way to sociologists practicing better pedagogy (it seemed about the same across disciplines to me). I strongly suspect that being known as an easy-A discipline is harming both the discipline’s reputation at large, and those students who are very serious about sociology but are unable to distinguish themselves through grades in their courses. Students today who experience sociology as ‘easy’, and other disciplines as ‘hard’ are likely to have less respect for us in the future.

    At my undergrad institution’s mathematics department standards were upheld via common finals–giving all students in the freshman and sophomore classes the same final exam regardless of which professor taught their courses. I suspect this would be much more difficult in a discipline like sociology which does not have a strong paradigm like mathematics.

    There’s probably a lot more I could write. I definitely agree that there is a problem here. And, as a student, I would have much appreciated it if the grade distributions were posted for classes along with my grade–both for classes I did well in, and those I did poorly in.


Leave a Reply

Please log in using one of these methods to post your comment: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.

%d bloggers like this: