Measuring Diversity

This week I’m going to use the blogger’s best friend and repost the interesting blogs of others with commentary.

The Chronicle of Higher Education hosts a very interesting data points blog that recently ran a three part series on measuring diversity.

The first post began by looking at the impact of of Michigan’s ban on affirmative action on minority enrollment but ended by noting that while enrollment of African-Americans and Hispanics had declined, diversity appeared to be stable.  How can this apparent paradox be resolved?  The answer has to do with measurement and the interaction of measurement with culture.

One part of the answer has to do with the part of the population that does not report race.  The size of this population and its racial composition appear to be related to the existence of race-based affirmative action.  When such affirmative action exists, whites are less likely to report race; when it does not exist, other groups are less likely to report race.  This makes it very difficult to determine how exactly changes like those in Michigan affect enrollments, since the change not only affects enrollments but also whether individuals report their race at all.

It also matters how people report their race and/or how they are allowed/encouraged to report their race.  In particular it matters whether people have the opportunity to report more than one race. The second post discusses how the US Department of Education measures race.  A key point in this discussion is that while the census began allowing respondents to identify with multiple races in 2000, the Department of Education did not do so until 2009.  This makes it hard to evaluate changes in racial composition of students before and after this change.  The apparent decline in African-American and Hispanic students may result from multiracial students now reporting as multiracial rather than selecting a single race category.

As a sociologist, I must point out that race is a social construct and not biologically determinable or verifiable.  If you say you are Asian, I cannot give you any objective psychological, biological or other kind of test to confirm or falsify your statement.  As a sociologist, I would argue that the same is true of the male/female gender binary.  The futility of the biological approach to determining a binary gender scheme is revealed in the very attempts to find a biologically determined gender binary.  This comes up most often in the context of sports, which often seeks to enforce rigid gender segregation.  The difficulty of determining sex for the purposes of classifying competitors demonstrates the point that sex  is no simple biological fact. (In the 48 hours since I drafted this blog another major case of gender testing in sports has been in the news.)

The lack of biological correlates does not mean that race and gender are not “real.”  As the sociologist W.I. Thomas famously said “If men [sic] define situations as real, they are real in their consequences.”  Race and gender–and the assumption that these are biological–matter in our lives because we act as if they are real.

The third post looks at how diversity is measured, comparing a measure called the diversity index to more traditional comparisons to local demographics.  EvCC has typically used comparisons to local demographics although, as the blog post indicates, this can be more complicated than it at first seems.  EvCC also faces the same challenges in categorization described in these articles.  Some students do not report their race.  We have difficulties with data over time because like other institutions we now ask people if they are multiracial when previously we did not.



Posted in News


How often do students . . .

This week we are looking at some of the results of the Community College Survey of Student Engagement (CCSSE) and the Community College Faculty Survey of Student Engagement (CCFSSE). You can find the CCSSE report and the full results on the IR intranet page.

The first comparison has to do with how often students come to class unprepared.


table 4

The second question concerns how often students skip class

table 1
The final example concerns class participationtable 2

The point of surveying both the faculty and the students on similar questions is to give insight into how those two groups view similar issues. I am interested to know how people interpret the differences in responses.

My own first response is methodological and was initially pointed out to me by my colleague Bonnie January.

While the questions address similar issues, faculty and students are not being asked precisely the same question. Faculty are asked about students collectively, while students are asked about themselves individually. The questions being asked of faculty are a bit unclear. Does a “very often” mean that most students in most classes ask questions, skip, or come unprepared or that some student does those things most classes or something else altogether? It isn’t exactly clear, and some of these results may differ for that reason.

If we wanted them to be the same, we would have to ask faculty a series of questions: What percentage of students asks questions or contributes to class discussions very often? What percentage of students asks questions or contributes to class discussions often? What percentage of students asks questions or contributes to class discussion sometimes? And so on. The problem here is that a survey that is already longish would become four times as long.

My own view, however, is that there is more to these differences than question wording. How would you interpret these differences? Please share your thoughts by commenting on this post.


College Success is a Success

This past fall, EvCC scaled up its college success course, now known as College 101. The course is designed to give students the skills they need to be effective college students. The long term goal is to increase student success, particularly student completions. A short term indicator of success is retention. Table 1 shows the retention rate for course takers and non-takers in the target population, which was all new degree-seeking students except for Running Start students.* The data show that course takers were 17% more likely to return in the winter than non-takers.

table 1

Table 2 looks at the same numbers but for different subgroups within the target population. Transfer-Specific students indicated a specific program of study (e.g., business, biology, history) for their associate’s degree. Transfer-General students indicated that they intend to transfer but did not choose a specific program. Undeclared students are completely undeclared with regard to both program and transfer/prof tech. In all four cases, the College 101 course is associated with higher retention, and the highest relative benefit is for undeclared students.

table 2

Because students self-selected into the class, there is the possibility that differences observed are the result of selection bias. This concern cannot be entirely eliminated, but the course-takers and non-takers have roughly similar levels of placement into college math which is a rough indication that they are more or less equally prepared for college. While there can be no definitive evaluations after so short a time, these data are certainly encouraging.

* Some programs have integrated the college success curriculum into their program curriculum. Students in these programs are excluded from this analysis.

Compared to What?

As some of you may know the Seahawks super bowl opponents, the New England Patriots, have been accused of cheating by deflating footballs during last week’s AFC championship game against the Indianapolis Colts. Under inflation allegedly gives the quarterback a better grip on the ball. The primary evidence against the Patriots is a report that, when inspected during or after the game, eleven of the twelve balls provided by the Patriots were significantly underinflated.

This report has led to a lot of speculation and accusation based on the “evidence” of eleven out of twelve.

But what does this evidence actually show?



Because, as the great statistician Edward Tufte wrote, “At the heart of quantitative reasoning is a single question: Compared to what?”
Whether eleven of twelve (91%) is a lot depends entirely on what you are comparing it to. A lot more than eleven of every twelve soldiers returned from Vietnam but does not mean that our casualties were light. People who drive while drunk survive a lot more than eleven out of twelve times but that doesn’t make drunk driving safe.

One of the ways people lie with statistics is by presenting them as if they were self-referential. “I mean, I could see if it were three or four balls but eleven out of twelve? C’mon somthin’s gotta be goin on there”

Maybe or maybe not.

We know that inflated things (car tires, footballs, air mattresses) tend to deflate over time. We can see the punishment a football takes during a game. Perhaps eleven out of twelve balls are generally deflated by the end of a game. Of course we can’t go back and collect data from games past to find the average deflation of a game ball but in this case that wasn’t necessary.

It turns out that each team provides game balls. The key piece of evidence in this case is the inflation of the Colts game balls. If eleven of the twelve Colts balls were deflated after the game then the state of the Patriots balls is just the state of game balls after a game. If only one or two Colts balls are deflated then we have pretty strong evidence that something was done to the Patriots balls.

If the NFL was able to measure the Patriots game balls surely they could have measured the Colts balls as well, compared the two sets of balls and had very good evidence from which to draw conclusions. So far it doesn’t look as though they did this and, as a result, we have only the appearance of evidence; heat but no light.

The take away is: Every time you see a statistic you should ask yourself “Compared to what?”

Working Tables #1: Running Start Failure Rates

Running Start generates strong feelings and contradictory hypotheses.  Here are some data that bear on the relative academic virtues of Running Start students at EvCC.  These tables show the proportions of students who received a grade lower than a C.  Students in the ORCA program have been excluded.

English annual

math annual

The first table shows that Running Start students have consistently passed English 101 at higher rates than other students and that the difference between Running Start and regular students has narrowed  as a result of improved pass rates by regular students.  The second table shows that the failure rate in Math is generally higher for all students, but as in English, Running Start students consistently outperform other students.  The math time series is shorter because Math 141 was not offered prior to 2008.

The following two tables show the data broken down into quarters and also provide sample sizes.  If you click on the tables you will see a larger versions.  One interesting feature of these data is that in the past few years the Running Start advantage in English appears to be restricted to fall quarter.

english quarterly

math quarterly

I hope these data will generate conversation. What do you think these data tell us about Running Start?  How else could we look at the data?  How might similar data look for other courses?




Evaluating Teaching.

Evaluating teacher effectiveness is one of the more controversial aspects of contemporary educational reform.  The IDEA form used by EvCC has some nice features but is relatively generic.   At the primary and secondary level teaching evaluations can be highly quantitative and linked to standardized testing.  One increasingly popular method for these evaluations is the Value Added Model (VAM).   VAMs  are complicated statistical models that attempt to separate the effects of teachers and schools from the effects of differences in student’s background.

The American Statistical Association (ASA) has just published a statement on using value added models for educational assessment. The statement is brief and uses plain language to describe some of the problems with the value added model (VAM) approach.  The statement strongly implies that efforts to improve education by focusing on teaching quality may be misplaced as

Most estimates in the literature attribute between 1% and 14% of the total variability [in outcomes] to teachers. This is not saying that teachers have little effect on students, but that variation among teachers accounts for a small part of the variation in scores. The majority of the variation in test scores is attributable to factors outside of the teacher’s control such as student and family background, poverty, curriculum and unmeasured influences (emphasis in original).

While the can be read as discouraging the use of VAMs for high stakes comparison and evaluation, the ASA does see value in using VAMS for evaluating policies or teacher training programs.

The ASA statement also points out that certain statistical properties of VAM scores (large standard errors) makes VAM rankings unstable and they recommend that VAM estimates should always be accompanied by measures of precision.  Measures of precision are a way of accounting for uncertainty and are  important in any evaluation.  When we see a difference between two groups (online vs. face to face classes, flipped vs. “normal” classes, EvCC vs. Edmonds), there are always at least two possible explanations.  The difference might be the result of the specified difference between the groups or it might be due to chance.  Measures of precision tell us about the likely impact of chance.  Two common measures of  precision are the margin of error that is printed with most reputable polls (although often at the bottom and in small print) and  measures of statistical significance.



Posted in News


Costs of attending Community College in Washington State

Here is an interesting graphic showing annual percentage increases in tuition and fees and annual costs for full time students.



Posted in News