Feature articles

How not to study bias

August 14, 2023

Torsten Skov
Independent scholar

The paper ”Gender information and perceived quality: An experiment with professional soccer performance”[1] was presented on forskning.no under the heading “Do men really play better soccer than women?”, and indeed the paper claims that it “challenges the idea that the relatively low demand for women’s professional soccer is due to the poor quality of female players’ technical performance” (p. 4). The implication that women play soccer just as well as men seems radical in view of the examples of elite female teams losing to high school boys (see for example here). There can be little doubt that elite men’s teams would invariably win over elite women’s teams. That’s why there are separate tournaments for men and women.

Perhaps the authors only want to make the weaker claim that the technical inferiority of women’s soccer compared to men’s is of no importance for the low demand for women’s professional soccer. Women’s soccer is in low demand, not because of its low quality, but because of the prejudice among the consumers that it is of low quality. This interpretation admits that the technical quality of women’s soccer is lower, while maintaining that the audience’s perception of this being the case is biased. The audience is right for the wrong reasons, and therefore the audience isn’t right.

How does the authors argue this case? Central to the argument is that evaluations of sports are biased by gender stereotypes. The study purports to demonstrate this bias in the evaluation of video clips of soccer games. The setup involved around 600 participants who evaluated 10 videos of soccer from the World Cup and Champions League, 5 from the men’s and 5 from the women’s. Half of the participant evaluated unmodified clips; the other half evaluated blurred versions of the clips so that the players’ sex was not discernible. The finding was that when the sex of the players could not be identified, the men’s and the women’s play was scored of equal quality whereas when the sex was evident, the men’s play was scored superior to the women’s. The authors conclude that the evaluation of the blurred videos was unbiased and consequently the evaluation of the unmodified was biased by gender stereotypes.

On the face of it, this looks like a valid conclusion. But it isn’t.

When studying bias, the first thing you need is a gold standard by which to judge what is biased and what is not. If we want to know whether people are biased in their evaluation of height, we compare their guesses with the heights measured by a yardstick or some other reliable, validated method, the gold standard. The principle is quite simple.

Although they do not state it explicitly, the authors define the blurred video evaluation as the gold standard of an unbiased evaluation. Whether this method measures something in a reliable and valid way has, however, not been established. The authors admit that the blurring is likely to introduce noise in the evaluation, and that they were not able to come up with a better way. Figure 1 of the paper does indeed indicate that it took some blurring to obliterate the players’ sex. Surely, it cannot be as enjoyable for a soccer enthusiast to watch a blurred game than to watch one where the details can be discerned. That may be an acceptable shortcoming of the method, provided that it is still able to distinguish between more and less enjoyable. But what if the method does not even do that? What if the read-out is entirely random? Then the finding of no difference in the evaluation of the blurred videos is explainable by the method’s inability to detect any differences even when differences are present.

It is not a good idea to establish the gold standard by a method that has never proven its validity and whose validity from a common-sense perspective is doubtful. The main finding of no difference in the evaluation of the blurred videos is likely a spurious effect produced by a methodological flaw. Which means that the study shows nothing at all.

[1] Gomez-Gonzalez, C., Dietl, H., Berri, D. and Nesseler, C. (2023). Gender information and perceived quality: An experiment with professional soccer performance. Sport Management Review, 11 July 2023, pp.1–22

1 COMMENT

HSFootball News September 8, 2023 At 21:06
The study’s approach in examining gender bias in the evaluation of soccer games is both methodically sound and enlightening. By employing a sample size of around 600 participants and presenting them with a mix of unmodified and blurred video clips from both men’s and women’s matches, the study effectively isolates the impact of gender stereotypes on evaluations.
The crucial finding that emerges from this experiment is compelling. When the participants were unable to discern the players’ genders due to the blurred clips, they assessed the quality of play on an equal footing for both men’s and women’s matches. This indicates that the inherent bias in favor of men’s play disappears when gender information is concealed. This outcome lends strong support to the argument that gender stereotypes play a significant role in influencing evaluations of sports.
Moreover, the study’s conclusion is well-grounded in its findings. By establishing that evaluations of the blurred videos were unbiased, it effectively underscores that the bias observed in the unmodified clips is indeed a result of gender stereotypes. This underscores the need for a reevaluation of how sports, and particularly women’s sports, are assessed and appreciated.
In a broader societal context, this study has profound implications. It highlights the importance of challenging and dismantling deeply ingrained gender stereotypes that persist in various domains, including sports. Moving forward, this research encourages a more nuanced and objective assessment of athletic performance, regardless of gender. It’s a step towards a more inclusive and equitable representation in the world of sports.