Disentangling acoustic and social biases in creaky voice perception: The effects of f0 and face gender on creakiness ratings

A study of how creaky voice perception is shaped by speaker f0 and perceived speaker gender.

This project is a direct follow-up to my previous study on creaky voice production in bilingual speakers. While that work (alongside others’) showed that men’s voices are acoustically creakier than women’s, public discourse since the early 2010s has increasingly portrayed vocal fry as a feature of young women’s speech. This study ((Brown & Clayards, 2026)) investigates whether that apparent contradiction could be explained by biases in perception (an acoustic pitch-contrast bias vs. a social gender bias).

Using a matched-guise paradigm, Canadian English listeners rated the creakiness of identical voice recordings manipulated to vary in f0 and paired with either female or male faces. By independently controlling acoustic cues and social information about the speaker, the design makes it possible to isolate how each factor shapes listeners’ judgments. The results show strong effects of voice quality and moderate effects of pitch: genuinely creaky and lower-pitched voices were consistently judged as more creaky. Face gender alone had little overall impact. However, a subtle interaction between f0 and face gender suggests that listeners draw on gender prototypes when cues are ambiguous: lower-pitched voices paired with female faces were judged slightly creakier, while higher-pitched voices paired with male faces received higher creakiness ratings. These effects are small and cannot account for the widespread belief that women use more vocal fry.

Taken together with the production findings, this study calls into question the dominant narrative that creaky voice is primarily a feature of young women’s speech. Across both acoustic production data and controlled perception experiments, evidence for women-led creak is weak or absent. The discrepancy instead appears to lie between these empirical approaches and findings from impressionistic sociolinguistic studies and popular discourse. Rather than resolving the puzzle, this work clarifies its scope and highlights the need to better understand how strong social perceptions can emerge despite limited support from acoustic and experimental evidence.