Abstract
This paper presents a specific example of how swear word translation can be studied computationally using mathematical techniques. The study describes a method for predicting where swear words can be added or omitted in English-to-Spanish translations. In Juanma Barranquero’s Spanish translation of Neal Stephenson’s “Snow Crash”, we see the English sentence “Whatsit (sic) stand for?” translated to “¿Qué coño significa?” (or “What the hell does it stand for?”) Note the addition of a swear word. We know from research by Davoodi (2009) and Ávila-Cabrera (2016) that translators at times expand or condense the original text, but when is this allowed for swear words, and can we predict it using a computational model? This study begins by outlining how a bilingual corpus was created using Stephenson’s cyberpunk novel and Barranquero’s Peninsular Spanish translation. It then demonstrates how Quinlan’s (1993) C4.5 decision tree classifier was used to generate a predictive model based on lexical, syntactic and semantic properties including the swear word itself, its part of speech, and its category as defined by Ávila-Cabrera’s swear word taxonomy.
Details
Presentation Type
Paper Presentation in a Themed Session
Theme
New Media, Technology and the Arts
KEYWORDS
Computational Linguistics, Automatic Translation, Taboo Words, Cyberpunk, Spanish, Decision Tree