Sixteenth International Conference on the Arts in Society

Abstract

This paper presents a specific example of how swear word translation can be studied computationally using mathematical techniques. The study describes a method for predicting where swear words can be added or omitted in English-to-Spanish translations. In Juanma Barranquero’s Spanish translation of Neal Stephenson’s “Snow Crash”, we see the English sentence “Whatsit (sic) stand for?” translated to “¿Qué coño significa?” (or “What the hell does it stand for?”) Note the addition of a swear word. We know from research by Davoodi (2009) and Ávila-Cabrera (2016) that translators at times expand or condense the original text, but when is this allowed for swear words, and can we predict it using a computational model? This study begins by outlining how a bilingual corpus was created using Stephenson’s cyberpunk novel and Barranquero’s Peninsular Spanish translation. It then demonstrates how Quinlan’s (1993) C4.5 decision tree classifier was used to generate a predictive model based on lexical, syntactic and semantic properties including the swear word itself, its part of speech, and its category as defined by Ávila-Cabrera’s swear word taxonomy.

Presenters

Mike Field
Director, Mike Field Enterprises, Ontario, Canada

Details

Presentation Type

Paper Presentation in a Themed Session

Theme

New Media, Technology and the Arts

KEYWORDS

Computational Linguistics, Automatic Translation, Taboo Words, Cyberpunk, Spanish, Decision Tree

Error