Google Translate is the world’s most popular web translation platform, but one Stanford University researcher says it doesn’t really understand sex and gender. Londa Schiebinger, who runs Stanford’s Gendered Innovations project, says Google’s choice of source databases causes a statistical bias toward male nouns and verbs in translation. In a paper on gender and natural language processing, Schiebinger offers convincing evidence that the source texts used with Google’s translation algorithms lead to unintentional sexism. As Fast Company reports:
“In a peer-reviewed case study published in 2013, Schiebinger illustrated that Google Translate has a tendency to turn gender-neutral English words (such as the, or occupational names such asprofessor and doctor) into the male form in other languages once the word is translated. However, certain gender-neutral English words are translated into the female form . . . but only when they comply with certain gender stereotypes. For instance, the gender-neutral English terms a defendant and a nurse translate into the German as ein Angeklagter and eine Krankenschwester. Defendanttranslates as male, but nurse auto-translates as female.
“Where Google Translate really trips up, Schiebinger claims, is in the lack of context for gender-neutral words in other languages when translated into English. Schiebinger ran an article about her work in the Spanish-language newspaper El Pais into English through Google Translate and rival platform Systran. Both Google Translate and Systran translated the gender-neutral Spanish words “suyo” and “dice” as “his” and “he said,” despite the fact that Schiebinger is female.
“These sorts of words bring up specific issues in Bing Translate, Google Translate, Systran, and other popular machine translation platforms. Google engineers working on Translate told Co.Labs that translation of all words, including gendered ones, is primarily weighed by statistical patterns in translated document pairs found online. Because “dice” can translate as either “he said” or “she said,” Translate’s algorithms look at combinations of “dice” in conjunction with neighboring words to see what the most frequent translations of those combinations are. If “dice” renders more often in the translations Google obtains as “he says,” then Translate will usually render it male rather than female. In addition, Google Translate’s team added that their platform only uses individual sentences for context. Gendered nouns or verbs in neighboring sentences aren’t weighed in terms of establishing context.”