A Corpus-Based Analysis of Mediation in EU Multi-word Organization Names


Conference


Fernando Sánchez Rodas
Multi-word Units in Machine Translation and Translation Technology (MUMTTT), EUROPHRAS, Málaga (Spain), 2022 Sep 30

PDF
Cite

Cite

APA   Click to copy
Rodas, F. S. (2022). A Corpus-Based Analysis of Mediation in EU Multi-word Organization Names. In Multi-word Units in Machine Translation and Translation Technology (MUMTTT). Málaga (Spain): EUROPHRAS.


Chicago/Turabian   Click to copy
Rodas, Fernando Sánchez. “A Corpus-Based Analysis of Mediation in EU Multi-Word Organization Names.” In Multi-Word Units in Machine Translation and Translation Technology (MUMTTT). Málaga (Spain): EUROPHRAS, 2022.


MLA   Click to copy
Rodas, Fernando Sánchez. “A Corpus-Based Analysis of Mediation in EU Multi-Word Organization Names.” Multi-Word Units in Machine Translation and Translation Technology (MUMTTT), EUROPHRAS, 2022.


BibTeX   Click to copy

@conference{fernando2022a,
  title = {A Corpus-Based Analysis of Mediation in EU Multi-word Organization Names},
  year = {2022},
  month = sep,
  day = {30},
  address = {Málaga (Spain)},
  organization = {EUROPHRAS},
  author = {Rodas, Fernando Sánchez},
  booktitle = {Multi-word Units in Machine Translation and Translation Technology (MUMTTT)},
  month_numeric = {9}
}

Abstract
This study aims at using Named Entity Recognition (NER) to extract a specific type of multi-word entity, that is, multi-word organization names (MWORGs), from an English-Spanish comparable corpus of European Parliament documents. Following a triadic, Peircean model of translation and grammar, we hypothesize that MWORGs are nominal constructions (or signs) which serve a semiotic function of mediation in EU translations (Stecconi 2009; Torres-Martínez 2022). Specific performance of the VIP-DeepPavlov NER system (Corpas Pastor 2021) with MWORGs is evaluated in terms of precision, recall, and F-1 scores. Relevant MWORGs are then annotated and analyzed from a contrastive, semi-constructional approach (Boas 2010) to determine how many of them are mediating, and under which schemata. Results predictably show that non-mediating constructions are prevalent in non-translated English (66 %), as mediating constructions are in translated Spanish (81 %). However, a surprising 34 % of the organization names in non-translated English are mediating; inversely, 19 % of the MWORGs in translated Spanish serve a non-mediating function. Seven different mediation schemes (blending, borrowing, translation, and further combinations of the three) where discovered among MWORGs, some of them languagepreferent. This reinforces our belief that names are largely disregarded semiotic hubs, and indeed a crucial piece in the understanding of (non-)translations and (non-)interpretations as construction-based grammars with a specific number of similar, different, and mediating rules in each language and textual typology.