Abstract

Terminological Concept Systems (TCS) provide a means of organizing, structuring and representing domain-specific multilingual information and are important to ensure terminological consistency in many tasks, such as translation and cross-border communication. While several approaches to (semi-)automatic term extraction exist, learning their interrelations is vastly underexplored. We propose an automated method to extract terms and relations across natural languages and specialized domains. To this end, we adapt pretrained multilingual neural language models, which we evaluate on term extraction standard datasets with best performing results and a combination of relation extraction standard datasets with competitive results. Code and dataset are publicly available.

Keywords

HyperparameterComputer scienceReplication (statistics)Code (set theory)Key (lock)Language modelArtificial intelligenceMachine learningTraining setNatural language processingProgramming languageStatisticsMathematicsSet (abstract data type)Computer security

Related Publications

Publication Info

Year
2021
Type
preprint
Citations
16995
Access
Closed

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

16995
OpenAlex
0
Influential

Cite This

Yinhan Liu, Myle Ott, Naman Goyal et al. (2021). Towards Learning Terminological Concept Systems from Multilingual Natural Language Text. Leibniz-Zentrum für Informatik (Schloss Dagstuhl) . https://doi.org/10.4230/oasics.ldk.2021.22

Identifiers

DOI
10.4230/oasics.ldk.2021.22

Data Quality

Data completeness: 77%