GAZİOSMANPAŞA BİLİMSEL ARAŞTIRMA DERGİSİ (GBAD), cilt.8, sa.2, ss.35-48, 2019 (Hakemli Dergi)
In this study, we present a semantic graph network model which is capable of detecting out-ofvocabulary (OOV) words in Turkish texts. In natural language processing (NLP) field, morphological
analyzers can encounter unknown words (UW) during word processing. This mostly occurs when these kind
of tools depend on a dictionary to find the probable lemmas in order to further process parsing. Sometimes,
an analyzer is unable to find any candidates because of the non-existence of the lemma candidates in the
dictionary. This results in degraded parsing output. The proposed model for OOV detection is able to define
OOV words which are suitable for dictionaries. Also co-occurrence relations of the lemmas in texts are
modelled as a semantic sub-graph and it is used to discover collocations to propose as new lemma candidates