Morphological Disambiguation of Turkish with Free-order Co-occurrence Statistics

Arslan, ENİS; Orhan, Umut; Tahiroğlu, Bekir

doi:10.17714/gumusfenbil.430034

Morphological Disambiguation of Turkish with Free-order Co-occurrence Statistics

Atıf İçin Kopyala

Arslan E., Orhan U., Tahiroğlu B. T.

Gümüşhane Üniversitesi Fen Bilimleri Enstitüsü Dergisi, sa.40662, ss.46-52, 2018 (Hakemli Dergi)

Yayın Türü: Makale / Tam Makale
Basım Tarihi: 2018
Doi Numarası: 10.17714/gumusfenbil.430034
Dergi Adı: Gümüşhane Üniversitesi Fen Bilimleri Enstitüsü Dergisi
Derginin Tarandığı İndeksler: TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.46-52
Çanakkale Onsekiz Mart Üniversitesi Adresli: Hayır

Özet

In this article, a solution to the morphological ambiguity problem which occurs frequently in morphologically complex languages like Turkish is proposed. Generally, statistical methods are applicable for these tasks which maximize the information, obtained for a probable word order sequence in a sentence. The decision in selection of the method for calculation of the probabilities and the sequence selection method depends on the nature of the language. By using the co-occurrence statistics obtained from a semantic graph network which represents the lemmas of the sentences, the best word order sequence is selected from the alternatives. The non-ambiguous and free-word-order character of this network is helpful in determining the statistics independently. The probability values are obtained by using the Naive Bayes (NB) method and the selection of each word sequence is achieved by maximization, in the inspiration of the Viterbi algorithm