Morphological Disambiguation of Turkish with Free-order Co-occurrence Statistics

Arslan E., Orhan U., Tahiroğlu B. T.

Gümüşhane Üniversitesi Fen Bilimleri Enstitüsü Dergisi, no.40662, pp.46-52, 2018 (Peer-Reviewed Journal)


In this article, a solution to the morphological ambiguity problem which occurs frequently in morphologically complex languages like Turkish is proposed. Generally, statistical methods are applicable for these tasks which maximize the information, obtained for a probable word order sequence in a sentence. The decision in selection of the method for calculation of the probabilities and the sequence selection method depends on the nature of the language. By using the co-occurrence statistics obtained from a semantic graph network which represents the lemmas of the sentences, the best word order sequence is selected from the alternatives. The non-ambiguous and free-word-order character of this network is helpful in determining the statistics independently. The probability values are obtained by using the Naive Bayes (NB) method and the selection of each word sequence is achieved by maximization, in the inspiration of the Viterbi algorithm