Spelling Correction Using Recurrent Neural Networks and Character Level N-gram


KINACI A. C.

International Conference on Artificial Intelligence and Data Processing (IDAP), Malatya, Türkiye, 28 - 30 Eylül 2018 identifier

  • Yayın Türü: Bildiri / Tam Metin Bildiri
  • Cilt numarası:
  • Basıldığı Şehir: Malatya
  • Basıldığı Ülke: Türkiye
  • Çanakkale Onsekiz Mart Üniversitesi Adresli: Evet

Özet

Spelling correction is the process of finding the correct word for a misspelled word in a text. Any system aimed to fix this error can not know the writer's intent. But at the same time it should find the word that the user wanted to write. In this study, we trained a recurrent neural network with dictionary words and used as an oracle. For a misspelled word, this oracle returns a candidate dictionary word. Character level bigram model is used to generate new query words from a misspelled word. These new query words are also given to the trained network for getting more candidate dictionary words. For testing the method's performance, randomly distorted dictionary words are used. Results showed that the trained network had an acceptable accuracy level. Also finding candidates using generated new query words have a positive impact on accuracy rather than using only misspelled word.