Welcome!

Nepali NLP Group conducts research and development activities in the field of natural language processing. Our research combines findings from linguistics with methods in machine leaning to develop efficient algorithms to process texts in Nepali.


Broadly, we work in the following areas:


  • Nepali NLP, morphology, parsing

  • Information extraction, data mining

  • Text analytics, social medial analytics

  • Linguistics resource development: corpora, lexicons




  • NELRALEC Tagset: A Part-of-speech Tagset for Nepali Language

    by Shreeya Singh Dhakal on Sept. 22, 2017


    Part-of-speech tags are word classes or syntactic categories of words. They carry important information about words, their neighbours and how they relate to each other. Other important information carried by part-of-speech is the possible morphological affixes for a given word ...

    Read More


    Tags: NLP


    Processing Unicode(Devnagari) in Python

    by Ingroj Shrestha on Sept. 14, 2017

    Source: Devanagari (Unicode block)


    Unicode is a standard for representing characters in different languages using four digit hexadecimal number called code points. Each character is associated with a unique code point. In python, these code points are represented as \uXXXX ...

    Read More


    Tags: Python , Regular Expressions


    Applications of Regular Expression in Text Analysis

    by Ingroj Shrestha on Sept. 4, 2017


    Text analysis applications require frequent pattern matching and searching. For this reason, regular expressions play an important role in text analysis. Regular expressions are special sequence of characters that are useful for searching in texts. They can be used to ...

    Read More


    Tags: Text Analysis , Python , Regular Expressions


    Iterative Rule-based Stemming in Nepali

    by Ingroj Shrestha on Aug. 25, 2017


    Nepali, being a highly inflectional and derivational language, a single word can represent various grammatical forms and meanings. For example a verb root लेख्(lekh) can show different forms such as: लेख्छु(lekh-chu), लेख्छस्(lekh-chas ...

    Read More


    Tags: Text Analysis , NLP , Pre Processing


    Stop Words Removal(Nepali)

    by Shreeya Singh Dhakal on Aug. 13, 2017


    Removing stop words is a common and important practice when working with text analysis applications. So, what are stop words and why filter them out during pre-processing?


    Stop words are the words used in defining the structure of sentences. These ...

    Read More


    Tags: Text Analysis , Pre Processing