ABOUT

Sections

srmorph: Serbian Morphology in Python

My interest in linguistics and programming is continued with an experiment in morphology and srmorphproject. It is a pilot endeavour I use to test ideas about parsing words of my native language (Serbian) on word level, and later, syntactic level. This post is about the work in progress.

Affixes as Basics

At the foundation of srmorph are Serbian affixes. I always wanted to write a parser that would work by first examining words on the level of prefixes an suffixes (infixes are somewhat tougher problem). Therefore, the analysis is for now based on identifying affixes.

Environment and Data Format

The environment is Python 3 programming language, while the grammar data format is based around Python classes themselves. The uninstantiated classes are the actual data containers, and after they inherit from main meta classes, the become useful for parsing. For example, a class containing suffixes about declension looks like this:

Parsing and Website

The inherited Serbian affix classes (60+) are so far parsed functionally. I have set up a dynamic website at http://srmorph.languagebits.com/ which shows some of the things that can be done by parsing. For now the algorithm is rather straightforward, until further filtering is introduced on word class level.

Once reasonably developed, the project will become open source.

Details about affix “na” in Serbian

Share this:

Checking Praat’s TextGrids in Python A TextGrid file contains data about intervals, segments, times etc. of the corresponding signal file (audio in wav, mp3, aif…). Because grids are in plain-text – they can be analysed /...

Updated Latin: Meanings and Derivation This is the first major update after putting the site online, two months ago. PyLatinam is improved so it “speaks” English now. Also, there are some grammatical updates. Grammar: Meanings and...