Friday, August 19, 2016

5000 Russian Sentences Sorted from Easiest to Hardest

Here's how this list was made:1) I grabbed a list with the 5000 most frequently used Russian words -- sorted from the most frequently used, to the least.
2) I grabbed 60 000 translated Russian sentences from the internet, each sentence no longer than 5 words.
3) I wrote a program that assigns a Frequency Rank Number to each word from every sentence, this Frequency Rank Number based on the list mentioned on item "1)"
4) This program calculates the average value of the all the words' Frequency Rank Numbers. And assigns this value to the sentence.
The result is that if a sentence contains advanced words, the sentence will have a high Average Frequency Rank Number. If a sentence contains only beginner words, the AFRN will be low.
5) Finally, I sorted the sentences: from the ones with the lowest AFRN, to the highest.

The end result is that this list begins with very, very simple sentences, and new words get slowly introduced as you progress.