You said that the goal was predict language of a short text of 140 characters, which is typical for an SMS or a tweet. Wikipedia articles are usually long sentences of grammatically correct language, where SMS or Twitter is full of shortenings, misspellings, abbreviations, and may contain a mix of two languages with English being one of them. This is actually a much harder problem to solve for a neural network, when people who know the language usually have no problem identifying the language of a short text even when grammatically it is a complete gibberish. Did you try your network on real short messages?