We use cookies to analyse our traffic. We also share information about your use of our site
with our social media, advertising and analytics partners. By continuing to browse the site you are
agreeing to our use of cookies.
close

NLP Pipeline

Our Multilingual Natural Language pipeline is the core of our technology and enables large-scale processing of text in many languages. We are committed to high accuracy and speed in all tasks. Key features include:

TraDeInterpret

Our solution for the world of trademark denominations, TraDeInterpret performs intra-word morphological analysis of denominations, exploits Comprehendo to understand the semantics hidden in a trademark denomination and leverages WordAtlas to extract concepts across languages together with their definitions for later inspection by trademark experts.

NLP Pipeline

— large-scale, parallel, multilingual and modularized

Process text in many languages!

Our multilingual NLP Pipeline is based on a flexible API which enables effective end-to-end processing of text in the following languages:

Arabic

Chinese

Dutch

English

French

German

Italian

Japanese

Korean

Polish

Portuguese

Russian

Spanish

Multilinguality is a key feature of our pipeline, with most modules available in 13 languages. Moreover, we feature:

parallelism of independent modules

modularization, with an effective pipeline customized to each specific need

large scale, making it possible to process millions of texts in seconds

availability both as an online service and as an offline software package

Modules

Our multilingual Natural Language Processing pipeline includes modules which perform the following tasks, which can be accessed separately and are integrated into the pipeline:

Language recognition

Tokenization

Morphological analysis

Part-of-speech tagging

Named Entity Recognition

Term, concept and entity extraction

Domain labeling

Tag classification

Word Sense Disambiguation and Entity Linking

Semantic vector document creation

Semantic document similarity of sentences, paragraphs and documents

Sentiment analysis

Features

Babelscape’s NLP pipeline comes with several groundbreaking features. It is designed to work on a large scale in dozens of languages using the same interface for each language. Users can choose only the modules they need and can run dozens of tasks in parallel on the same CPU. The pipeline also integrates our flagship products as modules: WordAtlas, Comprehendo and Extraggo, thanks to which a full-fledged analysis of text can be performed, ranging from tokenization to semantic analysis and text analytics.

multilinguality

large scale

parallel

modularity

flexible

high performance

Product comparison

We compared the time performance of our multilingual pipeline with two strong competitors, namely the Stanford CoreNLP and the NLTK libraries, on gold-standard data. The results reported in the Table show that our pipeline is faster and more accurate than its alternatives.