Instructions and Examples:

The tool allows automatic alignment between parallel texts in the same language. Its purpose is to display various degrees of textual variants based on syntactic alignment.

The tool performs automatic syntax-based intra-language alignment. It performs automatic alignment of different versions of a text. Its concept is based on a modified version of the Needleman-Wunsch algorithm (for more information, see the Bibliography).

It also provides additional refinement criteria, which can be chosen by the user according to the degree of similarity between the texts and to the purpose of the alignment:

Ignore nonalphabetical: ignores symbols such as punctuation and numbers, anything that is not an alphabetical character.

Case sensitive: detects variation between words in different cases.

Ignore diacritics: ignores any type of diacritical character, including punctuation.

Levensthein distance:allows more tolerance on the alignment of similar words, based on a revised version of the Levensthein algorithm (for more information, see the
Bibliography).

Enter your text:

This section allows alignment of single sentences. Copy and paste the sentences to align in plain format. The sentences should be grouped together, separated by one carriage return. One empty line should separate the groups of sentences to be aligned.

If necessary, select additional criteria for the alignment by clicking on the checkbox below the text: you can select Ignore non-alphabetical characters, Case sensitive, Ignore diacritics, Levensthein distance metric.

Click on "Align" for each group of sentences that you want to align. If something went wrong and you want to start again, click on Reset.

After clicking on “Align”, the tool will display the aligned sentences. The texts will be aligned automatically, with the selected additional criteria highlighted in green on the top. In case no criteria are selected, they will appear grey.

File upload:

Choose the file to upload.Important: the file can be either txt or csv, but it has to be structured in the following way:
the sentences to align should be grouped together, separated by one carriage return. One empty line should separate the various groups of sentences to align.

Before uploading, select the desired criteria. You will also be able to go back and reset the instruction if desired.
Check the box of the desired criteria: Ignore non-alphabetical characters, Case sensitive, Ignore diacritics, Levensthein distance metric.

Click on the “Upload button”. The texts will be aligned automatically, with the selected alignment criteria highlighted in green on the top.

Color-Key:

The degree of alignment is displayed in different shades according to the type of match (see Examples 1-3) and to the choice of additional refinement criteria (see Example 4). A color-key is displayed under the aligned text. When there is no complete match, the longest common substring will also be displayed.

Note: The display of multiple sentences is currently different.
Matching tokens are displayed in green, not aligned tokens are displayed in red. Tokens that match only in some of the given sentences are highlighted in light green.
The field below the aligned sentences shows the degree of matching. Moving the cursor over the single squares, the name of the single tokens appears.

How to prepare your text

Copy and paste the plain text to align. The groups of sentences representing each text to be compared have to be separated by one empty line.
Each sentence, individuated by the user, has to be placed in one single line, separated by one carriage return, e.g.:
Choose an option to refine the alignment according to specific criteria (ignore punctuation, ignore diacritics, case sensitive or Levensthein Distance). Documentation on every single option to be added.
Click on “Next” to visualize the tokenized sentences, then click “Align” to visualize the alignment.