-delimiter

If this option is omitted, it is assumed that each line is an entry (no multiple-line entries) and the definition and definiendum are separated by '-' (a dash). Even though it is not required, it is highly recommended to include a space before and afterwards (to eliminate any possible ambiguity with regards to the transliteration of reverse vowels in Extended Wylie). A sample entry for the dictionary is:

-string:

It is assumed that each line is an entry (no multiple-line entries) and the definition and definiendum are separated by the character or string of characters specified by the user. A sample entry for the dictionary is:

-acip:

It is assumed that the electronic file is a transliteration of a Tibetan dictionary. It is called "acip" because it accepts Acip's comment codes ('@' to mark page numbers, brackets to mark comments, etc). Nevertheless, it still requires the files to be in Extended Wylie, so if your file is in Acip's transliteration scheme make sure to run org.thdl.tib.scanner.AcipToWylie first. Definitions here can be of multiple lines, but with no blank lines in between. It is assumed that the definiendum starts after a blank line (except at the beginning of a new page where it could start with the last part of the previous definition) up to the shad (except when the shad is omitted because of grammar rules as for instance no shad after a "ga" suffix without a secondary suffix). Each time a new letter starts, it should be clearly marked in brackets ('', ''), parenthesis ('(', ')') or llaves ('{','}'). A sample entry for the dictionary is:

Comments: Notice in the sample text that at the beginning of page 2, "zhig" is not a new definiendum, but still is part of the definition of "khyod dngos po dang 'brel pa". Also the definiendum of the last entry is "kha dog" (the shad was omitted after "ga" suffix) and not "kha dog mdog du rung ba'am". Nevertheless the definiendum of the second term is not "khyod dngos po dang bdag" since there is no omitted shad after that "ga" suffix; the definiedum is "khyod dngos po dang bdag gcig 'brel". As is clear from the sample text, the tool has to make a series of "smart guesses" to try to figure out where each definiendum end and it's definition start. Such process is not 100% full-proof, so expect some mistakes.

Dictionaries in different formats can be processed together. For instance the command:

would generate alldicts.def and alldicts.wrd processing ry-dic99.txt as dash-separated, myglossary_rdzogs-chen.txt as tab-separated and myglossary_uma.txt in the transliteration format explained above.

org.thdl.tib.scanner.AcipToWylie

Note: Included only in DictionarySearchStandalone.jar

Provides an interface to convert from tibetan text transliterated in the Acip scheme to THDL's Extended Wylie scheme.

If no arguments are sent, it takes the Acip text from the standard input and sends the Wylie text to the standard output. If one argument is sent, it interprets it as the file name for the input. If two arguments are sent, it interprets the first one as the file name for the input and the second one as the file name for the output. For example, the following command converts the lam-rim-chen-mo.act storing the results in lam-rim-chen-mo.txt:

org.thdl.tib.scanner.SwingWindowScannerFilter

Note: Included in both DictionarySearchStandalone.jar and DictionarySearchHandheld.jar

This is the tool's main class. It loads a dictionary stored in the binary tree file format (use org.thdl.tib.scanner.BinaryFileGenerator to create it) and provides a graphical interface to input Tibetan text (in Roman or Tibetan script) and displays the words (in Roman or Tibetan script) with its definitions. Works without Tibetan script in platforms that don't support Swing. Can access dictionaries stored locally or remotely. For example, to access the public dictionary database run the command:

If the JRE you installed does not support Swing classes but supports AWT (as the JRE for handhelds) use org.thdl.tib.scanner.PocketWindowScannerFilter found in DictionarySearchHandheld.jar. Its syntax is the same.

org.thdl.tib.scanner.ConsoleScannerFilter

Note: Included in both DictionarySearchStandalone.jar and DictionarySearchHandheld.jar

Inputs a Tibetan text and displays the words with their definitions through the console over a shell. Use when no graphical interface is supported or for batch processes. For instance:

It reads from the standard input and prints the results to the standard output. For example if you want to parse a text stored in puja.txt and save the results in puja_words.txt, you can run the command: