no longer attempt to hyphenate uppercased words such as ‘LONDON’. This
feature had to be dropped to work around a likely bug in the C extension which,
under Python 3.3, caused
the hyphenator to return words starting with a capital letter as lowercase.

New in Version 2.0

The hyphen.dictools module has been completely rewritten. This was required
by the switch from OpenOffice to LibreOffice which does no longer support the
old formats for dictionaries and meta data. these changes made it impossible to release a stable v1.0.
The new dictionary management is more
flexible and powerful. There is now a registry for locally installed hyphenation dictionaries. Each dictionary
can have its own file path. It is thus possible to add persistent metadata on pre-existing hyphenation
dictionaries, e.g. from a LibreOffice installation.
Each dictionary and hence Hyphenator can now be
associated with multiple locales such as for ‘en_US’ and ‘en_NZ’. These changes cause some backwards-incompatible API changes.
Further changes are:

Hyphenator.info is of a container type for ‘url’, ‘locales’ and ‘filepath’ of the dictionary.

the Hyphenator.language attribute deprecated in v1.0 has been removed

download and install dictionaries from LibreOffice’s git repository by default

dictools.install(‘xx_YY’) will install all dictionaries found for the ‘xx’ language and associate them with all relevant locales
as described in the dictionaries.xcu file in LibreOffice’s git repository.

Upgraded the C library libhyphen
to v2.7 which brings significant improvements, most notably correct treatment of
already hyphenated words such as ‘Python-powered’

use a CSV file from the oo website with meta information
on dictionaries for installation of dictionaries and
instantiation of hyphenators. Apps can access the metadata
on all downloadable dicts through the new module-level attribute hyphen.dict_info or for each hyphenator
through the ‘info’ attribute,

Hyphenator objects have a ‘info’ attribute which is
a Python dictionary with meta information on
the hyphenation dictionary. The ‘language’ attribute
is deprecated. Note: These new features add
complexity to the installation process as the metadata and dictionary files
are downloaded at install time. These features have to be tested
in various environments before declaring the package stable.

Streamlined the installation process

The en_US hyphenation dictionary
has been removed from the package. Instead, the dictionaries for en_US and the local language are automatically
downloaded at install time.

restructured the package and merged 2.x and 3.x setup files

switch from svn to hg

added win32 binary of the C extension module for Python32, currently no binaries for Python 2.4 and 2.5

New in version 0.10

added win32 binary for Python 2.7

renamed ‘hyphenator’ class to to more conventional ‘Hyphenator’. ‘hyphenator’ is deprecated.

PyHyphen is a pythonic interface to the hyphenation C library used in software such as LibreOffice and the Mozilla suite.
It comes with tools to download, install and uninstall hyphenation dictionaries from LibreOffice’s Git repository.
PyHyphen consists of the package ‘hyphen’ and the module ‘textwrap2’.
The source distribution supports Python 2.6 or higher, including Python 3.3. If you depend on python 2.4 or 2.5, use PyHyphen-1.0b1
instead. In this case you may have to download hyphenation dictionaries manually.

the class hyphen.Hyphenator: each instance of it
can hyphenate and wrap words using a dictionary compatible with the hyphenation feature of
LibreOffice and Mozilla.

the module dictools contains useful functions such as for downloading and
installing dictionaries from a configurable repository. After installation of PyHyphen, the
LibreOffice repository is used by default.

hyphen.dict_info: a dict object with metadata on all hyphenation dictionaries installed locally. In previous
versions, dict_info contained meta data on all downloadable dictionaries. This feature
is no longer supported as LibreOffice’s GIT repository
does not provide such a list anymore. Instead, Use
hyphen.config.languages which is an incomplete set of
language codes of hyphenation dictionaries available from LibreOffice’s GIT repository. These codes
can be passed to hyphen.dictools.install() to download and install
the respective dictionary and update the local registry.

hyphen.config is a configuration file initialized at install time with default values
for paths of dictionaries and the registry file, as well as the default URL of
the repository for
downloadable dictionaries. Initial values for the local paths are set to
the package root, the URL is set to the LibreOffice
repository for dictionaries.

hyphen.DictInfo: dict-like container type for meta data on dictionaries. It has the following attributes:
‘locales’: a list of locales for which the dictionary is suitable;
‘url’: the URL from which the dictionary was downloaded, or None; ‘filepath’: the
local path including the file name of the dictionary.

hyphen.hnj’ is the C extension module that does all the ground work. It
contains the high quality
C library libhyphen.
It supports hyphenation with replacements as well as compound words.
Note that hyphenation dictionaries are invisible to the
Python programmer. But each hyphenator object has an attribute ‘info’ which is a
DictInfo object containing meta data on the hyphenation dictionary of this Hyphenator instance.
The ‘language’ attribute containing a locale for which the dictionary is suitable,
is deprecated as from v1.0. Use my_hyphenator.info.locales instead to access
a list of locales for which the dictionary is suitable.

This module is an enhanced though backwards compatible version of the module
‘textwrap’ from the Python standard library. Unsurprisingly, it adds
hyphenation functionality to ‘textwrap’. To this end, a new key word parameter
‘use_hyphenator’ has been added to the __init__ method of the TextWrapper class which
defaults to None. It can be initialized with any hyphenator object. Note that until version 0.7
this keyword parameter was named ‘use_hyphens’. So older code may need to be changed.’

PyHyphen works with Python 2.6 or higher, including Python3.
There are pre-compiled binaries of the hnj module for win32 and Python 2.6, 2.7, 3.2 and 3.3.
On other platforms you will need a build environment such as gcc, make

You can compile and install the hyphen package
as well as the module textwrap2 by entering at the command line somethin like:

$python setup.py install

The setup script will first check the Python version, create a ‘hyphen’ subdir and copy
the required files from the 2.x and src subdirs. If needed, lib2to3 will
be used.

Second, setup.py searches in ./bin for a pre-compiled binary
of hnj for your platform. If there is a binary that looks ok, this version is installed. Otherwise,
hnj is compiled from source. On Windows you will need MSVC, mingw
or whatever fits to your Python distribution.
If the distribution comes with a binary of ‘hnj’
that fits to your platform and python version, you can still force a compilation from
source by entering

$python setup.py install –force_build_ext

Under Linux you may need root privileges, so you may want to enter something like

$sudo python setup.py install

After compiling and installing the hyphen package, config.py is adjusted as follows:

the local default path for hyphenation dictionaries is set to the package directory

the base URL from which
dictionaries are downloaded is set to LibreOffice’s GIT repository

Thereafter the setup script imports the hyphen package to install a default
set of dictionaries, unless the command line contains ‘no_dictionaries’ after the ‘install’ command.
The dictionaries installed by default are those for English and the locale, if different.