Current Version

Keywords

Description

Six different samples of the character 'ka'. Red dots denote the start of each stroke.

This dataset of on-line handwriteen Devangari characters is composed of 1800 samples from 36 character classes obtained by 25 native writers. Each writer was asked to provde two samples per class.

No specific directions, constraints, or instructions were given to the users, aiming for a database of completely natural handwritings.

For data collection we used a simple Graphite tablet (WCACOM ET0405A-U), which captures the pen-tip position in the form of 2D coordinates.

Metadata and Technical Details

Each character is stored in a separate file and the files are text based comma separated values. The size of each character is approximately 4KB in average (actual size varies depending on the number and size of the strokes coprising the character).

The dataset is organised in folders that reflect the 36 classes. Inside each class folder there are 50 samples. For every writer there are two samples per class denoted by userX_1 and userX_2.

The digitizer captures a series of strokes during pen movement. A string of coordinates (pen-tip positions) from pen down to pen up movement represents a stroke.

For simplicity, we have inserted the special value [−1.0, −1.0] to indicate the termination of a stroke that makes it easier to count and separate strokes in a complete character. The following is an example for a two-stroke character. It is important to note that a series of [-1.0, -1.0] can be received when writing with tremor as well as in the case where pen-tip is just above the surface of the pad. Pre-processing is left to the end-user.