Arabic Online Digits Dataset (AOD)
Department of Electronics Engineering, The American University in Cairo
The AOD training set can be downloaded from here.
This webpage introduces a large Arabic Online Digits Dataset (AOD) suitable for online Arabic digit recognition research. The database is composed of over 30,000 digits are in three formats: a DHW format, MAT format and PDF format.
The database was gathered from 300 writers varying over different age groups with more than 75 percent in the age group between 20 and 35. Our youngest writer is 11 years old and our oldest is 70 years old. More than 90 percent are right-handed and around 60 percent of the writers are females. Each writer was asked to write an average of 10 samples per digit with no constraints on the number of strokes for each digits or the writing style in orientation or size. We collected 30,000 samples which means 300 samples per digit. The data set was collected using DigiMemo 5.9 X8.3 inches on-line dataset. Each digit is labeled and the ground truth data is stored along with the strokes information.
For each writer there are three files:
We are thankful to all who assisted in building this database: Hany Ahmed, Hesham and Kholoud El Meseery.
Send your comments, suggestions, or inquiries to Maha Elmeseery at firstname.lastname@example.org.
- The raw DHW page. This file contains the raw online data in (Digi Memo Format file DHW). This file does not contains any labeling information, it only contains a list of strokes the users wrote on the tablet.
- A labeled MAT file. This is a MAT file which groups the user strokes into labeled digits.
- A PDF file. A PDF file which represents the whole page.