Arabic On-line Digits Dataset: AOD
Electronics Engineering Deptartment, The American University in Cairo
AOD training set can be download from here.
This web page introduces a large Arabic On-line Digits Dataset: AOD suitable for Online Arabic digit recognition research. The database is composed of over 30,000 digits are in three format: a DHW format, MAT format and PDF format.
The database was gathered from 300 writers varying over different age groups with more than 75% in the age group between 20 and 35. Our youngest writer is 11 years old and our oldest is 70 years old. More than 90% are right handed and around 60% of the writers are females. Each writer was asked to write an average of 10 samples per digit with no constraints on the number of strokes for each digits or the writing style in orientation or size. We collected 30,000 samples which means 300 samples per digit. The data set was collected using DigiMemo 5.9 X8.3 inches on-line dataset. Each digit is labeled and the ground truth data is stored along with the strokes information.
For each writer there are three files:
- The raw DHW page. This file contains the raw online data in (Digi Memo Format file (DHW). This file does not contains any labeling information, it only contains a list of strokes the users wrote on the taplet.
- A labeled MAT file. This is a MAT file which groups the user strokes into labeled digits.
- A PDF file. A PDF file which represents the whole page.
We are thankful to all who assisted in building this database: Hany Ahmed, Hesham and Kholoud El Meseery.
Send your comments, suggestions, or inquiries to Maha Elmeseery at email@example.com.