Handwritten identification software background knowledge

xiaoxiao2021-03-06  50

1. Handwritten Identification Software

Handwriting recognition software includes two parts: identification procedures and identification dictionaries.

(1) The identification program refers to the source program (also known as source code) that can perform code and generates these executable code. This source code is actually a programmed recognition algorithm.

(2) The identification dictionary is a special database that must be used during the identification program run, which contains a handwritten feature description (ie, template) that can identify all characters in the character set. Identification Dictionary is a dictionary generating program to be complex, and the dictionary generated program, and identification program are closely related to a dictionary generating program, but it is inconsistent. The constructive and generation of the identification dictionary is a critical part of the performance of a recognition software; different companies are unlikely to generate and use the same identification dictionary due to the difference between the different trained samples.

2. Identification process description

The identification process of the online handwritten text is the handwriting data obtained by the handwritten device, the process of the identification program, and finally transforms into the process of the text code used by the computer. It is usually divided into four phases: pretreatment, normalization, feature extraction, feature matching. See below

3. Identify performance indicator

Usually evaluated a good and bad of identification software, mainly based on the following performance indicators:

(1) Identification: Refers to the identification of a large-scale test sample set, usually subdivided into the preferred correct rate and ten election. Test sample sets are usually divided into: worn, even pen words, free writing words (ie, insert pen words). (See the national eight six three evaluation)

Therefore, the recognition ability (if you can identify the pen word? Can I identify the pencil pen?) Also an important indicator for identifying performance evaluation.

(2) Identification speed: Each time the CPU run time consumed by a sample (word) to be tested.

(3) The range of scope of the recognizable character set: typically includes standard Chinese characters (6763 Chinese characters specified in GB2312), foreign characters (traditional and non-standard simplified words), English letters, Arab figures, punctuation, symbols, etc.

(4) Identify the size of the dictionary

(5) The demand for memory (RAM): that is, the memory (RAM) required during the identification process (RAM) is much.

If a recognition software has a high identification rate, the recognition speed is very fast, the recognition character set is very comprehensive, the identification dictionary is small, the memory demand is very small, then it is a very good identification software.

Identification performance depends on the structure of the identification program and the identification dictionary.

转载请注明原文地址:https://www.9cbs.com/read-80847.html

New Post(0)