Handwriting recognition

From Wikipedia, the free encyclopedia

Jump to: navigation, search

Handwriting recognition is the ability of a computer to receive and interpret intelligible handwritten input from sources such as paper documents, photographs, touch-screens and other devices. The image of the written text may be sensed "off line" from a piece of paper by optical scanning (optical character recognition) or intelligent word recognition. Alternatively, the movements of the pen tip may be sensed "on line", for example by a pen-based computer screen surface.

Handwriting recognition principally entails optical character recognition. However, a complete handwriting recognition system also handles formatting, performs correct segmentation into characters and finds the most plausible words.

Contents

[edit] On-line recognition

On-line handwriting recognition involves the automatic conversion of text as it is written on a special digitizer or PDA, where a sensor picks up the pen-tip movements as well as pen-up/pen-down switching. That kind of data is known as digital ink and can be regarded as a dynamic representation of handwriting. The obtained signal is converted into letter codes which are usable within computer and text-processing applications.

The elements of an on-line handwriting recognition interface typically include:

  • a pen or stylus for the user to write with.
  • a touch sensitive surface, which may be integrated with, or adjacent to, an output display.
  • a software application which interprets the movements of the stylus across the writing surface, translating the resulting curves into digital text.

Commercial products incorporating handwriting recognition as a replacement for keyboard input were introduced in the early 1980s. Examples include handwriting terminals such as the Pencept Penpad [1] and the Inforite point-of-sale terminal[2]. With the advent of the large consumer market for personal computers, several commercial products were introduced to replace the keyboard and mouse on a personal computer with a single pointing/handwriting system, such as those from PenCept [3], CIC [4] and others. The first commercially available tablet-type portable computer was the GRiDPad from GRiD Systems, released in September 1989. Its operating system was based on MS-DOS.

Handwriting recognition is often used as an input method for hand-held PDAs. The first PDA to provide written input was the Apple Newton, which exposed the public to the advantage of a streamlined user interface. However, the device was not a commercial success, owing to the unreliability of the software, which tried to learn a user's writing patterns. By the time of the release of the Newton OS 2.0, wherein the handwriting recognition was greatly improved, including unique features still not found in current recognition systems such as modeless error correction, the largely negative first impression had been made. After discontinuation of Apple Newton, the feature has been ported to Mac OS X 10.2 or later in form of Inkwell (Macintosh). In many cases, Newton is referred to as the only usable handwriting recognition device. Another effort was Go's tablet computer using Go's Penpoint operating system and manufactured by various hardware makers such as NCR and IBM. IBM's ThinkPad tablet computer was based on Penpoint operating system and used IBM's handwriting recognition. This recognition system was later ported to Microsoft Windows for Pen Computing, and IBM's Pen for OS/2. None of these were commercially successful.

Palm later launched a successful series of PDAs based on the Graffiti recognition system. Graffiti improved usability by defining a set of "unistrokes", or one-stroke forms, for each character. This narrowed the possibility for erroneous input, although memorization of the stroke patterns did increase the learning curve for the user. The Graffiti handwriting recognition was found to infringe on a patent held by Xerox, and Palm replaced Graffiti with a licensed version of the CIC handwriting recognition which, while also supporting unistroke forms, pre-dated the Xerox patent. The court finding of infringement was reversed on appeal, and then reversed again on a later appeal. The parties involved subsequently negotiated a settlement concerning this and other patents Graffiti (Palm OS).

A modern handwriting recognition system can be seen in Microsoft's operating system running on Tablet PCs (notably Windows XP Tablet PC Edition and Windows Vista). It is based on a Time Delayed Neural Network (TDNN) classifier, nicknamed "Inferno", built at Microsoft. Later on a version of CalliGrapher, the handwriting recognition software used on Newton OS 2.0, which in 1999 Microsoft licensed from ParaGraph International was integrated as a secondary recognizer with the TDNN. The new generation of CalliGrapher software is currently shipped for Windows Mobile by PhatWare Corp, which acquired ParaGraph in 2001.

The "third generation" riteScript handwriting recognition technology, built by EverNote Corporation (the successor of Pen&Internet division of Parascript) in 2000-2004, is included in the ritePen and EverNote software. ritePen also includes fusion technology, which allows combining riteScript with the embedded handwriting recognition in Windows Vista to improve recognition accuracy of each handwriting recognition engine.

A Tablet PC is a special notebook computer that is outfitted with a digitizer tablet and a stylus, and allows a user to handwrite text on the unit's screen. The operating system recognizes the handwriting and converts it into typewritten text. Windows Vista includes personalization features that learn a user's writing patterns and/or vocabulary for English, Japanese, Chinese Traditional, Chinese Simplified and Korean. The features include a "personalization wizard" that prompts for samples of a user's handwriting and uses them to retrain the system for higher accuracy recognition. This system is distinct from the less advanced handwriting recognition system employed in its Windows Mobile OS for PDAs.

In recent years, several attempts were made to produce ink pens that include digital elements, such that a person could write on paper, and have the resulting text stored digitally. The best known of these use technology developed by Anoto[5], which has had some success in the education market. The general success of these products is yet to be determined.

Although handwriting recognition is an input form that the public has become accustomed to, it has not achieved widespread use in either desktop computers or laptops. It is still generally accepted that keyboard input is both faster and more reliable. As of 2006, many PDAs offer handwriting input, sometimes even accepting natural cursive handwriting, but accuracy is still a problem, and some people still find even a simple on-screen keyboard more efficient.

[edit] Brief Historical Notes

  • 1915: U.S. Patent on handwriting recognition user interface with a stylus[6] [7]
  • 1957: Stylator tablet: Tom Dimond demonstrates electronic tablet with pen for computer input and handwriting recognition [8]
  • 1961: RAND Tablet invented: better known than earlier Stylator system[9] [10]
  • 1962: Computer recognition of connected/script handwriting[11]
  • 1969: GRAIL system: handwriting recognition with electronic ink display, gesture commands[12]
  • 1973: Applicon CAD/CAM computer system [13] using the Ledeen recognizer for handwriting recognition [14]
  • 1980s: Retail handwriting-recognition systems: Pencept [3] and CIC [4] both offer PC computers for the consumer market using a tablet and handwriting recognition instead of a keyboard and mouse. Cadre System markets Inforite point-of-sale terminal using handwriting recognition and a small electronic tablet and pen[15].
  • 1989: Portable handwriting recognition computer: GRiDPad [16]from GRiD Systems.

[edit] Off-line recognition

Off-line handwriting recognition involves the automatic conversion of text in an image into letter codes which are usable within computer and text-processing applications. The data obtained by this form is regarded as a static representation of handwriting.

The technology is successfully used by businesses which process lots of handwritten documents, like insurance companies. The quality of recognition can be substantially increased by structuring the document (by using forms).

The off-line handwriting recognition is comparatively difficult, as different people have different handwriting styles. Nevertheless, limiting the range of input can allow recognition to improve. For example, the ZIP code digits are generally read by computer to sort the incoming mail.

[edit] Research

Handwriting Recognition has an active community of academics studying it. The biggest conferences for handwriting recognition are the International Workshop on Frontiers in Handwriting Recognition (IWFHR), held in even-numbered years, and the International Conference on Document Analysis and Recognition (ICDAR), held in odd-numbered years. Both of these conferences are scrutinized by the IEEE. Active areas of research include:

  • Online Recognition
  • Offline Recognition
  • Signature Verification
  • Postal-Address Interpretation
  • Bank-Check Processing

[edit] See also

[edit] References

  1. ^ Pencept Penpad (TM) 200 Product Literature, Pencept, Inc., 1982-08-15, http://rwservices.no-ip.info:81/pens/biblio83.html#Pencept83 
  2. ^ Inforite Hand Character Recognition Terminal, Cadre Systems Limited, England, 1982-08-15, http://rwservices.no-ip.info:81/pens/biblio83.html#Inforite82 
  3. ^ a b Users Manual for Penpad 320, Pencept, Inc., 1984-06-15, http://users.erols.com/rwservices/pens/biblio85.html#Pencept84d 
  4. ^ a b Handwriter (R) GrafText (TM) System Model GT-5000, Communication Intelligence Corporation, 1985-01-15, http://rwservices.no-ip.info:81/pens/biblio85.html#CIC85 
  5. ^ Anoto Technology: Digital Pen and Paper, Anoto Group AB, http://www.anoto.com, retrieved on 2008-08-23 
  6. ^ Goldberg, H.E., Controller, United States Patent 1,117,184, http://users.erols.com/rwservices/pens/biblio70.html#GoldbergHE15 
  7. ^ Goldberg, H.E. (PDF), Controller, United States Patent 1,117,184 (full image), http://www.freepatentsonline.com/1117184.pdf 
  8. ^ Dimond, Tom (1957-12-01), Devices for reading handwritten characters, Proceedings of Eastern Joint Computer Conference, pp. 232-237, http://rwservices.no-ip.info:81/pens/biblio70.html#Dimond57, retrieved on 2008-08-23 
  9. ^ RAND Tablet, 1961-09-01, http://users.erols.com/rwservices/pens/biblio70.html#RAND61 
  10. ^ 50 Years of Looking Forward, RAND Corporation, 1998-09-01, http://www.rand.org/publications/randreview/issues/rr.fall.98/50.html 
  11. ^ Harmon, L.D. (1962-08-01), Handwriting reader recognizes whole words, Electronics, Vol 35, August 1962, http://users.erols.com/rwservices/pens/biblio70.html#Harmon62 
  12. ^ Ellis, T.O. (1969-09-01), The GRAIL Project: An Experiment in Man-Machine Communications, The RAND Corporation, RM-5999-ARPA, Santa Monica, California, September 1969, http://users.erols.com/rwservices/pens/biblio70.html#EllisTO69a 
  13. ^ Computerized Graphic Processing System: System User's Manual, Applicon Incorporated, 1973-09-01, http://users.erols.com/rwservices/pens/biblio75.html#Applicon73 
  14. ^ Newman, W.M. (1973-09-01), The Ledeen Character Recognizer, Principles of Interactive Computer Graphics, McGraw-Hill, pp. 575-582, http://users.erols.com/rwservices/pens/biblio75.html#NewmanWM73a 
  15. ^ Inforite Hand Character Recognition Terminal, Cadre Systems Limited, England, 1982-08-15, http://users.erols.com/rwservices/pens/biblio83.html#Inforite82 
  16. ^ The BYTE Awards: GRiD System's GRiDPad, BYTE Magazine, Vol 15. No 1, 1990-01-12, pp. 285, http://rwservices.no-ip.info:81/pens/biblio90.html#GridPad90a 

[edit] Related websites

[edit] Vendors and links to commercial sites

Personal tools