Rich Text Format

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Rich Text Format
Filename extension .rtf
Internet media type text/rtf
Type code 'RTF '
Uniform Type Identifier public.rtf
Magic number {\rtf
Developed by Microsoft
Type of format document file format

The Rich Text Format (often abbreviated RTF) is a document file format developed by Microsoft in 1987 for cross-platform document interchange. Most word processors are able to read and write RTF documents.

It should not be confused with enriched text (mimetype "text/enriched" of RFC 1896) or its predecessor Rich Text (mimetype "text/richtext" of RFC 1341 and 1521) which are completely different specifications.

Contents

[edit] History

Members of the Microsoft Word development team, Richard Brodie, Charles Simonyi, and David Luebbert developed the original RTF in the middle to late 1980s. Its syntax was influenced by the TeX typesetting language. The first RTF reader and writer shipped in 1987 as part of Microsoft Word 3.0 for Macintosh, which implemented the version 1.0 RTF specification.

All subsequent releases of Microsoft Word for the Macintosh and all versions of Microsoft Word for Windows have included built-in RTF readers and writers which translate from RTF to Word's .doc format and from .doc to RTF.

The format is still owned by Microsoft to this date; as of March 2008 it is up to version 1.9.1.

[edit] RTF specification timeline

  • 1987: RTF 1.0
  • January 1994: RTF 1.3
  • April 1997: RTF 1.5
  • May 1999: RTF 1.6
  • August 2001: RTF 1.7
  • April 2004: RTF 1.8
  • March 2008: RTF 1.9.1

[edit] Sample RTF document

As an example, the following RTF code:

{\rtf1\ansi{\fonttbl\f0\fswiss Helvetica;}\f0\pard
This is some {\b bold} text.\par
}

would be rendered like this when read by an appropriate word processor:

This is some bold text.

A backslash (\) starts an RTF control code. The \par control code indicates the end of a paragraph, and \b switches to a bold typeface. Braces ({ and }) define a group; the example uses a group to limit the scope of the \b control code. Everything else will be treated as clear text, or the text to be formatted. A valid RTF document is a group starting with the \rtf control code.

[edit] Character encoding

RTF is an 8-bit format. That would limit it to ASCII, but RTF can encode characters beyond ASCII by escape sequences. The character escapes are of two types: code page escapes and Unicode escapes. In a code page escape, two hexadecimal digits following an apostrophe are used for denoting a character taken from a Windows code page. For example, if control codes specifying Windows-1256 are present, the sequence \'c8 will encode the Arabic letter beh (ب).

If a Unicode escape is required, the control word \u is used, followed by a 16-bit signed decimal integer giving the Unicode codepoint number. For the benefit of programs without Unicode support, this must be followed by the nearest representation of this character in the specified code page. For example, \u1576? would give the Arabic letter beh, specifying that older programs which do not have Unicode support should render it as a question mark instead.

The control word \uc0 can be used to indicate that subsequent Unicode escape sequences within the current group do not specify a substitution character.

[edit] Human readability

Unlike most word processing formats, good RTF code can be made human-readable. That is to say that when an RTF file is opened in a text editor, the text is legible and the markup language is not too distracting or counter-intuitive. The RTF files produced by most programs, such as MS Word, will contain such a large number of control codes for compatibility with older programs that most files will easily be an order of magnitude larger than the raw text and very difficult to read. Formats such as MS Word's .doc are, in contrast, binary formats with only a few scraps of legible text.

Nowadays, human-readable XML-based formats are becoming more common, but at RTF's release its level of readability was rare among document formats. Note that the XML-based OpenDocument and Office Open XML formats are often not immediately human-readable because they are a bundle of several different files within a ZIP archive.

[edit] Common implementations

Most word processing software implementations support RTF format import and export, often making it a "common" format between otherwise incompatible word processing software.

The WordPad editor in Microsoft Windows creates RTF files by default. It once defaulted to the Microsoft Word 6.0 file format, but write support for Word documents was dropped in a security update.

The free and open-source word processors AbiWord, OpenOffice.org, and KWord can view and edit RTF files.

The default editor for Mac OS X, TextEdit, can also view and edit RTF files as well as RTFD files.

SIL International’s Toolbox application for developing and publishing dictionaries uses RTF as its most common form of document output. RTF files produced by Toolbox are designed to be used in Microsoft Word, but can also be used by other RTF-aware word processors.

[edit] See also

[edit] External links

Personal tools