Microsoft Compiled HTML Help

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Microsoft Compiled HTML Help
Filename extension .chm
Developed by Microsoft
Initial release 1997
Extended to .lit
Standard(s) No

Microsoft Compiled HTML Help is a proprietary format for online Help files, developed by Microsoft and first released in 1997 as a successor to the Microsoft WinHelp format. It was first introduced with the release of Windows 98, and is still supported and distributed through Windows XP and Vista platforms.

HTML Help files are made with Help authoring tools. Microsoft ships the Help Workshop with supported versions of Microsoft Windows and makes the tool available as a free download. There are also a number of third-party Help authoring tools available.

The Microsoft Reader .LIT file format is basically a modification of the HTML Help CHM format[citation needed].

.chm files are sometimes used for e-books. Most users find the format inconvenient because of its limited compatibility and the extreme difficulty of converting it to any other format without the use of expensive proprietary programs.

In 2002, Microsoft announced some security risks associated with the .CHM format, as well as some security bulletins and patches.[1] They have since announced their intentions not to develop the .CHM format further, and will be moving to a new generation of Windows Help called Microsoft Assistance Markup Language in the Windows Vista operating system.

Contents

[edit] History

Month Year Description
February 1996 Microsoft announced plans to stop development of WinHelp and start development on HTML Help.
August 1997 HTML Help 1.0 (HH 1.0) was released with Internet Explorer 4.
February 1998 HTML Help 1.1a shipped with Windows 98.
January 2000 HTML Help 1.3 shipped with Windows 2000.
July HTML Help 1.32 released with Internet Explorer 5.5 and Windows Me.
October 2001 HTML Help 1.33 released with Internet Explorer 6 and Windows XP.
March At the WritersUA (formerly WinWriters) conference, Microsoft announced plans for a new help platform Help 2, also HTML based.
January 2003 Microsoft decided not to release Microsoft Help 2 as a general Help platform.

[edit] File format

A CHM Help file has a ".chm" extension. It has a set of web pages written in a subset of HTML and a hyperlinked table of contents. CHM format is optimized for reading, as files are heavily indexed. All files are compressed together with LZX compression. Most CHM browsers have the capability to display a table of contents outside of the body text of the Help file.

The file starts with bytes "ITSF" (in ASCII), for "Info-Tech Storage Format". The format has been reverse engineered by Matthew Rusotto with assistance from Peter Ferrie and an anonymous contributor known only as "pabs (at) zip (dot) to". Rusotto's documentation is freely available at http://www.russotto.net/chm/chmformat.html[1].

On Windows computers, this Help file can be compiled using hhc.exe. There are some open source tools which can read and explore these files, but they lack various features of the Microsoft Windows tools, most importantly a write support.

A CHM file can contain links to other CHM files. When opening such a CHM file for the first time, the HTML Help viewer creates an index file with the extension .CHW. The .CHW file contains all the index terms of the master and linked CHM files, and enables faster searching for indexed terms. [2]

[edit] Advantages

  • File size smaller than plain HTML
  • Range of formatting options that HTML gives for text presentation
  • Ability to search the full text
  • Ability to assemble several CHM files into one file with common TOC, index and search (see MSDN)
  • Ability to Generate TOC and Topic Folders containing international characters. Standard HTML Help will not generate these correctly.

[edit] Applications

This format was originally intended only for encoding Help files, but other uses have since been found. It is very handy for packing saved HTML pages in one compact and browsable archive and for creating compact e-books. Some people use it to keep personal notes, because it can organize them in an ordered hierarchical table and allows quick text searching.

[edit] Reading on other platforms

[edit] GTK

[edit] Qt

[edit] Other

  • xCHM
  • Archmage[6]
  • DisplayCHM
  • CHM Reader Firefox addon[7]

[edit] Mac OS X

[edit] Extracting to HTML

On Windows, a CHM file can be extracted to plain HTML with the command:

hh.exe -decompile extracted filename.chm

This will decompress all files embedded in filename.chm to folder extracted.

or by using HTML Help Workshop.

On Windows, as a hack, select the topmost Topic, right-click and select Print. On the popup dialog, select "Print this heading and all subtopics.", click OK. Before selecting a printer, look in %HOMEPATH%\Local Settings\Temp for a file named ~hh*.htm. This is the concatenated HTML. The image reference will refer back to the .chm file at the time of the "print".

On Linux systems which use apt as a packaging tool, a CHM file is extracted to plain HTML with (first command is for a Debian based OS)

 $ sudo apt-get install libchm-bin
 $ extract_chmLib tero.chm tero/

Another useful set of tools for CHM files in non-Windows environments is the CHM Tools Package. It's available as source code, and includes a program, chmdump, which extracts the HTML from a CHM file into a separate directory.

It's also available for Mac OS X via MacPorts.

If MacPorts is installed on your system, you can type:

  $ sudo port install chmdump

at a Terminal prompt to install the package. You can then extract a CHM file with:

  $ chmdump chmfile.chm outdir

[edit] Known problems

Some CHM files behave poorly under IE7. Printing Topics will crash the CHM viewer on malformed HTML.[citation needed]

[edit] Alternative competing formats

MHTML MIME HTML

[edit] See also

[edit] References

  1. ^ WinWriters - Security and Microsoft Help
  2. ^ Build Help Indexes in Advance, Microsoft Office Online

[edit] Implementations

[edit] External links

Personal tools