From Wikipedia, the free encyclopedia

Jump to: navigation, search
This article contains special characters. Without proper rendering support, you may see question marks, boxes, or other symbols.


apostrophe ( ' )
brackets (( )), ([ ]), ({ }), (< >)
colon ( : )
comma ( , )
dashes ( , , , )
ellipses ( , ... )
exclamation mark/point ( ! )
full stop/period ( . )
guillemets ( « » )
hyphen ( -, )
question mark ( ? )
quotation marks/inverted commas ( ‘ ’, “ ” )
semicolon ( ; )
slash/stroke ( / )
solidus ( )
Word dividers
spaces ( ) () () ( ) () () ()
interpunct ( · )
General typography
ampersand ( & )
at sign ( @ )
asterisk ( * )
backslash ( \ )
bullet ( )
caret ( ^ )
currency generic: ( ¤ )
specific: ฿, ¢, $, , , £, , ¥, ,
daggers ( , )
degree ( ° )
inverted exclamation mark ( ¡ )
inverted question mark ( ¿ )
number sign/pound/hash ( # )
numero sign ( )
ordinal indicator (º, ª)
percent (etc.) ( %, ‰, )
pilcrow ( )
prime ( )
section sign ( § )
tilde/swung dash ( ~ )
umlaut/diaeresis ( ¨ )
underscore/understrike ( _ )
vertical/pipe/broken bar ( |, ¦ )
Uncommon typography
asterism ( )
index/fist ( )
therefore sign ( )
because sign ( )
interrobang ( )
irony mark ( )
lozenge ( )
reference mark ( )

A dash is a punctuation mark. It is longer than a hyphen and is used differently. The most common versions of the dash are the en dash (–) and the em dash (—).


[edit] Common dashes

There are several forms of dash, of which the most common are:

glyph Unicode[1] HTML[2] HTML/XML[3] TeX
figure dash U+2012 (8210) none &#x2012; or &#8210; none
en dash U+2013 (8211) &ndash; &#x2013; or &#8211; --
em dash U+2014 (8212) &mdash; &#x2014; or &#8212; ---
horizontal bar U+2015 (8213) none &#x2015; or &#8213;
swung dash ˜ U+2053 (8275) none &#x2053; or &#8275;

[edit] Figure dash

The figure dash (‒) is so named because it is the same width as a digit, at least in typefaces with digits of equal width.

The figure dash is used when a dash must be used within numbers, for example with telephone numbers: 8675309. This does not indicate a range (en dash is used for that), or function as the minus sign (which has its own glyph).

The figure dash is often unavailable; in this case, one may use a hyphen-minus instead. In Unicode, the figure dash is U+2012 (decimal 8210). HTML authors must use the numeric forms &#8210; or &#x2012; to type it unless the file is in Unicode; there is no equivalent character entity. In TeX, the standard fonts have no figure dash; however, the digits normally all have the same width as the en dash, so an en dash can be substituted in TeX.

[edit] En dash

The en dash, or n dash, n-rule, etc., (–) is roughly the width of the letter n. It is shorter than an em dash.

The en dash is used in ranges, such as 6–10 years, read as "six to ten years".

[edit] Ranges of values

The en dash is commonly used to indicate a closed range (a range with clearly defined and non-infinite upper and lower boundaries) of values, such as those between dates, times, or numbers.[4][5][6][7]

Some examples of this usage:

  • June–July 1967
  • 1:00–2:00 p.m.
  • For ages 3–5
  • pp. 38–55
  • President Jimmy Carter (1977–1981)

The Guide for the Use of the International System of Units (SI) recommends that the word to be used instead of an en dash when a number range might be misconstrued as subtraction, such as a range of units. For example, "a voltage of 50 V to 100 V" rather than "a voltage of 50 – 100 V".

It is also considered inappropriate to use the en dash in place of the words to or and in phrases that follow the forms from ... to ... and between ... and ....[5][6]

[edit] Relationships and connections

The en dash can also be used to contrast values, or illustrate a relationship between two things.[4][7]

Some examples of this usage:

  • Notre Dame beat Miami 31–30.
  • New York–London flight (though some sources say that New York to London flight is more appropriate because New York is a single name composed of two valid words; with a dash the phrase is ambiguous and could mean either Flight from New York to London or New flight from York to London[7])
  • Mother–daughter relationship
  • The Supreme Court voted 5–4 to uphold the decision.
  • The McCain–Feingold bill
  • A C–C single bond

A "simple" compound used as an adjective is written with a hyphen; at least one authority considers name pairs, as in the Taft-Hartley Act to be "simple",[5] while most consider an en dash appropriate there[citation needed] to represent the parallel relationship, as in the McCain–Feingold bill or Bose–Einstein statistics. (Note, however, that truly compound names are written with a hyphen, thus the Lennard-Jones potential is named after one person, while Bose and Einstein are two people.)

Note that The Chicago Manual of Style limits the use of the en dash to two main purposes: to indicate ranges of time, money, or other amounts (or in certain other cases where it replaces the word to); and in place of a hyphen in a compound adjective when one of the elements of the adjective is an open compound or when one of the elements is already hyphenated.[8] That is, the Chicago Manual of Style rules specify en dash in these:

  • Notre Dame beat Miami 31–30.
  • New York–London flight.
  • The Supreme Court voted 5–4 to uphold the decision.

but hyphens in these:

[edit] Compound adjectives

The en dash can be used instead of a hyphen in compound adjectives in which one part consists of two words or a hyphenated word:[5][6]

  • The non–San Francisco part of the world
  • The post–MS-DOS era
  • High-priority–high-pressure tasks (tasks which are both high-priority and high-pressure).

[edit] Usage guidelines

The en dash is used instead of a hyphen in compound adjectives for which neither part of the adjective modifies the other. That is, when each is modifying the noun. This is common in science, when names compose an adjective as in Bose–Einstein condensate. Compare this with "award-winning novel" in which "award" modifies "winning" and together they modify "novel". Contrast "Franco-Prussian War", "Anglo-Saxon", etc., in which the first element does not strictly modify the second, but a hyphen is still normally used. The Chicago Manual of Style recognizes but does not mandate this usage and uses a hyphen in Bose-Einstein condensate.[8]

En dashes that are used instead of hyphens to connect words normally do not have spaces around them. An exception is when excluding them may cause confusion or look odd (e.g., 12 June – 3 July; contrast 12 June–3 July). However, when an actual en dash is unavailable, one may use a hyphen-minus with a single space on each side (" - ").

[edit] Parenthetic and other uses at the sentence level

Like em dashes, en dashes can be used instead of colons, or pairs of commas that mark off a nested clause or phrase. They can also be used around parenthetical expressions – such as this one – in place of the em dashes preferred by some publishers, particularly where short columns are used, since em dashes can look awkward at the end of a line. See En dash versus em dash, below. In these situations, en dashes must have a single space on each side.

[edit] Electronic usage

In Unicode, the en dash is U+2013 (decimal 8211). In HTML, one may use the numeric forms &#8211; or &#x2013;; there is also an HTML entity &ndash;. In TeX, the en dash may normally (depending on the font) be input as a double hyphen-minus (--). On a computer running the Mac OS X operating system, most keyboard layouts map an en dash to Option-hyphen. On Microsoft Windows, an en dash may be entered as Alt+0150, where the digits are typed on the numeric keypad while holding the Alt key down.

The en dash is sometimes used as a substitute for the minus sign, when the minus sign character is not available, since the en dash is usually the same width as a plus sign. For example, the original 8-bit Macintosh character set had an en dash, useful for minus sign, years before Unicode with a dedicated minus sign was available. The hyphen-minus is usually too narrow to make a typographically acceptable minus sign. The en dash cannot be used in programming languages for a minus, however, since the syntax usually requires a hyphen-minus; since programming languages are usually set in a fixed-pitch (monospaced) font face, the hyphen-minus looks acceptable there.

[edit] Em dash

The em dash (—), or m dash, m-rule, etc., often demarcates a parenthetical thought—like this one—or some similar interpolation.

It is also used to indicate that a sentence is unfinished because the speaker has been interrupted. Similarly, it can be used instead of an ellipsis to indicate aposiopesis, the rhetorical device by which a sentence is stopped short not because of interruption but because the speaker is too emotional to continue, such as Darth Vader's line "I sense something; a presence I have not felt since—" in Star Wars Episode IV: A New Hope.

The term em dash derives from its defined width of one em (originally the width of the letter m), which is the length, expressed in points, by which font sizes are typically specified. Thus in 9-point type, an em is 9 points wide, while the em of 24-point type is 24 points wide, and so on. (By comparison, the en dash, with its 1-en width, is 1/2 em wide in any font.[citation needed])

The em dash is used in much the way a colon or set of parentheses is used: it can show an abrupt change in thought or be used where a full stop (or "period") is too strong and a comma too weak. Em dashes are sometimes used in lists or definitions, but that is a style guide issue; a colon is often recommended for use instead.

According to most American sources (e.g., The Chicago Manual of Style) and to some British sources (e.g., The Oxford Guide to Style), an em dash should always be set closed (not surrounded by spaces). But the practice in many parts of the English-speaking world, also the style recommended by The New York Times Manual of Style and Usage, sets it open (separates it from its surrounding words by using spaces or hair spaces (U+200A)) when it is being used parenthetically. Some writers, finding the em dash unappealingly long, prefer to use an open-set en dash. This "space, en dash, space" sequence is also the predominant style in German and French typography. See En dash versus em dash below.

Monospaced fonts (such as Courier) that mimic the look of a typewriter have the same width for all characters. Some of these fonts have em and en dashes which more or less fill the monospaced width they have available. For example, "- – — −" will show as a hyphen, en dash, em dash, and minus in a monospace font. Typewriters often only have a single hyphen glyph, so it is common to use two monospace hyphens strung together--like this--to serve as an em dash.

When an actual em dash is unavailable—as in the ASCII character set—a double ("--") or triple hyphen-minus ("---") is used. In Unicode, the em dash is U+2014 (decimal 8212). In HTML, one may use the numeric forms &#8212; or &#x2014;; there is also the HTML entity &mdash;. In TeX, the em dash may normally be input as a triple hyphen-minus (---). On a computer running the OS X operating system, most keyboard layouts map an em dash to Shift-Option-hyphen. On Microsoft Windows, an em dash may be entered as Alt+0151, where the digits are typed on the numeric keypad while holding the Alt key down. It can also be entered into Microsoft Office applications by using the Ctrl-Alt-hyphen combination.

[edit] En dash versus em dash

The en dash is wider than the hyphen but not as wide as the em dash. The width of the en dash was originally the width of the typeset lowercase letter 'n', while the width of the em dash was the width of an uppercase 'M'—hence the names. A more correct definition of the em width is the point size of the currently used font, since the M character is not always the width of the point size.[9]

Traditionally an em dash—like so—or a spaced em dash — like so — has been used for a dash in running text. The Elements of Typographic Style recommends the more concise spaced en dash – like so – and argues that the length and visual magnitude of an em dash "belongs to the padded and corseted aesthetic of Victorian typography." The spaced en dash is also the house style for certain major publishers (Penguin, Cambridge University Press, and Routledge among them). However, some longstanding typographical guides such as The Chicago Manual of Style still recommend unspaced em dashes for this purpose. The Oxford Guide to Style (2002, section 5.10.10) acknowledges that this style is used by "other British publishers", but observes that Oxford University Press (OUP) does not use it. In practice, there is little consensus, and it is a matter of personal or house taste.

The en dash (always with spaces, in running text) and the spaced em dash both have a certain technical advantage over the unspaced em dash. In most typesetting and most word processing, the spacing between words is expected to be variable, so there can be full justification. Alone among punctuation that marks pauses or logical relations in text, the unspaced em dash disables this for the words between which it falls. The effect can be uneven spacing in the text.

En dashes are often preferred to em dashes when text is set in narrow columns (as in newspapers and similar publications).[citation needed]

The spaced em dash risks introducing excessive separation of words: it is already long, and the spaces increase the separation. In full justification, the adjacent spaces may be stretched, and the separation of words is further exaggerated.

[edit] Horizontal bar

The horizontal bar or quotation dash (―) is used to introduce quoted text. This is the standard method of printing dialogue in some languages (see the quotation dash section of the Quotation mark article for further details of how it is used).

If the quotation dash is unavailable, then the em dash can be used instead. In Unicode, the quotation dash is U+2015 (decimal 8213). In HTML, it can be input only with the numeric form, &#x2015; or &#8213;; there is no equivalent character entity. But for web pages one generally uses the em dash. There is no support in the standard TeX fonts, but one can use \hbox{---}\kern-.5em--- instead (or just use an em dash).

[edit] Swung dash

The swung dash (⁓ or ~) resembles a lengthened tilde, and is used to separate alternatives or approximates. In dictionaries, it is frequently used to stand in for the defined term in example text. This character was added since Unicode 4.0.0. Note that there are several similar characters: ⁓ (U+2053: SWUNG DASH, used in Western typography), ∼ (U+223C: TILDE OPERATOR, used in mathematics), and 〜 (U+301C: WAVE DASH, used in East Asian typography).


  • henceforth (adverb), from this time forth; from now on; "⁓ she will be known as Mrs. Smith".

The swung dash in Unicode is U+2053 (decimal 8275). In HTML, it can be input only with the numeric form, &#x2053; or &#8275;; there is no equivalent HTML entity.

In LaTeX2ε, one can use the math mode command $\sim$, which yields the tilde operator, a similar character.

In Japanese a similar character, the wave dash is used instead, for a variety of purposes:

  • to indicate an extension of a vowel in slang;
  • it is often used in Japanese and Korean in place of an en dash;
  • in Chinese, the wave dash and the em dash can be used interchangeably to express a range.

[edit] Other dash-like characters

The are several characters which resemble dashes but have different meanings and uses. These include (though by no means are restricted to):

  • The hyphen-minus (-), Unicode U+002D, is the standard ASCII hyphen. It looks like a dash, but should only be used as such when proper dashes are unavailable. Sometimes this is used in groups to indicate different types of dash.
  • The tilde (~), U+007E, is a diacritic mark.
  • The underscore (_), U+005F, is either a diacritic mark, or a character replacing a standard space.
  • The macron (¯), U+00AF, is another diacritic mark.
  • The soft hyphen (U+00AD) is used to indicate where a line may break, as in a compound word or between syllables.
  • The hyphen (‐), U+2010, is a character which, unlike the ASCII hyphen, always represents a hyphen.[citation needed]
  • The hyphen bullet (⁃), U+2043, is a short horizontal line used as a list bullet.
  • The minus sign (−), U+2212, &minus;, is an arithmetic operator used in mathematics to represent subtraction or negative numbers.
  • The wave dash (〜), U+301C, and the wavy dash (〰), U+3030, are wavy lines found in some East Asian character sets. Typographically, they have the width of one CJK character cell (fullwidth form), and follow the direction of the text (horizontal for horizontal text, vertical for columnar). They are used as dashes, and occasionally as emphatic variants of the katakana vowel extender mark.
  • The Armenian hyphen (֊), U+058A, is a hyphen from the Armenian alphabet.
  • The Hebrew maqaf (־), U+05BE, is a hyphen-like character from the Hebrew alphabet.
  • The Mongolian todo hyphen (᠆), U+1806, is a hyphen from the Mongolian alphabet.
  • The Hangul Jungseong Eu ( U+3161 or U+1173) is used in Korean to indicate the sound [ɨ].
  • The Japanese chōonpu (ー), U+30FC, is used in Japanese to indicate a long vowel.
  • The yī/ichi (一), U+4E00, is a Chinese character which means "one" in both Chinese and Japanese.

[edit] Rendering dashes on computers

Typewriters and computers have traditionally had only a limited character set, often having no key with which to produce a dash. In consequence, it became common to substitute the nearest incorrect punctuation mark or symbol. Em dashes are often represented by a pair of spaces surrounding a single hyphen-minus (typical British usage) or by a pair of spaces surrounding two hyphen-minuses (mostly in the United States).

Modern computer software typically has support for many more characters, and is usually capable of rendering both the en and em dashes correctly—albeit sometimes with a little inconvenience for the user who has to input them. Some software, though, may operate in a more limited mode. Some text editors, for example, are restricted to working with a single 8-bit character encoding, and when unencodable characters are entered (e.g., by pasting from the clipboard), they are often blindly converted to question marks. Sometimes this happens to em and en dashes, even when the 8-bit encoding supports them, or when an alternative representation using hyphen-minuses would seem to be an option.

Any kind of dash can manifest directly in an HTML document, but HTML also allows them to be entered as character entity references. The entity names for the em dash and the en dash are mdash and ndash; therefore, they can be referenced in HTML as &mdash; and &ndash;. The equivalent numeric character references are &#8212; and &#8211;. Nearly all web browsers and operating systems used today are capable of rendering the numeric form, and almost as many correctly display the named form.

  • In Unicode, the figure dash, en dash, em dash, quotation dash, and swung dash correspond to characters U+2012, U+2013, U+2014, U+2015, and U+2053, respectively.
  • In Linux, under recent versions of X11, there are various methods of producing these dashes. For em dashes, one may use the compose key followed by three presses of the hyphen character. For en dashes, one may press the compose key followed by two hyphens and a period. For all dashes, one may press and hold ctrl and shift and then press u (and release them all) after which an underlined u will appear: then type the Unicode number (i.e. such as 2015) for the appropriate dash and press enter or the space bar. Also, dashes may be emulated by remapping other keys.[1]
  • In Mac OS using the Australian, British, Canadian, German, Irish, Irish Extended, Russian, U.S., or U.S. Extended keyboard layout, an en dash can be obtained by typing option-hyphen, while an em dash can be typed with option-shift-hyphen.
  • In TeX, an em dash is typed as three hyphens ("---"), an en dash as two hyphens ("--"), and a hyphen-minus as one hyphen ("-"). Mathematical minus is signified as "$-$".
  • On Plan 9 systems, an en or em dash may be entered by pressing the Compose key (usually left Alt), followed by typing en or em respectively.
  • In Microsoft Windows, an em dash can be typed with ctrl + alt + numeric hyphen (on the numeric keypad, usually in the top-right corner), and an en dash can be typed with ctrl + numeric hyphen. This will not work with the hyphen key on the main keyboard (usually between "0" and "="), which has completely different functions. Note also that this does not work in Windows Notepad. Alternatively, an en or em dash may be typed into most text areas by holding down the Alt key and pressing 0150 or 0151 respectively. The numbers must be typed on the numeric keypad with num lock turned on.
    In addition, the Character Map utility included with Windows can be used to copy and paste en and em dash characters (was well as accented letters and other non-English language characters) into most applications. It is usually in the Programs → Accessories → System Tools folder (or the Accessories folder on Windows Vista).
  • With Microsoft Word's default settings (both Windows and Macintosh versions), an em dash symbol (not always a true em dash from the font) is automatically produced by Autocorrect when two unspaced hyphens are entered between words ("word--word"). An en dash (again, not always a true en dash from the font) is automatically produced when one or two hyphens surrounded by spaces are entered: ("word - word") or ("word -- word"). This feature can be disabled by customizing Autocorrect. Other dashes, spaces, and special characters are possible, found through Tools → Customize… → Keyboard… → Common Symbols. Unassigned symbols (such as the true minus sign) can be assigned keyboard shortcuts through Insert → Symbol… → (select desired symbol) → Shortcut key… . To determine if the true en or em dash from the font are being used rather than a cross-referenced character from the Symbol font, copy and paste samples of the dashes into a text editor such as Windows Notepad. Using the true dash is important if one ever needs to share documents with other users in other applications or operating systems.

In professionally printed documents, a typographer sometimes adds hair space, or, rarely, a full inter-word space, on either side of an em dash. In HTML it is possible to generate a hair space using the numeric character reference &#8201;, but current-generation web browsers are not uniformly supportive of this character, and may render it incorrectly.

[edit] References

  1. ^ Characters in Unicode are referenced in prose via the "U+" notation. The hexadecimal number after the "U+" is the character's Unicode code point. The decimal equivalent is shown in parentheses.
  2. ^ Specifically, the predefined character entity reference that can be used in an HTML document in place of a literal dash.
  3. ^ Specifically, the numeric character reference that can be used in an HTML or XML document in place of a literal dash.
  4. ^ a b Griffith, Benjamin W., et al. (2004). Pocket Guide to Correct Grammar. Barron's Pocket Guides. Woodbury, N.Y: Barron's Educational Series. ISBN 0-7641-2690-3. 
  5. ^ a b c d Judd, Karen (2001). Copyediting: A Practical Guide. Menlo Park, Calif: Crisp Publications. ISBN 1-56052-608-4. 
  6. ^ a b c Loberger, Gordon; Kate Shoup Welsh (2001). Webster's new world English grammar handbook. New York: Hungry Minds. ISBN 0-7645-6488-9. 
  7. ^ a b c Ives, George B. (1921). Text, Type and Style: A Compendium of Atlantic Usage. Boston: Atlantic Monthly Press. 
  8. ^ a b The Chicago Manual of Style (15th Edition ed.). Chicago: University of Chicago Press. 2003. pp. 261–265. ISBN 0-226-10403-6. 
  9. ^ "A glossary of typographic terms". Adobe. http://www.adobe.com/uk/type/topics/glossary.html#ememspaceemquad. Retrieved on 2007-10-18. 

[edit] External links

This article contains special characters. Without proper rendering support, you may see question marks, boxes, or other symbols.
Personal tools