Umlaut (diacritic)

From Wikipedia, the free encyclopedia

Jump to: navigation, search
Ä ä
Ǟ ǟ
Ë ë
Ï ï
Ö ö
Ȫ ȫ
Ü ü
Ǖ ǖ
Ǘ ǘ
Ǚ ǚ
Ǜ ǜ
Ÿ ÿ

The word umlaut is the name of a type of sound shift in spoken language (phonological umlaut) and of the diacritic mark used to represent it orthographically. The diacritic mark comprises a pair of dots or lines (¨) placed over the letter that represents the affected vowel sound. When the letter is an i, the diacritic replaces the tittle. In German, the three umlauts are ä, ö, and ü. The same name is used in other languages that share these symbols with German. The phonological phenomenon of umlaut occurs in English (man ~ men; full ~ fill; goose ~ geese) in a way cognately parallel with German, but English orthography does not write the sound shift using the umlaut diacritic. Instead, a different letter is used.

A very similar diacritic is the diaeresis (or trema), and a distinction between umlaut and diaeresis characters is not always made. The diaeresis or trema is the diacritic mark ( ¨ ) used to indicate a phonological diaeresis, or, more generally, that a vowel should be pronounced apart from the letter which precedes it. That preceding letter is usually another vowel, but in Spanish it is used on the letter u when preceded by g and followed by another vowel to indicate that the u should be pronounced. For example, in the spelling coöperate, it reminds the reader that the word has four syllables [koʊˈɔpəreɪt], not three [ˈkuːpəreɪt]. In English, the trema is rare, and not mandatory, but other languages like Dutch, Spanish and French make regular use of it. By extension, the words trema and diaeresis also designate the same diacritic when used to denote other kinds of sound changes, such as marking the schwa ë in Albanian.

In modern computer systems (using Unicode), umlaut and diaeresis are represented identically.

Contents

[edit] Diaeresis or trema

[edit] History

Historically, the diaeresis mark or trema is far older than the umlaut mark.

The word trema is taken from the Byzantine Greek τρημα, meaning "perforation, orifice". This sign was first used in that language[citation needed] to indicate a phonological diaeresis, that is when two consecutive vowels are pronounced separately as a hiatus, rather than together in a diphthong. It is currently used with this purpose in several languages of western and southern Europe, among them Occitan, Modern Greek, Catalan, Dutch, and Welsh.

Punctuation

apostrophe ( ' )
brackets (( )), ([ ]), ({ }), (< >)
colon ( : )
comma ( , )
dashes ( , , , )
ellipses ( , ... )
exclamation mark/point ( ! )
full stop/period ( . )
guillemets ( « » )
hyphen ( -, )
question mark ( ? )
quotation marks/inverted commas ( ‘ ’, “ ” )
semicolon ( ; )
slash/stroke ( / )
solidus ( )
Word dividers
spaces ( ) () () ( ) () () ()
interpunct ( · )
General typography
ampersand ( & )
at sign ( @ )
asterisk ( * )
backslash ( \ )
bullet ( )
caret ( ^ )
currency generic: ( ¤ )
specific: ฿, ¢, $, , , £, , ¥, ,
daggers ( , )
degree ( ° )
inverted exclamation mark ( ¡ )
inverted question mark ( ¿ )
number sign/pound/hash ( # )
numero sign ( )
ordinal indicator (º, ª)
percent (etc.) ( %, ‰, )
pilcrow ( )
prime ( )
section sign ( § )
tilde/swung dash ( ~ )
umlaut/diaeresis ( ¨ )
underscore/understrike ( _ )
vertical/pipe/broken bar ( |, ¦ )
Uncommon typography
asterism ( )
index/fist ( )
therefore sign ( )
because sign ( )
interrobang ( )
irony mark ( )
lozenge ( )
reference mark ( )

For example, according to the spelling rules of Catalan, the digraphs ei and iu are normally read as diphthongs, [ei̯] and [iu̯]. To indicate exceptions to this rule, a diaeresis mark is placed on the second vowel: without the trema the words veïna [bəˈinə] ("neighbour", feminine) and diürn [diˈurn] ("diurnal") would be read [ˈbei̯nə] and [ˈdiu̯rn], respectively.

Occitan use of diaeresis is very similar to Catalan: ai, ei, oi, au, eu, ou are diphthongs consisting of one syllable but aï, eï, oï, aü, eü, oü are groups consisting of two distinct syllables.

In French, some pairs of vowels that were originally true diphthongs later coalesced into monophthongs, which led to an extension of the value of this diacritic. It often now indicates that the second vowel is to be pronounced separately from the first, rather than merge with it into a single sound. For example, the French words païen [pajɛ̃], Anaïs [anais], and naïve [naiv] would be pronounced [pɛɛ̃], [anɛs], and [nɛv], respectively, without the diaeresis mark, since the digraph ai is pronounced [ɛ].

Another example is the Dutch spelling coëfficiënt, necessary because the digraphs oe and ie normally represent the simple vowels [u] and [i], respectively.

Ÿ is sometimes used in transcribed Greek, where it represents the Greek letter υ (upsilon) in the non-diphthong αυ (alpha upsilon) (e.g., in the transcription Artaÿctes of the Persian name Ἀρταΰκτης at the very end of Herodotus. Or the name of Mount Taygetus on the southern Peloponnesus peninsula, in modern Greek spelled Ταΰγετος). It also occurs in French as a variant of ï, in rare proper nouns (for instance, the name of the Parisian suburb of L'Haÿ-les-Roses).

In some French words, a diaeresis is used to show what were historically two vowels in hiatus, although the first vowel has since fallen silent. So in "Saint-Saëns", the diaeresis shows that the combination ae is to read like an e; since the a is silent, the words are pronounced as if written "Saint-Sens".

As a further extension, other languages began to use the trema whenever they wish to indicate that a vowel should be pronounced separately from the preceding letter (possibly a consonant), with which it would normally form a digraph, according to the orthographic rules of that language. In the orthographies of Spanish, Catalan, Brazilian Portuguese, French, Galician and Occitan, the graphemes gu and qu normally represent a single sound, [g] or [k], before the front vowels e and i (before nearly all vowels in Occitan), for historical reasons. In the few exceptions where the u is pronounced before i or e, a trema is added to it. In French, the diaeresis in such cases is usually written over the following vowel.

Examples:

  • Spanish - vergüenza (shame), pingüino (penguin)
  • Catalan - aigües (waters), qüestió (matter)
  • Brazilian Portuguese - cinqüenta (fifty), qüinqüênio (quinquennial)
  • French - Noël (Christmas), aiguë (acute (fem.))
  • Occitan - lingüista (linguist), aqüatic (aquatic)
  • English - Naïve
  • Welsh - trolïau (trolley)
  • In Greek it can be used on its own (ακαδημαϊκός, "academic"), or in combination with an acute accent (πρωτεΐνη, "protein").

[edit] In English

The diaeresis mark has also been occasionally applied to English words of Latin origin (e.g., coöperate, reënact), as well as native English words (e.g., noöne), but this usage had become extremely rare by the 1940s. The New Yorker and MIT's Technology Review can be noted as some of the few publications that still spell coöperate with a diaeresis[citations needed]. Its use in English today, apart from words borrowed from other languages, is mostly limited to certain names, such as the surname Brontë and the given names Chloë and Zoë. It is relatively common in words that do not have an obvious divider at the diaeresis point (the diaeresis cannot be replaced by a preceding hyphen), such as naïve.

[edit] Other diacritical uses

  • In Dutch, a handwritten ij can resemble a ÿ (though the latter does not occur in Dutch).
  • Jacaltec, a Mayan language, and Malagasy are the only languages to allow a pair of dots over the letter "n", which is presented in unicode as "".
  • In J. R. R. Tolkien's romanisation of his fictional words (in languages such as Quenya and Sindarin), double dots over trailing e characters (as in Manwë) indicate that the e is pronounced rather than silent, as it would normally be in English. (See Silent E for further information.) On other e it is used to indicate that it does not form a diphthong, e.g. Fëanor. In names beginning with ë the diacritic is moved to the second vowel, like Eärendil.
  • The usage of double dots over vowels, particularly ü, also occurs in the transcription of languages that do not use the Roman alphabet, such as Chinese. For example, 女 (female) is transcribed as in proper Mandarin Chinese pinyin, while nv is sometimes used as a replacement for convenience since the letter v is not used in pinyin.

[edit] Umlaut

[edit] History

Historically, the umlaut mark is far younger than the diaeresis mark, and has unrelated origins, though it has been speculated that an awareness of diaeresis might have influenced the final written form of the umlaut.

Development of the umlaut in Sütterlin: schoen becomes schön via schoͤn (“beautiful”)

Originally, phonological umlaut was denoted in written German by adding an e to the affected vowel, either after the vowel or, in small form, above it. (In medieval German manuscripts, other digraphs could also be written using superscripts: in bluome (“flower”), for example, the <o> was frequently placed above the <u>. Compare also the development of the tilde as a superscript ‘n’.) In blackletter handwriting as used in German manuscripts of the later Middle Ages, and also in many printed texts of the early modern period, the superscript <e> still had a form which would be recognisable to us as an <e>. However, in the forms of handwriting which emerged in the early modern period (of which Sütterlin is the latest and best known example), the letter <e> had two strong vertical lines, and the superscript <e> looked like two tiny strokes. Gradually these strokes were reduced to dots, and as early as the 16th century we find this handwritten convention being transferred sporadically to printed texts too.

In modern handwriting, the umlaut sometimes looks like a tilde, quotation mark, dash, or other small mark.

[edit] Printing conventions in German

When typing German, if umlaut letters are not available, the proper way is to replace them with the underlying vowel and a following <e>. So, for example, "Schröder" becomes "Schroeder". As the pronunciation differs greatly between the normal letter and the umlaut, simply omitting the dots is considered incorrect. The result might often be a different word, as in schon 'already', schön 'beautiful' or Mutter 'mother', Mütter 'mothers'.

Despite this, the umlauted letters are not considered part of the alphabet proper. When alphabetically sorting German words, the umlaut is usually treated like the underlying vowel; if two words differ only by an umlaut, the umlauted one comes second, for example:

  1. Schon
  2. Schön
  3. Schonen

There is a second system in limited use, mostly for sorting names (colloquially called "telephone directory sorting"), which treats ü like ue, and so on.

  1. Schön
  2. Schon
  3. Schonen

Austrian telephone directories insert ö after oz.

  1. Schon
  2. Schonen
  3. Schön

In Switzerland, capital umlauts are sometimes printed as digraphs, in other words, <Ae>, <Oe>, <Ue>, instead of <Ä>, <Ö>, <Ü> (see German alphabet for an elaboration.) This is because the Swiss keyboard contains the French accents on the same keys as the umlauts (selected by Shift). To write capital umlauts the ¨-key is pressed followed by the capital letter to which the umlaut should apply.

[edit] Borrowing of German umlaut notation

Diacritical marks

accent

acute accent ( ´ )
double acute accent ( ˝ )
grave accent ( ` )
double grave accent (  ̏ )

breve ( ˘ )
caron / háček ( ˇ )
cedilla ( ¸ )
circumflex ( ^ )
diaeresis / umlaut ( ¨ )
dot ( · )

anunaasika ( ˙ )
anusvara (  ̣ )
chandrabindu (   ँ   ঁ   ઁ   ଁ ఁ )

hook / dấu hỏi (  ̉ )
horn / dấu móc (  ̛ )
macron ( ¯ )
ogonek ( ˛ )
ring / kroužek ( ˚, ˳ )
rough breathing / spiritus asper (    )
smooth breathing / spiritus lenis (  ᾿  )

Marks sometimes used as diacritics

apostrophe ( )
bar ( | )
colon ( : )
comma ( , )
hyphen ( ˗ )
tilde ( ~ )
titlo (  ҃ )

Some languages have borrowed some of the forms of the German letters Ä, Ö, or Ü, including Estonian, Finnish, Hungarian, Karelian, some of the Sami languages, Slovak, Swedish and Turkish. With the exception of Swedish, use of the diacritic in these languages does not relate to instances of the historical phenomenon of Germanic umlaut, but it often indicates sounds similar to those for which it is used in German.

The Estonian alphabet has borrowed <ä>, <ö> and <ü> from German, Swedish and Finnish have <ä> and <ö>, and Slovak has <ä>. In Estonian, Swedish, Finnish and Sami <ä> and <ö> denote [æ] and [ø] respectively. Hungarian, on the other hand, has <ü>, and <ö>. The Slovak language uses the letter <ä> to denote [ɛ] (or a bit archaic but still correct [æ]) — the sign is called dve bodky ("two dots"), and the full name of the letter ä is a s dvomi bodkami ("a with two dots"). In all these languages, however, the replacement rule for situations where the umlaut character is not available, is to simply use the underlying unaccented character instead (without a following e).

In Luxembourgish (Lëtzebuergesch), the umlaut diacritic in <ä> and <ë> represents a stressed schwa. Since the Luxembourgish language uses the mark to show stress, it cannot be used to modify the 'u' which therefore has to be 'ue'.

When Turkish switched from the Arabic to the Latin alphabet in 1928 it adopted a number of diacritics borrowed from various languages, including <ü>, which was taken from German (Turkey had a close relationship with Germany) and <ö> from Swedish, which in turn had borrowed this symbol from German. These Turkish graphemes represent similar sounds to their values in German (see Turkish alphabet).

As the borrowed diacritic has lost its relationship to Germanic i-mutation, they are in some languages considered independent graphemes, and cannot be replaced with <ae>, <oe>, or <ue> as in German. In Estonian and Finnish, for example, these latter diphthongs have independent meanings. Even some Germanic languages such as Swedish (which does have a transformation analogous to the German umlaut, called omljud ), treat them as independent letters. In collation, this means they have their own positions in the alphabet, for example at the end ("A–Ö", not "A–Z") as in Swedish and Finnish, which means that the dictionary order is different from German. It also means that the transformations äae and öoe are inappropriate for these languages.

When typing in Norwegian, the letters Ø and Æ might be replaced with Ö and Ä respectively if the former are not available. If neither are available, it is appropriate to use oe and ae. While ae has a great resemblance to the letter æ and therefore does not impede legibility, the digraph oe is likely to reduce the legibility of a Norwegian text. This especially applies to the digraph øy which would be rendered in the more cryptic form oey.

Early Volapük used Fraktur a, o and u as different than Antiqua ones. Later, the Fraktur forms were replaced with umlauted vowels.

[edit] Use of the umlaut for special effect

The umlaut diacritic can be used in "sensational spellings" or foreign branding, for example in advertising, or for other special effects. Häagen-Dazs is an example of such usage.

As the German short /a/ is more open than the equivalent sound in English (/æ/), Germans sometimes use the diacritic <ä> to imitate the English sound in writing, giving an English "feel" to words used in advertising; in a McDonald's restaurant in Germany one can buy a "Big Mäc".

Since the letter ü is very common in Turkish, its inappropriate use can make a text in another language look "turkified", a purely visual mimicry. Because of the large number of Turks living in Germany, this again is a phenomenon familiar in German. The Turkish-German satirist Osman Engin, for example, wrote a book entitled Dütschlünd, Dütschlünd übür üllüs - the opening line of the first stanza from Das Lied der Deutschen, but turkified.

In the heavy metal scene, the umlaut diacritic can frequently be observed as a mere decoration (with no significance for the pronunciation) on the names of bands such as Blue Öyster Cult, Motörhead, Mötley Crüe, Queensrÿche, or Leftöver Crack. The group Spın̈al Tap places an umlaut over the <n>. A self-referential example is the Finnish group Ümlaut.

[edit] In mathematics and physics

The derivative with respect to time is often represented as a dot above a variable. Two dots represents the second derivative.

{\dot{a}} = {\mathrm{d}a \over \mathrm{d}t}
{\ddot{a}} = {\mathrm{d} ^2 a \over \mathrm{d} t^2}

This may be contrasted with the more general notation for a derivative using a prime:

f'(x) = {\mathrm{d} \over \mathrm{d}x} f(x)
f''(x) = {\mathrm{d}^2 \over \mathrm{d} x^2} f(x)

[edit] Computer usage

Most character encodings treat the umlaut and the diaeresis as the same diacritic mark.

[edit] Keyboard input

Umlauts on a German computer keyboard. The ligature eszett (ß) can also be seen.

Using Microsoft Word, the double dot is produced by pressing Ctrl+Shift+:, then the letter.

On a computer running Mac OS double dots can be entered be pressing Option+U, followed by the vowel to have a double dot above it.

X-based systems with the Compose key can usually enter characters with double dots by typing Compose, " followed by the letter.

Microsoft Windows allows users to set their US layout keyboard language to International which allows for something similar, by turning keys (rather characters) into dead keys. If the user enters ", nothing will appear on screen, until the user types another character, after which the characters will be merged if possible, or added independently at once if not.

On several operating systems, double dotted characters can be written even without the current keyboard layout having umlauts or tremas by entering Alt codes. On Microsoft Windows keyboard layouts that do not have double dotted characters, one can especially use Windows Alt keycodes. Double dots are then entered by pressing the left Alt key, and entering the full decimal value of the character's position in the Windows code page on the numeric keypad, provided that the compatible code page is used as a system code page. You can also use numbers from Code page 850; these lack a leading 0.

Character Windows Code Page Code CP850 Code
ä Alt+0228 Alt+132
ë Alt+0235 Alt+137
ï Alt+0239 Alt+139
ö Alt+0246 Alt+148
ü Alt+0252 Alt+129
ÿ Alt+0255 Alt+152
Ä Alt+0196 Alt+142
Ë Alt+0203 N/A
Ï Alt+0207, Alt+02255 Alt+651
Ö Alt+0214 Alt+153
Ü Alt+0220 Alt+154

[edit] Character encodings

The ISO 8859-1 character encoding includes the letters ä, ë, ï, ö, ü, and their respective capital forms, as well as ÿ in lower case only, with Ÿ added in the revised edition ISO 8859-15.

Unicode provides the double dot as a combining character U+0308. Mainly for compatibility with older character encodings, dozens of codepoints with letters with double dots are available.

Both the combining character U+0308 and the precombined codepoints can be used as umlaut and as diaeresis.

Sometimes, there's a need to distinguish between the umlaut sign and the diaeresis sign. In these cases, the following recommendation by ISO/IEC JTC 1/SC 2/WG 2 should be followed:

  • To represent the umlaut use Combining Diaeresis (U+0308)
  • To represent the diaeresis use Combining Grapheme Joiner (CGJ, U+034F) + Combining Diaeresis (U+0308)

[edit] HTML

In HTML, vowels with double dots can be entered with an entity reference of the form &?uml;, where ? can be any of a, e, i, o, u, y or their majuscule counterparts. With the exception of the uppercase Ÿ, these characters are also available in all of the ISO 8859 character sets and thus have the same codepoints in ISO-8859-1 (-2, -3, -4, -9, -10, -13, -14, -15, -16) and Unicode. The uppercase Ÿ is available in ISO 8859-15 and Unicode, and Unicode provides a number of other letters with double dots as well.

Umlauts
Character Replacement HTML Unicode
ä a or ae &auml; U+00E4
ö o or oe &ouml; U+00F6
ü u or ue &uuml; U+00FC
Ä A or Ae &Auml; U+00C4
Ö O or Oe &Ouml; U+00D6
Ü U or Ue &Uuml; U+00DC
Other double dots
Character HTML Unicode
ë &euml; U+00EB
ï &iuml; U+00EF
ÿ &yuml; U+00FF
Ë &Euml; U+00CB
Ï &Iuml; U+00CF
Ÿ &Yuml; U+0178


Note: when replacing umlaut characters with plain ASCII, use ae, oe, etc. for German language, and the simple character replacements for all other languages.

[edit] TeX

TeX also allows double dots to be placed over letters in math mode, using "\ddot{}", or outside of math mode, with the \" control sequence:

 \mathrm{\ddot{a}\ddot{b}\ddot{c}\ddot{d}\ddot{e}\ddot{A}\ddot{B}\ddot{C}\ddot{D}\ddot{E}}

However this will give the trema-style dots that are too far above the letter's body for good typographical umlauts. TeX's "German" package should be used if possible: it adds the " control sequence (without backslash) which gives umlauts.

[edit] See also

The Basic modern Latin alphabet
Aa Bb Cc Dd Ee Ff Gg Hh Ii Jj Kk Ll Mm Nn Oo Pp Qq Rr Ss Tt Uu Vv Ww Xx Yy Zz
Letters using umlaut or diaeresis sign

history palaeography derivations diacritics punctuation numerals Unicode list of letters ISO/IEC 646

[edit] External links

Look up ä, Ë, ë, ö in Wiktionary, the free dictionary.
Personal tools