Bush hid the facts
From Wikipedia, the free encyclopedia
Bush hid the facts (sometimes also this app can break) is the common name for a bug present in the function IsTextUnicode of Windows NT 3.5 and its successors, which causes a file of text encoded in Windows-1252 or similar encoding to be interpreted by applications that uses it (such as Notepad) as if it was UTF-16, resulting in mojibake.
While "Bush hid the facts" is the sentence that is most commonly presented on the Internet, it does not exclusively occur with that phrase. The bug can be triggered by many sentences with alphabetic characters and spaces in a particular order (4-space-3-space-3-space-5), as well as other combinations that can be parsed into valid (if nonsensical) Chinese characters in Unicode.
The bug occurs when the ANSI string is passed to the Win32 charset detection function IsTextUnicode
with no other characters. Because of this bug, IsTextUnicode will return TRUE, which means that applications that uses it will incorrectly interpret it as UTF-16. For example, if you load a text file with the string into a text editor that uses IsTextUnicode, the text will be displayed as nine Chinese characters, or squares if the language pack has not been installed. With Notepad, to retrieve the original text, bring up the "Open a file" dialog box, select the file, select "ANSI" in the "Encoding" list box, and click Open.
[edit] Discovery
The bug appeared for the first time in Windows NT 3.5 but was not discovered until early 2004[1] and has since risen in popularity on the Internet.[citation needed]
Clearing the content by selecting, cutting and then repasting the text does not prevent reproduction as long as it is carefully done.
Because of this bug in IsTextUnicode, Notepad misinterprets the encoding of the file when it is re-opened. If the file is originally saved as "Unicode" rather than "ANSI" the text displays correctly.
Older versions of Notepad such as those that came with Windows 95, 98, ME, or NT 3.1 do not include Unicode support so the error does not occur.
Notepad2 (by Florian Balmer) also exhibits this behaviour because it also uses IsTextUnicode.
[edit] External links
[edit] References
- ^ Cumps, David (February 27, 2004). "Notepad bug? Encoding issue?". #region .Net Blog. http://weblogs.asp.net/cumpsd/archive/2004/02/27/81098.aspx. Retrieved on February 15, 2009.