Plaintext
Plaintext is a term with several meanings.
Plain text can refer to computer files in ASCII or other human-readable form. This usually excludes files stored with formatting, such as Microsoft Word files, but includes any file that can be opened, read, and edited with a text editor such as Notepad (on Microsoft Windows), pico, nano, vi, emacs (on UNIX), SimpleText (on Mac OS), or TextEdit (on Mac OS X). Most programming languages require source files to be stored in plain text, as do HTML and XML.
Cryptography
Plaintext is the input to a crypto system, or more simply, the message that will be encrypted.
In any crypto system, plaintext must be handled properly lest an attacker gain considerable advantage.
- First and most obviously, it should be kept carefully. If the information was (or will be) important enough to entrust to a crypto system for protection, it is probably important enough to not to lose track of in other ways.
- If printed out, it must be stored securely. Most file cabinets, locked office desk drawers, and many safes, are easily opened. Offices are not always secured sensibly after hours, or even during hours in too many cases, and so on. Since dumpster diving is widely possible, and reconstruction of even shredded sheets possible for those sufficiently committed to their recovery, discarded printed plaintexts must be throughly crosscut shredded, burned, or otherwise made un-diveable.
- If kept in a computer file, the computer and its components must be secure. A removable disk (or extractable disk drive -- anyone have a screwdriver handy?) is an obvious possibility. Laptop computers are an special problem. The US State Department, the British Secret Service, and the US Department of Defense have all had laptops containing secret information, presumably in readable text form, vanish in recent years. Discarded computers (and disks and disk drives) are also a source of plaintexts. Unerased files (including any plaintexts which may have been present) will still be readable. Erased files may be as well. Most operating systems do not actually erase anything -- they simply mark the disk space formerly occupied by the 'erased' file as 'available for use'. Even overwriting the portion of a disk occupied by a file before eraseing it is insufficient in many cases. Peter Gutmann of the University of Auckland wrote a celebrated paper some years ago about recovering overwritten information from magnetic disks. Some government agencies require that all disk drives be physically crushed when they are no longer needed.
- Second, possession of any plaintext whatsoever, whether it is itself meaningful (and perhaps sensitive) or merely some administrivia in some heading, makes several cryptanalytic attacks either possible or easier. This implies that it is best to process the information being sent in some way unhelpful to the attacker prior to it becoming actual plaintext. For instance, it is common in well designed crypto systems to run all messages being sent through a data compression algorithm prior to submitting the result (the actual plaintext for crypto processing) to a crypto system. This provides at least some masking for stereotyped headings and introductions in the original message. If the compressed plaintext is not retained (but see the difficulty in erasing files above) then 'plaintext' won't be available at all. Russian copulation has also been used to obscure headings and introductions though, in modern contexts, with message material which may not be readily 'decopulated' on simple inspection, this has become less useful in practice.