ASCII Character Set: Encoding
ASCII Character Set: Encoding
The ASCII character set is the most common format for English-language text files, and is
generally assumed to be the default file format in many situations. For accented and other nonASCII characters, it is necessary to choose a character encoding. In many systems, this is chosen
on the basis of the default locale setting on the computer it is read on. Common character
encodings include ISO 8859-1 for many European languages.
Because many encodings have only a limited repertoire of characters, they are often only usable
to represent text in a limited subset of human languages. Unicode is an attempt to create a
common standard for representing all known languages, and most known character sets are
subsets of the very large Unicode character set. Although there are multiple character encodings
available for Unicode, the most common is UTF-8, which has the advantage of being backwardscompatible with ASCII; that is, every ASCII text file is also a UTF-8 text file with identical
meaning.
Formats
On most operating systems the name text file refers to file format that allows only plain text
content with very little formatting (e.g., no bold or italic types). Such files can be viewed and
edited on text terminals or in simple text editors. Text f