9
9
Most elements take the language-related attribute dir to specify text direction,
such as with "rtl" for right-to-left text in, for example, Arabic, Persian or
Hebrew.[79]
The ability to "escape" characters in this way allows for the characters < and &
(when written as < and &, respectively) to be interpreted as character data,
rather than markup. For example, a literal < normally indicates the start of a tag,
and & normally indicates the start of a character entity reference or numeric
character reference; writing it as & or & or & allows & to be included
in the content of an element or in the value of an attribute. The double-quote
character ("), when not used to quote an attribute value, must also be escaped as
" or " or " when it appears within the attribute value itself.
Equivalently, the single-quote character ('), when not used to quote an attribute
value, must also be escaped as ' or ' (or as ' in HTML5 or XHTML
documents[80][81]) when it appears within the attribute value itself. If document
authors overlook the need to escape such characters, some browsers can be very
forgiving and try to use context to guess their intent. The result is still invalid
markup, which makes the document less accessible to other browsers and to other
user agents that may try to parse the document for search and indexing purposes for
example.
Escaping also allows for characters that are not easily typed, or that are not
available in the document's character encoding, to be represented within the
element and attribute content. For example, the acute-accented e (é), a character
typically found only on Western European and South American keyboards, can be
written in any HTML document as the entity reference é or as the numeric
references é or é, using characters that are available on all keyboards
and are supported in all character encodings. Unicode character encodings such as
UTF-8 are compatible with all modern browsers and allow direct access to almost all
the characters of the world's writing systems.[82]