02_html_syntax_2024
02_html_syntax_2024
Darja Solodovņikova
2
HTML: What is It?
▪ HTML is a text markup language
▪ HTML document =
text document + markup *
A demonstration <h1>A demonstration</h1>
3
Types of Markup Languages
▪ "Macro" style languages
▪ Can be compiled down to "draw this letter
here" commands
▪ Example: LaTeX, PostScript
▪ WYSIWYG languages
▪ Markup exists, it's hidden from the user
▪ Example: Microsoft RTF
▪ Structural markup languages
▪ Markup defines only structure
▪ Examples: HTML, XML
4
Tags in HTML
▪ Each "tag" has its meaning and place
▪ If you read between the "tags", you get
readable text
▪ The computer uses "tags" to alter the
display of text
▪ "Tags" mark the beginning and end of
HTML elements and may contain
attributes
5
HTML CONCEPTS
6
Some Ground Rules
▪ Browser should try its best to display a
document (even with syntax errors)
▪ If the document contains unknown
elements, treat them as simple text
– Compatibility feature so older browsers can
display newer markup (and vice versa)
– Also users make lots of mistakes
7
Whitespaces and Line Breaks
▪ HTML was designed to be pretty-formatted
▪ Line breaks are allowed to wrap long lines
▪ Additional white spaces can be added
▪ Rules are simple:
▪ Multiple whitespaces are treated as one
▪ Linebreak is treated as a whitespace
14
Two Approaches
WYSIWYG
Code editing
24
Two Approaches of Web Authoring
▪ The two approaches still remain:
▪ WYSIWYG
▪ use visual formatting tools, let the computer take care of
markup
▪ no (not enough) control over what is generated
▪ Source code editing
▪ full control over what is generated
▪ author has to know syntax and semantics of HTML
elements
▪ slower on large amounts of data
▪ In this course we use the latter approach
▪ We are the IT guys, we have to know how it works
to fix it when it breaks 17
HTML STANDARDIZATION
18
The Browser Wars
▪ Facts
▪ In 1994, Mosaic Communications was founded and
started creating Netscape Navigator
▪ In 1995, Microsoft began developing Internet Explorer
▪ The first proposal for an HTML specification was published
in the mid-1993
▪ HTML 2.0 standard was published in November, 1995
▪ Way too late!
▪ To attract developers and users, both vendors tried to
implement as many features as possible
▪ Features that were different from other browsers
▪ Features that were not documented elsewhere
▪ and were "a bit different" than the competitor's way.
▪ Result: part of web only working partway! 20
Standards Organizations: W3C
36
Global Attributes – Used on Any Element
38
Why UTF-8?
▪ When did you last time do View→Character
encoding while browsing the Web?
▪ How often did you do that in 2005?
▪ Historical issues in 90s/00s
▪ Wrong encoding
41
Other Substitutes
< < (less than)
> > (greater than)
” " (quote)
& & (ampersand)
... … (ellipsis)
”” (non-breaking space)
€ €
ü ü (any Unicode symbol)
— — (m-dash, the real "domuzīme")
▪ Other:
▪ Raquo Laquo? http://www.raquo.net
▪ http://htmlhelp.com/reference/html40/entities/
▪ Note: if you're using UTF-8 everywhere, you don't need the
42
Unicode substitutes (except for < > " &)
When are Whitespaces not Welcome?
▪ Whitespaces can be multiplied/added nearly
anywhere in textual content … with some
exceptions
▪ Do not put a space
▪ between the tag brace and the element name
▪ < table
▪ Do put a space
▪ after attribute quote and the next attribute name
▪ <table border="1"id="sampletable">
44