Introduction to HTML

Introduction to HTML

In this lesson of the HTML tutorial, you will learn… 

  1. To create a simple HTML page.
  2. About HTML elements and attributes.
  3. The difference between HTML and XHTML.
  4. To create the skeleton of an HTML document.
  5. About whitespace and HTML.
  6. To output special characters in HTML.

HyperText Markup Language (HTML) is the language behind most Web pages. The language is made up of elements that describe the structure and format of the content on a Web page.

HTML is maintained by the World Wide Web Consortium (W3C). As of this writing, the latest versions are HTML 4.01 and XHTML 1.0. See and for the specifications. In this lesson, we will address the differences between HTML and XHTML. The rest of the course covers XHTML, but for simplicity, we’ll usually refer to it as HTML.

Getting Started

We’ll begin with a simple exercise.

Exercise: A Simple HTML Document

Duration: 5 to 15 minutes.

In this exercise, you will create your first HTML document by simply copying the text shown below. The purpose is to give you some sense of the structure of an HTML document.

  1. Open a simple text editor such as Notepad and create a new file. Do not use an HTML editor for this exercise.
  2. Save the file as HelloWorld.html in the HTMLBasics/Exercises folder.
  3. Type the following exactly as shown:
    <title>Hello world!</title>
     <h1>Hello world!</h1>
  4. Save the file again and then open it in your browser by navigating to the file in your folder system and double-clicking on it. The page should appear as follows:

The HTML Skeleton

At its simplest, an HTML page contains what can be thought of as a skeleton – the main structure of the page. It looks like this:

Code Sample: HTMLBasics/Demos/Skeleton.html

 <!--Content that appears on the page-->

The <head> Element

The <head> element contains content that is not displayed on the page itself. Some of the elements commonly found in the <head> are:

  • Title of the page (<title>). Browsers typically show the title in the “title bar” at the top of the browser window.
  • Meta tags, which contain descriptive information about the page (<meta />)
  • Script blocks, which contain javascript or vbscript code for adding functionality and interactivity to a page (<script>)
  • Style blocks, which contain Cascading Style Sheet rules (<style>).
  • References (or links) to external style sheets (<link />).

The <body> Element

The <body> element contains all of the content that appears on the page itself. Body tags will be covered thoroughly throughout this manual.


Extra whitespace is ignored in HTML. This means that all hard returns, tabs and multiple spaces are condensed into a single space for display purposes.

Code Sample: HTMLBasics/Demos/Whitespace.html

 <title>Whitespace Example</title>
This is a sentence on a single line.

    sentence with
   extra whitespace

Code Explanation

The two sentences in the code above will be rendered in exactly the same way.

HTML Elements

HTML elements describe the structure and content of a Web page. Tags are used to indicate the beginning and end of elements. The syntax is as follows:

<tagname>Element content</tagname>


Tags often have attributes for further defining the element. Attributes come in name-value pairs

This is not allowed in XHTML.

Note that attributes only appear in the open tag, like so:

<tagname att1="value" att2="value">Element content</tagname>

The order of attributes is not important.

Empty vs. Container Tags

The tags shown above are called container tags because they have both an open and close tag with content contained between them. Tags that do not contain content are called empty tags. The syntax is as follows:

<tagname />


<tagname att1="value" att2="value" />

Blocks and Inline Elements

Block-level Elements

Block elements are elements that separate a block of content. For example, a paragraph (<p>) element is a block element. Other block elements include:

  • Lists (<ul> and <ol>)
  • Tables (<table>)
  • Forms (<form>)
  • Divs (<div>)

Inline Elements

Inline elements are elements that affect only snippets of content and do not block off a section of a page. Examples of inline elements include:

  • Links (<a>)
  • Images (<img>)
  • Formatting tags (<b>, <i>, <tt>, etc.)
  • Phrase elements (<em>, <strong>, <code>, etc.)
  • Spans (<span>)

Important: Inline elements cannot be direct children of the body element. They must be contained within a block-level element.


Comments are generally used for one of three purposes.

  1. To write helpful notes about the code; for example, why something is written in a specific way.
  2. To comment out some code that is not currently needed, but may be used sometime in the future.
  3. To debug a page.

HTML comments are enclosed in <!– and –>. For example:

<!-- This is an HTML comment -->


XHTML 1.0 and HTML 4.0 consist of the same sets of elements. The only difference is that HTML is fairly flexible; whereas, XHTML has strict rules.

DOCTYPE Declarations

The DOCTYPE declaration goes at the beginning of the document and is used to indicate which version of (X)HTML the page uses. There are three versions of (X)HTML documents: strict, frameset and transitional (loose). In HTML, the DOCTYPE declaration is optional. In XHTML, it is required.


The strict versions of HTML and XHTML do not allow use of tags and attributes that have been deprecated.  The strict versions do not support framesets.

    XHTML Strict
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"

HTML Strict

Transitional (Loose)

The transitional (or loose) versions of HTML and XHTML allow for the use of deprecated tags and attributes. The transitional versions also do not support framesets.

    XHTML Transitional
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"

HTML Transitional
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"


The frameset versions of HTML and XHTML are the same as the transitional versions, except that they also support frames.

    XHTML Frameset
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN"

HTML Frameset
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Frameset//EN"

Closing Tags

HTML 4.0 allows some closing tags to be omitted. For example, in HTML, list item (<li>) tags do not require a matching close tag (</li>).

In XHTML, all tags must be closed. Empty tags are closed by adding a forward slash before the final angle bracket of the tag:

<tagname att1="value" att2="value" />

Note the space before the forward slash. Though this is not required by XHTML, it may help older browsers from getting confused.

In HTML 4.0, the forward slash is not required:

<tagname att1="value" att2="value">

Case Sensitivity

In HTML, case is not important. In XHTML, all tags and attributes must be in lowercase letters.


In HTML, attribute values do not always have to be in quotes; whereas, in XHTML quotes are required. Either single quotes or double quotes may be used.


In both HTML and XHTML, tags should be nested properly. Proper nesting requires nested tags to be closed in reverse order from which they were opened. Another way to say this is that each element must be completely contained by its parent element. For example, the following line of code uses improper nesting:


The corrected line looks like this:


Some XML Stuff

The XML Declaration

XHTML documents are, by definition, XML documents. This means that they follow the rules of XML. Although not required, it is good practice to include an XML declaration in your XHTML documents. If included, the XML declaration must be at the very beginning of the document. The XML declaration looks like this:

<?xml version="1.0" encoding="UTF-8"?>

For best results, it is best to define the encoding in a meta tag as well. We’ll cover meta tags later in the manual. For now, note that you should include the following tag within the head tag:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

The XHTML Namespace

In XHTML documents, the html tag must contain an xmlns declaration for the XHTML namespace, which indicates that the document must conform to the rules defined in the XHTML namespace. The syntax is shown below:

<html xmlns="">

Special Characters

Special characters (i.e, characters that do not show up on your keyboard) can be added to HTML pages using entity names and numbers. For example, a copyright symbol (©) can be added using &copy; or ©. The following table shows some of the more common character references.

lang and xml:lang

The lang and xml:lang attributes are used to tell the browser (or other user agent) the language contained within an element. The W3C recommends that both lang and xml:lang be included in the html tag of all XHTML documents, like so:

<html xmlns="" xml:lang="en" lang="en">

According to the W3C, these attributes may be helpful in:

  • Assisting search engines
  • Assisting speech synthesizers
  • Helping a user agent select glyph variants for high quality typography
  • Helping a user agent choose a set of quotation marks
  • Helping a user agent make decisions about hyphenation, ligatures, and spacing
  • Assisting spell checkers and grammar checkers

Introduction to HTML Conclusion

In this lesson of the HTML tutorial, you have learned the basics of HTML. You should understand how an HTML page is structured, know the major differences between HTML and XHTML and understand the basic syntax of HTML tags.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s