Monday, December 17, 2007

What is a DOCTYPE? Part 1 of 2 - TUTORIAL

Audience: Those with at least a basic understanding of HTML and CSS.

If you view the HTML code of a webpage, there is a good chance you will see the following text (or something similar) at the top:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"">

Why is it there, and what does it mean?

To get a clear understanding, we need to go back in time..

When HTML first arrived in 1993, it was in a much simpler form than it is today. For example, the first version of HTML didn’t have as many tags (there was no <div>, no <span>, among others) not even any CSS. HTML’s creator (Tim Berners-Lee) specified what tags could be used in HTML, and what they would mean. For example, he said the <h1> tag could be used in HTML to mean a heading. This list of tags and their meanings became known as the “HTML specification”.

As the internet became more popular, competition between the web browsers to be the most popular was also growing. In order to try and separate themselves from others on the market, some web browsers began introducing new tags that were not part of the original “HTML specification”. This caused a major problem: some web pages were being created with tags that not all web browsers could understand. Eventually, many people agreed that the original “HTML specification” should be updated to a second version, and should include as many of the new tags as possible.

This is exactly what happened. HTML was upgraded to version 2.0, and included many new tags, and the removal of some old ones. This created a new problem for web browsers. If HTML was going to keep getting upgraded to newer versions, it would mean that some (older) web pages would still exist on the internet written in older versions of HTML. How was a web browser supposed to know which version of HTML a web page was written in?

This problem was solved by asking web designers to place some special text at the top of a web page’s HTML code. This special piece of text lets the web browser know which version of HTML the web page is written in. This bit of text is called, in technical terms, a document type declaration, because it is declaring which type (or version) of HTML is being used.

If a tag is found in a particular HTML version, then it is considered to be a “valid” HTML tag (for that version). Each time HTML is upgraded to a new version, the list of “valid” tags for that version is compiled together in a document called a document type definition (or DTD). It has this name because the DTD defines which tags are valid for the type (or version) of HTML being used.

So that special bit of text that you sometimes see at the top of HTML code is a document type declaration, which is letting the web browser know which version of HTML the webpage is written in. In the example below, it's XHTML 1.0 Transitional, one of the latest versions of HTML.

You will also notice a link to the document type definition for XHTML 1.0 Transitional (a list of all valid tags for this version of HTML). See below.

You can even type that web address in to your browser and view the DTD, but please note that figuring out how to properly read it would require a tutorial of its own!

Additional tips:
* DOCTYPE is short for 'Document Type'.

NEXT: Part 2 of the 'What is a DOCTYPE?' tutorial.


Please comment on this tutorial, and let me know if there's any way I could have improved it. Most importantly, what could I have done to make it easier to understand?

1 comment:

scragar said...

maybe just add that DTD is short for "doctype declaration", just for those that don't know, and maybe a little info about what doctypes are available...