Headingology; using HTML headings

Most web pages do not use HTML heading elements (see Related documents [1] for a survey), when they are used they are regularly not used correctly, or not used effectively. In this article I set out the benefits of using headings, what I consider to be good practice guidelines and I also describe a few awkward choices that could have to be made.

Contents

Benefits

Correctly and effectively used headings allow users to quickly visually scan content to get an impression of the contents of a given page and find the section(s) that they are interested in.

Image
Screencap of a browser side panel displaying a TOC which was automatically generated from a document's heading structure

Headings also allow tools to automatically generate a node tree displaying the document section structure, aka the document outline where a user can click on the displayed section names to navigate to that section. In a browser side panel there is more room to display the section structure compared to the viewport and the node tree display conveys the hierarchy better than mere font size variations. It can function as a document's Table of Contents (TOC), see nearby screencap.

Assistive Technology (AT) User Agents (UA) like talking browsers used by visually impaired users can use document headings to offer their users a navigation mechanism. Without such a mechanism, users who access pages via the linear aural domain might otherwise be forced to listen to much more of the content on a page to find out if it contains interesting information. The use of headings combined with a good UA heading navigation mechanism can help such users step through a document by sampling the section headings.

Correctly used headings allow Search Engines (SEs) to index content better. SE indexing algorithms are a closely guarded secret, but it is commonly accepted that heading content is given greater "weight" by SEs compared to regular text to establish what a page is about. Section - Scope (explained later) can be used to give emphasis to keyword weighing per section (see Related documents [2] for more on this).

Section

Headings are the only HTML 4 elements capable of subdividing a document into structural sections.

The following paragraph in the HTML 4 specification could possibly result in the misunderstanding that <div>s can be used for that purpose:

Quote:

The DIV and SPAN elements, in conjunction with the id and class attributes, offer a generic mechanism for adding structure to documents. These elements define content to be inline (SPAN) or block-level (DIV) but impose no other presentational idioms on the content. Thus, authors may use these elements in conjunction with style sheets, the lang attribute, etc., to tailor HTML to their own needs and tastes.

Source: HTML 4.01 spec. Grouping elements: the DIV and SPAN elements

The use of the word "structure" in this description and the fact that it is contained in a section called "Grouping elements ..." does not mean that these elements can be used to section content in a document.

A div or a span can only be said to provide structure and/or act as grouping elements when they are used to code language information via the lang attribute or for styling purposes. This is supported by the HTML 4.01 DTD comments describing the div and span elements:

Quote:

generic language/style container

Source: HTML 4.01 DTD

Scope

When a heading is used in a document it establishes a section, the scope of this section may not be obvious since the heading element doesn't encapsulate the content like for example a div element would. The section started by a heading ends at the start of another heading of the same or higher importance that follows it, so the section established by an h2 ends at the first h2 or h1 that follows it, or when the document ends (whichever comes first).

Conveying section structure

For users it is often helpful to be aware of a document's section structure, authors should always be aware of the document hierarchy.

Users

The way graphic browsers indicate heading levels by using different font sizes doesn't convey the section hierarchy and scope particularly well. Emphasizing this structure can make it easier for a user to scan a document. Some Section styling examples.

Authors

Keeping track of a document's section hierarchy and scope when working on the source is difficult if the code is unformatted, it can easily result in mistakes. Authors could manually indent the code to reflect the section hierarchy as in this example:

   <h2>Section heading</h2>
   <p>Section content.</p>
      <h3>Section heading</h3>
      <p>Section content.</p>
         <h4>Section heading</h4>
         <p>Section content.</p>
   <h2>Section heading</h2>
   <p>Section content.</p>

Authors who use an automated code formatting utility could consider using div wrappers such as used in the Section styling demo even if they do not use them for styling:

   <div class="section">
      <h2>Section heading</h2>
      <p>Section content.</p>
      <div class="section">
         <h3>Section heading</h3>
         <p>Section content.</p>
         <div class="section">
            <h4>Section heading</h4>
            <p>Section content.</p>
         </div>
      </div>
   </div>
   <div class="section">
      <h2>Section heading</h2>
      <p>Section content.</p>
   </div>

The use of div wrapper elements makes it possible for the code formatter to automatically indent the sections.

Section or divide

A document becomes unwieldy and difficult to use if there is too much content. Typically there is a direct relation between the amount of content and the number of headings if the content is effectively sectioned. For headings to be broadly useful there shouldn't be too many. Although sectioning a larger document with headings can make it substantially more usable, it cannot cure a problem of too much content per page.

Authors should consider dividing content over multiple documents if the content nature allows it and if doing so is likely to benefit users. Some authors might also want to consider that dividing content thematically into multiple smaller documents could benefit SE ranking for a theme, although the general principle of "design for humans, not for SEs" should remain the overriding guideline.

Levels

In addition to the argument from the previous section against using too much content and thus ending up with too many headings on a page in an absolute sense, it should also be considered that there can be too great a depth. For headings to be broadly useful the hierarchy depth needs to be conveyable to users, the greater the depth, the more difficult this becomes.

HTML 4 provides us with six heading levels: <h1> - <h6>. A question that should be asked is: "is using all six levels wise?", in my opinion it isn't for most situations. On the rare occasions that I thought I needed an <h5> or <h6>, on reflection I had to conclude that I had too much content for one page and it would be better to divide the content over multiple documents, or that I was using headings presentationally (more on that later in the Pseudo headings section). Consequently I now only use the first four heading levels.

Authors who also wish to use <h5> & <h6> should consider the following issue; CSS 2.1's Default style sheet for HTML 4 suggests that browsers default to a font-size for these headings that is smaller than normal body text:

h1{font-size:2em}
h2{font-size:1.5em}
h3{font-size:1.17em}
h4{font-size:1em}
h5{font-size:.83em}
h6{font-size:.75em}

This is an odd choice IMO, text displayed at a size smaller than the user's configured default size can result in usability problems if users haven't also specified a minimum font size in their browser equal to their configured default body text size, and if they have, the distinction between <h4>, <h5> and <h6> is likely to disappear.

As noted earlier, I don't find variation in font size to be a particularly effective way to convey section structure. A font size needs to increase by at least 20% per level for me to be able to distinguish different levels, although this partially depends on what font is used.

Font sizing demos (using a sans-serif font):

To avoid the previously noted problems that could result from the CSS 2.1 default stylesheet heading font size suggestion, I suggest that authors who intend to use all six heading levels and who rely on font size alone to convey heading levels should set the font size for <h6> headings to 100%, and then scale up other headings by 20% increments:

h1{font-size:249%}
h2{font-size:207%}
h3{font-size:173%}
h4{font-size:144%}
h5{font-size:120%}
h6{font-size:100%}

Consecution

The general opinion in the authoring community seems to be that heading levels should be consecutive. This means for example that "child" heading(s) to an <h2> should be <h3>(s), not "skip" one or more levels. I support this practice because it appeals to the orderly side of my character, but I am not aware of solid arguments as to why.

An automatic TOC generator may display a less orderly node tree (depending on the generator used). In some cases skipping heading levels may even cause a utility to choke such as in the case of the A Dynamic Table of Contents Script example. But it seems fair to say that such should be labelled as a flaw of this particular script.

A W3C note says this about the issue:

Quote:

Users should order heading elements properly. For example, in HTML, H2 elements should follow H1 elements, H3 elements should follow H2 elements, etc.

Source: W3C Note; HTML Techniques for Web Content Accessibility Guidelines 1.0; Section headings

But it isn't backed up with argumentation. Earlier in the paragraph we find this:

Quote:

Since some users skim through a document by navigating its headings, it is important to use them appropriately to convey document structure

Source: W3C Note; HTML Techniques for Web Content Accessibility Guidelines 1.0; Section headings

But it is unclear if that is meant as an argument to the rule stated later, and regardless, I'm not aware of an AT UA in which heading levels being skipped causes a problem.

One argument used in favour of skipping heading levels is that sometimes using a heading that defaults to a larger font size like an <h2> "looks wrong" in certain places (especially when viewed sans CSS) and incorrectly represents the structure. A common counter argument to this point of view is that the font size should then be reduced using CSS for such headings. IMO this isn't a valid argument, a document should look structurally sound without CSS, if something "looks wrong" sans CSS, it is usually because it is wrong. The fact that there can be situations where certain headings "look wrong" is IMO caused by headings being used presentationally (more on that in the Pseudo headings section), or it is the result of using headings to mark up page foreign content (discussed next).

Foreign content

Modern web pages often contain in-viewport UI components like a navigation bar (navbar) and/or a promotional panel (promo panel) with short summaries of, and links to, other articles on the site. Inclusion of a navbar and a promo panel can be highly effective in getting people to explore more of the content on a site, so they are widely used.

Unfortunately HTML 4 is a markup language primarily designed for individual pages with a single subject, it offers no proper facilities for structuring in-viewport UI elements like a site navbar, or promo panel content that doesn't directly relate to the subject of the principle page content.

I use the phrase "foreign content" to refer to a navbar and/or a promo panel, "compound documents" to refer to documents that contain such components, and "native content" for the actual subject content of the page.

Usability problems

Foreign content is often repeated on every page of a site, this creates various usability problems.

Repetition

A navbar or promo panel content typically don't hinder (rather help) people who access such pages in the two dimensional visual domain since they can quickly identify these sections and locate the native page content.

But to users that access such pages through a talking browser, foreign content can be a significant hindrance because these users have no way to quickly distinguish such sections. This could result in them having to listen to navbars and side panel content repeated on every page of a site.

Users of AT UAs do not have a way to navigate between page native and page foreign content. Authors could try to address this issue by including several "Skip to ..." links, but introducing more foreign content to alleviate a problem which was created by foreign content in the first place isn't exactly elegant.

Indexing

The higher the ratio of page native content vs page foreign content, the better a SE can index the content that should be indexed. Promo panel prose especially can drag that ratio down, but navbars that contain a lot of links such as nested pull down menus can also significantly affect this ratio.

The Pringles dilemma

As noted earlier, the only way to end a section is by following it with another heading of equal or greater importance, or by ending the document. This means that from the point in the document source where the first heading appears, all subsequent content should be correctly sectioned. As a catch phrase: Once you start, you can't stop (a bastardisation of a marketing slogan used by the potato chip maker Pringles: "Once you pop, you can't stop").

Content that isn't denoted by an appropriate heading will be falsely sectioned under the preceding heading. An example:

Footers

Footers can be inadvertently placed in a section, consider the following content and code:

<h1>Widget Inc.</h1>
<p>World leading Widget manufacturer.</p>
<div class="section">
   <h2>Products</h2>
   <p>We make great Widgets.</p>
</div>
<div class="section">
   <h2>Contact</h2>
   <p>Phone us on 12345</p>
</div>
<p id="footer">© 2007 Widget Inc.</p>

In this example the likely intention is that the copyright footer should apply to the whole document, but as per the rules set out in Section - Scope, the footer is actually part of the "Contact" section. Visually it may appear outside of that section if section styling is used due to the fact that the footer has been placed outside of the wrapper div, but this has no effect on the actual sectioning.

Some authors use an <hr> element in an attempt to make clear that the main structure has ended. They seem to have assigned mythical qualities to the bizarre <hr> element in the illusion that it has structural significance. The reality is that a <hr> is 100% presentational, it has no structural significance and it has no effect on sectioning.

Within the HTML 4 vocabulary the only way to solve this problem is to place the copyright footer in its own section.

<h1>Widget Inc.</h1>
<p>World leading Widget manufacturer.</p>
<div class="section">
   <h2>Products</h2>
   <p>We make great Widgets.</p>
</div>
<div class="section">
   <h2>Contact</h2>
   <p>Phone us on 12345</p>
</div>
<div id="footer">
   <h2>Copyright</h2>
   <p>© 2007 Widget Inc.</p>
</div>

If desired the footer heading can be positioned off screen with CSS.

Native content first

The practice of placing page foreign content after the page native content in the document source is also affected by the Pringles dilemma. The arguments used in favour of placing native content first in the document source include:

Assuming that this technique has been used and the order in the document source is: page native content, promo panel and navbar, as with the Footer example, if the navbar and the promo panel aren't denoted by their own heading they will be incorrectly sectioned under the last heading from the page native content section.

Since neither the navbar nor the promo panel content pertain to the principle subject of the page denoted by the <h1> content above the page native content section, <h1>s should be used to denote the navbar and a promo panel. As with the Footer example, these headings can be positioned off screen by using CSS.

This is controversial as many believe that a document should only contain one <h1>, and this makes perfect sense for non compound documents. But compound documents offer significant benefits, if the puritanical ideal of not including page foreign content isn't an option, then authors need to compromise given the limited options that HTML 4 offers.

The choices are: also use <h1>s to denote things like a navbar and a promo panel, or make sure that a navbar and promo panel precede the first heading in the document source, and then add "Skip to ..." links.

Each method has it's own pros and cons. On a practical level it should be considered that using <h1>s to denote page foreign content may have an effect on search engine indexing, although it is hard to predict what that effect might be. Placing a navbar and/or promo panel content before the page native content in the source will also prevent the false sectioning, but it replaces it with another problem: un-sectioned content that deprives users of navigating the document structure via the document headings.

Users of AT UAs in particular are thought to use and appreciate heading navigation (see User behaviour research). The Heading navigation support in AT UAs found in certain popular clients allows users to directly navigate from one <h1> to the next <h1> regardless of any headings of lower importance in between the two. A plus point of using <h1>s to denote page foreign content is that the ability to skip or navigate to a navbar or promo panel section with the heading navigation method AT users already use to navigate page native content sections can be considered as a significant advantage.

*$£!%* HTML 4

Instead of blaming HTML 4 for its inadequacies that require us authors to use these less than ideal solutions for compound documents, why not read the HTML 5 proposals and get involved if you are able to, and help create something better for the future.

Linking

Headings are an ideal destination for fragment identifier links. I advocate using the exact target heading text as the link text, that way users get a conformation that they ended up in the right place after the navigation has taken place. Using the exact heading text (which if located on another site is usually not under the author's control) as the link text often means that you have to make adjustments to your text in which those links are embedded. Having used this method myself for a while I've found this easy to do once I got used to it.

Linking to a heading using the heading content as the link text should make it easy for a user to find the section if the section being linked to does not appear at the top of the viewport such as when the section is near the end of a document. A demonstration; on the HTML is dead, long live HTML blog post if I follow the "skip to the point" link I am lost, I don't know where to start reading. A "The point" subheading would have solved that.

If the heading that is being linked to is a subheading who's text content doesn't sufficiently make sense if isolated away from the preceding section heading, then the preceding section heading text can be included in the target heading link text.

A practical example: when linking to the subsection on (section) Scope in this document I would use the link text Section - Scope. This should ensure that the link continues to make sense when it is taken out of context for example when a UA lists all links in a document (see nearby screencap image).

I've seen a lot of instances where authors hyperlink heading text, I consider this bad practice. Heading text should briefly describe the topic of the section it establishes, the topic description should immediately follow the heading, it makes no sense to link a heading to another location.

Grammar

I'm strongly partial to conciseness for heading text, the briefer the description of the content the better as far as I'm concerned. I aim for staccato, not legato.

I'm not aware of a preferred convention on capitalising only the first or every word, both methods look ok to me depending on the text, provided that the same style is used across a site.

I avoid ending with a punctuation mark like a colon or semi-colon because they don't work when a heading is taken out of context. Although question marks can work out of context it is not common to have a section which really asks a question. For example this document contains a section called "Section or divide", but it represents a choice that is expanded upon, it is not a question.

Pseudo headings

I assume that we'd all agree that marking up content as a heading if all we want is a bold and bigger font is incorrect. I argue that it is equally incorrect to mark up something as a heading just because we want something to appear structurally separated from adjacent content (typical visual rendering: a preceding and trailing blank line).

The <th> element is used to mark up "headings" for table rows and cells, it isn't part of the normal heading hierarchy, it isn't included in the heading navigation option that certain browsers offer, the content is typically rendered in bold, at normal font size and AFAIK is not treated by search engines as having greater importance than normal body text.

IMO there are cases where such a "pseudo" heading is useful outside of the context of a table, list introductions for example. Consider the following content & code:

   <h2>Shopping list</h2>
   <p>Edgar struggled to compose a shopping list. Eventually
   he managed to write down what he needed whilst at the same
   time answering a telephone call.</p>
   <p><b>Shopping list</b></p>
   <ul>
      <li>Milk</li>
      <li>Eggs</li>
      <li>Bread</li>
      <li>Potatoes</li>
   </ul>

It is obvious that the shopping list itself is already in an appropriate section under an appropriate heading. Marking up the list introduction as a h3 heading would not only result in two headings with the same content, it would serve no meaningful function since the list cannot justifiably be considered as a separate section. The list introduction as used in the example is also not superfluous, it is necessary. Imagine a possible rendering by a speaking browser if it were omitted: "... whilst at the same time answering a telephone call. [pause] Milk [pause] Eggs [pause] Bread [pause] Potatoes".

My use of a <p> element and (shock horror) the allegedly evil <b> element to construct such a pseudo heading will undoubtedly cause certain people to want to stick voodoo pins in an effigy of me for sinning against their semantic markup cult. But I suggest they wait until they've read another article that I hope to publish shortly that will expand on that practice (after which they'll probably want to stick pins in my effigy and burn it at the stake).

Heading or Pseudo heading

Based on the previous example I advocate that authors should use headings in a way that will result in a document outline likely to be useful as a Table of Contents and navigation tool, and consider using "pseudo headings" as illustrated by the previous example for situations where something merely needs to be structurally separated from adjacent content.

Graphic headings

It should be stressed that any use of images of text can cause problems, examples:

These potential problems should be carefully weighed before deciding to use images of text. On the other hand the desire amongst designers to use certain fonts and effects for headings which currently cannot be reliably achieved with CSS such as shadows and partial transparency is IMO understandable. The issue of people needing a bigger font size is arguably the most likely to cause a problem, consequently images of text should IMO only be considered for large size headings.

Another problem in addition to those mentioned earlier is that a search engine may not index the fall back text content, more on this issue in the next section.

Site branding

It is common to use site branding where an image displaying the company or site name is displayed site wide. Doing so should create a sense of consistency and familiarity for users as they navigate across a site. Such an image of text should typically be marked up as the <h1> content on the site's home page with an appropriate real text alternative.

On pages other than the home page the same image is essentially decorative and should be coded as such. The <h1> content on for example the "About" page on such a site should be "About". Authors who wish to make such an image on sub pages link to the home page can still do so (although I do not support this convention myself).

Using the <img> element to embed the image like for example <h1><img src="widget_inc.png" alt="Widget Inc."></h1> can create a problem because search engines (wrongly) may not index the alt content. The so called "image replacement" techniques solve this problem and this could partially explain their popularity, but these image replacement techniques all introduce additional problems. There is a technique available that does not introduce additional problems. Instead of using the <img> element it uses the <object> element to embed the image. The text fall back for the image is coded as the element's content instead of attribute content, so the text will be indexed by SEs.

Alas Microsoft (MS) Internet Explorer (IE) 6 & 7 do not render images embedded with <object> properly.

This can be solved by serving IE an <img> construct instead with the help of Conditional Comments (CCs). The so called "downlevel-revealed" CC required for this as documented by MS does not validate, but this can be solved by modifying the MS code as has been done in the example that follows. Using CCs and two different elements results in slightly more verbose code, but this shouldn't significantly increase development or maintenance effort as it would only be used once on the home page.

Code example (width and height info omitted for brevity)

<h1>
   <!--[if !IE]>--><object type="image/png" data="widget_inc.png">Widget Inc.</object><!--<![endif]-->
   <!--[if IE]><img src="widget_inc.png" alt="Widget Inc."><![endif]-->
</h1>

A working demonstration that combines all the principles and techniques described: Widget Inc. Test both the home page and one of the "product" pages in turn with: CSS disabled & images enabled, CSS enabled & images disabled and CSS & images disabled (using a browser that supports this). The CSS used is placed in the HTML document's head for ease of viewing.

Subheadings

Subheadings can be split in two categories:

Promotional text

Widget Inc.
World leading Widget manufacturer

"World leading Widget manufacturer" would typically be referred to as a "subheading" in the print domain, and I don't see much wrong with that. But in HTML the phrase "subheading" means something else; "a heading that establishes a subsection". It is obvious that the "World leading Widget manufacturer" doesn't establish a sub-section, it is merely promotional sub-text. But the above effect is commonly coded as:

<h1>Widget Inc.</h1> 
<h2>World leading Widget manufacturer</h2>

This is incorrect and should be replaced by the following:

<h1>Widget Inc.</h1> 
<p>World leading Widget manufacturer.</p>

Presentational

Headingology,
using HTML headings

Using the title of this article, "Headingology" is a made up word that is paradoxically both very useful for SE indexing (it wasn't listed by Google when I searched for it), and at the same time useless because unless the phrase is used to refer to this article it is unlikely to feature in the search result for someone looking for opinions on HTML heading usage, so I added "using HTML headings". I wanted to display "using HTML headings" in a smaller font, so I decided on the following coding:

<h1>Headingology, <small style="display:block;font-size:50%">using HTML headings</small></h1> 

Possible problems

I've developed an allergy for the following situations:

No content between headings

<h1>Codewallop</h1> 
<h2>Good practice</h2>
<ul>
   <li>HTML Titles</li>
   <li>HTML Headings</li>
</ul>
<h2>Points of view</h2>
<ul>
   <li>css Zen Garden, The Road to Misconception</li>
   <li>The flaws of WAI 2</li>
</ul>

I don't have an argument to suggest that the previous example is wrong, subjectively I prefer inserting a paragraph in between. Using the guidelines from the HTML Headings - Grammar section, headings should be as short and concise as possible, perhaps it is due to the resulting desire to expand on the heading that I'm partial to at least inserting a short paragraph, an example:

<h1>Codewallop</h1> 
<p>Striking a blow for quality web code.</p>
<h2>Good practice</h2>
<ul>
   <li>HTML Titles</li>
   <li>HTML Headings</li>
</ul>
<h2>Points of view</h2>
<ul>
   <li>css Zen Garden, The Road to Misconception</li>
   <li>The flaws of WAI 2</li>
</ul>

Single subsection

This is another scenario that I view with some suspicion:

<h1>Codewallop</h1> 
<p>Striking a blow for quality web code.</p>
<h2>Good practice</h2>
<ul>
   <li>HTML Titles</li>
   <li>HTML Headings</li>
</ul>

The <h2> in the previous example is used to create a single subsection, this doesn't strike me as sensible. If subsectioning is to be used shouldn't there at least be two subsections? I'd be inclined to use a Pseudo heading instead of the <h2> in the previous scenario.

References

Links to various other documents, research and tools.

Tools

Heading navigation support in AT UAs

The following links to AT UA manuals suggest that they support heading navigation per level.

Other AT UAs.

AFAIK neither Orca nor NVDA lists headings yet, but they're both still in early development.

User behaviour research

A 2003 study of 16 blind users found that most tried to scan the page in various ways, but while a few used the headings features many were unaware they existed.

(Both these links write up the same study, but somewhat differently.)

A 2005 study of 8 screen reader users found that all participants said headings for different levels of navigation were very useful: Page Source Order & Accessibility

Peter Krantz studied screen reader users in 2005 and reported that the list of headings was one of the most used features of JAWS: Browsing habits of screen reader users

Techniques

Links to pages on some of the coding techniques used in the demos.

Credits

The "Heading navigation support in AT UAs" and "User behaviour research" references and prose were taken from a post by Benjamin Hawkes-Lewis to the WhatWG mailing list: [whatwg] Clarify how to indicate document hierarchy.

Revisions

Date Revision
15 March 2007 Initial publication
22 March 2007 Removed the notice that IE7 by default blocks content embedded with an object element, this only appears to be the case when loading the HTML file from the local filesystem

Home