Guide

HTML fundamentals explained

HTML (HyperText Markup Language) is the structural layer of the web. Browsers do not run HTML like a program; they parse it into a tree of nodes called the Document Object Model (DOM), then combine that tree with CSS and JavaScript to render what users see. Every page you ship — static guide, React SPA, or server-rendered shop — ultimately becomes HTML on the wire. Good markup improves search rankings, accessibility, and performance on the critical rendering path. This guide covers the document skeleton, semantic elements, text and lists, links and media, forms, head metadata, a Harbor product-page worked example, an element decision table, common pitfalls, and a production checklist.

The HTML document skeleton

A valid HTML5 page follows a predictable structure. The browser needs a <!DOCTYPE html> declaration so it renders in standards mode instead of legacy quirks mode. The root <html lang="en"> element wraps everything; the lang attribute tells assistive tech and search engines which language the primary content uses.

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>Page title — unique per URL</title>
    <link rel="canonical" href="https://example.com/page/">
  </head>
  <body>
    <!-- visible content -->
  </body>
</html>

The <head> holds machine-readable metadata: character encoding, viewport for mobile layout, title, description, canonical URL, stylesheets, and scripts that should load before paint. The <body> holds everything users interact with. Keep one <h1> per page that describes the main topic; use <h2> through <h6> in order without skipping levels — screen readers and SEO both rely on that outline.

Semantic HTML: meaning, not just boxes

Early web pages used <div> and <span> for everything. HTML5 added semantic elements that describe role so browsers, crawlers, and assistive technology understand structure without reading class names.

Landmark elements

  • <header> — introductory content or site-wide navigation banner.
  • <nav> — a block of navigation links (primary menu, breadcrumbs).
  • <main> — the dominant unique content; exactly one per page.
  • <article> — self-contained composition (blog post, guide, card).
  • <section> — thematic grouping inside an article or page.
  • <aside> — tangentially related content (sidebar, pull quote).
  • <footer> — footer for its nearest sectioning ancestor or the page.

Text-level semantics

Use <strong> for importance, <em> for emphasis, <code> for code fragments, <abbr title="..."> for abbreviations, and <time datetime="2026-06-08"> for dates. Prefer these over bold or italic styling alone — semantics survive when CSS is off or overridden.

Block vs inline

Block elements (<p>, <ul>, <div>) start on a new line and can contain other blocks. Inline elements (<a>, <span>, <img>) flow inside text. Nesting rules matter: never put a <div> inside a <p>; the browser will auto-close the paragraph and break your layout.

Links, images, and media

Hyperlinks

The <a href="..."> element is the web's superpower. Use absolute URLs for external destinations and root-relative paths (/guides/) for internal links so they survive domain changes. Add rel="noopener noreferrer" when target="_blank" opens a new tab — it prevents the new page from accessing window.opener. Descriptive link text ("read the HTTP guide") beats "click here" for accessibility and SEO.

Images

Every meaningful <img> needs an alt attribute. Decorative images get alt="" so screen readers skip them. Specify width and height (or use CSS aspect-ratio) to reserve space and avoid Cumulative Layout Shift. For responsive images, use srcset and sizes or the <picture> element — our image optimization guide covers formats and lazy loading.

Video and audio

Native <video> and <audio> elements support multiple <source> children for codec fallbacks. Always provide captions or transcripts for video with speech — WCAG requires it for compliance-minded products.

Forms and user input

Forms are how users submit data. Wrap controls in <form> with an explicit action and method (usually POST for mutations). Every input needs a <label> associated via for matching the input's id — clicking the label focuses the field, and screen readers announce the relationship.

  • type="text", email, url, tel — browser validation and mobile keyboards adapt to the type.
  • type="checkbox" and radio — group radios with the same name; use fieldset and legend for clusters.
  • type="submit" — prefer a real button over JavaScript-only submission so the form works without JS.
  • required, pattern, min/max — client-side hints; always re-validate server-side.

HTML5 added semantic inputs like date, number, and search. Use them when they match the data — they reduce custom widget code and improve mobile UX.

Head metadata that matters

Beyond charset and viewport, production pages typically include:

  • Title — unique, descriptive, under ~60 characters when possible.
  • Meta description — summary for search snippets; not a ranking factor but affects click-through.
  • Canonical link — tells crawlers the preferred URL when duplicates exist (trailing slash, query params).
  • Open Graph / Twitter tags — control how links preview on social platforms.
  • JSON-LD — structured data for rich results; see our JSON-LD guide.
  • Faviconlink rel="icon" for browser tabs.

Stylesheets belong in <head> so the browser can fetch CSS before painting body content. Defer non-critical scripts with defer or async attributes to avoid blocking HTML parsing.

Worked example: Harbor product detail page

Harbor Gear sells outdoor equipment. A product page needs crawlable structure, accessible forms, and fast images. Here is a simplified skeleton:

<main>
  <article itemscope itemtype="https://schema.org/Product">
    <header>
      <h1 itemprop="name">Saltwind 40L Pack</h1>
      <p><data itemprop="price" value="129.00">$129.00</data></p>
    </header>
    <figure>
      <img src="/img/saltwind-pack.webp"
           alt="Saltwind 40L hiking backpack, slate gray"
           width="800" height="600" loading="lazy">
    </figure>
    <section aria-labelledby="features-heading">
      <h2 id="features-heading">Features</h2>
      <ul>
        <li>40L capacity, 1.2 kg empty weight</li>
        <li>Hydration sleeve fits 3L reservoir</li>
      </ul>
    </section>
    <form action="/cart/add" method="post">
      <label for="qty">Quantity</label>
      <input id="qty" name="quantity" type="number" min="1" max="10" value="1">
      <input type="hidden" name="sku" value="SG-PACK-40">
      <button type="submit">Add to cart</button>
    </form>
  </article>
</main>

One <main>, one <h1>, sections with labelled headings, lazy-loaded image with dimensions, and a form that works even if JavaScript fails. JSON-LD in the head can duplicate Product schema for Google Merchant surfaces. CSS from a shared stylesheet handles layout — markup stays semantic so responsive rules reflow the same HTML on mobile and desktop.

Element choice decision table

You needUseAvoid
Site navigation block<nav> inside <header>Div soup with onclick handlers
Primary page content<main>Multiple mains or skipping main entirely
Blog post or guide body<article>Bare div with class "post"
Grouped subsection<section> + headingSection without a heading (orphan landmark)
Clickable button action<button type="button"><div role="button"> without keyboard support
Navigate to URL<a href>Button that only runs location.href in JS
Tabular data<table> with <th scope>CSS grid pretending to be a data table
Layout-only wrapper<div>Semantic tag chosen for styling convenience

Anti-patterns

  • Divitis — replacing every semantic tag with <div class="header"> loses landmarks; screen reader users cannot jump to main content.
  • Multiple h1 tags — confuses document outline; one h1 per page unless using explicit sectioning with nested articles (rare).
  • Missing alt text — images without alt fail WCAG and hurt image search visibility.
  • Inline styles everywhere — hard to maintain; use classes and external CSS. Inline is fine for one-off email or critical above-the-fold tweaks.
  • Tables for layout — breaks responsive design and screen reader table navigation; use CSS Grid or Flexbox.
  • Blocking scripts in head — synchronous JS without defer delays first paint; hurts LCP.
  • Invalid nesting — interactive elements inside interactive elements (button inside link) create unpredictable behavior across browsers.

Production checklist

  • Start with <!DOCTYPE html>, lang, charset, and viewport meta.
  • One <main>, one logical <h1>, heading hierarchy without skips.
  • Unique <title>, meta description, and canonical URL per page.
  • All images have appropriate alt; decorative images use alt="".
  • Forms: labels on every control; server-side validation mirrors client hints.
  • Reserve image dimensions; lazy-load below-the-fold media.
  • Validate markup with the W3C validator or IDE linting in CI.
  • Run keyboard-only and screen reader smoke tests on key flows.
  • Keep HTML declarative; push behavior to progressive-enhancement JavaScript.

Key takeaways

  • HTML describes structure and meaning; CSS handles presentation and JavaScript adds behavior.
  • Semantic elements improve SEO, accessibility, and maintainability compared to div-only markup.
  • The head carries encoding, viewport, title, canonical, and social metadata that affect discovery and sharing.
  • Forms and links should work without JavaScript as a baseline; enhance from there.
  • Good HTML is the foundation for fast rendering, valid structured data, and HTTP-delivered content that crawlers can index.

Related reading