Guide
HTML fundamentals explained
HTML (HyperText Markup Language) is the structural layer of the web. Browsers do not run HTML like a program; they parse it into a tree of nodes called the Document Object Model (DOM), then combine that tree with CSS and JavaScript to render what users see. Every page you ship — static guide, React SPA, or server-rendered shop — ultimately becomes HTML on the wire. Good markup improves search rankings, accessibility, and performance on the critical rendering path. This guide covers the document skeleton, semantic elements, text and lists, links and media, forms, head metadata, a Harbor product-page worked example, an element decision table, common pitfalls, and a production checklist.
The HTML document skeleton
A valid HTML5 page follows a predictable structure. The browser needs a
<!DOCTYPE html> declaration so it renders in standards mode
instead of legacy quirks mode. The root <html lang="en"> element
wraps everything; the lang attribute tells assistive tech and search
engines which language the primary content uses.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Page title — unique per URL</title>
<link rel="canonical" href="https://example.com/page/">
</head>
<body>
<!-- visible content -->
</body>
</html>
The <head> holds machine-readable metadata: character encoding,
viewport for mobile layout, title, description, canonical URL, stylesheets, and
scripts that should load before paint. The <body> holds everything
users interact with. Keep one <h1> per page that describes the
main topic; use <h2> through <h6> in order
without skipping levels — screen readers and SEO both rely on that outline.
Semantic HTML: meaning, not just boxes
Early web pages used <div> and <span> for
everything. HTML5 added semantic elements that describe role
so browsers, crawlers, and assistive technology understand structure without reading
class names.
Landmark elements
<header>— introductory content or site-wide navigation banner.<nav>— a block of navigation links (primary menu, breadcrumbs).<main>— the dominant unique content; exactly one per page.<article>— self-contained composition (blog post, guide, card).<section>— thematic grouping inside an article or page.<aside>— tangentially related content (sidebar, pull quote).<footer>— footer for its nearest sectioning ancestor or the page.
Text-level semantics
Use <strong> for importance, <em> for emphasis,
<code> for code fragments, <abbr title="...">
for abbreviations, and <time datetime="2026-06-08"> for dates.
Prefer these over bold or italic styling alone — semantics survive when CSS is off
or overridden.
Block vs inline
Block elements (<p>, <ul>, <div>)
start on a new line and can contain other blocks. Inline elements
(<a>, <span>, <img>) flow
inside text. Nesting rules matter: never put a <div> inside a
<p>; the browser will auto-close the paragraph and break your layout.
Links, images, and media
Hyperlinks
The <a href="..."> element is the web's superpower. Use absolute
URLs for external destinations and root-relative paths (/guides/) for
internal links so they survive domain changes. Add rel="noopener noreferrer"
when target="_blank" opens a new tab — it prevents the new page from
accessing window.opener. Descriptive link text ("read the HTTP guide")
beats "click here" for accessibility and SEO.
Images
Every meaningful <img> needs an alt attribute.
Decorative images get alt="" so screen readers skip them. Specify
width and height (or use CSS aspect-ratio) to reserve space
and avoid
Cumulative Layout Shift.
For responsive images, use srcset and sizes or the
<picture> element — our
image optimization guide
covers formats and lazy loading.
Video and audio
Native <video> and <audio> elements support
multiple <source> children for codec fallbacks. Always provide
captions or transcripts for video with speech — WCAG requires it for compliance-minded
products.
Forms and user input
Forms are how users submit data. Wrap controls in <form> with an
explicit action and method (usually POST for
mutations). Every input needs a <label> associated via
for matching the input's id — clicking the label focuses
the field, and screen readers announce the relationship.
type="text",email,url,tel— browser validation and mobile keyboards adapt to the type.type="checkbox"andradio— group radios with the samename; usefieldsetandlegendfor clusters.type="submit"— prefer a real button over JavaScript-only submission so the form works without JS.required,pattern,min/max— client-side hints; always re-validate server-side.
HTML5 added semantic inputs like date, number, and
search. Use them when they match the data — they reduce custom widget
code and improve mobile UX.
Head metadata that matters
Beyond charset and viewport, production pages typically include:
- Title — unique, descriptive, under ~60 characters when possible.
- Meta description — summary for search snippets; not a ranking factor but affects click-through.
- Canonical link — tells crawlers the preferred URL when duplicates exist (trailing slash, query params).
- Open Graph / Twitter tags — control how links preview on social platforms.
- JSON-LD — structured data for rich results; see our JSON-LD guide.
- Favicon —
link rel="icon"for browser tabs.
Stylesheets belong in <head> so the browser can fetch CSS before
painting body content. Defer non-critical scripts with defer or
async attributes to avoid blocking HTML parsing.
Worked example: Harbor product detail page
Harbor Gear sells outdoor equipment. A product page needs crawlable structure, accessible forms, and fast images. Here is a simplified skeleton:
<main>
<article itemscope itemtype="https://schema.org/Product">
<header>
<h1 itemprop="name">Saltwind 40L Pack</h1>
<p><data itemprop="price" value="129.00">$129.00</data></p>
</header>
<figure>
<img src="/img/saltwind-pack.webp"
alt="Saltwind 40L hiking backpack, slate gray"
width="800" height="600" loading="lazy">
</figure>
<section aria-labelledby="features-heading">
<h2 id="features-heading">Features</h2>
<ul>
<li>40L capacity, 1.2 kg empty weight</li>
<li>Hydration sleeve fits 3L reservoir</li>
</ul>
</section>
<form action="/cart/add" method="post">
<label for="qty">Quantity</label>
<input id="qty" name="quantity" type="number" min="1" max="10" value="1">
<input type="hidden" name="sku" value="SG-PACK-40">
<button type="submit">Add to cart</button>
</form>
</article>
</main>
One <main>, one <h1>, sections with labelled
headings, lazy-loaded image with dimensions, and a form that works even if JavaScript
fails. JSON-LD in the head can duplicate Product schema for Google Merchant surfaces.
CSS from a shared stylesheet handles layout — markup stays semantic so
responsive rules
reflow the same HTML on mobile and desktop.
Element choice decision table
| You need | Use | Avoid |
|---|---|---|
| Site navigation block | <nav> inside <header> | Div soup with onclick handlers |
| Primary page content | <main> | Multiple mains or skipping main entirely |
| Blog post or guide body | <article> | Bare div with class "post" |
| Grouped subsection | <section> + heading | Section without a heading (orphan landmark) |
| Clickable button action | <button type="button"> | <div role="button"> without keyboard support |
| Navigate to URL | <a href> | Button that only runs location.href in JS |
| Tabular data | <table> with <th scope> | CSS grid pretending to be a data table |
| Layout-only wrapper | <div> | Semantic tag chosen for styling convenience |
Anti-patterns
- Divitis — replacing every semantic tag with
<div class="header">loses landmarks; screen reader users cannot jump to main content. - Multiple h1 tags — confuses document outline; one h1 per page unless using explicit sectioning with nested articles (rare).
- Missing alt text — images without
altfail WCAG and hurt image search visibility. - Inline styles everywhere — hard to maintain; use classes and external CSS. Inline is fine for one-off email or critical above-the-fold tweaks.
- Tables for layout — breaks responsive design and screen reader table navigation; use CSS Grid or Flexbox.
- Blocking scripts in head — synchronous JS without
deferdelays first paint; hurts LCP. - Invalid nesting — interactive elements inside interactive elements (button inside link) create unpredictable behavior across browsers.
Production checklist
- Start with
<!DOCTYPE html>,lang, charset, and viewport meta. - One
<main>, one logical<h1>, heading hierarchy without skips. - Unique
<title>, meta description, and canonical URL per page. - All images have appropriate
alt; decorative images usealt="". - Forms: labels on every control; server-side validation mirrors client hints.
- Reserve image dimensions; lazy-load below-the-fold media.
- Validate markup with the W3C validator or IDE linting in CI.
- Run keyboard-only and screen reader smoke tests on key flows.
- Keep HTML declarative; push behavior to progressive-enhancement JavaScript.
Key takeaways
- HTML describes structure and meaning; CSS handles presentation and JavaScript adds behavior.
- Semantic elements improve SEO, accessibility, and maintainability compared to div-only markup.
- The head carries encoding, viewport, title, canonical, and social metadata that affect discovery and sharing.
- Forms and links should work without JavaScript as a baseline; enhance from there.
- Good HTML is the foundation for fast rendering, valid structured data, and HTTP-delivered content that crawlers can index.
Related reading
- Browser critical rendering path explained — how HTML becomes pixels in the browser
- Responsive web design explained — mobile-first layout on top of semantic HTML
- Web accessibility (a11y) explained — WCAG, ARIA, and keyboard patterns beyond semantic tags
- SEO fundamentals explained — titles, canonicals, and structured data that start in your HTML head