Guide
Game shaders explained
A shader is a small program that runs on your graphics card (GPU) and decides how 3D geometry becomes colored pixels on screen. Shaders drive the glossy metal on a sci-fi rifle, the rippling water in an open world, the cel-shaded outline on a stylized hero, and the subtle fog that hides LOD pop-in at distance. Modern games rarely draw triangles with flat colors — every surface passes through a rendering pipeline of shader stages that transform vertices, sample textures, compute lighting, and apply post-effects. This guide explains how that pipeline works, the difference between vertex and fragment shaders, how physically based rendering (PBR) became the industry default, when to use node-based shader graphs vs handwritten GLSL, and the performance rules that keep browser and mobile games playable.
The GPU graphics pipeline in plain terms
Your CPU prepares a frame: which meshes to draw, where the camera sits, which lights are active. It sends draw calls to the GPU — batches of triangles with associated material data. The GPU then runs a fixed sequence of stages, several of which are programmable via shaders:
- Vertex processing — each corner of a mesh is transformed from model space (local coordinates) through world space and into clip space (what the camera sees). Normals and texture coordinates (UVs) are transformed alongside positions.
- Rasterization — the GPU fills in the triangles, producing a grid of fragments (potential pixels). This stage is fixed-function; you do not write code for it directly.
- Fragment processing — for each fragment, a fragment shader (historically called a pixel shader) decides the final color, often by sampling textures and evaluating lighting.
- Output merging — depth testing, blending with existing pixels, and writing to the framebuffer.
Shaders execute thousands of times in parallel — once per vertex or once per fragment. That parallelism is why GPUs excel at graphics and why a few lines of sloppy shader code can cost milliseconds when multiplied across a 4K display. Your game loop budget for rendering is shared with physics, AI, and networking; shader cost shows up directly in frame time.
Vertex shaders: placement and deformation
A vertex shader receives one vertex at a time and outputs its position in clip space plus any data the fragment shader will need — typically the interpolated normal, UV coordinates, vertex color, and tangent vectors for normal mapping.
The minimum job is matrix multiplication: multiply the vertex position by model, view, and projection matrices so the mesh appears in the correct place on screen. Engines (Unity, Unreal, Godot) inject these matrices as uniforms — values constant across all vertices in a draw call.
Vertex shaders also enable effects that do not belong in the fragment stage:
- Skeletal animation — blend bone weights to deform a mesh each frame, feeding data from your animation blending system.
- GPU particles — expand a single point into a billboard quad facing the camera.
- Displacement — offset vertices along normals using a height map for terrain detail without extra geometry.
- Wind and waves — sinusoidal offsets on foliage or water surface vertices.
Because vertex shaders run once per vertex (not per pixel), they are cheap relative to fragment work on dense meshes seen up close. Moving math from fragment to vertex shaders is a classic optimization when the result can be interpolated smoothly across a triangle.
Fragment shaders: color, light, and material
The fragment shader runs once per fragment (roughly once per pixel, before depth rejection). It is where most of the visual identity of a game is authored. Typical inputs arrive interpolated from the vertex stage: UV coordinates, normals, world position. The shader samples textures — 2D images wrapped onto the mesh — and combines them with lighting equations.
From flat shading to PBR
Early real-time graphics used Phong or Blinn-Phong lighting: a diffuse term (Lambert cosine law) plus a specular highlight controlled by a shininess exponent. Artists tuned separate ambient, diffuse, and specular colors per material. It looked acceptable in controlled scenes but broke under varying light conditions — the same albedo texture could look plastic indoors and chalky outdoors.
Physically based rendering (PBR) standardizes materials around parameters that approximate real-world optics:
- Albedo / base color — the surface color without lighting (no shadows baked in).
- Metallic — 0 for dielectrics (wood, skin), 1 for metals; metals have no diffuse component.
- Roughness — microsurface scatter; 0 is a mirror, 1 is fully rough.
- Normal map — perturbs surface normals from a texture to fake fine detail without extra polygons.
- Ambient occlusion (AO) — darkens creases where indirect light is blocked.
PBR uses a bidirectional reflectance distribution function (BRDF), commonly Cook-Torrance with a GGX normal distribution. The details are math-heavy, but the workflow benefit is consistency: an asset authored once reacts believably under different lighting setups. Unreal's default lit shader, Unity's HDRP/URP lit shaders, and glTF's standard material all follow this model, which simplifies asset pipelines and marketplace sharing.
Specialized fragment techniques
Beyond standard PBR, fragment shaders implement stylized looks and effects:
- Cel shading — quantize lighting into bands for a comic or anime look.
- Dissolve and hit-flash — clip pixels based on noise thresholds for damage feedback.
- Screen-space effects — sample neighboring depth or color buffers for outlines, blur, or ambient occlusion approximations.
- Decals and projected textures — project blood splats or graffiti onto arbitrary surfaces.
Shader languages and authoring workflows
GLSL, HLSL, and WGSL
Shaders are written in GPU-specific languages. GLSL (OpenGL Shading Language) powers WebGL and many mobile OpenGL ES pipelines. HLSL (High Level Shading Language) is Microsoft's dialect used by DirectX and Xbox. WGSL (WebGPU Shading Language) is the emerging standard for WebGPU, designed for safer memory rules and cross-platform consistency.
Syntax differs slightly — vec3 vs float3, varying vs
interpolator keywords — but concepts transfer. A fragment shader that samples a diffuse
map and applies a directional light in GLSL maps cleanly to HLSL with find-and-replace
on types and semantics.
Shader graphs vs handwritten code
Unity Shader Graph, Unreal Material Editor, and Godot VisualShader let artists wire nodes instead of typing code. Benefits: live preview, reusable subgraphs, and accessibility for designers. Costs: graph bloat, harder version control diffs, and occasional hidden complexity when a node expands to dozens of instructions.
Handwritten shaders remain essential for unique effects, extreme optimization, and platforms where graph export is opaque. Many teams use graphs for hero props and code for foliage, water, and post-processing passes where every instruction counts.
Compute shaders
Beyond vertex and fragment stages, compute shaders run general parallel workloads on the GPU — fluid simulation, culling, texture compression, procedural mesh generation. They do not fit the classic raster pipeline but share the same hardware. Browser support is growing via WebGPU; until then, heavy compute often stays on CPU or uses WebAssembly with SIMD for parallel tasks.
WebGL and browser game constraints
Browser games typically render through WebGL (OpenGL ES 2.0/3.0 subset) or increasingly WebGPU. WebGL 1.0 lacks native compute and has limited texture formats; WebGL 2.0 adds instancing, 3D textures, and multiple render targets useful for deferred techniques and post-processing.
Practical constraints for web shaders:
- Shader compile time — compiling GLSL on first use causes hitches; precompile during loading screens.
- Precision qualifiers — mobile GPUs may use lower precision (
mediump) unless you forcehighp; banding on gradients is a common symptom. - Texture size limits — often 4096 or 8192 px max; atlasing small sprites saves draw calls.
- No geometry shaders in WebGL 1 — tessellation and geometry expansion must be simulated in vertex shaders or CPU.
- Thermal throttling on phones — sustained GPU load reduces clock speed; shader complexity affects battery and heat, not just FPS.
Libraries like Three.js, Babylon.js, and PixiJS wrap WebGL with material systems. You
can still inject custom onBeforeCompile hooks or raw shader chunks when
stock PBR is not enough — for example, a toon outline pass or a scrolling UV effect on
a casino felt table.
Performance: where shaders hurt frame rate
Shader cost scales with pixels shaded, not just triangle count. Overdraw — drawing transparent objects on top of each other — multiplies fragment shader work. Common bottlenecks:
- Expensive fragment math — multiple texture samples, dynamic loops, noise functions per pixel add up on fill-rate limited GPUs (most mobile devices).
- Too many draw calls — each unique shader/material combination may break batching; instancing and texture atlases reduce state changes.
- Full-screen post passes — bloom, depth of field, and motion blur each read the entire framebuffer; combine passes where possible.
- Branch divergence —
ifstatements on per-pixel data can serialize threads on some GPU architectures; prefer step functions and lerp blends. - Unused shader variants — Unity and Unreal generate keyword permutations; stripping unused variants shrinks build size and compile time.
Profile with GPU tools: RenderDoc for desktop, Xcode GPU Frame Debugger for iOS, browser WebGL inspector extensions. If fragment time dominates, simplify materials, reduce render resolution with upscaling, or apply LOD so distant objects use cheaper shader variants with fewer texture lookups.
Debugging shaders that look wrong
Shader bugs are visual, not textual. A black mesh might mean wrong matrix order, backface culling with inverted normals, or a texture bound to the wrong unit. A pink/magenta material in Unity usually means a compile error in the shader pass.
Systematic debugging:
- Output intermediate values as color — visualize normals as RGB, UVs as gradient, roughness as grayscale.
- Strip lighting — return albedo only to confirm textures and UVs are correct before adding BRDF math.
- Check coordinate spaces — mixing world-space normals with view-space light directions produces nonsense highlights.
- Verify gamma — sRGB textures must be sampled with correct color space or surfaces look too dark or washed out.
- Compare against reference — render the same glTF model in a viewer vs your engine to isolate pipeline differences.
Production checklist
- Pick a lighting model early (PBR metallic-roughness is the safe default) and stick to it across the art team.
- Author textures at power-of-two sizes with mipmaps; use compression formats appropriate per platform (BC7 desktop, ASTC mobile).
- Define shader LOD tiers — full PBR near camera, simplified diffuse+normal mid-range, unlit impostors far away.
- Batch materials sharing the same shader; minimize unique uniforms per frame.
- Precompile shaders at load; cache compiled binaries on desktop platforms that support it.
- Document render queue order for transparent objects, UI overlays, and post-processing.
- Test on a mid-tier phone, not only developer laptops with discrete GPUs.
Key takeaways
- Vertex shaders place and deform geometry; fragment shaders decide final pixel color and lighting.
- PBR (albedo, metallic, roughness, normal maps) is the standard material model for consistent lighting across scenes.
- Shader graphs accelerate authoring; handwritten shaders remain necessary for custom effects and tight optimization.
- Browser games use WebGL/WebGPU with stricter limits on precision, compile time, and thermal headroom.
- Performance is usually fill-rate bound — reduce overdraw, simplify fragment math, and tier shader complexity by distance.
Related reading
- Game LOD explained — mesh and texture LOD pairs naturally with shader complexity tiers
- Game animation blending explained — skeletal data fed into vertex shaders each frame
- Game physics explained — simulation runs on CPU; shaders visualize the results
- WebAssembly explained — near-native compute when GPU compute is unavailable in browser