Guide

Plotly fundamentals explained

You pass a pandas DataFrame to px.line(df, x="date", y="revenue", color="region") and a chart appears in your notebook: hover tooltips, zoom boxes, legend toggles, and pan — no JavaScript required. That is Plotly in practice: a Python-first visualization stack built on a declarative JSON figure schema rendered by Plotly.js in the browser. Analysts use it for exploratory charts; ML engineers embed confusion matrices and SHAP summaries in reports; product teams ship standalone HTML files stakeholders can open without a Jupyter install. Plotly sits between quick static plots (Matplotlib, Seaborn) and full dashboard frameworks (Streamlit, Dash). This guide covers Plotly Express versus Graph Objects, the figure/trace/layout model, interactivity and subplots, themes and annotations, export paths, embedding in notebooks and web apps, a Harbor Analytics funnel dashboard worked example, a tooling decision table, common pitfalls, and a production checklist. Pair it with our Python fundamentals guide and machine learning overview when building end-to-end analytics pipelines.

What Plotly is (and how it differs from Matplotlib or Altair)

Plotly (the open-source plotly Python package) serializes chart definitions to JSON and renders them interactively via Plotly.js. Unlike Matplotlib, which draws static pixels to a canvas, Plotly figures are live DOM components: users zoom into outliers, hide series from the legend, and download PNG snapshots from the mode bar. Unlike Altair, which compiles Vega-Lite specs for declarative grammar-of-graphics fans, Plotly optimizes for batteries-included interactivity and a huge catalog of trace types (3D surfaces, geographic choropleths, candlesticks, Sankey diagrams) with minimal schema learning.

The ecosystem splits into layers:

  • Plotly Express (px) — one-liner API over tidy DataFrames; best for 80% of EDA charts.
  • Graph Objects (go) — low-level control over every trace attribute; required for custom annotations and mixed chart types.
  • Figure Widgets (FigureWidget) — Jupyter-only bidirectional updates for linked brushing in notebooks.
  • Plotly Dash — separate framework for multi-page web dashboards with callbacks; uses the same figure JSON under the hood.
  • Kaleido / Orca — static image export engines for CI and slide decks.

Install with pip install plotly pandas kaleido. Kaleido is optional but needed for headless PNG/PDF export in Docker and GitHub Actions.

The figure model: traces, layout, and frames

Every Plotly chart is a Figure containing one or more traces (data layers) plus a layout (axes, titles, margins, legends, templates). Understanding this split prevents the common mistake of trying to set axis titles on a trace instead of fig.update_layout().

Plotly Express patterns

import plotly.express as px

fig = px.scatter(
    df,
    x="ad_spend",
    y="conversions",
    color="channel",
    size="impressions",
    hover_data=["campaign_id", "ctr"],
    trendline="ols",
    title="Spend vs conversions by channel",
)
fig.update_layout(hovermode="closest")
fig.show()

Express infers trace types from arguments: color splits series, facet_col builds small multiples, animation_frame adds a slider over time. The returned object is a full go.Figure you can still mutate with update_traces and update_layout.

Graph Objects when Express is not enough

import plotly.graph_objects as go
from plotly.subplots import make_subplots

fig = make_subplots(rows=2, cols=1, shared_xaxes=True,
                    row_heights=[0.7, 0.3], vertical_spacing=0.05)
fig.add_trace(go.Candlestick(x=df.date, open=df.o, high=df.h,
                             low=df.l, close=df.c), row=1, col=1)
fig.add_trace(go.Bar(x=df.date, y=df.volume, name="Volume"), row=2, col=1)
fig.update_layout(xaxis_rangeslider_visible=False)

Use make_subplots for stacked panels with independent y-axes. Mix trace types freely: scatter plus bar plus heatmap on one figure. Set secondary_y=True when revenue (left axis) and margin percent (right axis) share the same x dimension.

Interactivity, styling, and accessibility

Default interactivity includes box zoom, autoscale, hover tooltips, and legend click-to-hide. Tune behavior explicitly:

  • hovermode="x unified" — one tooltip listing all series at the same x (great for time series).
  • fig.update_traces(hovertemplate="%{y:.1f}%<br>%{x|%b %Y}") — custom tooltip formatting without HTML injection risks.
  • config={"displayModeBar": False} — hide the mode bar when embedding in Streamlit sidebars.
  • fig.update_layout(dragmode="select") plus callback in Dash — filter tables by brushed region.

Themes and color

Built-in templates (plotly, plotly_white, plotly_dark, ggplot2, seaborn) set fonts, gridlines, and default colors. Apply with fig.update_layout(template="plotly_white") or globally via px.defaults.template. For brand consistency, define a custom template JSON once and reuse across reports. Use color_discrete_sequence in Express or marker=dict(color=df["status"], colorscale="Viridis") for continuous scales. Always check contrast: Plotly does not enforce WCAG; test charts with color-blind simulators when status colors carry meaning.

Annotations and shapes

Add vertical event lines with fig.add_vline(x="2026-01-15", annotation_text="Launch"), shaded recession bands with add_vrect, and text callouts with fig.add_annotation(). Shapes live in layout coordinates; traces live in data coordinates — mixing them incorrectly is a frequent source of misaligned labels on log-scale axes.

Export, embedding, and deployment

Standalone HTML

fig.write_html("report.html", include_plotlyjs="cdn") produces a single file stakeholders open in any browser. Use include_plotlyjs=True for air-gapped environments (larger file). auto_open=False suits CI pipelines that upload artifacts to S3.

Static images

fig.write_image("chart.png", width=1200, height=600, scale=2) requires Kaleido. Pin kaleido version in requirements; headless servers need no display. For slide decks, export SVG when vector text matters; use PNG for email thumbnails.

Jupyter, Streamlit, and FastAPI

  • Jupyterfig.show() in VS Code or classic notebook; use FigureWidget for two-way selection.
  • Streamlitst.plotly_chart(fig, use_container_width=True, config=...); cache the DataFrame, not necessarily the figure, when data is large.
  • FastAPI — return fig.to_json() to a React front end using react-plotly.js, or serve prebuilt HTML fragments.

Performance tip: downsampling million-row series before plotting. Use df.resample("1H").mean() or Plotly’s scattergl / WebGL trace types for dense point clouds. Rendering 500k markers in SVG mode will freeze the browser tab.

Worked example: Harbor Analytics conversion funnel dashboard

Harbor Analytics product managers needed to compare weekly funnel drop-off across acquisition channels without waiting for a BI ticket queue. The Plotly layer of their internal toolkit:

  • Funnel chartpx.funnel on aggregated stage counts (visit, signup, activate, pay) with color="channel" and hover_data=["median_hours_to_next"].
  • Cohort heatmappx.imshow on a pivot of week-by-week retention; diverging colorscale centered at industry benchmark.
  • Drill-down scatter — Graph Objects scatter with customdata campaign IDs; Dash callback filters a DataTable when users box-select outliers.
  • Executive export — Monday cron runs write_html plus write_image for the all-hands deck; same Python module powers the live Dash app.
  • Shared theme — company template JSON sets fonts and primary green to match the public marketing site.

The funnel replaced a static spreadsheet that was always three days stale. PMs could hover a channel, see median time-to-convert, and paste a PNG into Slack in under a minute. When metrics definitions stabilized, the SQL behind the charts moved into the Streamlit revenue dashboard for executives who preferred filters over Dash’s callback model.

Tooling decision table

Goal Favor Avoid
Quick EDA in a notebook Plotly Express on tidy DataFrames Graph Objects boilerplate for a simple line chart
Publication-quality static figure for a paper Matplotlib or Seaborn with explicit typographic control Plotly when vector font embedding and LaTeX labels are mandatory
Shareable interactive HTML report Plotly write_html with CDN plotly.js Notebook-only show() when recipients lack Jupyter
Multi-page dashboard with linked filters Plotly Dash or Streamlit with st.plotly_chart Standalone Express charts without a hosting layer
Embedded charts in a custom React product fig.to_json() + react-plotly.js Iframe-heavy HTML exports with mismatched sizing
Millisecond-updating real-time telemetry Specialized time-series front ends or uPlot Plotly full redraw on every WebSocket tick
Grammar-of-graphics composition in Python Altair or ggplot (plotnine) Plotly when you want Vega-Lite’s compile-time validation

Common pitfalls

  • Plotting raw millions of rows — resample or aggregate first; use WebGL trace types when density is the message.
  • Setting axis properties on traces — titles, ranges, and log scales belong in update_layout or update_xaxes.
  • Forgetting Kaleido in CIwrite_image fails silently in pipelines without the dependency pinned.
  • Hard-coded hex colors per series — breaks when categories change; map categories to a discrete sequence instead.
  • Mixing timezone-naive and aware datetimes — x-axis gaps or misordered points; normalize to UTC in pandas before px.
  • Huge self-contained HTML filesinclude_plotlyjs=True on every chart in a bundle; prefer CDN or one shared script tag.
  • Overusing 3D charts — readability drops; 2D faceting plus color often communicates better.
  • Dash and Streamlit on the same port — pick one hosting model per app; do not nest Dash inside Streamlit without an iframe strategy.

Production checklist

  • Pin plotly and kaleido versions in requirements.txt or lockfile.
  • Define a shared layout template (fonts, margins, color sequence) in one module.
  • Document hover templates and units in chart titles or subtitles.
  • Downsample or aggregate time series above ~50k points per trace.
  • Test HTML exports in Chrome and Safari; verify mode bar does not overlap content on mobile.
  • Run color-blind checks on status and threshold coloring.
  • CI smoke test: build one figure from fixture data and assert write_image succeeds.
  • Separate data SQL from figure code; inject DataFrames via parameterized queries.
  • Log figure generation latency; alert when warehouse queries dominate render time.
  • Version-control example figures or snapshot tests only when definitions are stable.

Key takeaways

  • Plotly Express covers most exploratory charts from tidy pandas tables; drop to Graph Objects for mixed subplots and custom annotations.
  • Figures are JSON documents rendered in the browser — interactivity is the default, not an add-on.
  • write_html and write_image bridge notebooks to stakeholders who never run Python.
  • Pair Plotly with Streamlit or Dash when filters and multipage layout matter; keep SQL and metric definitions in shared modules.
  • Performance and accessibility require deliberate choices — downsample data, tune templates, and validate color contrast.

Related reading