Guide
Plotly fundamentals explained
You pass a pandas
DataFrame to px.line(df, x="date", y="revenue", color="region") and
a chart appears in your notebook: hover tooltips, zoom boxes, legend toggles, and
pan — no JavaScript required. That is Plotly in practice:
a Python-first visualization stack built on a declarative JSON figure schema
rendered by Plotly.js in the browser. Analysts use it for exploratory charts;
ML engineers embed confusion matrices and SHAP summaries in reports; product teams
ship standalone HTML files stakeholders can open without a Jupyter install. Plotly
sits between quick static plots (Matplotlib, Seaborn) and full dashboard frameworks
(Streamlit,
Dash). This guide covers Plotly Express versus Graph Objects, the figure/trace/layout
model, interactivity and subplots, themes and annotations, export paths, embedding
in notebooks and web apps, a Harbor Analytics funnel dashboard worked example, a
tooling decision table, common pitfalls, and a production checklist. Pair it with
our
Python fundamentals guide
and
machine learning overview
when building end-to-end analytics pipelines.
What Plotly is (and how it differs from Matplotlib or Altair)
Plotly (the open-source plotly Python package) serializes
chart definitions to JSON and renders them interactively via Plotly.js. Unlike
Matplotlib, which draws static pixels to a canvas, Plotly figures are live DOM
components: users zoom into outliers, hide series from the legend, and download
PNG snapshots from the mode bar. Unlike Altair, which compiles
Vega-Lite specs for declarative grammar-of-graphics fans, Plotly optimizes for
batteries-included interactivity and a huge catalog of trace types (3D surfaces,
geographic choropleths, candlesticks, Sankey diagrams) with minimal schema learning.
The ecosystem splits into layers:
- Plotly Express (
px) — one-liner API over tidy DataFrames; best for 80% of EDA charts. - Graph Objects (
go) — low-level control over every trace attribute; required for custom annotations and mixed chart types. - Figure Widgets (
FigureWidget) — Jupyter-only bidirectional updates for linked brushing in notebooks. - Plotly Dash — separate framework for multi-page web dashboards with callbacks; uses the same figure JSON under the hood.
- Kaleido / Orca — static image export engines for CI and slide decks.
Install with pip install plotly pandas kaleido. Kaleido is optional
but needed for headless PNG/PDF export in Docker and GitHub Actions.
The figure model: traces, layout, and frames
Every Plotly chart is a Figure containing one or more
traces (data layers) plus a layout (axes, titles,
margins, legends, templates). Understanding this split prevents the common mistake
of trying to set axis titles on a trace instead of fig.update_layout().
Plotly Express patterns
import plotly.express as px
fig = px.scatter(
df,
x="ad_spend",
y="conversions",
color="channel",
size="impressions",
hover_data=["campaign_id", "ctr"],
trendline="ols",
title="Spend vs conversions by channel",
)
fig.update_layout(hovermode="closest")
fig.show()
Express infers trace types from arguments: color splits series,
facet_col builds small multiples, animation_frame adds
a slider over time. The returned object is a full go.Figure you can
still mutate with update_traces and update_layout.
Graph Objects when Express is not enough
import plotly.graph_objects as go
from plotly.subplots import make_subplots
fig = make_subplots(rows=2, cols=1, shared_xaxes=True,
row_heights=[0.7, 0.3], vertical_spacing=0.05)
fig.add_trace(go.Candlestick(x=df.date, open=df.o, high=df.h,
low=df.l, close=df.c), row=1, col=1)
fig.add_trace(go.Bar(x=df.date, y=df.volume, name="Volume"), row=2, col=1)
fig.update_layout(xaxis_rangeslider_visible=False)
Use make_subplots for stacked panels with independent y-axes. Mix
trace types freely: scatter plus bar plus heatmap on one figure. Set
secondary_y=True when revenue (left axis) and margin percent (right
axis) share the same x dimension.
Interactivity, styling, and accessibility
Default interactivity includes box zoom, autoscale, hover tooltips, and legend click-to-hide. Tune behavior explicitly:
hovermode="x unified"— one tooltip listing all series at the same x (great for time series).fig.update_traces(hovertemplate="%{y:.1f}%<br>%{x|%b %Y}")— custom tooltip formatting without HTML injection risks.config={"displayModeBar": False}— hide the mode bar when embedding in Streamlit sidebars.fig.update_layout(dragmode="select")plus callback in Dash — filter tables by brushed region.
Themes and color
Built-in templates (plotly, plotly_white,
plotly_dark, ggplot2, seaborn) set fonts,
gridlines, and default colors. Apply with fig.update_layout(template="plotly_white")
or globally via px.defaults.template. For brand consistency, define a
custom template JSON once and reuse across reports. Use color_discrete_sequence
in Express or marker=dict(color=df["status"], colorscale="Viridis")
for continuous scales. Always check contrast: Plotly does not enforce WCAG; test
charts with color-blind simulators when status colors carry meaning.
Annotations and shapes
Add vertical event lines with fig.add_vline(x="2026-01-15", annotation_text="Launch"),
shaded recession bands with add_vrect, and text callouts with
fig.add_annotation(). Shapes live in layout coordinates; traces live
in data coordinates — mixing them incorrectly is a frequent source of
misaligned labels on log-scale axes.
Export, embedding, and deployment
Standalone HTML
fig.write_html("report.html", include_plotlyjs="cdn") produces a
single file stakeholders open in any browser. Use include_plotlyjs=True
for air-gapped environments (larger file). auto_open=False suits CI
pipelines that upload artifacts to S3.
Static images
fig.write_image("chart.png", width=1200, height=600, scale=2) requires
Kaleido. Pin kaleido version in requirements; headless servers need
no display. For slide decks, export SVG when vector text matters; use PNG for
email thumbnails.
Jupyter, Streamlit, and FastAPI
- Jupyter —
fig.show()in VS Code or classic notebook; useFigureWidgetfor two-way selection. - Streamlit —
st.plotly_chart(fig, use_container_width=True, config=...); cache the DataFrame, not necessarily the figure, when data is large. - FastAPI — return
fig.to_json()to a React front end usingreact-plotly.js, or serve prebuilt HTML fragments.
Performance tip: downsampling million-row series before plotting. Use
df.resample("1H").mean() or Plotly’s scattergl /
WebGL trace types for dense point clouds. Rendering 500k markers in
SVG mode will freeze the browser tab.
Worked example: Harbor Analytics conversion funnel dashboard
Harbor Analytics product managers needed to compare weekly funnel drop-off across acquisition channels without waiting for a BI ticket queue. The Plotly layer of their internal toolkit:
- Funnel chart —
px.funnelon aggregated stage counts (visit, signup, activate, pay) withcolor="channel"andhover_data=["median_hours_to_next"]. - Cohort heatmap —
px.imshowon a pivot of week-by-week retention; diverging colorscale centered at industry benchmark. - Drill-down scatter — Graph Objects scatter with
customdatacampaign IDs; Dash callback filters a DataTable when users box-select outliers. - Executive export — Monday cron runs
write_htmlpluswrite_imagefor the all-hands deck; same Python module powers the live Dash app. - Shared theme — company template JSON sets fonts and primary green to match the public marketing site.
The funnel replaced a static spreadsheet that was always three days stale. PMs could hover a channel, see median time-to-convert, and paste a PNG into Slack in under a minute. When metrics definitions stabilized, the SQL behind the charts moved into the Streamlit revenue dashboard for executives who preferred filters over Dash’s callback model.
Tooling decision table
| Goal | Favor | Avoid |
|---|---|---|
| Quick EDA in a notebook | Plotly Express on tidy DataFrames | Graph Objects boilerplate for a simple line chart |
| Publication-quality static figure for a paper | Matplotlib or Seaborn with explicit typographic control | Plotly when vector font embedding and LaTeX labels are mandatory |
| Shareable interactive HTML report | Plotly write_html with CDN plotly.js |
Notebook-only show() when recipients lack Jupyter |
| Multi-page dashboard with linked filters | Plotly Dash or Streamlit with st.plotly_chart |
Standalone Express charts without a hosting layer |
| Embedded charts in a custom React product | fig.to_json() + react-plotly.js |
Iframe-heavy HTML exports with mismatched sizing |
| Millisecond-updating real-time telemetry | Specialized time-series front ends or uPlot | Plotly full redraw on every WebSocket tick |
| Grammar-of-graphics composition in Python | Altair or ggplot (plotnine) | Plotly when you want Vega-Lite’s compile-time validation |
Common pitfalls
- Plotting raw millions of rows — resample or aggregate first; use WebGL trace types when density is the message.
- Setting axis properties on traces — titles, ranges, and log scales belong in
update_layoutorupdate_xaxes. - Forgetting Kaleido in CI —
write_imagefails silently in pipelines without the dependency pinned. - Hard-coded hex colors per series — breaks when categories change; map categories to a discrete sequence instead.
- Mixing timezone-naive and aware datetimes — x-axis gaps or misordered points; normalize to UTC in pandas before
px. - Huge self-contained HTML files —
include_plotlyjs=Trueon every chart in a bundle; prefer CDN or one shared script tag. - Overusing 3D charts — readability drops; 2D faceting plus color often communicates better.
- Dash and Streamlit on the same port — pick one hosting model per app; do not nest Dash inside Streamlit without an iframe strategy.
Production checklist
- Pin
plotlyandkaleidoversions in requirements.txt or lockfile. - Define a shared layout template (fonts, margins, color sequence) in one module.
- Document hover templates and units in chart titles or subtitles.
- Downsample or aggregate time series above ~50k points per trace.
- Test HTML exports in Chrome and Safari; verify mode bar does not overlap content on mobile.
- Run color-blind checks on status and threshold coloring.
- CI smoke test: build one figure from fixture data and assert
write_imagesucceeds. - Separate data SQL from figure code; inject DataFrames via parameterized queries.
- Log figure generation latency; alert when warehouse queries dominate render time.
- Version-control example figures or snapshot tests only when definitions are stable.
Key takeaways
- Plotly Express covers most exploratory charts from tidy pandas tables; drop to Graph Objects for mixed subplots and custom annotations.
- Figures are JSON documents rendered in the browser — interactivity is the default, not an add-on.
- write_html and write_image bridge notebooks to stakeholders who never run Python.
- Pair Plotly with Streamlit or Dash when filters and multipage layout matter; keep SQL and metric definitions in shared modules.
- Performance and accessibility require deliberate choices — downsample data, tune templates, and validate color contrast.
Related reading
- Pandas fundamentals explained — DataFrames, groupby, and the tabular data Plotly Express expects
- Streamlit fundamentals explained — embedding Plotly charts in multipage data apps
- scikit-learn fundamentals explained — model metrics worth visualizing
- Python fundamentals explained — environments, packaging, and scripting conventions