Guide
Bash shell scripting explained
Every
Linux server,
CI pipeline, and
Docker entrypoint
eventually needs a Bash script: rotate logs, restart a failed
service, rsync a release, or wrap ten CLI tools into one operator command.
Bash (the Bourne Again SHell) is not a general-purpose language
like Python — it is a glue layer optimized for spawning processes, wiring stdin
and stdout, and automating the Unix toolbox. Scripts that ignore quoting rules or
skip error handling become Friday-night outages. This guide covers shebang lines
and permissions, variables and the three quoting modes, conditionals and loops,
pipes and redirects, the set -euo pipefail safety trio, functions and
argument parsing, a Harbor Fleet deploy automation worked example, a Bash vs
Python decision table, common pitfalls, and a production checklist.
What Bash is (and where it fits)
When you SSH into a VPS and type commands, you are talking to a shell. On most
distributions the default interactive shell is Bash (or Dash for
/bin/sh on Debian — know which interpreter your script targets).
A shell script is a text file of commands the shell reads
line by line, with variables, conditionals, and loops layered on top.
Strong fits: Bootstrapping servers, cron maintenance, Git hooks,
wrapping rsync/tar/curl, CI steps that
only orchestrate binaries, and glue between
systemd units (see our
Linux fundamentals guide)
and container runtimes. Bash starts instantly and needs no virtual environment.
Weak fits: JSON parsing at scale, complex data structures, network services, anything requiring unit tests with heavy mocking, or logic longer than ~200 lines. Reach for Python when the script grows parsers, HTTP clients, or database drivers — not because Bash is “bad,” but because maintainability crosses a threshold.
Your first script: shebang, permissions, execution
The shebang on line one tells the kernel which interpreter to invoke:
#!/usr/bin/env bash
# deploy.sh — Harbor Fleet release helper
echo "Starting deploy at $(date -u +%Y-%m-%dT%H:%M:%SZ)"
Use /usr/bin/env bash instead of hard-coding /bin/bash
so the script finds Bash on macOS Homebrew paths and NixOS layouts. Mark the
file executable with chmod +x deploy.sh and run
./deploy.sh, or invoke explicitly with bash deploy.sh
(useful when executable bit is not set in version control).
Scripts checked into Git often ship without the executable bit; CI can
chmod +x before running. Never commit secrets — load tokens from
the environment or a secrets manager referenced in our
secrets management
guide.
Variables, quoting, and expansion
Assign with NAME=value (no spaces around =). Read with
$NAME or ${NAME} — braces are required for
concatenation (${FILE}.bak) and defaults
(${PORT:-8080}).
Three quoting modes (memorize these)
- Unquoted — word splitting and glob expansion apply. Fine for controlled integers; dangerous for file names with spaces.
- Double quotes
"..."— preserves spaces; still expands$varsand$(commands). Default for human-readable strings. - Single quotes
'...'— literal text; no expansion. Use for regex patterns and static SQL fragments.
LOG_DIR="/var/log/harbor"
FILES=(access.log error.log) # indexed array (Bash 4+)
for f in "${FILES[@]}"; do
gzip -9 "${LOG_DIR}/${f}"
done
# Command substitution captures stdout
DISK_USED=$(df -h / | awk 'NR==2 {print $5}')
Always quote variable expansions unless you have a deliberate reason not to.
Unquoted $foo is the single largest source of scripts that work in
testing and break the first time a user name contains a space.
Special parameters
$0 script name, $1..$9 positional args,
$# count, $@ all args as separate words,
$? exit code of last command, $$ current PID.
Functions receive the same positional parameters as the caller unless you
save them first.
Conditionals, loops, and tests
if [[ -f /etc/harbor/config.yml ]]; then
echo "config present"
elif [[ -d /opt/harbor ]]; then
echo "install dir only"
else
echo "missing" >&2
exit 1
fi
for host in web-{01..03}.internal; do
ssh "deploy@${host}" 'sudo systemctl reload nginx'
done
while read -r line; do
echo "log: $line"
done < /var/log/harbor/access.log
Prefer [[ ... ]] (Bash builtin) over [ ... ] (legacy
test binary) for safer string comparisons and pattern matching.
-f file exists, -d directory, -z empty
string, -n non-empty. Combine with && and
|| for short-circuit one-liners, but avoid unreadable nested
chains — use if when logic branches multiply.
Brace expansion {01..03} generates sequences; it is not a loop by
itself. For numeric loops, for ((i=0; i<10; i++)) works in Bash.
Reading files line by line with while read handles whitespace in
paths if you use IFS= read -r line.
Pipes, redirects, and subprocesses
Unix philosophy: small programs connected by pipes. stdout redirect
> truncates; >> appends; stderr
2>; combine with 2>&1 to merge streams for
logging. Pipe | sends stdout of left command to stdin of right.
# Keep stderr on terminal, log stdout
./backup.sh > "/var/log/backup-$(date +%F).log"
# Pipeline — each stage is a separate process
journalctl -u nginx --since "1 hour ago" | grep -c " 502 "
# Subshell ( ) runs in forked environment; { } groups without fork
( cd /tmp && tar czf archive.tar.gz data/ )
Remember: variables set inside a pipeline subshell do not persist in the parent
shell unless you use process substitution or a different pattern. For counting
pipeline failures, enable set -o pipefail (covered below).
Error handling: set -euo pipefail
Production scripts should open with the safety trio:
set -euo pipefail
IFS=$'\n\t'
-e(errexit) — exit immediately if any command returns non-zero.-u(nounset) — treat unset variables as errors.-o pipefail— pipeline fails if any stage fails, not only the last.
Commands you expect to fail sometimes (e.g. grep finding no matches)
need an escape: grep pattern file || true, or temporarily
set +e. For fatal errors, print to stderr and exit with a distinct
code:
die() { echo "ERROR: $*" >&2; exit 1; }
[[ -n "${API_TOKEN:-}" ]] || die "API_TOKEN not set"
Pair scripts with structured logging in long-running services; Bash glue usually logs timestamped lines to syslog or a file consumed by your aggregator.
Functions and CLI argument parsing
usage() {
echo "Usage: $0 --env staging|prod [--dry-run]"
exit 1
}
DRY_RUN=false
ENV=""
while [[ $# -gt 0 ]]; do
case "$1" in
--env) ENV="$2"; shift 2 ;;
--dry-run) DRY_RUN=true; shift ;;
-h|--help) usage ;;
*) echo "unknown arg: $1" >&2; usage ;;
esac
done
[[ -n "$ENV" ]] || usage
Functions encapsulate repeated steps; local keeps variables scoped
inside the function. For richer CLIs, consider
getopts (POSIX) or delegating to Python with
click/typer when flags multiply. Document required
environment variables at the top of the script and in your runbook.
Essential toolbox commands
Bash scripts rarely do heavy lifting themselves — they orchestrate:
find+-execorxargs— batch file operations (watch GNU vs BSD flags on macOS).grep/rg— search logs and configs; preferripgrepin new scripts for speed.awkandsed— column extraction and stream edits; learn one-line patterns, not full programs.curl— HTTP health checks; always set--failand timeouts.jq— JSON parsing when APIs are unavoidable (not stdlib, but ubiquitous in ops).trap— run cleanup on EXIT, INT, or TERM (remove temp files, release locks).
TMPFILE=$(mktemp)
trap 'rm -f "$TMPFILE"' EXIT
Worked example: Harbor Fleet rolling deploy
Harbor Fleet runs three stateless API nodes behind nginx. The team replaces symlink-based releases with a Bash orchestrator invoked from CI after tests pass.
- Inputs —
--version v2.4.1,--env prod, artifact URL on internal S3-compatible storage. - Preflight — verify
sshagent, confirm all hosts answercurl -sf http://localhost/healthz, assert disk free > 2 GB viadf. - Download once —
curl -fL "$URL" -o "$TMPFILE"; checksum withsha256sum -c. - Rolling loop — for each host: drain from load balancer config,
rsynctarball to/opt/harbor/releases/$VERSION, flipcurrentsymlink,systemctl restart harbor-api, wait for healthz 200, re-enable in nginx upstream. - Failure path — on any non-zero step,
traplogs host name, sends Slack webhook viacurl, exits 2 so CI marks deploy failed without silent partial rollouts. - Audit — append one JSON line per deploy to
/var/log/harbor/deploy.logwith user, version, duration.
Total script stays under 120 lines because business logic lives in the compiled API binary; Bash only moves bits and restarts services — the right separation of concerns for glue code.
Language decision table
| Task | Prefer Bash | Prefer Python instead |
|---|---|---|
| Restart service if health check fails | Yes | Overkill |
| Parse nested JSON API responses | No — jq one-liners excepted | Yes |
| CI: build Docker image and push | Yes — orchestrate docker CLI | Either |
| ETL with pandas/SQLAlchemy | No | Yes |
| Git pre-commit hook (10 lines) | Yes | Either |
| Long-running daemon | No | Yes — or Go/Rust |
| Ad-hoc log grep + archive | Yes | Slower to write for ops |
| Cross-platform Windows + Linux | No | Yes — or PowerShell on Windows |
Common pitfalls
- Unquoted variables — filenames with spaces or glob characters break loops and
rmcommands. - Missing shebang — script runs under
/bin/sh(Dash) on Debian, where Bash arrays and[[fail. - Ignoring exit codes — without
set -e, a failedcdormkdirsilently corrupts later steps. - Parsing
lsoutput — neverfor f in $(ls); use globs orfind. - Integer comparison with
[[and strings — use(( count > 5 ))for numbers,[[ "$a" == "$b" ]]for strings. - Exporting secrets in process list — environment variables are visible in
ps; pass via file descriptor or agent. - CRLF line endings from Windows editors — causes
/bin/bash^M: bad interpreter; enforce LF in.gitattributes.
Practical checklist
- Start scripts with
#!/usr/bin/env bashandset -euo pipefail. - Quote all expansions; use
"${array[@]}"when iterating arrays. - Define
usage()and validate required args before side effects. - Use
mktempandtrap ... EXITfor temporary files. - Log to stderr; reserve stdout for data other tools might pipe.
- Test on a clean VM matching production (same distro, Dash vs Bash defaults).
- Run
shellcheck script.shin CI — catches quoting and portability issues. - Pin external tool assumptions (GNU tar flags,
date -dvs BSD) in comments. - Return distinct exit codes: 0 success, 1 usage/config error, 2 operational failure.
- When script exceeds ~200 lines or needs tests, rewrite hot path in Python and keep Bash as a thin wrapper.
Key takeaways
- Bash excels at orchestrating CLI tools on Linux — not at application logic.
- Quoting and
set -euo pipefailseparate scripts that survive production from ones that corrupt data quietly. - Pipes and redirects are the core abstraction; master
grep,awk,find, andcurlbefore writing custom parsers. - Functions + case blocks scale CLI parsing to a point; beyond that, use Python.
- Pair shell glue with solid Linux fundamentals and Git hooks for a complete ops toolkit.
Related reading
- Linux fundamentals explained — kernel, filesystem, processes, systemd, and SSH
- CI/CD pipelines explained — where Bash glue runs in build and deploy automation
- Python fundamentals explained — when scripts outgrow shell maintainability
- Git fundamentals explained — hooks and release tagging Bash scripts invoke