Guide

Bash shell scripting explained

Every Linux server, CI pipeline, and Docker entrypoint eventually needs a Bash script: rotate logs, restart a failed service, rsync a release, or wrap ten CLI tools into one operator command. Bash (the Bourne Again SHell) is not a general-purpose language like Python — it is a glue layer optimized for spawning processes, wiring stdin and stdout, and automating the Unix toolbox. Scripts that ignore quoting rules or skip error handling become Friday-night outages. This guide covers shebang lines and permissions, variables and the three quoting modes, conditionals and loops, pipes and redirects, the set -euo pipefail safety trio, functions and argument parsing, a Harbor Fleet deploy automation worked example, a Bash vs Python decision table, common pitfalls, and a production checklist.

What Bash is (and where it fits)

When you SSH into a VPS and type commands, you are talking to a shell. On most distributions the default interactive shell is Bash (or Dash for /bin/sh on Debian — know which interpreter your script targets). A shell script is a text file of commands the shell reads line by line, with variables, conditionals, and loops layered on top.

Strong fits: Bootstrapping servers, cron maintenance, Git hooks, wrapping rsync/tar/curl, CI steps that only orchestrate binaries, and glue between systemd units (see our Linux fundamentals guide) and container runtimes. Bash starts instantly and needs no virtual environment.

Weak fits: JSON parsing at scale, complex data structures, network services, anything requiring unit tests with heavy mocking, or logic longer than ~200 lines. Reach for Python when the script grows parsers, HTTP clients, or database drivers — not because Bash is “bad,” but because maintainability crosses a threshold.

Your first script: shebang, permissions, execution

The shebang on line one tells the kernel which interpreter to invoke:

#!/usr/bin/env bash
# deploy.sh — Harbor Fleet release helper

echo "Starting deploy at $(date -u +%Y-%m-%dT%H:%M:%SZ)"

Use /usr/bin/env bash instead of hard-coding /bin/bash so the script finds Bash on macOS Homebrew paths and NixOS layouts. Mark the file executable with chmod +x deploy.sh and run ./deploy.sh, or invoke explicitly with bash deploy.sh (useful when executable bit is not set in version control).

Scripts checked into Git often ship without the executable bit; CI can chmod +x before running. Never commit secrets — load tokens from the environment or a secrets manager referenced in our secrets management guide.

Variables, quoting, and expansion

Assign with NAME=value (no spaces around =). Read with $NAME or ${NAME} — braces are required for concatenation (${FILE}.bak) and defaults (${PORT:-8080}).

Three quoting modes (memorize these)

  • Unquoted — word splitting and glob expansion apply. Fine for controlled integers; dangerous for file names with spaces.
  • Double quotes "..." — preserves spaces; still expands $vars and $(commands). Default for human-readable strings.
  • Single quotes '...' — literal text; no expansion. Use for regex patterns and static SQL fragments.
LOG_DIR="/var/log/harbor"
FILES=(access.log error.log)   # indexed array (Bash 4+)

for f in "${FILES[@]}"; do
  gzip -9 "${LOG_DIR}/${f}"
done

# Command substitution captures stdout
DISK_USED=$(df -h / | awk 'NR==2 {print $5}')

Always quote variable expansions unless you have a deliberate reason not to. Unquoted $foo is the single largest source of scripts that work in testing and break the first time a user name contains a space.

Special parameters

$0 script name, $1..$9 positional args, $# count, $@ all args as separate words, $? exit code of last command, $$ current PID. Functions receive the same positional parameters as the caller unless you save them first.

Conditionals, loops, and tests

if [[ -f /etc/harbor/config.yml ]]; then
  echo "config present"
elif [[ -d /opt/harbor ]]; then
  echo "install dir only"
else
  echo "missing" >&2
  exit 1
fi

for host in web-{01..03}.internal; do
  ssh "deploy@${host}" 'sudo systemctl reload nginx'
done

while read -r line; do
  echo "log: $line"
done < /var/log/harbor/access.log

Prefer [[ ... ]] (Bash builtin) over [ ... ] (legacy test binary) for safer string comparisons and pattern matching. -f file exists, -d directory, -z empty string, -n non-empty. Combine with && and || for short-circuit one-liners, but avoid unreadable nested chains — use if when logic branches multiply.

Brace expansion {01..03} generates sequences; it is not a loop by itself. For numeric loops, for ((i=0; i<10; i++)) works in Bash. Reading files line by line with while read handles whitespace in paths if you use IFS= read -r line.

Pipes, redirects, and subprocesses

Unix philosophy: small programs connected by pipes. stdout redirect > truncates; >> appends; stderr 2>; combine with 2>&1 to merge streams for logging. Pipe | sends stdout of left command to stdin of right.

# Keep stderr on terminal, log stdout
./backup.sh > "/var/log/backup-$(date +%F).log"

# Pipeline — each stage is a separate process
journalctl -u nginx --since "1 hour ago" | grep -c " 502 "

# Subshell ( ) runs in forked environment; { } groups without fork
( cd /tmp && tar czf archive.tar.gz data/ )

Remember: variables set inside a pipeline subshell do not persist in the parent shell unless you use process substitution or a different pattern. For counting pipeline failures, enable set -o pipefail (covered below).

Error handling: set -euo pipefail

Production scripts should open with the safety trio:

set -euo pipefail
IFS=$'\n\t'
  • -e (errexit) — exit immediately if any command returns non-zero.
  • -u (nounset) — treat unset variables as errors.
  • -o pipefail — pipeline fails if any stage fails, not only the last.

Commands you expect to fail sometimes (e.g. grep finding no matches) need an escape: grep pattern file || true, or temporarily set +e. For fatal errors, print to stderr and exit with a distinct code:

die() { echo "ERROR: $*" >&2; exit 1; }

[[ -n "${API_TOKEN:-}" ]] || die "API_TOKEN not set"

Pair scripts with structured logging in long-running services; Bash glue usually logs timestamped lines to syslog or a file consumed by your aggregator.

Functions and CLI argument parsing

usage() {
  echo "Usage: $0 --env staging|prod [--dry-run]"
  exit 1
}

DRY_RUN=false
ENV=""

while [[ $# -gt 0 ]]; do
  case "$1" in
    --env) ENV="$2"; shift 2 ;;
    --dry-run) DRY_RUN=true; shift ;;
    -h|--help) usage ;;
    *) echo "unknown arg: $1" >&2; usage ;;
  esac
done

[[ -n "$ENV" ]] || usage

Functions encapsulate repeated steps; local keeps variables scoped inside the function. For richer CLIs, consider getopts (POSIX) or delegating to Python with click/typer when flags multiply. Document required environment variables at the top of the script and in your runbook.

Essential toolbox commands

Bash scripts rarely do heavy lifting themselves — they orchestrate:

  • find + -exec or xargs — batch file operations (watch GNU vs BSD flags on macOS).
  • grep/rg — search logs and configs; prefer ripgrep in new scripts for speed.
  • awk and sed — column extraction and stream edits; learn one-line patterns, not full programs.
  • curl — HTTP health checks; always set --fail and timeouts.
  • jq — JSON parsing when APIs are unavoidable (not stdlib, but ubiquitous in ops).
  • trap — run cleanup on EXIT, INT, or TERM (remove temp files, release locks).
TMPFILE=$(mktemp)
trap 'rm -f "$TMPFILE"' EXIT

Worked example: Harbor Fleet rolling deploy

Harbor Fleet runs three stateless API nodes behind nginx. The team replaces symlink-based releases with a Bash orchestrator invoked from CI after tests pass.

  1. Inputs--version v2.4.1, --env prod, artifact URL on internal S3-compatible storage.
  2. Preflight — verify ssh agent, confirm all hosts answer curl -sf http://localhost/healthz, assert disk free > 2 GB via df.
  3. Download oncecurl -fL "$URL" -o "$TMPFILE"; checksum with sha256sum -c.
  4. Rolling loop — for each host: drain from load balancer config, rsync tarball to /opt/harbor/releases/$VERSION, flip current symlink, systemctl restart harbor-api, wait for healthz 200, re-enable in nginx upstream.
  5. Failure path — on any non-zero step, trap logs host name, sends Slack webhook via curl, exits 2 so CI marks deploy failed without silent partial rollouts.
  6. Audit — append one JSON line per deploy to /var/log/harbor/deploy.log with user, version, duration.

Total script stays under 120 lines because business logic lives in the compiled API binary; Bash only moves bits and restarts services — the right separation of concerns for glue code.

Language decision table

TaskPrefer BashPrefer Python instead
Restart service if health check failsYesOverkill
Parse nested JSON API responsesNo — jq one-liners exceptedYes
CI: build Docker image and pushYes — orchestrate docker CLIEither
ETL with pandas/SQLAlchemyNoYes
Git pre-commit hook (10 lines)YesEither
Long-running daemonNoYes — or Go/Rust
Ad-hoc log grep + archiveYesSlower to write for ops
Cross-platform Windows + LinuxNoYes — or PowerShell on Windows

Common pitfalls

  • Unquoted variables — filenames with spaces or glob characters break loops and rm commands.
  • Missing shebang — script runs under /bin/sh (Dash) on Debian, where Bash arrays and [[ fail.
  • Ignoring exit codes — without set -e, a failed cd or mkdir silently corrupts later steps.
  • Parsing ls output — never for f in $(ls); use globs or find.
  • Integer comparison with [[ and strings — use (( count > 5 )) for numbers, [[ "$a" == "$b" ]] for strings.
  • Exporting secrets in process list — environment variables are visible in ps; pass via file descriptor or agent.
  • CRLF line endings from Windows editors — causes /bin/bash^M: bad interpreter; enforce LF in .gitattributes.

Practical checklist

  • Start scripts with #!/usr/bin/env bash and set -euo pipefail.
  • Quote all expansions; use "${array[@]}" when iterating arrays.
  • Define usage() and validate required args before side effects.
  • Use mktemp and trap ... EXIT for temporary files.
  • Log to stderr; reserve stdout for data other tools might pipe.
  • Test on a clean VM matching production (same distro, Dash vs Bash defaults).
  • Run shellcheck script.sh in CI — catches quoting and portability issues.
  • Pin external tool assumptions (GNU tar flags, date -d vs BSD) in comments.
  • Return distinct exit codes: 0 success, 1 usage/config error, 2 operational failure.
  • When script exceeds ~200 lines or needs tests, rewrite hot path in Python and keep Bash as a thin wrapper.

Key takeaways

  • Bash excels at orchestrating CLI tools on Linux — not at application logic.
  • Quoting and set -euo pipefail separate scripts that survive production from ones that corrupt data quietly.
  • Pipes and redirects are the core abstraction; master grep, awk, find, and curl before writing custom parsers.
  • Functions + case blocks scale CLI parsing to a point; beyond that, use Python.
  • Pair shell glue with solid Linux fundamentals and Git hooks for a complete ops toolkit.

Related reading