SOPs

How We Build Anything

What this covers

Any programmatic thing: Chrome extensions, websites, CLIs, cron jobs, hooks, agents, scrapers, daemons. If it runs code, it lives under this SOP.

What makes this SOP valuable is that it bakes in observability from the first commit. Every app writes logs to a predictable location so “check the logs — something is wrong” works the same way every time, regardless of which app is failing. And every app with a UI shows its current build number and commit summary so we always know which version is running.

Build visibility — non-negotiable

Every built thing with a visible surface shows its current build number and the latest commit summary, somewhere a human will see it. Not buried in an About page. Not in DevTools. In the UI.

This matters because:

Caches lie. Chrome caches aggressively, CDNs cache aggressively, service workers cache aggressively. “Did my change deploy?” is answerable in one glance if the build tag is visible.
Bug reports get 10x more useful. “It is broken on Build 17 (Chrome extension SOP - programmatic logging)” tells me exactly which code path to look at. “It is broken” does not.
No duplicate state. The build number lives in build.txt. The summary lives in the latest Build X: summary commit message. Both are already canonical — the UI just reads them.

Data sources (same for every app)

Build number: build.txt at repo root. Plain integer. Bumped by bump-build.sh or echo $((... + 1)) > build.txt.
Commit summary: git log -1 --format=%s → parse the Build X: summary prefix → the summary is what renders. If the latest commit is not in Build format, show the raw subject.
Optional: short SHA via git rev-parse --short HEAD for extra certainty.

How each surface renders it

Surface type	Where the badge lives	How it gets the data
Chrome extension	Popup header, directly under the title	`chrome.runtime.getManifest().description`, parsed at runtime (see Chrome Extension SOP)
Astro / Starlight site	Footer override that reads `build.txt` + `git log` at build time	`@astrojs/starlight/components/Footer.astro` override (see `tms-internal/src/components/BuildFooter.astro` for the reference implementation)
Next.js / React site	Footer component, props injected at build from `process.env.BUILD_NUM` or a generated `build-info.json`	Node `readFileSync` + `execSync('git log -1 --format=%s')` in `next.config.js` or a build script
CLI	`--version` flag prints `Build X - summary`	Read `build.txt` + parse `git log` at install time and embed in the binary, or read at runtime
Web API / service	`/health` and `/version` endpoints return `{build, summary, sha}`; banner in admin UI if any	Same pattern — read files at startup, cache in memory
Cron / daemon / worker (no UI)	First log line at startup: `INFO [bootstrap] Build X - summary (sha)`	Part of the logger init
Packaged desktop app (Electron, Tauri…)	Window title / About: `Build X · summary`	The bundle is NOT a git checkout, so it cannot run `git log` at runtime. A `predist`/prebuild step bakes the summary in: write `git log -1 --format=%s` (stripped of the `Build X:` prefix) to a small file the app reads at runtime. Never hand-maintain that file — a hand-edited summary silently drifts behind the build number (better-screen-sharing hit this, Build 38→40). Reference: `better-screen-sharing/bin/gen-build-summary.sh` + the `predist` npm script.

The pattern

┌────────────────────────┐
│ EXTENSION / APP / SITE │
└────────────┬───────────┘
             │
   ┌─────────┴────────┐
   │                  │
 build.txt    git log -1 --format=%s
   │                  │
   └────────┬─────────┘
            │
    formatted badge
   ("Build 31 — SOP update (abc1234)")
            │
        rendered in UI

When to skip

Rare. For internal-only scripts with no user surface at all (single-use migration, one-shot import), the first log line is enough. Everything else gets a visible badge.

Logging is non-negotiable

Every app writes logs to ~/apps/{repo-name}/logs/YYYY-MM-DD.log. No exceptions.

One file per day, date-rotated.
Append only — never overwrite.
Directory must exist before the app starts (create it in the setup script or at boot).
.gitignore the logs/ directory so logs never hit GitHub.

This is what makes /review-logs and “check the logs” work as generic skills — the skill looks at ~/apps/{app}/logs/ and finds what it needs.

Errors must page the human

Logging alone is not enough — if no one reads the log, an error is silent. Every app wraps its main loop/handlers in a try-catch (or try-except in Python), and on exception does two things:

Log it to the standard log file with [ERROR] level and full traceback.
Email James via Resend so the failure cannot be ignored.

The CRM already does this cleanly in ~/apps/claude-code-crm/notify.py via send_error_alert(source, error, traceback_str, context). It includes:

Source label — which script / workflow failed (e.g., "followup_engine", "webhook_handler")
Error message — the exception string
Traceback — full Python traceback, rendered inside the email
Context dict — relevant state (contact IDs, user IDs, request payloads) so you can jump straight to the broken thing
Cooldown window — 5 minutes per unique source+error[:80] so a burst of identical failures does not flood the inbox

Copy that file’s pattern when spinning up a new app. Python skeleton:

import traceback
from notify import send_error_alert  # copy from claude-code-crm

def run_main_loop():
    try:
        # the actual work
        do_the_thing()
    except Exception as e:
        logger.exception("main loop failed")  # logs traceback at ERROR level
        send_error_alert(
            source="my_app.main_loop",
            error=str(e),
            traceback_str=traceback.format_exc(),
            context={"batch_id": current_batch, "user_id": current_user},
        )
        # re-raise or return depending on whether the app should keep running

Node equivalent: wrap async handlers in try/catch, send via fetch('https://api.resend.com/emails', ...) with the same structure. The Resend API key is in shared-secrets.env as RESEND_API_KEY.

Non-negotiable rules:

Always include traceback. An error message without a traceback is almost useless for debugging.
Always include context. The contact ID, the batch ID, the request path — anything that lets you reproduce or jump straight to the failing state.
Always include a cooldown. Without it, a 2-minute outage emails you 120 times.
Never swallow exceptions silently. except: pass is banned. If you truly need to continue on error, log + alert first, THEN continue.

Log format standard

Every log line:

2026-04-16T16:58:32-06:00 [LEVEL] [component vX.Y.Z] Message text with context

ISO-8601 timestamp with timezone offset (no implicit UTC).
LEVEL: DEBUG, INFO, WARN, ERROR, FATAL. One of those five.
Component tag with version: a short identifier for which part of the app wrote the line, followed by the build/version ([worker v1.0.17], [server build 42], [background v2.3.1]). The version comes from build.txt for Python/Node apps and chrome.runtime.getManifest().version for Chrome extensions. This is mandatory. Without it you cannot tell whether a log line is from the build you just shipped or a stale process — a real bug we hit on the read-aloud-extension on 2026-04-19 when we could not confirm whether Build 17 was the one writing the logs.
Message: human-readable, includes relevant IDs or state. No trailing punctuation required — the tail of the line IS the message.

The version goes in the component tag, not the message body. Resolve it once at logger construction time so every line carries it automatically — never rely on the developer to remember to include it in each log call.

Python example:

import logging
from logging.handlers import TimedRotatingFileHandler
from pathlib import Path

LOG_DIR = Path(__file__).parent / "logs"
LOG_DIR.mkdir(exist_ok=True)

handler = TimedRotatingFileHandler(
    LOG_DIR / "app.log",
    when="midnight",
    backupCount=30,
)
handler.suffix = "%Y-%m-%d"

logging.basicConfig(
    level=logging.INFO,
    format="%(asctime)s [%(levelname)s] [%(name)s] %(message)s",
    datefmt="%Y-%m-%dT%H:%M:%S%z",
    handlers=[handler, logging.StreamHandler()],
)

Node equivalent: use pino with a file transport pointed at logs/YYYY-MM-DD.log. Chrome extensions: see the scaffold section below.

Repo creation checklist

Run this on every new app:

Create the private GitHub repo: gh repo create ojhurst/{name} --private --clone && cd {name}
Copy the CLAUDE.md template: cp ~/apps/cc/templates/CLAUDE.md ./CLAUDE.md, then fill in placeholders.
Initialize build.txt: echo 1 > build.txt
Set up .gitignore with the baseline: logs/, node_modules/, .env, .env.local, .env.*.local, __pycache__/, .venv/, .DS_Store, *.db.
Create .env.example at the repo root with every config key the app needs and safe placeholders. See the Env Files SOP for the full pattern. Then cp .env.example .env locally and fill in real values.
Create logs/ directory (gitignored, but needs to exist so the app does not crash on first run): mkdir logs && touch logs/.gitkeep (then add !logs/.gitkeep to .gitignore if you want the directory tracked).
First commit: git add -A && git commit -m "Build 1: initial scaffold" && git push -u origin main

Chrome extension scaffold

See the full Chrome Extension SOP — folder layout, bump-build.sh, manifest rules, programmatic logging, and popup build visibility.

Summary of the logging path:

One shared receiver — ~/apps/cc/chrome-log-receiver.py under launchd (com.cc.chrome-log-receiver), listening on http://127.0.0.1:9876. Runs always, used by every extension.
Every extension has extension/logger.js (copied from ~/apps/cc/templates/chrome-extension-logger.js) that POSTs to /log/<source-name>.
Logs land in ~/apps/cc/logs/<source>.log — a single canonical location Claude reads directly. No per-extension server.
Never tell James to open DevTools. Error messages must not say “check the service worker console.” If the info needed is not in the log file, add more logging.

Manifest v3 only. Nested extension/ folder. Popup shows Build number + summary.

Website scaffold

See the existing Cloudflare Pages Deploy SOP for the full deploy path.

Server-side code (FastAPI, Express, etc.) follows the Python/Node logging pattern above. Client-side code logs to console.* — no need to ship client logs to a server unless you are debugging production, in which case use a service like Sentry.

CLI scaffold

For a Python or Node CLI:

stdout = structured output meant for piping (JSON, CSV, etc.)
stderr = human-readable messages, progress, errors
log file = everything — timestamps, full detail, for post-mortem

If a user runs the CLI and something goes wrong, they should see useful output on stderr immediately, and the log file should have the full story for debugging later.

Use argparse (Python) or commander (Node) for argument parsing.

Cron / daemon scaffold

Cron scripts run with a minimal environment — no $PATH, no $HOME (sometimes), no shell setup.

Start the script with #!/usr/bin/env python3 or #!/bin/bash with explicit paths.
Export PATH explicitly at the top: export PATH="/usr/local/bin:/usr/bin:/bin:$PATH".
Run from an absolute path: cd "$(dirname "$0")" or pass --cwd.
Log to the standard location.
Redirect cron’s own stderr so cron does not email you for successful runs: 30 3 * * * /path/to/script.sh >> /dev/null 2>&1.
If the script fails in a way the log captures, run a separate watchdog that reads logs/ and alerts.

Example crontab entry:

0 3 * * * /Users/ojhurst/apps/update-manager/nightly.sh >> /Users/ojhurst/apps/update-manager/logs/cron.log 2>&1

Future-self readable: launchd plist naming

ProgramArguments[0] is what macOS shows in Background Items notifications. If you set it to /bin/bash and pass the script as the second array entry, your future self gets a toast that reads "bash" can run in the background. You can manage background activity in Login Items Extensions. Six months later, you have no idea which of your LaunchAgents that is.

Rule: the first ProgramArguments entry must be a script with a meaningful filename and its own shebang. Never the interpreter.

Wrong — macOS labels this job bash:

<key>ProgramArguments</key>
<array>
    <string>/bin/bash</string>
    <string>/path/to/screenshots-roll.sh</string>
</array>

Right — macOS labels this job screenshots-roll. Script starts with #!/usr/bin/env bash and is chmod +x:

<key>ProgramArguments</key>
<array>
    <string>/path/to/screenshots-roll.sh</string>
</array>

Python or Node entry points — same rule. Do not put /usr/bin/env python3 server.py in ProgramArguments. Write a thin launcher script with a descriptive name:

#!/usr/bin/env bash
exec /usr/bin/env python3 "$(dirname "$0")/server.py"

Name it claude-code-chrome-ext-server (no extension), chmod +x, point ProgramArguments at it. macOS shows claude-code-chrome-ext-server in notifications — actually useful.

Drop RunAtLoad for periodic-only jobs. A StartCalendarInterval-only job does not need <key>RunAtLoad</key><true/> — every reload otherwise fires a fresh “App Background Activity” notification. Keep RunAtLoad only when the daemon must come up at boot.

Drift example — 2026-05-18

I shipped a com.claude-code-chrome-ext.screenshots-roll launchd job with /bin/bash as ProgramArguments[0] and RunAtLoad=true. The first toast that fired said "bash" can run in the background. James caught it immediately:

“I cannot have it be called Bash. It can run in the background. I will never know what that is in six months.”

Drift root cause: the launchd plist template I had in head was the “bash + script-as-arg” idiom that works fine on a server but produces meaningless macOS notifications on a desktop machine. The script’s filename was already meaningful (screenshots-roll.sh); I just was not pointing macOS at it directly.

Guard now in place at ~/apps/cc/hooks/launchd-plist-name-guard.py — a PreToolUse hook that blocks any .plist whose first ProgramArguments entry is a generic interpreter (/bin/bash, /usr/bin/env, python3, node, etc.). The block message tells future-Claude exactly how to fix it.

The lesson is bigger than launchd: anything the user sees in a notification, a process list, or a six-months-later forensic search needs a name that means something. Default to the meaningful name, never the interpreter.

The “check the logs” workflow

When James says “check the logs” without a specific app name:

Pull the Thought Catcher / current context to infer which app he means.
If still unclear, ask once: “Check logs for which app? Options: CRM, Update Manager, auto-journal, voice-first…”
Look at ~/apps/{app}/logs/YYYY-MM-DD.log (today’s file).
If nothing relevant in today’s, walk back a day at a time up to 7 days.
Look for [ERROR] or [WARN] lines first, then [INFO] for context around them.
Report: last error, when it happened, what was happening around it (lines before + after), suggested cause.

The /review-logs and /mts-logs skills already encode this pattern for their respective apps — follow the same shape when adding a new one.

Testing is non-negotiable

Every app ships with a smoke.sh at its repo root that runs one real end-to-end call and exits 0 or 1. No mocking. No unit test scaffolding. One call that proves the app is alive.

For read-aloud-extension, that is a curl against the local TTS server asking for four bytes of synthesized audio. If the server is broken, the curl fails, the script exits non-zero, and we know inside ten seconds.

Why this matters. In April 2026 a brew upgrade moved Python from 3.14.3 to 3.14.4 and stranded the read-aloud-extension TTS server on a deleted framework path. Every synth call failed with ModuleNotFoundError: No module named 'concurrent.futures.thread'. The extension looked broken. The fix was a 30-second launchd kick, but the bug sat live for a day because nothing checked. A ten-line smoke test would have caught it the next time anything touched Python on this machine.

The `smoke.sh` contract

Lives at the repo root. Executable.
Exits 0 on success, non-zero on failure.
Prints a one-line summary on success; full diagnostic on failure.
Runs in under ten seconds. If it cannot, it is not a smoke test — it is an integration test, and those live elsewhere.
Declares runtime dependencies as a comment header so the master runner can filter:

#!/usr/bin/env bash
# smoke-deps: python, edge-tts, launchd-tts-server
set -euo pipefail

curl -fsS -o /dev/null \
  -X POST http://127.0.0.1:9877/tts \
  -H "Content-Type: application/json" \
  -d '{"text":"ok","voice":"en-US-AndrewNeural"}' \
  || { echo "FAIL: TTS server not responding on :9877"; exit 1; }

echo "OK: read-aloud-extension TTS server responding"

The smoke-deps header is the contract with change management — it is how the master runner knows which smoke tests to run when something upstream changes.

The master runner

~/apps/cc/smoke.sh walks every repo in ~/apps/, reads each smoke.sh header, and runs only the ones whose smoke-deps match the filter:

smoke --python     # every smoke test tagged `python`
smoke --chrome     # every Chrome extension
smoke --ghl        # everything that talks to GoHighLevel
smoke --all        # every smoke test in every app

This is the piece that makes change management cheap. Before upgrading Python, smoke --python. After upgrading Python, smoke --python. If the after-run is red and the before-run was green, the upgrade broke something and you know exactly which app to look at.

Preflight before push

Any repo with a build script in package.json runs npm run build locally before the push is allowed to leave the machine. Pre-push git hook enforces it.

Why. CI-side build failures are silent by default — tms-internal Build 37 (2026-04-17) pushed clean, failed on GitHub Actions because of a YAML frontmatter quirk, and the CF Pages deploy sat broken while the verifier timed out on the Cloudflare Access auth page. A 3-second local build would have caught it before the push landed.

The hook. Central script at ~/apps/cc/hooks/preflight-build.sh. Installed into every repo’s .git/hooks/pre-push by ~/apps/cc/hooks/install-pre-push-hook.sh, which is called from setup.sh on every pull. The hook skips silently for repos with no package.json or no build script, so Python/FastAPI/MCP/raw-HTML repos are unaffected.

Behavior on fail.

Prints the last 30 lines of the build output.
Points to the full log at /tmp/preflight-build-$$.log.
Exits non-zero. Push is blocked.
Escape hatch: git push --no-verify if you know what you are doing.

Behavior on pass. Silent success, push proceeds.

Custom pre-push hooks are preserved. If a repo already has its own pre-push (e.g., mytechsupport has a build-number increment check), the installer leaves it alone. Those repos should call ~/apps/cc/hooks/preflight-build.sh from inside their own hook if they want the build check too.

Installation. Runs automatically on every pullall (via cc/setup.sh). Manual re-install: bash ~/apps/cc/hooks/install-pre-push-hook.sh.

One change per commit

Work on multiple things in parallel if you want — the rule is that each change lands as its own distinct, concrete, reversible unit. Never bake two unrelated changes into a single commit, even when you wrote them in the same session.

Why. If both changes ship together and one of them is wrong, you have to unwind the good one to back out the bad one. Separate commits keep every change independently revertable. git revert <sha> is cheap; surgery inside a mixed commit is not.

Test: “If I wanted to back out only this change, could I?” If no, split the commit.

Example. Live View Build 37 refactored the map from SwiftUI Map to an MKMapView wrapper so the fly-zoom duration slider actually works. Build 38 added a pre-record hotspot overlay showing tap targets. Two unrelated features, two commits, two build.txt bumps. Either one can be reverted without touching the other.

Not this: “Build 37: map refactor + hotspot overlay + slight color tweak.” That is three changes in one wrapper. If the map refactor breaks pinch-zoom in a way we only notice a week later, we cannot revert without losing the hotspots and the color tweak too.

Change management — when we upgrade anything

Personal stack, enterprise spirit. Four steps. Stripped to what one person can sustain.

1. Classify the change

Class	What it looks like	Blast radius
Patch	Config tweak, bug fix in one app	That app only
Minor	New feature, dependency bump inside one repo	That app, maybe its direct callers
Major	Runtime upgrade (Python, Node, Chrome), shared library swap, infrastructure change	Every app sharing that runtime

A brew upgrade python@3.14 is a major change even when the version bump looks cosmetic. Treat it accordingly. The cost of classifying up is one extra smoke run. The cost of classifying down is a day of silent breakage.

2. Check the dependency graph

Every app’s CLAUDE.md declares its runtime dependencies near the top — Python, Node, Chrome extension APIs, shared services like GHL or Edge TTS. Before a major change, grep across the fleet:

grep -l "python@3.14" ~/apps/*/CLAUDE.md
grep -l "edge-tts"    ~/apps/*/CLAUDE.md

If the grep returns an app that is not already in your head, read its README and smoke.sh header before touching the runtime. Stale dependency declarations are worse than none — if a repo’s CLAUDE.md claims a dependency the app no longer uses, or omits one it does use, the graph lies. Keep the declarations honest or the whole mechanism rots.

3. Run smoke before and after

smoke --python       # baseline — everything green right now?
brew upgrade python@3.14
launchctl kickstart -k gui/$UID/com.read-aloud-extension.tts-server   # kick any daemons
smoke --python       # did anything regress?

If the baseline is already red, stop. Fix the existing breakage first. Never change a system you cannot prove is healthy — the post-change red becomes impossible to attribute.

4. Rollback plan and changelog

Rollback plan — written before the change, not after. For Homebrew: brew switch python@3.14 3.14.3_1. For npm: pin in package.json and npm ci. For pip: pip install pkg==previous_version. If there is no rollback path, the change is not ready.
Changelog — build.txt + commit message. Both already exist in every repo. A commit like Build 47: upgrade Python 3.14.3 → 3.14.4, smoke --python green tells future-me exactly what shipped and what was validated. No separate changelog file.

Why this is lighter than it sounds

None of this is new process. build.txt, CLAUDE.md, logging, and commit messages are already in every repo because of the rest of this SOP. The only new artifacts are the per-app smoke.sh and the master runner at ~/apps/cc/smoke.sh. Once both exist, every change — patch, minor, or major — follows the same three commands: baseline, change, verify.

Build it light

Every email and every externally-rendered HTML output uses a light background — white card, dark text. Never navy, never #1a1a2e, never dark blue. Dark backgrounds get color-inverted by dark-mode email clients and force-dark browsers into unreadable dark-on-dark, and we do not control the recipient’s renderer. This is a correctness rule, not a style choice.

Full pattern, the inversion fingerprint, and the reference template: Build It Light, Not Dark.

Deprecate an app

When an app is no longer used:

Archive the GitHub repo (settings → archive). Keeps it readable but disables writes.
Remove it from ~/apps/ if it is taking up space, OR move to ~/apps/_deprecated/{name}/.
Add a row to Deprecated Apps (create the page if it does not exist yet) noting: name, what it did, when retired, why, where successor lives.
Leave the logs in place if they might matter for historical debugging — compress with tar -czf logs.tar.gz logs/ to save space.

Build It Light, Not Dark — light backgrounds for emails and rendered output
Env Files — .env and .env.example — the config pattern every new repo uses
Adding a Page to tms-internal — meta-SOP for writing pages like this one
Cloudflare Pages Deploy — website deploy path
Update Manager — fleet-wide software inventory
Claude Code Issues Filed — running log of upstream asks