SOPs
Verify Deploys
The Rule
Section titled “The Rule”Never tell James “should be live in 30 seconds” or “give it a minute.” Poll the deployed thing for its current build number, compare to the one just pushed, and only report “Build X is live” after the numbers match. On timeout, say so explicitly — a timeout is data, it means the deploy got stuck somewhere.
Caches, CDNs, service workers, Vercel preview URLs, and GitHub Actions can all silently serve stale content or fail without surfacing the error, which is why verification cannot be skipped.
This SOP assumes the Build Visibility section of “How We Build Anything” is in place — every deployed thing exposes its build number somewhere readable. Verify Deploys is how to USE that.
Local preflight is step 0. Before the push is allowed, the pre-push preflight hook runs npm run build and blocks on failure. That catches CI build breaks on the laptop instead of on Cloudflare Pages three minutes later. Remote verify-deploy is step N: it confirms the push actually served.
The core pattern
Section titled “The core pattern”1. Push code with Build N in build.txt2. Wait a few seconds (let the CI hook fire)3. Poll the deployed thing for its current build number4. Keep polling until: build == N → "Build N is live" OR: timeout elapsed → "Deploy did not land in {T}s, check GH Actions" OR: build > N → bug ("how did that happen?"), stop and investigate5. Only AFTER a successful match, open the page in James's browser and tell himNo step in that flow involves estimating. Every step is observable and deterministic.
Every deployed thing exposes its build
Section titled “Every deployed thing exposes its build”The “How We Build Anything” SOP already makes this non-negotiable. Quick recap of where the build number lives per surface type:
| Surface | Exposed at |
|---|---|
| Static site (Astro, Starlight) | Footer + <meta name="build" content="N"> in <head> |
| Next.js / React | /api/version returning {build, summary, sha} + footer |
| FastAPI / Express service | /version returning {build, summary, sha} + /health |
| Chrome extension | chrome.runtime.getManifest().description starts with Build N - ... |
| CLI | --version prints Build N - summary |
If a deploy target does not expose its build, the fix is upstream — add the build surface first, then come back here.
The verifier
Section titled “The verifier”Canonical path: ~/apps/cc/verify-deploy.sh (currently themarketingshow-specific, generalize over time).
Standard interface:
bash ~/apps/cc/verify-deploy.sh <project> <expected-build> [--timeout=180]<project>— one of the configured adapters (tms,tms-internal,mytechsupport,ath,crm, etc.)<expected-build>— the integer we just pushed--timeout— seconds before bailing (default 180)
Exit codes:
0— verified live2— timed out3— unexpected build (polled number was different from expected in a weird way)
Inside, the script dispatches to a per-provider adapter. New adapters get added as new deploy targets land.
Per-provider recipes
Section titled “Per-provider recipes”Cloudflare Pages (Astro, Starlight, static)
Section titled “Cloudflare Pages (Astro, Starlight, static)”expected=$1url="https://themarketingshow.com/build.txt" # or meta tag / footerfor i in $(seq 1 60); do got=$(curl -s "$url" | tr -d ' \n') if [ "$got" = "$expected" ]; then echo "live at Build $expected" exit 0 fi sleep 3doneecho "TIMEOUT: build stuck at $got"exit 2Tips:
- Expose
build.txtat the site root by including it in the public/ folder, OR - Serve a
<meta name="build">tag andcurl -s $url | grep buildit. - Cache-bust: append
?v=$(date +%s)to the URL.
Vercel (Next.js)
Section titled “Vercel (Next.js)”Expose /api/version in the app:
import buildInfo from '../../build-info.json'export default function handler(req, res) { res.status(200).json(buildInfo)}Where build-info.json is generated at build time from build.txt and git log -1 --format=%s.
Verifier polls https://app.com/api/version and compares .build.
VPS / FastAPI / Express
Section titled “VPS / FastAPI / Express”Add a /version endpoint (FastAPI shown):
from pathlib import Pathimport subprocess
@app.get("/version")def version(): build = int(Path(__file__).parent.joinpath("build.txt").read_text().strip()) summary = subprocess.check_output(["git", "log", "-1", "--format=%s"], text=True).strip() sha = subprocess.check_output(["git", "rev-parse", "--short", "HEAD"], text=True).strip() return {"build": build, "summary": summary, "sha": sha}Verifier curls https://crm.gokartpark.com/version, checks .build.
Cloudflare Access–gated sites (tms-internal)
Section titled “Cloudflare Access–gated sites (tms-internal)”curl against the live URL returns a 302 to the Access auth page, so the usual HTML scrape does not work. Two options:
Option A — Wrangler API (preferred):
wrangler pages deployment list --project-name tms-internal \ --json 2>/dev/null \ | jq -r '.[0] | select(.latest_stage.status == "success") | .url'Requires wrangler installed and authenticated with the same Cloudflare account.
Option B — GitHub Actions workflow status:
gh run list --repo ojhurst/tms-internal --workflow deploy.yml --limit 1 --json status,conclusionWait for conclusion == "success", then trust it. Does NOT confirm the HTML is serving the new build — only that the workflow finished. Use Option A when you can.
Chrome extensions
Section titled “Chrome extensions”Extensions deploy to the local filesystem or to the Chrome Web Store. There is no “URL to poll.” Verification path:
- Unpacked (local dev): reload the extension via
chrome.managementAPI or manually (chrome://extensions). Open the popup, read the build badge in the header. If it matches, done. - Published to Web Store: the store review process is NOT deterministic. Track the expected build in a log, hit the public listing page periodically to confirm the version number updated. This path is hours-to-days, not seconds.
What to tell James
Section titled “What to tell James”Only AFTER the verifier exits 0:
- “Build N is live at [url].”
- Open the page in his browser with a text-fragment highlight pointing at the change.
- Include the build number in the text so he sees it in the voice read-aloud too.
On timeout:
- “Build N is not live yet — verifier timed out at {T} seconds. Last seen build was {M}. Checking GH Actions.”
- DO NOT say “should be up soon” or “give it another minute.”
Common failure modes
Section titled “Common failure modes”- Build failed on the CI side. Check GH Actions:
Terminal window gh run list --repo ojhurst/{repo} --limit 3 --json status,conclusion,headSha - Stale cache served. Force a fresh fetch with
?v=$(date +%s)on the URL, or clear Cloudflare cache via API if it persists. - Service worker intercepted the request. In a browser, Dev Tools → Application → Service Workers → Unregister. In the verifier, always hit the URL with
curl --max-time 5to bypass browser caches entirely. - Vercel preview vs production URL confusion. A push to
maindeploys to production; every other branch makes a preview. Always verify against the production URL for real deploys. - DNS propagation for a new deploy target. A brand new hostname can take up to a few minutes. Use
digornslookupto confirm DNS is resolving before blaming the deploy.
Related
Section titled “Related”- How We Build Anything — the Build Visibility section this SOP depends on
- Cloudflare Pages Deploy — the deploy SOP for CF Pages sites
- Adding a Page to tms-internal — meta-SOP for pages like this one