Skip to content

Troubleshooting

Symptom table

SymptomLikely causeFix
Local next dev logs [shadow-canary] middleware passthrough (VERCEL_ENV != 'production') on every requestVERCEL_GIT_REPO_SLUG not set locallyRun vercel env pull OR add VERCEL_GIT_REPO_SLUG=<your-repo-slug> to .env.local. Middleware degrades to passthrough in dev — behavior is correct but the warn is telling you the Edge Config key can’t be derived
Production deploy returns 500 on every request with VERCEL_GIT_REPO_SLUG is not setThe Vercel project isn’t linked to a Git repo, OR the env var was explicitly overridden to emptyRe-link the Git repo in Vercel Project Settings > Git, then redeploy. VERCEL_GIT_REPO_SLUG is auto-injected on every linked-project deploy
404 on JS/CSS chunks after deploySkew Protection is OFFEnable Skew Protection in Vercel Project Settings (Pro/Enterprise required)
/admin shows “unconfigured” or fails to load Edge Config dataEdge Config store not linked to the project, OR shadow-<repo-slug>-canary key not yet populatedVercel dashboard > Storage > your store > Connected Projects — add your project. Run deploy-shadow.yml once to populate the key
Cross-deploy rewrites return 401Deployment Protection is blocking shadow/previous deploy URLsDisable SSO Protection or enable Protection Bypass for Automation (see Prerequisites)
Canary cron does not fireDefault branch is not masterRename default branch to master (GitHub Settings > Branches) or update the on.push.branches trigger in the workflow files
Canary stuck at 4% (or any low %)canaryPaused: true in Edge ConfigUse the admin UI Resume button, or set canaryPaused: false directly in Edge Config
Canary stuck at 0% after deployFirst deploy with no previous prod URLUse [skip-canary] on first production push, or push again after the bootstrap deploy
Shadow deploy gets 0% traffictrafficShadowPercent: 0 in Edge ConfigSet trafficShadowPercent: 1 in Edge Config (propagates in 60s)
/debug page shows wrong branchYou are hitting the shadow or previous prod URL directlyThis is expected — those deploys always show their own branch. Visit via the custom domain to see the routing in action
Cookie does not stick across requestsCookie set with wrong domain or SameSite mismatchEnsure the middleware sets sameSite: 'lax' and path: '/'; check that the custom domain matches what the browser expects
SLO check always fails/api/slo returns non-200Check the endpoint response: curl https://your-app.vercel.app/api/slo. If it is a stub, it returns 200 by default — something is wrong with your custom implementation
vercel promote fails in CIToken does not have team scope or wrong org IDRegenerate the token with team scope and verify VERCEL_ORG_ID matches the team’s orgId in .vercel/project.json
Admin login returns 401ADMIN_USER or ADMIN_PASS env var not set, or wrong valueCheck Vercel Project Settings > Environment Variables. Defaults are admin / 12345 if vars are absent
Edge Config reads fail at runtimeEDGE_CONFIG connection string not injectedVercel injects this automatically when the store is linked. Re-link the store to the project and redeploy
Middleware runs on shadow/previous deploy and routes againx-shadow-routed header not set or strippedVerify rewriteTo sets x-shadow-routed: 1. Check that no other middleware or proxy strips it before reaching the target deploy
Rollback button in admin returns 500VERCEL_API_TOKEN not set or expiredAdd/refresh VERCEL_API_TOKEN in Vercel env vars
Workflow fails with ::error::Edge Config read failed (HTTP NNN)Vercel API transient — 401/403 token expired or wrong scope, 429 rate limit, 5xx Vercel outage, 000 runner network errorFail-safe, not a bug. This step refuses to write Edge Config when the read can’t be trusted, preventing the historical state-clobber bug. Check Vercel status, wait for recovery, then re-trigger the workflow from the Actions UI. Edge Config was not mutated — the step exits before the PATCH. For 401/403, regenerate VERCEL_TOKEN with team scope. For 404 on the project lookup specifically, verify VERCEL_PROJECT_ID matches the project the token can access
Workflow fails with ::error:: … body is not valid JSONVercel returned 200 OK with a non-JSON body (CDN error page during an incident)Same recovery as the HTTP-NNN error above — re-trigger once the API is healthy. The fail-fast is intentional: a malformed 200 response would otherwise be parsed as {} and clobber state

Canary stuck: detailed checklist

If trafficProdCanaryPercent has not changed in over 15 minutes:

  1. Check GitHub Actions > Canary ramp — is the workflow running? Look for a failed or skipped run.
  2. In the failing run, check the “Skip if no canary” step — is paused=true? Use admin UI to resume.
  3. Check the “Run 2 SLO checks” step — what HTTP code is /api/slo returning?
  4. Verify VERCEL_TOKEN has not expired and VERCEL_EDGE_CONFIG_ID is correct.
  5. Manually trigger the workflow (Actions > Canary ramp > Run workflow) to test.

Shadow not routing

If you visit /debug from multiple incognito windows and never get the shadow deploy:

  • Verify deploymentDomainShadow is set in Edge Config (not empty string)
  • Verify trafficShadowPercent is greater than 0
  • Check that the middleware is running on the production deploy (not preview) — VERCEL_ENV must be production
  • Bot detection may be filtering your client — check the user-agent

At 1%, you need roughly 100 requests to statistically expect one shadow assignment. Use shadowForceIPs or the /debug Force Shadow button for testing.

SLO false positives

If the canary rolls back but the deploy looks healthy:

  • The SLO endpoint may be timing out — the cron uses --max-time 10. If your check makes slow external API calls, it may be cut off.
  • The SLO endpoint may be returning 500 due to an unrelated dependency issue. Make the endpoint fail open if monitoring is unavailable.
  • Two checks 30 seconds apart is a small sample. If your error rate is naturally spiky (e.g. a scheduled job that briefly spikes errors), the check timing may coincide with a spike. Consider adding a moving average or increasing the check interval.

Related:

  • Workflows — workflow internals and customization
  • Edge Config — field reference for manual edits
  • Dashboard — Pause, Resume, Cancel controls