Scheduled jobs
This page explains how to wire up the Better Comply cron endpoints and what each one does.
Operators setting up or maintaining a production deployment.
How secret gating works
These endpoints are not protected by a user JWT. They use a shared secret passed as the x-internal-secret request header. In production, every endpoint refuses to run if its secret is not set. Store secrets in Secret Manager and pass them as environment variables; never bake them into the image.
All scan/cron routes use the same pattern:
- The caller sends
POST /v1/<endpoint>with the headerx-internal-secret: <secret>. - The backend compares the header to the configured secret using a constant-time equality check.
- If the environment is production and the secret is not configured, the endpoint refuses with
503even before comparing. - A mismatched secret returns
401.
Secrets are logged as [REDACTED] in all server logs.
Endpoints and cadences
Recertification scan
Endpoint: POST /v1/recertification-scan
Secret variable: RECERTIFICATION_SCAN_SECRET
Recommended cadence: Daily at 06:00 UTC (0 6 * * *)
Scans evidence.recertification_due_at for each user:
- Due within the next 30 days: creates a reminder notification.
- Past due: creates a new assignment in an auto-generated
Recertification: <training name>campaign and sends a notification.
The operation is idempotent - the UNIQUE(user_id, campaign_id) constraint and the recertification_status state machine make double-fires safe.
Cloud Scheduler setup:
gcloud scheduler jobs create http better-comply-recert \
--schedule="0 6 * * *" \
--uri="https://<cloud-run-url>/v1/recertification-scan" \
--http-method=POST \
--headers="x-internal-secret=$(gcloud secrets versions access latest --secret=better-comply-recert-secret),content-type=application/json" \
--time-zone=UTC \
--message-body='{}'
GitHub Actions alternative: .github/workflows/recertification-scan.yml runs the same 0 6 * * * schedule using the BACKEND_URL and RECERTIFICATION_SCAN_SECRET repository secrets. Do not enable both Cloud Scheduler and GitHub Actions against the same backend.
Supervisor report scan
Endpoint: POST /v1/supervisor-report-scan
Secret variable: SUPERVISOR_REPORT_SCAN_SECRET
Recommended cadence: Mondays at 07:00 UTC (0 7 * * 1)
Generates and emails the weekly training status digest to each team lead and line lead. Each supervisor receives only their own team's data - the SQL views enforce this scoping. The operation:
- Reads
report_assignment_statusfail-loud (a read failure aborts the run). - Sends email to each supervisor via the configured email provider.
- Records a row in
report_deliveriesfor each recipient (outcome:sent,skipped_no_email,skipped_no_data, orfailed). - Emits a best-effort
send_supervisor_reportaudit event (one failed recipient does not abort the rest of the batch).
Emailless / pseudonymous supervisors are skipped gracefully. A missing EMAIL_API_KEY falls back to console logging - no hard failure.
Cloud Scheduler setup:
gcloud scheduler jobs create http better-comply-supervisor-report \
--schedule="0 7 * * 1" \
--uri="https://<cloud-run-url>/v1/supervisor-report-scan" \
--http-method=POST \
--headers="x-internal-secret=$(gcloud secrets versions access latest --secret=better-comply-report-secret),content-type=application/json" \
--time-zone=UTC \
--message-body='{}'
GitHub Actions alternative: .github/workflows/supervisor-report-scan.yml runs the same 0 7 * * 1 schedule.
Assignment recompute drain
Endpoint: POST /v1/drain-assignment-recompute
Secret variable: RECERTIFICATION_SCAN_SECRET (reuses the same secret)
Recommended cadence: Every 5 minutes (*/5 * * * *)
Drains the assignment_recompute_queue table, which is populated by a database trigger whenever a user's profile attributes (activity, plant, line, workstation) change. The drain calls recomputeAssignmentsForUser for each queued entry and assigns any matching onboarding campaigns.
Recompute is additive and idempotent - it never deletes or supersedes existing assignments.
You can also trigger recompute synchronously on a per-user basis through the POST /v1/apply-onboarding-rules route (user-authenticated, admin-initiated).
Document-processing queue drain
Endpoint: POST /v1/process-document-queue
Secret variable: RECERTIFICATION_SCAN_SECRET (reuses the same secret)
Recommended cadence: Every 1 minute (*/1 * * * *) or via Cloud Tasks (see below)
Drains the RAG indexing queue by processing documents in pending status. Only relevant when DOCUMENT_PROCESSING_MODE=worker. See Document processing for the full explanation of modes.
Cloud Scheduler setup (for worker mode):
gcloud scheduler jobs create http better-comply-doc-queue \
--schedule="*/1 * * * *" \
--uri="https://<cloud-run-url>/v1/process-document-queue" \
--http-method=POST \
--headers="x-internal-secret=<secret>,content-type=application/json" \
--time-zone=UTC \
--message-body='{}'
Document-conversion cleanup
Endpoint: POST /v1/cleanup-document-conversions
Secret variable: RECERTIFICATION_SCAN_SECRET (reuses the same secret)
Recommended cadence: Daily (0 2 * * *)
Removes orphaned staging blobs from the knowledge-documents bucket:
conversions/<id>/...- extracted images from a conversion job that was never committed.imports/<uuid>.<ext>- source binaries staged by the import page but abandoned (tab-close, etc.).
The default retention is 24 hours (olderThanHours=24). A beforeunload browser cleanup is not reliable for async storage deletes, so this cron is the durable garbage collector.
gcloud scheduler jobs create http better-comply-doc-cleanup \
--schedule="0 2 * * *" \
--uri="https://<cloud-run-url>/v1/cleanup-document-conversions" \
--http-method=POST \
--headers="x-internal-secret=<secret>,content-type=application/json" \
--time-zone=UTC \
--message-body='{}'
Summary table
| Endpoint | Secret variable | Recommended cadence | Purpose |
|---|---|---|---|
/v1/recertification-scan | RECERTIFICATION_SCAN_SECRET | Daily 0 6 * * * | Re-enrol overdue learners, send reminders |
/v1/supervisor-report-scan | SUPERVISOR_REPORT_SCAN_SECRET | Mondays 0 7 * * 1 | Weekly team digest email to supervisors |
/v1/drain-assignment-recompute | RECERTIFICATION_SCAN_SECRET | Every 5 min */5 * * * * | Drain attribute-change assignment queue |
/v1/process-document-queue | RECERTIFICATION_SCAN_SECRET | Every 1 min */1 * * * * | RAG indexing drain (worker mode only) |
/v1/cleanup-document-conversions | RECERTIFICATION_SCAN_SECRET | Daily 0 2 * * * | Remove orphaned staging blobs |
Secret rotation
- Add a new secret version in Secret Manager.
- Redeploy the Cloud Run service.
- Update each Cloud Scheduler job to pass the new header value.
- Verify the next scheduled run succeeds before deleting the old secret version.
Related
- Environment variables -
RECERTIFICATION_SCAN_SECRETandSUPERVISOR_REPORT_SCAN_SECRET - Email delivery - configuring the email provider for supervisor reports
- Document processing - full explanation of
DOCUMENT_PROCESSING_MODE - Reports and notifications - the supervisor digest from the admin perspective