Skip to main content

Scheduled jobs

This page explains how to wire up the Better Comply cron endpoints and what each one does.

Who this is for

Operators setting up or maintaining a production deployment.

How secret gating works

Not user-authenticated

These endpoints are not protected by a user JWT. They use a shared secret passed as the x-internal-secret request header. In production, every endpoint refuses to run if its secret is not set. Store secrets in Secret Manager and pass them as environment variables; never bake them into the image.

All scan/cron routes use the same pattern:

  1. The caller sends POST /v1/<endpoint> with the header x-internal-secret: <secret>.
  2. The backend compares the header to the configured secret using a constant-time equality check.
  3. If the environment is production and the secret is not configured, the endpoint refuses with 503 even before comparing.
  4. A mismatched secret returns 401.

Secrets are logged as [REDACTED] in all server logs.

Endpoints and cadences

Recertification scan

Endpoint: POST /v1/recertification-scan Secret variable: RECERTIFICATION_SCAN_SECRET Recommended cadence: Daily at 06:00 UTC (0 6 * * *)

Scans evidence.recertification_due_at for each user:

  • Due within the next 30 days: creates a reminder notification.
  • Past due: creates a new assignment in an auto-generated Recertification: <training name> campaign and sends a notification.

The operation is idempotent - the UNIQUE(user_id, campaign_id) constraint and the recertification_status state machine make double-fires safe.

Cloud Scheduler setup:

gcloud scheduler jobs create http better-comply-recert \
--schedule="0 6 * * *" \
--uri="https://<cloud-run-url>/v1/recertification-scan" \
--http-method=POST \
--headers="x-internal-secret=$(gcloud secrets versions access latest --secret=better-comply-recert-secret),content-type=application/json" \
--time-zone=UTC \
--message-body='{}'

GitHub Actions alternative: .github/workflows/recertification-scan.yml runs the same 0 6 * * * schedule using the BACKEND_URL and RECERTIFICATION_SCAN_SECRET repository secrets. Do not enable both Cloud Scheduler and GitHub Actions against the same backend.


Supervisor report scan

Endpoint: POST /v1/supervisor-report-scan Secret variable: SUPERVISOR_REPORT_SCAN_SECRET Recommended cadence: Mondays at 07:00 UTC (0 7 * * 1)

Generates and emails the weekly training status digest to each team lead and line lead. Each supervisor receives only their own team's data - the SQL views enforce this scoping. The operation:

  1. Reads report_assignment_status fail-loud (a read failure aborts the run).
  2. Sends email to each supervisor via the configured email provider.
  3. Records a row in report_deliveries for each recipient (outcome: sent, skipped_no_email, skipped_no_data, or failed).
  4. Emits a best-effort send_supervisor_report audit event (one failed recipient does not abort the rest of the batch).

Emailless / pseudonymous supervisors are skipped gracefully. A missing EMAIL_API_KEY falls back to console logging - no hard failure.

Cloud Scheduler setup:

gcloud scheduler jobs create http better-comply-supervisor-report \
--schedule="0 7 * * 1" \
--uri="https://<cloud-run-url>/v1/supervisor-report-scan" \
--http-method=POST \
--headers="x-internal-secret=$(gcloud secrets versions access latest --secret=better-comply-report-secret),content-type=application/json" \
--time-zone=UTC \
--message-body='{}'

GitHub Actions alternative: .github/workflows/supervisor-report-scan.yml runs the same 0 7 * * 1 schedule.


Assignment recompute drain

Endpoint: POST /v1/drain-assignment-recompute Secret variable: RECERTIFICATION_SCAN_SECRET (reuses the same secret) Recommended cadence: Every 5 minutes (*/5 * * * *)

Drains the assignment_recompute_queue table, which is populated by a database trigger whenever a user's profile attributes (activity, plant, line, workstation) change. The drain calls recomputeAssignmentsForUser for each queued entry and assigns any matching onboarding campaigns.

Recompute is additive and idempotent - it never deletes or supersedes existing assignments.

You can also trigger recompute synchronously on a per-user basis through the POST /v1/apply-onboarding-rules route (user-authenticated, admin-initiated).


Document-processing queue drain

Endpoint: POST /v1/process-document-queue Secret variable: RECERTIFICATION_SCAN_SECRET (reuses the same secret) Recommended cadence: Every 1 minute (*/1 * * * *) or via Cloud Tasks (see below)

Drains the RAG indexing queue by processing documents in pending status. Only relevant when DOCUMENT_PROCESSING_MODE=worker. See Document processing for the full explanation of modes.

Cloud Scheduler setup (for worker mode):

gcloud scheduler jobs create http better-comply-doc-queue \
--schedule="*/1 * * * *" \
--uri="https://<cloud-run-url>/v1/process-document-queue" \
--http-method=POST \
--headers="x-internal-secret=<secret>,content-type=application/json" \
--time-zone=UTC \
--message-body='{}'

Document-conversion cleanup

Endpoint: POST /v1/cleanup-document-conversions Secret variable: RECERTIFICATION_SCAN_SECRET (reuses the same secret) Recommended cadence: Daily (0 2 * * *)

Removes orphaned staging blobs from the knowledge-documents bucket:

  • conversions/<id>/... - extracted images from a conversion job that was never committed.
  • imports/<uuid>.<ext> - source binaries staged by the import page but abandoned (tab-close, etc.).

The default retention is 24 hours (olderThanHours=24). A beforeunload browser cleanup is not reliable for async storage deletes, so this cron is the durable garbage collector.

gcloud scheduler jobs create http better-comply-doc-cleanup \
--schedule="0 2 * * *" \
--uri="https://<cloud-run-url>/v1/cleanup-document-conversions" \
--http-method=POST \
--headers="x-internal-secret=<secret>,content-type=application/json" \
--time-zone=UTC \
--message-body='{}'

Summary table

EndpointSecret variableRecommended cadencePurpose
/v1/recertification-scanRECERTIFICATION_SCAN_SECRETDaily 0 6 * * *Re-enrol overdue learners, send reminders
/v1/supervisor-report-scanSUPERVISOR_REPORT_SCAN_SECRETMondays 0 7 * * 1Weekly team digest email to supervisors
/v1/drain-assignment-recomputeRECERTIFICATION_SCAN_SECRETEvery 5 min */5 * * * *Drain attribute-change assignment queue
/v1/process-document-queueRECERTIFICATION_SCAN_SECRETEvery 1 min */1 * * * *RAG indexing drain (worker mode only)
/v1/cleanup-document-conversionsRECERTIFICATION_SCAN_SECRETDaily 0 2 * * *Remove orphaned staging blobs

Secret rotation

  1. Add a new secret version in Secret Manager.
  2. Redeploy the Cloud Run service.
  3. Update each Cloud Scheduler job to pass the new header value.
  4. Verify the next scheduled run succeeds before deleting the old secret version.