Neuron — Workers

The Neuron package (chaoscypher-neuron) provides background job processing through a unified worker that manages two independent queues.

Architecture

Dual Queue System

The worker runs a single process with two independent asyncio pollers:

Queue	Concurrency	Max Retries	Purpose
LLM (`llm`)	1 (dynamic with multi-instance Ollama)	5	Chat, extraction, tool LLM calls
Operations (`operations`)	8	5	Source processing, exports, batch ops

Why Separate Queues?

LLM queue runs with concurrency 1 because LLM calls are GPU-bound. Running multiple LLM requests simultaneously doesn't improve throughput and can cause OOM errors with local models. Interactive chat gets priority over background extraction.

Operations queue runs with concurrency 8 because operations are I/O-bound (file processing, database writes, network requests). Higher concurrency improves throughput.

Priority System

The LLM queue supports priority-based scheduling:

Priority	Value	Use Case
Interactive	100	User-initiated chat, interactive operations
Background	50	Background extraction, batch processing

Higher values = higher priority (ZPOPMAX). Interactive chat preempts background extraction.

Worker Lifecycle

Startup

Load worker configuration for both queues
Initialize shared resources: database, settings, LLM provider, Valkey — database initialization runs the tier-aware migration runner, auto-applying any pending Alembic migrations (see ADR-0006)
Upgrade gate: if the database is blocked on a migration that requires operator confirmation, the worker waits — polling the upgrade state — before registering any handlers, so no work is claimed mid-upgrade
Register handlers for LLM and operations queues
Start background settings listener (watches for config changes)
Run extraction recovery (find orphaned tasks, unstick stuck sources)
Begin polling both queues

Runtime

Independent pollers for each queue
Configurable poll interval (default: 0.5s)
Periodic reconcile loops run alongside the pollers:
- Source recovery (default: every 60s) — scans for non-terminal sources whose queued work was dropped (e.g. after a crash) and re-dispatches the missing queue tasks
- Orphan file cleanup (default: every 24h) — removes staging directories left behind when a hard kill lands between the file write and the source-record commit
- Orphan task cleanup (default: every 24h) — deletes stale orphaned chunk-extraction tasks past the retention window
Health reporting at configurable intervals
Settings listener monitors for configuration changes

Shutdown

Cancel settings listener
Drain in-flight tasks (configurable timeout)
Close worker session
Disconnect storage adapter
Disconnect Valkey

Worker Context

All handlers share a WorkerContext dictionary containing:

Application settings, engine settings, and database name
Storage adapter (database)
Graph repository
Search repository
LLM provider and LLM service
Config manager
Worker session (database session)

This context is initialized once at startup and shared across all job handlers.

Task Cancellation

Both queued and running tasks can be cancelled.

Queued tasks are removed from the pending sorted set immediately and marked as cancelled.

Running tasks use a cooperative cancellation mechanism:

A cancellation flag (queue:cancel:{task_id}) is set in Valkey with a TTL sized to outlive the worst-case handler lifetime (LLM worker timeout + 5 minutes — about 65 minutes with the default 1-hour timeout)
The task status is updated to cancelled immediately (so the UI reflects the change)
The worker handler checks for the flag between processing batches
When detected, the handler raises CancelledError and exits gracefully

This design means the UI updates instantly while the handler finishes its current batch and cleans up. Orphaned tasks (marked as running but no longer in the running set) are handled gracefully.

Batch cancellation is supported for cancelling multiple tasks in a single API call, using Valkey pipelines for efficiency.

Configuration

Worker behavior can be customized via workers.yaml:

llm_worker:
  max_concurrent: 1
  max_tries: 5

operations_worker:
  max_concurrent: 8
  max_tries: 5

Changes to workers.yaml require a worker restart. Settings changes via the API are picked up by the settings listener without restart.

Running

# Via entry point
cc-neuron

# Via Docker
make docker-dev  # Starts worker as part of the stack

Monitoring

The queue monitor at http://localhost/queues (on the multi-container development stack the UI is served by the Vite dev server at http://localhost:3000/queues) shows:

Active jobs per queue
Queue depth
Job status and progress
Error details for failed jobs

Architecture​

Dual Queue System​

Why Separate Queues?​

Priority System​

Worker Lifecycle​

Startup​

Runtime​

Shutdown​

Worker Context​

Task Cancellation​

Configuration​

Running​

Monitoring​