Architecture
This document describes some high level design concepts.
Workers are queue based
The following diagram shows (almost) all the processes involved in a Worker
.
- The
Producer
fetches jobs from the Faktory server and enqueues them on theJob Queue
. - Multiple
Consumers
dequeue jobs from theJob Queue
, process them, and enqueue the results onto theReport Queue
. - The
Reporter
dequeues results and reports correspondingack
orfail
messages to the Faktory server.
The number of jobs that can be processed concurrently is equal to the number of Consumers
, which is set by the concurrency
option.
Worker Connections
A worker only makes 3 connections to the Faktory server, no matter what the concurrency is set to:
- Producer (for fetching jobs)
- Reporter (for acking or failing jobs)
- Heartbeat (send a required keepalive message every 15 seconds)
Connection
The actually connections to the Faktory server use the Connection library which aids in error handling and reconnecting. Talking to connections are wrapped with retryable_ex
. If that fails, the process should die and supervisors will take over.
Lost jobs?
No job should ever be lost due to Faktory's ack'ing semantics, and if they are
it's Faktory's fault, not faktory_worker_ex
's... ;)
At worst, a job will be processed more than one due to a process crashing before issuing the ack.
Memory bloat?
Every job is executed in its own BEAM process. Processors and connection pools are the only long running processes.
It's the Resque model, but efficient because of BEAM processes vs Unix processes.