Queue congestion

The basic flow for background queues is something like this:
 * something queues up a background item [T_start]
 * some amount of time passes [dT_waiting]
 * a background queue process pulls that item and processes it, which takes some amount of time. [dT_processing]

For most background tasks, the time taken to actually process [dT_processing] should be pretty small, and is a function of the actual task to be performed. Delays here are probably due to waiting on a foreign web server to respond, etc.

Much more variable is the time between queueing and processing [dT_waiting]. This is dependent on what else is in the system -- how the low-level queues are split up, how many other items need to be processed in front of us in the queues, and how long they're going to take.

As long as there are relatively few items to be processed, and they process quickly, dT_waiting will usually be very small. As long as dT_waiting + dT_processing are smaller than the typical elapsed total time between enqueueings, we tend to think of them as constant values that can be ignored for most purposes as "small enough".

When we suddenly flood thousands of items into the same queue, dT_waiting starts to dominate as we go farther on in the queue:

dT_waiting(item) = sum(dT_waiting(item') + dT_processing(item') for item' in predecessors(item)]

There are several unpleasant consequences:
 * more items are sitting idle in the queues than we're used to, which in large enough numbers could cause memory issues on the queue server
 * any other items added later in the same queue won't be processed until the whole flood's completed, also getting a huge delay.
 * it may take some time before we've gotten dT_waiting back down to normal -- once we've processed that "flood" we still have to finish everything that was delayed by it.

One big queue
Sharing most of our high-level queue transport/handler types into a shared queue keeps ActiveMQ happy with thousands of sites, but can contribute to high latency when things are backed up, as each individual queue is relatively linear.

Breakout queues
We have for instance the 'dist' queue for inbox distributions separated out to its own low-level queue. This helps to ensure that whatever's at the head of that queue will get handled as soon as possible, even if there's a huge backup on other queues.

It doesn't, however, help to prioritize things later on in the same queue. If we get 10000 items in 'dist', it's still going to be a long time before we reach number 10k.

Giant item
This is about what we do now for on-site inbox delivery and OMB outgoing posts. We enqueue a single event ("send this notice out to a bunch of people") and try to process them all at once.

Good: Bad:
 * only adds a single item to queue
 * only ties up a single worker thread -- other queued items can run on other threads
 * if we die partway through, we have to run the whole thing again from the beginning
 * risks losing later sub-events entirely if we die enough times for the queue item to be discarded as unprocessable
 * can only make use of a single worker thread -- we lose benefits of parallelization, and might take a lot longer than we need to complete

Fan-out flood
This is basically what we're doing now for outgoing OStatus delivery. We first enqueue a single event ("send this notice out to a bunch of people"). When we process that we look up the actual destinations, then enqueue one item per destination server ("send this notice to this server").

Good: Bad:
 * error recovery / reprocessing can work individually for each destination
 * separate items can be handled by separate worker threads, letting us parallelize delivery and get done a lot faster.
 * large number of items can delay processing of other events in the same queue group until the entire flood has been dealt with.

Rolling fan-out
A possible hybrid strategy...

We first enqueue a single event ("send this notice out to a bunch of people"). When processed, that pulls the full list of destinations, and breaks it up into smaller subsets of N items.

Now we enqueue some items:
 * each individual item from the first subset of N
 * a fan-out event with a list of the remaining items.

After the individual sets get processed we'll get to the fan-out, which will continue that pattern -- enqueue the first N items, then another fan-out event with the rest.

Good: Bad:
 * error recovery / reprocessing can work individually for each item
 * separate items can be handled by separate worker threads, letting us parallelize delivery and get done a lot faster.
 * relatively small number of items pushed into queue at once allows interleaving smoothly with other events that are being enqueued while we're working
 * if one of the fan-out events fails too much and gets discarded, we lose the entire rest of the bunch. (but if that consists solely of dumping things into queue, it's less likely to fail)
 * may take longer to deal with the complete set, as we'll end up interleaved with other events