Very useful (and dense) talk that goes into the details of Kafka consumer groups.
Production/consumption always occurs against the leader replica (for a partition); all other replicas are simply standbys.
consumer_offsetstopic contains offsets for each connected consumer (group), which provides fault tolerance.
Consumers in a group figure out what partitions to consume using a homegrown protocol.
This first used Zookeeper, which couldn't scale (expected load is 30-50k consumer groups, with 100+ consumers per group)
v2 was a custom protocol that had a broker make all the assignments, which was unsuitable
v3 (current) moves this responsibility over to a consumer in each group
FindCoordinator: hash group id, modulo no. of partitions in
consumer_offsetsto get a partition. leader of this partition is the coordinator (guaranteed to be live)
JoinGroup: agree on an assignment algorithm + receive a member id + choose one consumer as the leader (linked list of consumers in the order they joined; head of the list is the leader, remove when a consumer dies)
SyncGroup: leader decides the paritition assignment and tells the coordinator, which gives this assigment to all other consumers
Heartbeat: consumers tell the coordinator they're still alive. Response can be just OK, or a command to rebalance, which is triggered when a consumer enters/leaves the group, topic metadata changes, etc.
LeaveGroup: consumer deliberately leaves the group
Can specify multiple assignment strategies when joining a group, in priority order. This is useful for upgrades, when each consumer may support a different set of assignment strategies. The coordinator settles on the highest-priority strategy that all consumers can support.
As far as I can tell, consumers don't talk directly to each other
AbstractCoordinatorto implement a custom coordination (at the client level, not the broker) strategy (essentially customize consumer behavior for each of the 5 steps above)
Create a new
ConsumerPartitionAssignorto implement just a custom partition assignment strategy
There was some explanation of ways this protocol is used outside the core use case towards the end, but went by too fast for me to actually grasp:
The schema registry overrides
AbstractCoordinatorto provide a leader election system (does this depend on a Kafka broker or is it entirely standalone?)
Kafka connect uses this to assign generic tasks (copy table X from a given source database, for example) to workers, so there's some scope to use this more generically, not just with partitions.
Kafka streams uses a custom assignor to have a consumer join data from two source topics. This is possible because the protocol allows for user metadata, which this strategy (ab)uses.
RocksDB for local state, but I didn't really understand the context/motivation
I'm curious about fault-tolerance strategies for when the coordinator goes down, which this talk didn't cover.