Zookeeper plays the role of coordinating with servers in fail over scenarios, stores naming and configuration information, meta info. For more info visit : https://zookeeper.apache.org/
Kafka needs Zookeeper to maintain its state and store topics, acls cluster information. To maintain high availability, it is required maintain a quorum – to have an odd number of Zookeeper nodes installed to keep it running.
At any point of time, majority of ZK nodes should be up, else whole Kafka cluster will be down.
For ex :
Setup of 4 Kafka nodes in a kafka cluster, would atleast need 3 zk nodes to form a quorum.,
In production, 5 zk nodes are recommended, distributed in 3 datacentres( for disaster recovery ). In this case, if any DC comes down, majority of Zookeeper nodes (either 3 or 4 which is >=3) are still running, making sure ZK and Kafka cluster is available.
Quorum can be defined with a formula.
q = 2n+1
q is the total number of nodes, and n is the number of allowed failure nodes.
if n=2, quorum size is 5.