Druid Cluster Setup
What is Druid? Apache Druid is a real-time analytics database designed for fast slice-and-dice analytics (“ OLAP ” queries) on large data sets. Druid is most often used as a database for powering use cases where real-time ingest, fast query performance, and high uptime are important. As such, Druid is commonly used for powering GUIs of analytical applications, or as a backend for highly-concurrent APIs that need fast aggregations. Druid works best with event-oriented data. Ideal Druid Setup Architecture This section describes the Druid processes and the suggested Master/Query/Data server organisation, as shown in the architecture diagram above. Processes and Servers : Druid has several process types, briefly described below: Coordinator processes manage data availability on the cluster. Overlord processes control the assignment of data ingestion workloads. Broker processes handle queries from external clients. Router processes are optional processes that c...