Posts

Showing posts from April, 2021

Druid Cluster Setup

Image
  What is Druid? Apache Druid is a real-time analytics database designed for fast slice-and-dice analytics (“ OLAP ” queries) on large data sets. Druid is most often used as a database for powering use cases where real-time ingest, fast query performance, and high uptime are important. As such, Druid is commonly used for powering GUIs of analytical applications, or as a backend for highly-concurrent APIs that need fast aggregations. Druid works best with event-oriented data. Ideal Druid Setup Architecture This section describes the Druid processes and the suggested Master/Query/Data server organisation, as shown in the architecture diagram above. Processes and Servers : Druid has several process types, briefly described below: Coordinator  processes manage data availability on the cluster. Overlord  processes control the assignment of data ingestion workloads. Broker  processes handle queries from external clients. Router  processes are optional processes that can route requests to Bro