SadServers
  • Scenarios
  • Labs
    All Labs Linux & Bash Web Servers Databases Data Processing Docker Kubernetes CI/CD Infrastructure as Code Tooling / Applications
  • Dashboard
  • Solutions
    For Individuals For Businesses
  • Ranking
  • Newsletter
  • Documentation
    FAQ Support Pro Accounts Pro+ Accounts Business Accounts Gift API CLI/TUI Privacy Troubleshooting Interviews
  • Blog
  • Pricing
  • Gift
    Gift Purchase Gift Redeem
  • About
Log In - Sign Up
  1. Labs
  2. ClickHouse
  3. Guide

Guide

Concepts and learning path

Troubleshooting

Failure modes and fixes

Cheatsheet

Commands to keep handy

ClickHouse guide

What ClickHouse does in production

ClickHouse is optimized for analytical workloads: aggregations, scans over billions of rows, and time-series queries. Unlike row-oriented OLTP databases (Postgres, MySQL), it stores data by column — compression is excellent and queries read only the columns they need. It is not a general-purpose transactional database; use it for append-heavy analytics and reporting.

MergeTree and data on disk

Most production tables use the MergeTree engine family. Inserts create new parts on disk; background merges combine smaller parts into larger ones. Data is partitioned (often by date) and sorted by a primary key within each part. Too many small parts — from bursty inserts or failed merges — degrades performance and can block writes.

How a query is processed

A simplified workflow when you run a SELECT:

  1. Client sends query — via clickhouse-client (native port 9000) or HTTP (8123)
  2. Parser and planner — query is parsed; relevant parts are selected using the primary key and partition key
  3. Execution — data is read column-by-column from disk; aggregation/filtering runs in memory up to limits
  4. Result — rows stream back to the client; query metrics appear in system.query_log

Heavy queries hit max_memory_usage or max_execution_time and are killed — check settings at user, profile, or query level.

Server layout and ports

  • HTTP — 8123 (curl, JDBC HTTP, monitoring)
  • Native TCP — 9000 (clickhouse-client, inter-server replication)
  • Config — /etc/clickhouse-server/config.xml, users in users.xml or users.d/
  • Data — /var/lib/clickhouse/ (parts, metadata)
  • Logs — /var/log/clickhouse-server/clickhouse-server.log

System tables to know

  • system.parts — parts per table, size, rows, active status
  • system.merges — in-progress background merges
  • system.processes — currently running queries
  • system.query_log — finished query history and resource usage
  • system.replicas — replication queue, lag, readonly status (ReplicatedMergeTree)
  • system.disks — disk space per storage volume

Replication

ReplicatedMergeTree tables use ClickHouse Keeper (or ZooKeeper) to coordinate replicas. A replica can go readonly if it loses coordination or falls behind. Monitor system.replicas for queue_size, absolute_delay, and is_readonly.

Learning resources

  • ClickHouse documentation — clickhouse.com/docs (official docs)
  • MergeTree engine — clickhouse.com/docs — MergeTree
  • System tables — clickhouse.com/docs — system tables
  • Replication — clickhouse.com/docs — replication

Practice scenarios

Hands-on ClickHouse scenarios on live Linux VMs: clickhouse

Troubleshooting →
SadServersSadServers

Real-world Linux and DevOps scenarios for hands-on learning and technical assessment.

Uptime Robot ratio (30 days)
Product
  • Scenarios
  • For Individuals
  • For Businesses
  • Pricing
Resources
  • FAQ
  • Blog
  • Newsletter
Company
  • About Us
  • Support
  • Privacy Policy
  • Terms of Service
  • Contact
Connect With Us
info@sadservers.com

Made in Canada 🇨🇦
Updated: 2026-06-13 16:06 UTC – 2d2950a