๐Ÿ•ท๏ธ Crawler Inspector

URL Lookup

Direct Parameter Lookup

Raw Queries and Responses

1. Shard Calculation

Query:
Response:
Calculated Shard: 195 (from laksa198)

2. Crawled Status Check

Query:
Response:

3. Robots.txt Check

Query:
Response:

4. Spam/Ban Check

Query:
Response:

5. Seen Status Check

โ„น๏ธ Skipped - page is already crawled

๐Ÿ“„
INDEXABLE
โœ…
CRAWLED
3 hours ago
๐Ÿค–
ROBOTS ALLOWED

Page Info Filters

FilterStatusConditionDetails
HTTP statusPASSdownload_http_code = 200HTTP 200
Age cutoffPASSdownload_stamp > now() - 6 MONTH0 months ago
History dropPASSisNull(history_drop_reason)No drop reason
Spam/banPASSfh_dont_index != 1 AND ml_spam_score = 0ml_spam_score=0
CanonicalPASSmeta_canonical IS NULL OR = '' OR = src_unparsedNot set

Page Details

PropertyValue
URLhttps://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/
Last Crawled2026-04-07 11:47:12 (3 hours ago)
First Indexed2022-06-23 07:53:42 (3 years ago)
HTTP Status Code200
Meta TitleGROUP BY | Altinityยฎ Knowledge Base for ClickHouseยฎ
Meta DescriptionLearn about the GROUP BY clause in ClickHouseยฎ
Meta Canonicalnull
Boilerpipe Text
Learn about the GROUP BY clause in ClickHouseยฎ Internal implementation Code ClickHouseยฎ uses non-blocking? hash tables, so each thread has at least one hash table. It makes easier to not care about sync between multiple threads, but has such disadvantages as: Bigger memory usage. Needs to merge those per-thread hash tables afterwards. Because second step can be a bottleneck in case of a really big GROUP BY with a lot of distinct keys, another solution has been made. Two-Level https://youtu.be/SrucFOs8Y6c?t=2132 โ”Œโ”€nameโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€valueโ”€โ”€โ”€โ”€โ”ฌโ”€changedโ”€โ”ฌโ”€descriptionโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€minโ”€โ”€โ”ฌโ”€maxโ”€โ”€โ”ฌโ”€readonlyโ”€โ”ฌโ”€typeโ”€โ”€โ”€โ” โ”‚ group_by_two_level_threshold โ”‚ 100000 โ”‚ 0 โ”‚ From what number of keys, a two-level aggregation starts. 0 - the threshold is not set. โ”‚ แดบแตแดธแดธ โ”‚ แดบแตแดธแดธ โ”‚ 0 โ”‚ UInt64 โ”‚ โ”‚ group_by_two_level_threshold_bytes โ”‚ 50000000 โ”‚ 0 โ”‚ From what size of the aggregation state in bytes, a two-level aggregation begins to be used. 0 - the threshold is not set. Two-level aggregation is used when at least one of the thresholds is triggered. โ”‚ แดบแตแดธแดธ โ”‚ แดบแตแดธแดธ โ”‚ 0 โ”‚ UInt64 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ In order to parallelize merging of hash tables, ie execute such merge via multiple threads, ClickHouse use two-level approach: On the first step ClickHouse creates 256 buckets for each thread. (determined by one byte of hash function) On the second step ClickHouse can merge those 256 buckets independently by multiple threads. https://github.com/ClickHouse/ClickHouse/blob/1ea637d996715d2a047f8cd209b478e946bdbfb0/src/Common/HashTable/TwoLevelHashTable.h#L6 GROUP BY in external memory It utilizes a two-level group by and dumps those buckets on disk. And at the last stage ClickHouse will read those buckets from disk one by one and merge them. So you should have enough RAM to hold one bucket (1/256 of whole GROUP BY size). https://clickhouse.com/docs/en/sql-reference/statements/select/group-by/#select-group-by-in-external-memory optimize_aggregation_in_order GROUP BY Usually it works slower than regular GROUP BY, because ClickHouse needs to read and process data in specific ORDER, which makes it much more complicated to parallelize reading and aggregating. But it use much less memory, because ClickHouse can stream resultset and there is no need to keep it in memory. Last item cache ClickHouse saves value of previous hash calculation, just in case next value will be the same. https://github.com/ClickHouse/ClickHouse/pull/5417 https://github.com/ClickHouse/ClickHouse/blob/808d9afd0f8110faba5ae027051bf0a64e506da3/src/Common/ColumnsHashingImpl.h#L40 StringHashMap Actually uses 5 different hash tables For empty strings For strings < 8 bytes For strings < 16 bytes For strings < 24 bytes For strings > 24 bytes SELECT count () FROM ( SELECT materialize ( '1234567890123456' ) AS key -- length(key) = 16 FROM numbers ( 1000000000 ) ) GROUP BY key Aggregator : Aggregation method : key_string Elapsed : 8 . 888 sec . Processed 1 . 00 billion rows , 8 . 00 GB ( 112 . 51 million rows / s ., 900 . 11 MB / s .) SELECT count () FROM ( SELECT materialize ( '12345678901234567' ) AS key -- length(key) = 17 FROM numbers ( 1000000000 ) ) GROUP BY key Aggregator : Aggregation method : key_string Elapsed : 9 . 089 sec . Processed 1 . 00 billion rows , 8 . 00 GB ( 110 . 03 million rows / s ., 880 . 22 MB / s .) SELECT count () FROM ( SELECT materialize ( '123456789012345678901234' ) AS key -- length(key) = 24 FROM numbers ( 1000000000 ) ) GROUP BY key Aggregator : Aggregation method : key_string Elapsed : 9 . 134 sec . Processed 1 . 00 billion rows , 8 . 00 GB ( 109 . 49 million rows / s ., 875 . 94 MB / s .) SELECT count () FROM ( SELECT materialize ( '1234567890123456789012345' ) AS key -- length(key) = 25 FROM numbers ( 1000000000 ) ) GROUP BY key Aggregator : Aggregation method : key_string Elapsed : 12 . 566 sec . Processed 1 . 00 billion rows , 8 . 00 GB ( 79 . 58 million rows / s ., 636 . 67 MB / s .) length 16 8.89 17 9.09 24 9.13 25 12.57 For what GROUP BY statement use memory Hash tables It will grow with: Amount of unique combinations of keys participated in GROUP BY Size of keys participated in GROUP BY States of aggregation functions: Be careful with function, which state can use unrestricted amount of memory and grow indefinitely: groupArray (groupArray(1000)()) uniqExact (uniq,uniqCombined) quantileExact (medianExact) (quantile,quantileTDigest) windowFunnel groupBitmap sequenceCount (sequenceMatch) *Map Why my GROUP BY eat all the RAM run your query with set send_logs_level='trace' Remove all aggregation functions from the query, try to understand how many memory simple GROUP BY will take. One by one remove aggregation functions from query in order to understand which one is taking most of memory
Markdown
[![Altinity.com](https://kb.altinity.com/images/general/altinity-logo_horizontal_blue_white.svg)](https://altinity.com/) - [Products](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/) - ##### PRODUCTS Altinity.Cloud [Managed cloud](https://altinity.com/managed-clickhouse/) Deploy on Altinityโ€™s cloud [Bring your own cloud](https://altinity.com/managed-clickhouse/bring-your-own-cloud/) Any region, any environment [Vendor comparison](https://altinity.com/altinity-cloud-vs-clickhouse-cloud-faq) [Pricing](https://altinity.com/clickhouse-pricing/) Support for ClickHouseยฎ [24/7 Support](https://altinity.com/clickhouse-support/) [POC and Evaluative Support](https://altinity.com/poc-evaluation-support/) [Training for ClickHouseยฎ](https://altinity.com/clickhouse-training/) ##### OPEN SOURCE SOFTWARE [NEW Project Antalya Builds](https://altinity.com/project-antalya-real-time-data-lakes/) Scale ClickHouse queries infinitely with 10X cheaper data lakes [Altinity Stable Builds for ClickHouseยฎ](https://altinity.com/altinity-stable/) LTS-certified binaries with 3 yearsโ€™ support [Altinity Kubernetes Operator for ClickHouseยฎ](https://altinity.com/kubernetes-operator/) Manage ClickHouse clusters on Kubernetes effortlessly [Altinity Backup for ClickHouseยฎ](https://altinity.com/altinity-backup-for-clickhouse/) Simple backup and restore tool for ClickHouse [See more \>](https://altinity.com/ecosystem/) ##### CLICKHOUSEยฎ SOLUTIONS [Observability & Logging](https://altinity.com/observability-and-logging/) Manage high-volume observability data efficiently [Security Information and Event Management (SIEM)](https://altinity.com/security-information-and-event-management/) Efficiently process and analyze high-volume security events [Trading Analytics](https://altinity.com/trading-systems/) Make faster market decisions [Customer stories](https://altinity.com/customer-stories/) See why our customers love us [Altinity Plans and Features](https://altinity.com/plans-and-features-clickhouse/) Managed service or support? Compare plans and features - [Resources](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/) - Developer Center [Documentation](https://docs.altinity.com/) Product guides and tutorials [Knowledge Base](https://kb.altinity.com/) Answers to common questions and issues [Past Webinars](https://altinity.com/blog/?cat=webinarspage) Watch anytime, on demand. [More Technical Content](https://altinity.com/developer-resources/) How-to guides and videos Learning [What is ClickHouse?](https://altinity.com/clickhouse-database/) New to ClickHouse? Start here [Kubernetes Operator](https://altinity.com/kubernetes-operator/) Learn how it works and who uses it [Unevenly Distributed](https://altinity.com/unevenly-distributed/) A Thought Leadership Series on Real-Time Data Lakes [Altinity Stable Builds](https://altinity.com/altinity-stable/) Learn why you should be using them in your prod environments Community [Open Source at Altinity](https://altinity.com/ecosystem/) Explore our open source projects and contributions [Events](https://altinity.com/events/) Upcoming conferences, events, and webinars [Slack Community](https://altinity.com/slack/) Get help for any ClickHouse issue from experts [Monthly Office Hours](https://altinity.com/office-hours-for-clickhouse/) Meet our engineers live and ask your questions - [Blog](https://altinity.com/blog) - [Company](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/) - [About the company](https://altinity.com/about-us/) [Leadership](https://altinity.com/about-us/#leadership) [Press Releases](https://altinity.com/about-us/press-releases/) [Partners](https://altinity.com/partners/) [Customer Stories](https://altinity.com/customer-stories/) [Careers](https://altinity.com/careers/) ##### Get in touch with ClickHouse experts. [Contact Us](https://altinity.com/contact) - [Contact](https://altinity.com/contact) - [Tech Support](https://altinity.com/free-clickhouse-consultation/) [ClickHouseยฎ in Cloud](https://acm.altinity.cloud/signup) - [Products]() - [Altinity.Cloud](https://altinity.com/managed-clickhouse/) - [Managed cloud](https://altinity.com/managed-clickhouse/) - [Bring your own cloud](https://altinity.com/managed-clickhouse/bring-your-own-cloud/) - [Vendor comparison](https://altinity.com/altinity-cloud-vs-clickhouse-cloud-faq/) - [Pricing](https://altinity.com/clickhouse-pricing/) - [Support for ClickHouseยฎ](https://altinity.com/clickhouse-support/) - [24/7 Support](https://altinity.com/clickhouse-support/) - [POC and evaluative support](https://altinity.com/poc-evaluation-support/) - [Training for ClickHouse](https://altinity.com/clickhouse-training/) - [Open Source Software](https://altinity.com/ecosystem/) - [NEW Project Antalya Builds](https://altinity.com/project-antalya-real-time-data-lakes/) - [Altinity Stable Builds for ClickHouseยฎ](https://altinity.com/altinity-stable/) - [Altinity Kubernetes Operator for ClickHouseยฎ](https://altinity.com/kubernetes-operator/) - [Altinity Backup for ClickHouseยฎ](https://altinity.com/altinity-backup-for-clickhouse/) - [Customer Stories](https://altinity.com/customer-stories/) - [Altinity Plans and Features](https://altinity.com/plans-and-features-clickhouse/) - [Resources]() - [Developer Center](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/) - [Documentation](https://docs.altinity.com/) - [Knowledge Base](https://kb.altinity.com/) - [Past Webinars](https://altinity.com/blog/?cat=webinarspage) - [More Technical Content](https://altinity.com/developer-resources/) - [Learning](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/) - [What is ClickHouse?](https://altinity.com/clickhouse-database/) - [Kubernetes Operator](https://altinity.com/what-is-kubernetes/) - [Unevenly Distributed](https://altinity.com/unevenly-distributed/) - [Altinity Stable Builds](https://altinity.com/altinity-stable/) - [Community](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/) - [Open Source at Altinity](https://altinity.com/ecosystem/) - [Events](https://altinity.com/events) - [Slack Community](http://www.localhost:10008/slack) - [Monthly Office Hours](https://altinity.com/office-hours-for-clickhouse/) - [Blog](https://altinity.com/blog) - [Company]() - [About the company](https://altinity.com/about-us/) - [Leadership](https://altinity.com/about-us/#leadership) - [Press Releases](https://altinity.com/about-us/press-releases/) - [Partners](https://altinity.com/partners/) - [Customer Stories](https://altinity.com/customer-stories/) - [Careers](https://altinity.com/careers/) - [Contact](https://altinity.com/contact) - [Tech Support](https://altinity.com/free-clickhouse-consultation) - [ClickHouseยฎ in Cloud](https://acm.altinity.cloud/signup?__hstc=173202428.421802398cbec3b5d6457748d00d825f.1702395478849.1769622795179.1769630357065.395&__hssc=173202428.9.1769630357065&__hsfp=e5063c5ad56e06035f046e27f2ea897a) - [Altinityยฎ Knowledge Base for ClickHouseยฎ](https://kb.altinity.com/) - [Engines](https://kb.altinity.com/engines/) - [Atomic Database Engine](https://kb.altinity.com/engines/altinity-kb-atomic-database-engine/ "ClickHouseยฎ Atomic Database Engine") - [How to Convert Ordinary to Atomic](https://kb.altinity.com/engines/altinity-kb-atomic-database-engine/how-to-convert-ordinary-to-atomic/) - [How to Convert Atomic to Ordinary](https://kb.altinity.com/engines/altinity-kb-atomic-database-engine/altinity-kb-how-to-convert-atomic-to-ordinary/) - [EmbeddedRocksDB & dictionary](https://kb.altinity.com/engines/altinity-kb-embeddedrocksdb-and-dictionary/) - [MergeTree table engine family](https://kb.altinity.com/engines/mergetree-table-engine-family/) - [CollapsingMergeTree vs ReplacingMergeTree](https://kb.altinity.com/engines/mergetree-table-engine-family/collapsing-vs-replacing/) - [Part names & MVCC](https://kb.altinity.com/engines/mergetree-table-engine-family/part-naming-and-mvcc/) - [Properly ordering and partitioning MergeTree tables](https://kb.altinity.com/engines/mergetree-table-engine-family/pick-keys/ "How to pick an ORDER BY / PRIMARY KEY / PARTITION BY for the MergeTree family table") - [AggregatingMergeTree](https://kb.altinity.com/engines/mergetree-table-engine-family/aggregatingmergetree/ "ClickHouseยฎ AggregatingMergeTree") - [index & column files](https://kb.altinity.com/engines/mergetree-table-engine-family/index-and-column-files/) - [Merge performance and OPTIMIZE FINAL](https://kb.altinity.com/engines/mergetree-table-engine-family/merge-performance-final-optimize-by/) - [Nulls in order by](https://kb.altinity.com/engines/mergetree-table-engine-family/altinity-kb-nulls-in-order-by/) - [ReplacingMergeTree](https://kb.altinity.com/engines/mergetree-table-engine-family/replacingmergetree/) - [ReplacingMergeTree does not collapse duplicates](https://kb.altinity.com/engines/mergetree-table-engine-family/replacingmergetree/altinity-kb-replacingmergetree-does-not-collapse-duplicates/) - [Skip index](https://kb.altinity.com/engines/mergetree-table-engine-family/skip-index/) - [SummingMergeTree](https://kb.altinity.com/engines/mergetree-table-engine-family/summingmergetree/) - [VersionedCollapsingMT](https://kb.altinity.com/engines/mergetree-table-engine-family/versioned-collapsing-mergetree/ "UPSERT by VersionedCollapsingMergeTree") - [Queries & Syntax](https://kb.altinity.com/altinity-kb-queries-and-syntax/) - [GROUP BY](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/) - [GROUP BY tricks](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/tricks/) - [Adjustable table partitioning](https://kb.altinity.com/altinity-kb-queries-and-syntax/variable-partitioning/) - [DateTime64](https://kb.altinity.com/altinity-kb-queries-and-syntax/datetime64/) - [DISTINCT & GROUP BY & LIMIT 1 BY what the difference](https://kb.altinity.com/altinity-kb-queries-and-syntax/distinct-vs-group-by-vs-limit-by/) - [Imprecise literal Decimal or Float64 values](https://kb.altinity.com/altinity-kb-queries-and-syntax/literal-decimal-or-float/ "Imprecise parsing of literal Decimal or Float64") - [Multiple aligned date columns in PARTITION BY expression](https://kb.altinity.com/altinity-kb-queries-and-syntax/multiple-date-column-in-partition-key/) - [Row policies overhead](https://kb.altinity.com/altinity-kb-queries-and-syntax/row_policy_using_dictionary/ "Row policies overhead (hiding 'removed' tenants)") - [Slow \`SELECT count()\`](https://kb.altinity.com/altinity-kb-queries-and-syntax/slow_select_count/ "Why is simple `SELECT count()` Slow in ClickHouseยฎ?") - [trace\_log](https://kb.altinity.com/altinity-kb-queries-and-syntax/trace_log/ "Collecting query execution flamegraphs using system.trace_log") - [Using array functions to mimic window-functions alike behavior](https://kb.altinity.com/altinity-kb-queries-and-syntax/array-functions-as-window/) - [\-State & -Merge combinators](https://kb.altinity.com/altinity-kb-queries-and-syntax/state-and-merge-combinators/) - [ALTER MODIFY COLUMN is stuck, the column is inaccessible.](https://kb.altinity.com/altinity-kb-queries-and-syntax/altinity-kb-alter-modify-column-is-stuck-the-column-is-inaccessible/) - [ANSI SQL mode](https://kb.altinity.com/altinity-kb-queries-and-syntax/ansi-sql-mode/) - [Async INSERTs](https://kb.altinity.com/altinity-kb-queries-and-syntax/async-inserts/) - [Atomic insert](https://kb.altinity.com/altinity-kb-queries-and-syntax/atomic-insert/) - [ClickHouse Projections](https://kb.altinity.com/altinity-kb-queries-and-syntax/projections-examples/ "ClickHouseยฎ Projections") - [Cumulative Anything](https://kb.altinity.com/altinity-kb-queries-and-syntax/cumulative-unique/) - [Data types on disk and in RAM](https://kb.altinity.com/altinity-kb-queries-and-syntax/data-types-on-disk-and-in-ram/) - [DELETE via tombstone column](https://kb.altinity.com/altinity-kb-queries-and-syntax/delete-via-tombstone-column/) - [EXPLAIN query](https://kb.altinity.com/altinity-kb-queries-and-syntax/explain-query/) - [Fill missing values at query time](https://kb.altinity.com/altinity-kb-queries-and-syntax/fill-missing-values-at-query-time/) - [FINAL clause speed](https://kb.altinity.com/altinity-kb-queries-and-syntax/altinity-kb-final-clause-speed/) - [Join with Calendar using Arrays](https://kb.altinity.com/altinity-kb-queries-and-syntax/join-with-calendar-using-arrays/) - [JOINs](https://kb.altinity.com/altinity-kb-queries-and-syntax/joins/) - [JOIN optimization tricks](https://kb.altinity.com/altinity-kb-queries-and-syntax/joins/joins-tricks/) - [JSONExtract to parse many attributes at a time](https://kb.altinity.com/altinity-kb-queries-and-syntax/jsonextract-to-parse-many-attributes-at-a-time/) - [KILL QUERY](https://kb.altinity.com/altinity-kb-queries-and-syntax/altinity-kb-kill-query/) - [Lag / Lead](https://kb.altinity.com/altinity-kb-queries-and-syntax/lag-lead/) - [Machine learning in ClickHouse](https://kb.altinity.com/altinity-kb-queries-and-syntax/machine-learning-in-clickhouse/) - [Mutations](https://kb.altinity.com/altinity-kb-queries-and-syntax/mutations/) - [OPTIMIZE vs OPTIMIZE FINAL](https://kb.altinity.com/altinity-kb-queries-and-syntax/altinity-kb-optimize-vs-optimize-final/) - [Parameterized views](https://kb.altinity.com/altinity-kb-queries-and-syntax/altinity-kb-parameterized-views/) - [partial-projection-optimization](https://kb.altinity.com/altinity-kb-queries-and-syntax/partial-projection-optimization/ "Use both projection and raw data in single query") - [PIVOT / UNPIVOT](https://kb.altinity.com/altinity-kb-queries-and-syntax/pivot-unpivot/) - [Possible deadlock avoided. Client should retry](https://kb.altinity.com/altinity-kb-queries-and-syntax/altinity-kb-possible-deadlock-avoided.-client-should-retry/) - [Roaring bitmaps for calculating retention](https://kb.altinity.com/altinity-kb-queries-and-syntax/roaring-bitmaps-for-calculating-retention/) - [SAMPLE by](https://kb.altinity.com/altinity-kb-queries-and-syntax/altinity-kb-sample-by/) - [Sampling Example](https://kb.altinity.com/altinity-kb-queries-and-syntax/sampling-example/) - [Simple aggregate functions & combinators](https://kb.altinity.com/altinity-kb-queries-and-syntax/simplestateif-or-ifstate-for-simple-aggregate-functions/) - [Skip indexes](https://kb.altinity.com/altinity-kb-queries-and-syntax/skip-indexes/) - [Example: minmax](https://kb.altinity.com/altinity-kb-queries-and-syntax/skip-indexes/minmax/) - [Skip index bloom\_filter Example](https://kb.altinity.com/altinity-kb-queries-and-syntax/skip-indexes/skip-index-bloom_filter-for-array-column/) - [Skip indexes examples](https://kb.altinity.com/altinity-kb-queries-and-syntax/skip-indexes/skip-indexes-examples/) - [Time zones](https://kb.altinity.com/altinity-kb-queries-and-syntax/time-zones/) - [Time-series alignment with interpolation](https://kb.altinity.com/altinity-kb-queries-and-syntax/ts-interpolation/) - [Top N & Remain](https://kb.altinity.com/altinity-kb-queries-and-syntax/top-n-and-remain/) - [Troubleshooting](https://kb.altinity.com/altinity-kb-queries-and-syntax/troubleshooting/) - [TTL](https://kb.altinity.com/altinity-kb-queries-and-syntax/ttl/) - [MODIFY (ADD) TTL](https://kb.altinity.com/altinity-kb-queries-and-syntax/ttl/modify-ttl/ "MODIFY (ADD) TTL in ClickHouseยฎ") - [What are my TTL settings](https://kb.altinity.com/altinity-kb-queries-and-syntax/ttl/what-are-my-ttls/ "What are my TTL settings?") - [TTL GROUP BY Examples](https://kb.altinity.com/altinity-kb-queries-and-syntax/ttl/ttl-group-by-examples/) - [TTL Recompress example](https://kb.altinity.com/altinity-kb-queries-and-syntax/ttl/ttl-recompress-example/) - [UPDATE via Dictionary](https://kb.altinity.com/altinity-kb-queries-and-syntax/update-via-dictionary/) - [Values mapping](https://kb.altinity.com/altinity-kb-queries-and-syntax/values-mapping/) - [Window functions](https://kb.altinity.com/altinity-kb-queries-and-syntax/window-functions/) - [Functions](https://kb.altinity.com/altinity-kb-functions/) - [Encoding and Decoding of quantileTDigest states](https://kb.altinity.com/altinity-kb-functions/how-to-encode-decode-quantiletdigest-state/ "How to encode/decode quantileTDigest states from/to list of centroids") - [kurt & skew](https://kb.altinity.com/altinity-kb-functions/kurt_skew_statistics/ "kurt & skew statistical functions in ClickHouseยฎ ") - [\-Resample vs -If vs -Map vs Subquery](https://kb.altinity.com/altinity-kb-functions/resample-vs-if-vs-map-vs-subquery/) - [arrayFold](https://kb.altinity.com/altinity-kb-functions/arrayfold/) - [arrayMap, arrayJoin or ARRAY JOIN memory usage](https://kb.altinity.com/altinity-kb-functions/array-like-memory-usage/) - [assumeNotNull and friends](https://kb.altinity.com/altinity-kb-functions/assumenotnull-and-friends/) - [Encrypt](https://kb.altinity.com/altinity-kb-functions/encrypt/) - [sequenceMatch](https://kb.altinity.com/altinity-kb-functions/altinity-kb-sequencematch/) - [Integrations](https://kb.altinity.com/altinity-kb-integrations/) - [Altinity Cloud Access Management](https://kb.altinity.com/altinity-kb-integrations/altinity-cloud/) - [ClickHouseยฎ python drivers](https://kb.altinity.com/altinity-kb-integrations/clickhouse_python_drivers/) - [Integrating ClickHouseยฎ with MySQL](https://kb.altinity.com/altinity-kb-integrations/mysql-clickhouse/ "MySQL") - [ODBC Driver for ClickHouseยฎ](https://kb.altinity.com/altinity-kb-integrations/clickhouse-odbc/) - [Spark](https://kb.altinity.com/altinity-kb-integrations/spark/ "ClickHouseยฎ + Spark") - [BI Tools](https://kb.altinity.com/altinity-kb-integrations/bi-tools/) - [CatBoost / MindsDB / Fast.ai](https://kb.altinity.com/altinity-kb-integrations/catboost-mindsdb-fast.ai/) - [Google S3 (GCS)](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-google-s3-gcs/) - [Kafka](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/) - [Kafka virtual columns](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/altinity-kb-kafka-virtual-columns/ "Kafka engine Virtual columns") - [Schema Inference for Kafka](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/kafka-schema-inference/ "Inferring Schema from AvroConfluent Messages in Kafka for ClickHouseยฎ") - [Setting the background message broker schedule pool size](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/background_message_broker_schedule_pool_size/) - [Adjusting librdkafka settings](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/altinity-kb-adjusting-librdkafka-settings/) - [Error handling](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/error-handling/) - [Exactly once semantics](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/altinity-kb-exactly-once-semantics/) - [Kafka main parsing loop](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/altinity-kb-kafka-main-parsing-loop/) - [Kafka parallel consuming](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/altinity-kb-kafka-parallel-consuming/) - [Multiple MVs attached to Kafka table](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/altinity-kb-kafka-mv-consuming/) - [Rewind / fast-forward / replay](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/altinity-kb-rewind-fast-forward-replay/) - [SELECTs from engine=Kafka](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-kafka/altinity-kb-selects-from-engine-kafka/) - [RabbitMQ](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-rabbitmq/) - [RabbitMQ Error handling](https://kb.altinity.com/altinity-kb-integrations/altinity-kb-rabbitmq/error-handling/) - [Setup & maintenance](https://kb.altinity.com/altinity-kb-setup-and-maintenance/) - [S3 & object storage](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-s3-object-storage/) - [AWS S3 Recipes](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-s3-object-storage/aws-s3-recipes/) - [Clean up orphaned objects on s3](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-s3-object-storage/clean-up-orphaned-objects-on-s3.md/) - [s3 and mutations](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-s3-object-storage/s3_and_mutations/ "How much data are written to S3 during mutations") - [s3 cached table](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-s3-object-storage/s3_cache_example/ "Example of the table at s3 with cache") - [S3Disk](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-s3-object-storage/s3disk/) - [AggregateFunction(uniq, UUID) doubled after ClickHouseยฎ upgrade](https://kb.altinity.com/altinity-kb-setup-and-maintenance/uniq-uuid-doubled-clickhouse-upgrade/) - [Can not connect to my ClickHouseยฎ server](https://kb.altinity.com/altinity-kb-setup-and-maintenance/connection-problems/) - [cgroups and k8s](https://kb.altinity.com/altinity-kb-setup-and-maintenance/cgroups_k8s/ "cgroups and kubernetes cloud providers") - [ClickHouse logs and Vector.dev](https://kb.altinity.com/altinity-kb-setup-and-maintenance/ch-logs-2-json-vectordev/ "Transforming ClickHouse logs to ndjson using Vector.dev") - [ClickHouse operator](https://kb.altinity.com/altinity-kb-setup-and-maintenance/clickhouse-operator/ "Altinity Kubernetes Operator For ClickHouseยฎ") - [ClickHouseยฎ and different filesystems](https://kb.altinity.com/altinity-kb-setup-and-maintenance/filesystems/) - [ClickHouseยฎ RBAC example](https://kb.altinity.com/altinity-kb-setup-and-maintenance/rbac/ "ClickHouseยฎ Access Control and Account Management (RBAC)") - [Client Timeouts](https://kb.altinity.com/altinity-kb-setup-and-maintenance/client-timeouts/) - [Compatibility layer for the Altinity Kubernetes Operator for ClickHouseยฎ](https://kb.altinity.com/altinity-kb-setup-and-maintenance/monitoring-operator-exporter-compatibility/) - [Convert uniqExact to uniq(Combined)](https://kb.altinity.com/altinity-kb-setup-and-maintenance/uniqexact-to-uniq-combined/ "How to convert uniqExact states to approximate uniq functions states") - [Custom Settings](https://kb.altinity.com/altinity-kb-setup-and-maintenance/custom_settings/) - [Description of asynchronous\_metrics](https://kb.altinity.com/altinity-kb-setup-and-maintenance/asynchronous_metrics_descr/) - [disk encryption](https://kb.altinity.com/altinity-kb-setup-and-maintenance/disk_encryption/ "ClickHouseยฎ data/disk encryption (at rest)") - [DR two DC](https://kb.altinity.com/altinity-kb-setup-and-maintenance/dr-two-dc/) - [How ALTERs work in ClickHouseยฎ](https://kb.altinity.com/altinity-kb-setup-and-maintenance/alters/) - [How to recreate a table](https://kb.altinity.com/altinity-kb-setup-and-maintenance/how_to_recreate_table/ "How to recreate a table in case of total corruption of the replication queue") - [http\_handlers](https://kb.altinity.com/altinity-kb-setup-and-maintenance/http_handlers/ "http handler example") - [Jemalloc heap profiling](https://kb.altinity.com/altinity-kb-setup-and-maintenance/jemalloc_heap_profiling/) - [Logging](https://kb.altinity.com/altinity-kb-setup-and-maintenance/logging/) - [Merge Memory in metric\_log](https://kb.altinity.com/altinity-kb-setup-and-maintenance/metric_log_ram/ "High Memory Usage During Merge in system.metric_log") - [Precreate parts using clickhouse-local](https://kb.altinity.com/altinity-kb-setup-and-maintenance/precreate_parts_using_clickhouse_local.sh/) - [Recovery after complete data loss](https://kb.altinity.com/altinity-kb-setup-and-maintenance/recovery-after-complete-data-loss/) - [Replicate RBAC with Keeper](https://kb.altinity.com/altinity-kb-setup-and-maintenance/users_in_keeper/ "How to Replicate ClickHouse RBAC Users and Grants with ZooKeeper/Keeper") - [Replication: Can not resolve host of another ClickHouseยฎ server](https://kb.altinity.com/altinity-kb-setup-and-maintenance/change-me/) - [source parts sizeis greater than the current maximum](https://kb.altinity.com/altinity-kb-setup-and-maintenance/source-pars-size-is-greater-than-maximum/ "source parts size is greater than the current maximum") - [Successful ClickHouseยฎ deployment plan](https://kb.altinity.com/altinity-kb-setup-and-maintenance/clickhouse-deployment-plan/) - [sysall database](https://kb.altinity.com/altinity-kb-setup-and-maintenance/sysall/ "sysall database (system tables on a cluster level)") - [Timeouts during OPTIMIZE FINAL](https://kb.altinity.com/altinity-kb-setup-and-maintenance/timeouts-during-optimize-final/) - [Use an executable dictionary as cron task](https://kb.altinity.com/altinity-kb-setup-and-maintenance/executable-dictionary/) - [Useful settings to turn on](https://kb.altinity.com/altinity-kb-setup-and-maintenance/useful-setting-to-turn-on/ "Useful settings to turn on/Defaults that should be reconsidered") - [Who ate my CPU](https://kb.altinity.com/altinity-kb-setup-and-maintenance/who-ate-my-cpu/) - [Zookeeper session has expired](https://kb.altinity.com/altinity-kb-setup-and-maintenance/zookeeper-session-expired/) - [Server config files](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-server-config-files/ "Server configuration files") - [Aggressive merges](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-aggressive_merges/) - [Altinity Backup for ClickHouseยฎ](https://kb.altinity.com/altinity-kb-setup-and-maintenance/clickhouse-backup/) - [Altinity packaging compatibility \>21.x and earlier](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-packaging-compatibility-greater-than-21.x-and-earlier/) - [AWS EC2 Storage](https://kb.altinity.com/altinity-kb-setup-and-maintenance/aws-ec2-storage/) - [ClickHouseยฎ in Docker](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-clickhouse-in-docker/) - [ClickHouseยฎ Monitoring](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-monitoring/) - [ClickHouseยฎ versions](https://kb.altinity.com/altinity-kb-setup-and-maintenance/clickhouse-versions/) - [Configure ClickHouseยฎ for low memory environments](https://kb.altinity.com/altinity-kb-setup-and-maintenance/configure_clickhouse_for_low_mem_envs/) - [Converting MergeTree to Replicated](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-converting-mergetree-to-replicated/) - [Data Migration](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/) - [Export from MSSQL to ClickHouseยฎ](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/mssql-clickhouse/ "MSSQL bcp pipe to clickhouse-client") - [add\_remove\_replica](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/add_remove_replica/ "Add/Remove a new replica to a ClickHouseยฎ cluster") - [clickhouse-copier](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/altinity-kb-clickhouse-copier/) - [clickhouse-copier 20.3 and earlier](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/altinity-kb-clickhouse-copier/altinity-kb-clickhouse-copier-20.3-and-earlier/) - [clickhouse-copier 20.4 - 21.6](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/altinity-kb-clickhouse-copier/altinity-kb-clickhouse-copier-20.4_21.6/) - [Kubernetes job for clickhouse-copier](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/altinity-kb-clickhouse-copier/altinity-kb-clickhouse-copier-kubernetes-job/) - [Distributed table to cluster](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/distributed-table-cluster/ "Distributed table to ClickHouseยฎ Cluster") - [Fetch Alter Table](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/fetch_alter_table/) - [Remote table function](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/remote-table-function/) - [rsync](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-data-migration/rsync/ "Moving ClickHouse to Another Server") - [DDLWorker and DDL queue problems](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-ddlworker/) - [There are N unfinished hosts (0 of them are currently active).](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-ddlworker/there-are-n-unfinished-hosts-0-of-them-are-currently-active/) - [Desharding](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-desharding/ "Merge Shards") - [differential backups using clickhouse-backup](https://kb.altinity.com/altinity-kb-setup-and-maintenance/clickhouse-backup-diff/) - [High CPU usage](https://kb.altinity.com/altinity-kb-setup-and-maintenance/high-cpu-usage/ "High CPU usage in ClickHouseยฎ") - [Load balancers](https://kb.altinity.com/altinity-kb-setup-and-maintenance/load-balancers/) - [memory configuration settings](https://kb.altinity.com/altinity-kb-setup-and-maintenance/altinity-kb-memory-configuration-settings/) - [Useful queries](https://kb.altinity.com/altinity-kb-useful-queries/) - [Check table metadata in zookeeper](https://kb.altinity.com/altinity-kb-useful-queries/table-meta-in-zookeeper/) - [Compare query\_log for 2 intervals](https://kb.altinity.com/altinity-kb-useful-queries/compare_query_log_for_2_intervals/) - [Debug hanging thing](https://kb.altinity.com/altinity-kb-useful-queries/debug-hang/) - [Handy queries for system.query\_log](https://kb.altinity.com/altinity-kb-useful-queries/query_log/) - [Ingestion metrics from system.part\_log](https://kb.altinity.com/altinity-kb-useful-queries/ingestion-rate-part_log/) - [Remove block numbers from zookeeper for removed partitions](https://kb.altinity.com/altinity-kb-useful-queries/remove_unneeded_block_numbers/) - [Removing tasks in the replication queue related to empty partitions](https://kb.altinity.com/altinity-kb-useful-queries/remove_empty_partitions_from_rq/) - [Can detached parts be dropped?](https://kb.altinity.com/altinity-kb-useful-queries/detached-parts/ "Can detached parts in ClickHouseยฎ be dropped?") - [Database Size - Table - Column size](https://kb.altinity.com/altinity-kb-useful-queries/altinity-kb-database-size-table-column-size/) - [Notes on Various Errors with respect to replication and distributed connections](https://kb.altinity.com/altinity-kb-useful-queries/connection-issues-distributed-parts/) - [Number of active parts in a partition](https://kb.altinity.com/altinity-kb-useful-queries/altinity-kb-number-of-active-parts-in-a-partition/) - [Parts consistency](https://kb.altinity.com/altinity-kb-useful-queries/parts-consistency/) - [Schema design](https://kb.altinity.com/altinity-kb-schema-design/) - [ClickHouse limitations](https://kb.altinity.com/altinity-kb-schema-design/how-much-is-too-much/ "ClickHouseยฎ limitations") - [ClickHouseยฎ row-level deduplication](https://kb.altinity.com/altinity-kb-schema-design/row-level-deduplication/) - [Column backfilling from dictionary](https://kb.altinity.com/altinity-kb-schema-design/backfill_column/ "Column backfilling with alter/update using a dictionary") - [Functions to count uniqs](https://kb.altinity.com/altinity-kb-schema-design/uniq-functions/) - [How to change ORDER BY](https://kb.altinity.com/altinity-kb-schema-design/change-order-by/) - [Ingestion of AggregateFunction](https://kb.altinity.com/altinity-kb-schema-design/ingestion-aggregate-function/) - [insert deduplication](https://kb.altinity.com/altinity-kb-schema-design/insert_deduplication/ "Insert Deduplication / Insert Idempotency") - [JSONEachRow, tuple, map and MVs](https://kb.altinity.com/altinity-kb-schema-design/altinity-kb-jsoneachrow-tuples-and-mvs/ "JSONEachRow, Tuples, Maps and Materialized Views") - [Pre-Aggregation approaches](https://kb.altinity.com/altinity-kb-schema-design/preaggregations/) - [SnowflakeID](https://kb.altinity.com/altinity-kb-schema-design/snowflakeid/ "SnowflakeID for Efficient Primary Keys ") - [Two columns indexing](https://kb.altinity.com/altinity-kb-schema-design/two-columns-indexing/) - [Best schema for storing many metrics registered from the single source](https://kb.altinity.com/altinity-kb-schema-design/best-schema-for-storing-many-metrics-registered-from-the-single-source/) - [Codecs](https://kb.altinity.com/altinity-kb-schema-design/codecs/) - [Codecs on array columns](https://kb.altinity.com/altinity-kb-schema-design/codecs/codecs-on-array-columns/) - [Codecs speed](https://kb.altinity.com/altinity-kb-schema-design/codecs/codecs-speed/) - [How to test different compression codecs](https://kb.altinity.com/altinity-kb-schema-design/codecs/altinity-kb-how-to-test-different-compression-codecs/) - [Dictionaries vs LowCardinality](https://kb.altinity.com/altinity-kb-schema-design/altinity-kb-dictionaries-vs-lowcardinality/) - [Flattened table](https://kb.altinity.com/altinity-kb-schema-design/flattened-table/) - [Floats vs Decimals](https://kb.altinity.com/altinity-kb-schema-design/floats-vs-decimals/) - [Ingestion performance and formats](https://kb.altinity.com/altinity-kb-schema-design/ingestion-performance-and-formats/) - [IPs/masks](https://kb.altinity.com/altinity-kb-schema-design/how-to-store-ips/) - [JSONAsString and Mat. View as JSON parser](https://kb.altinity.com/altinity-kb-schema-design/altinity-kb-jsonasstring-and-mat.-view-as-json-parser/) - [LowCardinality](https://kb.altinity.com/altinity-kb-schema-design/lowcardinality/) - [Materialized Views](https://kb.altinity.com/altinity-kb-schema-design/materialized-views/ "ClickHouseยฎ Materialized Views") - [Idempotent insert MV](https://kb.altinity.com/altinity-kb-schema-design/materialized-views/idempotent_inserts_mv/ "Idempotent inserts into a materialized view") - [Backfill/populate MV in a controlled manner](https://kb.altinity.com/altinity-kb-schema-design/materialized-views/backfill-populate-mv-in-a-controlled-manner/) - [Using the Altinity Kubernetes Operator for ClickHouseยฎ](https://kb.altinity.com/altinity-kb-kubernetes/) - [Istio Issues](https://kb.altinity.com/altinity-kb-kubernetes/altinity-kb-istio-user-issue-k8s/) - [Interfaces](https://kb.altinity.com/altinity-kb-interfaces/) - [clickhouse-client](https://kb.altinity.com/altinity-kb-interfaces/altinity-kb-clickhouse-client/) - [Upgrade](https://kb.altinity.com/upgrade/) - [Vulnerabilities](https://kb.altinity.com/upgrade/vulnerabilities/) - [ClickHouseยฎ Function/Engines/Settings Report](https://kb.altinity.com/upgrade/clickhouse-feature-report/) - [Removing empty parts](https://kb.altinity.com/upgrade/removing-empty-parts/) - [Removing lost parts](https://kb.altinity.com/upgrade/removing-lost-parts/) - [Dictionaries](https://kb.altinity.com/altinity-kb-dictionaries/) - [Dictionaries & arrays](https://kb.altinity.com/altinity-kb-dictionaries/dictionaries-and-arrays/) - [Dictionary on the top of several tables using VIEW](https://kb.altinity.com/altinity-kb-dictionaries/dictionary-on-top-tables/) - [Dimension table design](https://kb.altinity.com/altinity-kb-dictionaries/dimension_table_desing/) - [Example of PostgreSQL dictionary](https://kb.altinity.com/altinity-kb-dictionaries/example-of-postgresql-dictionary/) - [MySQL8 source for dictionaries](https://kb.altinity.com/altinity-kb-dictionaries/mysql8-source-for-dictionaries/) - [Partial updates](https://kb.altinity.com/altinity-kb-dictionaries/partial-updates/) - [range\_hashed example - open intervals](https://kb.altinity.com/altinity-kb-dictionaries/altinity-kb-range_hashed-example-open-intervals/) - [Security named collections](https://kb.altinity.com/altinity-kb-dictionaries/security-named-collections/) - [SPARSE\_HASHED VS HASHED vs HASHED\_ARRAY](https://kb.altinity.com/altinity-kb-dictionaries/altinity-kb-sparse_hashed-vs-hashed/) - [Using This Knowledge Base](https://kb.altinity.com/using-this-knowledgebase/) - [Mermaid Example](https://kb.altinity.com/using-this-knowledgebase/mermaid_example/) - [ClickHouseยฎ Upgrade eBook](https://hubs.la/Q03ccpmq0 "Upgrade eBook") - [ClickHouse TTLs in Practice (eBook)](https://altinity.com/clickhouse-ttl/?utm_campaign=34417448-CH_TTL_eBook&utm_source=KB&utm_medium=CTA_Sidebar) [Edit this page](https://github.com/Altinity/altinityknowledgebase/edit/main/content/en/altinity-kb-queries-and-syntax/group-by/_index.md) [Create child page](https://github.com/Altinity/altinityknowledgebase/new/main/content/en/altinity-kb-queries-and-syntax/group-by/_index.md?filename=change-me.md&value=---%0Atitle%3A+%22Long+Page+Title%22%0AlinkTitle%3A+%22Short+Nav+Title%22%0Aweight%3A+100%0Adescription%3A+%3E-%0A+++++Page+description+for+heading+and+indexes.%0A---%0A%0A%23%23+Heading%0A%0AEdit+this+template+to+create+your+new+page.%0A%0A%2A+Give+it+a+good+name%2C+ending+in+%60.md%60+-+e.g.+%60getting-started.md%60%0A%2A+Edit+the+%22front+matter%22+section+at+the+top+of+the+page+%28weight+controls+how+its+ordered+amongst+other+pages+in+the+same+directory%3B+lowest+number+first%29.%0A%2A+Add+a+good+commit+message+at+the+bottom+of+the+page+%28%3C80+characters%3B+use+the+extended+description+field+for+more+detail%29.%0A%2A+Create+a+new+branch+so+you+can+preview+your+new+file+and+request+a+review+via+Pull+Request.%0A) [Create documentation issue](https://github.com/Altinity/altinityknowledgebase/issues/new?title=/altinity-kb-queries-and-syntax/group-by/) [Print entire section](http://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/printview/) - [Internal implementation](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/#internal-implementation) - [Two-Level](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/#two-level) - [GROUP BY in external memory](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/#group-by-in-external-memory) - [optimize\_aggregation\_in\_order GROUP BY](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/#optimize_aggregation_in_order-group-by) - [Last item cache](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/#last-item-cache) - [StringHashMap](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/#stringhashmap) - [For what GROUP BY statement use memory](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/#for-what-group-by-statement-use-memory) - [Why my GROUP BY eat all the RAM](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/#why-my-group-by-eat-all-the-ram) Altinityยฎ, Altinity.Cloudยฎ, and Altinity Stableยฎ are registered trademarks of Altinity, Inc. ClickHouseยฎ is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc. Run open-source ClickHouseยฎ better Blogs, webinars, office hours, in-person events: [check out the latest technical content from Altinity.](https://hubs.la/Q0498sPf0) 1. [Queries & Syntax](https://kb.altinity.com/altinity-kb-queries-and-syntax/) 2. GROUP BY # GROUP BY Learn about the GROUP BY clause in ClickHouseยฎ ## Internal implementation [Code](https://github.com/ClickHouse/ClickHouse/blob/8ab5270ded39c8b044f60f73c1de00c8117ab8f2/src/Interpreters/Aggregator.cpp#L382) ClickHouseยฎ uses non-blocking? hash tables, so each thread has at least one hash table. It makes easier to not care about sync between multiple threads, but has such disadvantages as: 1. Bigger memory usage. 2. Needs to merge those per-thread hash tables afterwards. Because second step can be a bottleneck in case of a really big GROUP BY with a lot of distinct keys, another solution has been made. ## Two-Level <https://youtu.be/SrucFOs8Y6c?t=2132> ``` โ”Œโ”€nameโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€valueโ”€โ”€โ”€โ”€โ”ฌโ”€changedโ”€โ”ฌโ”€descriptionโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€minโ”€โ”€โ”ฌโ”€maxโ”€โ”€โ”ฌโ”€readonlyโ”€โ”ฌโ”€typeโ”€โ”€โ”€โ” โ”‚ group_by_two_level_threshold โ”‚ 100000 โ”‚ 0 โ”‚ From what number of keys, a two-level aggregation starts. 0 - the threshold is not set. โ”‚ แดบแตแดธแดธ โ”‚ แดบแตแดธแดธ โ”‚ 0 โ”‚ UInt64 โ”‚ โ”‚ group_by_two_level_threshold_bytes โ”‚ 50000000 โ”‚ 0 โ”‚ From what size of the aggregation state in bytes, a two-level aggregation begins to be used. 0 - the threshold is not set. Two-level aggregation is used when at least one of the thresholds is triggered. โ”‚ แดบแตแดธแดธ โ”‚ แดบแตแดธแดธ โ”‚ 0 โ”‚ UInt64 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` In order to parallelize merging of hash tables, ie execute such merge via multiple threads, ClickHouse use two-level approach: On the first step ClickHouse creates 256 buckets for each thread. (determined by one byte of hash function) On the second step ClickHouse can merge those 256 buckets independently by multiple threads. <https://github.com/ClickHouse/ClickHouse/blob/1ea637d996715d2a047f8cd209b478e946bdbfb0/src/Common/HashTable/TwoLevelHashTable.h#L6> ## GROUP BY in external memory It utilizes a two-level group by and dumps those buckets on disk. And at the last stage ClickHouse will read those buckets from disk one by one and merge them. So you should have enough RAM to hold one bucket (1/256 of whole GROUP BY size). <https://clickhouse.com/docs/en/sql-reference/statements/select/group-by/#select-group-by-in-external-memory> ## optimize\_aggregation\_in\_order GROUP BY Usually it works slower than regular GROUP BY, because ClickHouse needs to read and process data in specific ORDER, which makes it much more complicated to parallelize reading and aggregating. But it use much less memory, because ClickHouse can stream resultset and there is no need to keep it in memory. ## Last item cache ClickHouse saves value of previous hash calculation, just in case next value will be the same. <https://github.com/ClickHouse/ClickHouse/pull/5417> <https://github.com/ClickHouse/ClickHouse/blob/808d9afd0f8110faba5ae027051bf0a64e506da3/src/Common/ColumnsHashingImpl.h#L40> ## StringHashMap Actually uses 5 different hash tables 1. For empty strings 2. For strings \< 8 bytes 3. For strings \< 16 bytes 4. For strings \< 24 bytes 5. For strings \> 24 bytes ``` SELECT count() FROM ( SELECT materialize('1234567890123456') AS key -- length(key) = 16 FROM numbers(1000000000) ) GROUP BY key Aggregator: Aggregation method: key_string Elapsed: 8.888 sec. Processed 1.00 billion rows, 8.00 GB (112.51 million rows/s., 900.11 MB/s.) SELECT count() FROM ( SELECT materialize('12345678901234567') AS key -- length(key) = 17 FROM numbers(1000000000) ) GROUP BY key Aggregator: Aggregation method: key_string Elapsed: 9.089 sec. Processed 1.00 billion rows, 8.00 GB (110.03 million rows/s., 880.22 MB/s.) SELECT count() FROM ( SELECT materialize('123456789012345678901234') AS key -- length(key) = 24 FROM numbers(1000000000) ) GROUP BY key Aggregator: Aggregation method: key_string Elapsed: 9.134 sec. Processed 1.00 billion rows, 8.00 GB (109.49 million rows/s., 875.94 MB/s.) SELECT count() FROM ( SELECT materialize('1234567890123456789012345') AS key -- length(key) = 25 FROM numbers(1000000000) ) GROUP BY key Aggregator: Aggregation method: key_string Elapsed: 12.566 sec. Processed 1.00 billion rows, 8.00 GB (79.58 million rows/s., 636.67 MB/s.) ``` length 16 8.89 17 9.09 24 9.13 25 12.57 ## For what GROUP BY statement use memory 1. Hash tables It will grow with: Amount of unique combinations of keys participated in GROUP BY Size of keys participated in GROUP BY 1. States of aggregation functions: Be careful with function, which state can use unrestricted amount of memory and grow indefinitely: - groupArray (groupArray(1000)()) - uniqExact (uniq,uniqCombined) - quantileExact (medianExact) (quantile,quantileTDigest) - windowFunnel - groupBitmap - sequenceCount (sequenceMatch) - \*Map ## Why my GROUP BY eat all the RAM 1. run your query with `set send_logs_level='trace'` 2. Remove all aggregation functions from the query, try to understand how many memory simple GROUP BY will take. 3. One by one remove aggregation functions from query in order to understand which one is taking most of memory *** ##### [GROUP BY tricks](https://kb.altinity.com/altinity-kb-queries-and-syntax/group-by/tricks/) Tricks for GROUP BY memory usage optimization Last modified 2025.01.16: [Streamlined page metadata, simplified directory structure (afe0f3c)](https://github.com/Altinity/altinityknowledgebase/commit/afe0f3c3e76e848e6941903e93f05dd41fccfea0) [![Altinity.com](https://kb.altinity.com/images/general/altinity-logo_horizontal_blue_white.svg)](https://altinity.com/) - **PRODUCT** - [Altinity.Cloud](https://altinity.com/managed-clickhouse/) - [Support for ClickHouse](https://altinity.com/clickhouse-support/) - [Training for ClickHouse](https://altinity.com/clickhouse-training) - [Altinity.Cloud Pricing](https://altinity.com/clickhouse-pricing/) - [Altinity Plans and Features](https://altinity.com/plans-and-features-clickhouse/) ![SOC Certified](https://kb.altinity.com/images/general/soc.webp)![SOC2 TYPE II Certified](https://kb.altinity.com/images/general/soc2.webp) - **RESOURCES** - [Blog](https://altinity.com/blog) - [Documentation](https://docs.altinity.com/) - [Knowledge Base](https://kb.altinity.com/) - [Altinity Stable Builds](https://altinity.com/releases) - [Kubernetes Operator](https://hubs.la/Q02pLTV20) - [Altinity Open Source Projects](https://altinity.com/ecosystem/) - [Events](https://altinity.com/events/) - **COMPANY** - [About Altinity](https://altinity.com/about-us/) - [Press Releases](https://altinity.com/about-us/press-releases) - [Partners](https://altinity.com/partners/) - [Customer Stories](https://altinity.com/customer-stories/) - [Careers](https://altinity.com/careers/) - [Contact Us](https://altinity.com/contact/) Get the latest ClickHouse news straight to your inbox every month. [Sign Up](https://altinity.com/newsletter/) ยฉ 2026 Altinity Inc. Altinityยฎ, Altinity.Cloudยฎ, and Altinity Stableยฎ are registered trademarks of Altinity, Inc. ClickHouseยฎ is a registered trademark of ClickHouse, Inc.; Altinity is not affiliated with or associated with ClickHouse, Inc. Kafka, Kubernetes, MySQL, and PostgreSQL are trademarks and property of their respective owners. All Rights Reserved [Privacy Policy](https://altinity.com/privacy-policy/) ![](https://px.ads.linkedin.com/collect/?pid=1601132&fmt=gif)
Readable Markdown
Learn about the GROUP BY clause in ClickHouseยฎ ## Internal implementation [Code](https://github.com/ClickHouse/ClickHouse/blob/8ab5270ded39c8b044f60f73c1de00c8117ab8f2/src/Interpreters/Aggregator.cpp#L382) ClickHouseยฎ uses non-blocking? hash tables, so each thread has at least one hash table. It makes easier to not care about sync between multiple threads, but has such disadvantages as: 1. Bigger memory usage. 2. Needs to merge those per-thread hash tables afterwards. Because second step can be a bottleneck in case of a really big GROUP BY with a lot of distinct keys, another solution has been made. ## Two-Level <https://youtu.be/SrucFOs8Y6c?t=2132> ``` โ”Œโ”€nameโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€valueโ”€โ”€โ”€โ”€โ”ฌโ”€changedโ”€โ”ฌโ”€descriptionโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€minโ”€โ”€โ”ฌโ”€maxโ”€โ”€โ”ฌโ”€readonlyโ”€โ”ฌโ”€typeโ”€โ”€โ”€โ” โ”‚ group_by_two_level_threshold โ”‚ 100000 โ”‚ 0 โ”‚ From what number of keys, a two-level aggregation starts. 0 - the threshold is not set. โ”‚ แดบแตแดธแดธ โ”‚ แดบแตแดธแดธ โ”‚ 0 โ”‚ UInt64 โ”‚ โ”‚ group_by_two_level_threshold_bytes โ”‚ 50000000 โ”‚ 0 โ”‚ From what size of the aggregation state in bytes, a two-level aggregation begins to be used. 0 - the threshold is not set. Two-level aggregation is used when at least one of the thresholds is triggered. โ”‚ แดบแตแดธแดธ โ”‚ แดบแตแดธแดธ โ”‚ 0 โ”‚ UInt64 โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ ``` In order to parallelize merging of hash tables, ie execute such merge via multiple threads, ClickHouse use two-level approach: On the first step ClickHouse creates 256 buckets for each thread. (determined by one byte of hash function) On the second step ClickHouse can merge those 256 buckets independently by multiple threads. <https://github.com/ClickHouse/ClickHouse/blob/1ea637d996715d2a047f8cd209b478e946bdbfb0/src/Common/HashTable/TwoLevelHashTable.h#L6> ## GROUP BY in external memory It utilizes a two-level group by and dumps those buckets on disk. And at the last stage ClickHouse will read those buckets from disk one by one and merge them. So you should have enough RAM to hold one bucket (1/256 of whole GROUP BY size). <https://clickhouse.com/docs/en/sql-reference/statements/select/group-by/#select-group-by-in-external-memory> ## optimize\_aggregation\_in\_order GROUP BY Usually it works slower than regular GROUP BY, because ClickHouse needs to read and process data in specific ORDER, which makes it much more complicated to parallelize reading and aggregating. But it use much less memory, because ClickHouse can stream resultset and there is no need to keep it in memory. ## Last item cache ClickHouse saves value of previous hash calculation, just in case next value will be the same. <https://github.com/ClickHouse/ClickHouse/pull/5417> <https://github.com/ClickHouse/ClickHouse/blob/808d9afd0f8110faba5ae027051bf0a64e506da3/src/Common/ColumnsHashingImpl.h#L40> ## StringHashMap Actually uses 5 different hash tables 1. For empty strings 2. For strings \< 8 bytes 3. For strings \< 16 bytes 4. For strings \< 24 bytes 5. For strings \> 24 bytes ``` SELECT count() FROM ( SELECT materialize('1234567890123456') AS key -- length(key) = 16 FROM numbers(1000000000) ) GROUP BY key Aggregator: Aggregation method: key_string Elapsed: 8.888 sec. Processed 1.00 billion rows, 8.00 GB (112.51 million rows/s., 900.11 MB/s.) SELECT count() FROM ( SELECT materialize('12345678901234567') AS key -- length(key) = 17 FROM numbers(1000000000) ) GROUP BY key Aggregator: Aggregation method: key_string Elapsed: 9.089 sec. Processed 1.00 billion rows, 8.00 GB (110.03 million rows/s., 880.22 MB/s.) SELECT count() FROM ( SELECT materialize('123456789012345678901234') AS key -- length(key) = 24 FROM numbers(1000000000) ) GROUP BY key Aggregator: Aggregation method: key_string Elapsed: 9.134 sec. Processed 1.00 billion rows, 8.00 GB (109.49 million rows/s., 875.94 MB/s.) SELECT count() FROM ( SELECT materialize('1234567890123456789012345') AS key -- length(key) = 25 FROM numbers(1000000000) ) GROUP BY key Aggregator: Aggregation method: key_string Elapsed: 12.566 sec. Processed 1.00 billion rows, 8.00 GB (79.58 million rows/s., 636.67 MB/s.) ``` length 16 8.89 17 9.09 24 9.13 25 12.57 ## For what GROUP BY statement use memory 1. Hash tables It will grow with: Amount of unique combinations of keys participated in GROUP BY Size of keys participated in GROUP BY 1. States of aggregation functions: Be careful with function, which state can use unrestricted amount of memory and grow indefinitely: - groupArray (groupArray(1000)()) - uniqExact (uniq,uniqCombined) - quantileExact (medianExact) (quantile,quantileTDigest) - windowFunnel - groupBitmap - sequenceCount (sequenceMatch) - \*Map ## Why my GROUP BY eat all the RAM 1. run your query with `set send_logs_level='trace'` 2. Remove all aggregation functions from the query, try to understand how many memory simple GROUP BY will take. 3. One by one remove aggregation functions from query in order to understand which one is taking most of memory
Shard195 (laksa)
Root Hash3533317439195346595
Unparsed URLcom,altinity!kb,/altinity-kb-queries-and-syntax/group-by/ s443