βΉοΈ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0.2 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://clickhouse.com/docs/sql-reference/statements/select/group-by |
| Last Crawled | 2026-04-04 06:15:57 (5 days ago) |
| First Indexed | 2025-02-20 23:37:52 (1 year ago) |
| HTTP Status Code | 200 |
| Meta Title | GROUP BY Clause | ClickHouse Docs |
| Meta Description | Documentation for GROUP BY Clause |
| Meta Canonical | null |
| Boilerpipe Text | GROUP BY
clause switches the
SELECT
query into an aggregation mode, which works as follows:
GROUP BY
clause contains a list of expressions (or a single expression, which is considered to be the list of length one). This list acts as a "grouping key", while each individual expression will be referred to as a "key expression".
All the expressions in the
SELECT
,
HAVING
, and
ORDER BY
clauses
must
be calculated based on key expressions
or
on
aggregate functions
over non-key expressions (including plain columns). In other words, each column selected from the table must be used either in a key expression or inside an aggregate function, but not both.
Result of aggregating
SELECT
query will contain as many rows as there were unique values of "grouping key" in source table. Usually, this significantly reduces the row count, often by orders of magnitude, but not necessarily: row count stays the same if all "grouping key" values were distinct.
When you want to group data in the table by column numbers instead of column names, enable the setting
enable_positional_arguments
.
Note
There's an additional way to run aggregation over a table. If a query contains table columns only inside aggregate functions, the
GROUP BY clause
can be omitted, and aggregation by an empty set of keys is assumed. Such queries always return exactly one row.
NULL Processing
β
For grouping, ClickHouse interprets
NULL
as a value, and
NULL==NULL
. It differs from
NULL
processing in most other contexts.
Here's an example to show what this means.
Assume you have this table:
ββxββ¬ββββyββ
β 1 β 2 β
β 2 β α΄Ία΅α΄Έα΄Έ β
β 3 β 2 β
β 3 β 3 β
β 3 β α΄Ία΅α΄Έα΄Έ β
βββββ΄βββββββ
The query
SELECT sum(x), y FROM t_null_big GROUP BY y
results in:
ββsum(x)ββ¬ββββyββ
β 4 β 2 β
β 3 β 3 β
β 5 β α΄Ία΅α΄Έα΄Έ β
ββββββββββ΄βββββββ
You can see that
GROUP BY
for
y = NULL
summed up
x
, as if
NULL
is this value.
If you pass several keys to
GROUP BY
, the result will give you all the combinations of the selection, as if
NULL
were a specific value.
ROLLUP Modifier
β
ROLLUP
modifier is used to calculate subtotals for the key expressions, based on their order in the
GROUP BY
list. The subtotals rows are added after the result table.
The subtotals are calculated in the reverse order: at first subtotals are calculated for the last key expression in the list, then for the previous one, and so on up to the first key expression.
In the subtotals rows the values of already "grouped" key expressions are set to
0
or empty line.
Note
Mind that
HAVING
clause can affect the subtotals results.
Example
Consider the table t:
ββyearββ¬βmonthββ¬βdayββ
β 2019 β 1 β 5 β
β 2019 β 1 β 15 β
β 2020 β 1 β 5 β
β 2020 β 1 β 15 β
β 2020 β 10 β 5 β
β 2020 β 10 β 15 β
ββββββββ΄ββββββββ΄ββββββ
Query:
SELECT
year
,
month
,
day
,
count
(
*
)
FROM
t
GROUP
BY
ROLLUP
(
year
,
month
,
day
)
;
As
GROUP BY
section has three key expressions, the result contains four tables with subtotals "rolled up" from right to left:
GROUP BY year, month, day
;
GROUP BY year, month
(and
day
column is filled with zeros);
GROUP BY year
(now
month, day
columns are both filled with zeros);
and totals (and all three key expression columns are zeros).
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2020 β 10 β 15 β 1 β
β 2020 β 1 β 5 β 1 β
β 2019 β 1 β 5 β 1 β
β 2020 β 1 β 15 β 1 β
β 2019 β 1 β 15 β 1 β
β 2020 β 10 β 5 β 1 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 1 β 0 β 2 β
β 2020 β 1 β 0 β 2 β
β 2020 β 10 β 0 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 0 β 0 β 2 β
β 2020 β 0 β 0 β 4 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 0 β 0 β 6 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
The same query also can be written using
WITH
keyword.
SELECT
year
,
month
,
day
,
count
(
*
)
FROM
t
GROUP
BY
year
,
month
,
day
WITH ROLLUP
;
See also
group_by_use_nulls
setting for SQL standard compatibility.
CUBE Modifier
β
CUBE
modifier is used to calculate subtotals for every combination of the key expressions in the
GROUP BY
list. The subtotals rows are added after the result table.
In the subtotals rows the values of all "grouped" key expressions are set to
0
or empty line.
Note
Mind that
HAVING
clause can affect the subtotals results.
Example
Consider the table t:
ββyearββ¬βmonthββ¬βdayββ
β 2019 β 1 β 5 β
β 2019 β 1 β 15 β
β 2020 β 1 β 5 β
β 2020 β 1 β 15 β
β 2020 β 10 β 5 β
β 2020 β 10 β 15 β
ββββββββ΄ββββββββ΄ββββββ
Query:
SELECT
year
,
month
,
day
,
count
(
*
)
FROM
t
GROUP
BY
CUBE
(
year
,
month
,
day
)
;
As
GROUP BY
section has three key expressions, the result contains eight tables with subtotals for all key expression combinations:
GROUP BY year, month, day
GROUP BY year, month
GROUP BY year, day
GROUP BY year
GROUP BY month, day
GROUP BY month
GROUP BY day
and totals.
Columns, excluded from
GROUP BY
, are filled with zeros.
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2020 β 10 β 15 β 1 β
β 2020 β 1 β 5 β 1 β
β 2019 β 1 β 5 β 1 β
β 2020 β 1 β 15 β 1 β
β 2019 β 1 β 15 β 1 β
β 2020 β 10 β 5 β 1 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 1 β 0 β 2 β
β 2020 β 1 β 0 β 2 β
β 2020 β 10 β 0 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2020 β 0 β 5 β 2 β
β 2019 β 0 β 5 β 1 β
β 2020 β 0 β 15 β 2 β
β 2019 β 0 β 15 β 1 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 0 β 0 β 2 β
β 2020 β 0 β 0 β 4 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 1 β 5 β 2 β
β 0 β 10 β 15 β 1 β
β 0 β 10 β 5 β 1 β
β 0 β 1 β 15 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 1 β 0 β 4 β
β 0 β 10 β 0 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 0 β 5 β 3 β
β 0 β 0 β 15 β 3 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 0 β 0 β 6 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
The same query also can be written using
WITH
keyword.
SELECT
year
,
month
,
day
,
count
(
*
)
FROM
t
GROUP
BY
year
,
month
,
day
WITH
CUBE
;
See also
group_by_use_nulls
setting for SQL standard compatibility.
WITH TOTALS Modifier
β
If the
WITH TOTALS
modifier is specified, another row will be calculated. This row will have key columns containing default values (zeros or empty lines), and columns of aggregate functions with the values calculated across all the rows (the "total" values).
This extra row is only produced in
JSON*
,
TabSeparated*
, and
Pretty*
formats, separately from the other rows:
In
XML
and
JSON*
formats, this row is output as a separate 'totals' field.
In
TabSeparated*
,
CSV*
and
Vertical
formats, the row comes after the main result, preceded by an empty row (after the other data).
In
Pretty*
formats, the row is output as a separate table after the main result.
In
Template
format, the row is output according to specified template.
In the other formats it is not available.
Note
totals is output in the results of
SELECT
queries, and is not output in
INSERT INTO ... SELECT
.
WITH TOTALS
can be run in different ways when
HAVING
is present. The behavior depends on the
totals_mode
setting.
Configuring Totals Processing
β
By default,
totals_mode = 'before_having'
. In this case, 'totals' is calculated across all rows, including the ones that do not pass through HAVING and
max_rows_to_group_by
.
The other alternatives include only the rows that pass through HAVING in 'totals', and behave differently with the setting
max_rows_to_group_by
and
group_by_overflow_mode = 'any'
.
after_having_exclusive
β Don't include rows that didn't pass through
max_rows_to_group_by
. In other words, 'totals' will have less than or the same number of rows as it would if
max_rows_to_group_by
were omitted.
after_having_inclusive
β Include all the rows that didn't pass through 'max_rows_to_group_by' in 'totals'. In other words, 'totals' will have more than or the same number of rows as it would if
max_rows_to_group_by
were omitted.
after_having_auto
β Count the number of rows that passed through HAVING. If it is more than a certain amount (by default, 50%), include all the rows that didn't pass through 'max_rows_to_group_by' in 'totals'. Otherwise, do not include them.
totals_auto_threshold
β By default, 0.5. The coefficient for
after_having_auto
.
If
max_rows_to_group_by
and
group_by_overflow_mode = 'any'
are not used, all variations of
after_having
are the same, and you can use any of them (for example,
after_having_auto
).
You can use
WITH TOTALS
in subqueries, including subqueries in the
JOIN
clause (in this case, the respective total values are combined).
GROUP BY ALL
β
GROUP BY ALL
is equivalent to listing all the SELECT-ed expressions that are not aggregate functions.
For example:
SELECT
a
*
2
,
b
,
count
(
c
)
,
FROM
t
GROUP
BY
ALL
is the same as
SELECT
a
*
2
,
b
,
count
(
c
)
,
FROM
t
GROUP
BY
a
*
2
,
b
For a special case that if there is a function having both aggregate functions and other fields as its arguments, the
GROUP BY
keys will contain the maximum non-aggregate fields we can extract from it.
For example:
SELECT
substring
(
a
,
4
,
2
)
,
substring
(
substring
(
a
,
1
,
2
)
,
1
,
count
(
b
)
)
FROM
t
GROUP
BY
ALL
is the same as
SELECT
substring
(
a
,
4
,
2
)
,
substring
(
substring
(
a
,
1
,
2
)
,
1
,
count
(
b
)
)
FROM
t
GROUP
BY
substring
(
a
,
4
,
2
)
,
substring
(
a
,
1
,
2
)
Examples
β
Example:
SELECT
count
(
)
,
median
(
FetchTiming
>
60
?
60
: FetchTiming
)
,
count
(
)
-
sum
(
Refresh
)
FROM
hits
As opposed to MySQL (and conforming to standard SQL), you can't get some value of some column that is not in a key or aggregate function (except constant expressions). To work around this, you can use the 'any' aggregate function (get the first encountered value) or 'min/max'.
Example:
SELECT
domainWithoutWWW
(
URL
)
AS
domain
,
count
(
)
,
any
(
Title
)
AS
title
-- getting the first occurred page header for each domain.
FROM
hits
GROUP
BY
domain
For every different key value encountered,
GROUP BY
calculates a set of aggregate function values.
GROUPING SETS modifier
β
This is the most general modifier.
This modifier allows manually specifying several aggregation key sets (grouping sets).
Aggregation is performed separately for each grouping set, and after that, all results are combined.
If a column is not presented in a grouping set, it's filled with a default value.
In other words, modifiers described above can be represented via
GROUPING SETS
.
Despite the fact that queries with
ROLLUP
,
CUBE
and
GROUPING SETS
modifiers are syntactically equal, they may perform differently.
When
GROUPING SETS
try to execute everything in parallel,
ROLLUP
and
CUBE
are executing the final merging of the aggregates in a single thread.
In the situation when source columns contain default values, it might be hard to distinguish if a row is a part of the aggregation which uses those columns as keys or not.
To solve this problem
GROUPING
function must be used.
Example
The following two queries are equivalent.
-- Query 1
SELECT
year
,
month
,
day
,
count
(
*
)
FROM
t
GROUP
BY
year
,
month
,
day
WITH ROLLUP
;
-- Query 2
SELECT
year
,
month
,
day
,
count
(
*
)
FROM
t
GROUP
BY
GROUPING SETS
(
(
year
,
month
,
day
)
,
(
year
,
month
)
,
(
year
)
,
(
)
)
;
See also
group_by_use_nulls
setting for SQL standard compatibility.
Implementation Details
β
Aggregation is one of the most important features of a column-oriented DBMS, and thus it's implementation is one of the most heavily optimized parts of ClickHouse. By default, aggregation is done in memory using a hash-table. It has 40+ specializations that are chosen automatically depending on "grouping key" data types.
GROUP BY Optimization Depending on Table Sorting Key
β
The aggregation can be performed more effectively, if a table is sorted by some key, and
GROUP BY
expression contains at least prefix of sorting key or injective functions. In this case when a new key is read from table, the in-between result of aggregation can be finalized and sent to client. This behaviour is switched on by the
optimize_aggregation_in_order
setting. Such optimization reduces memory usage during aggregation, but in some cases may slow down the query execution.
GROUP BY in External Memory
β
You can enable dumping temporary data to the disk to restrict memory usage during
GROUP BY
.
The
max_bytes_before_external_group_by
setting determines the threshold RAM consumption for dumping
GROUP BY
temporary data to the file system. If set to 0 (the default), it is disabled.
Alternatively, you can set
max_bytes_ratio_before_external_group_by
, which allows to use
GROUP BY
in external memory only once the query reaches certain threshold of used memory.
When using
max_bytes_before_external_group_by
, we recommend that you set
max_memory_usage
about twice as high (or
max_bytes_ratio_before_external_group_by=0.5
). This is necessary because there are two stages to aggregation: reading the data and forming intermediate data (1) and merging the intermediate data (2). Dumping data to the file system can only occur during stage 1. If the temporary data wasn't dumped, then stage 2 might require up to the same amount of memory as in stage 1.
For example, if
max_memory_usage
was set to 10000000000 and you want to use external aggregation, it makes sense to set
max_bytes_before_external_group_by
to 10000000000, and
max_memory_usage
to 20000000000. When external aggregation is triggered (if there was at least one dump of temporary data), maximum consumption of RAM is only slightly more than
max_bytes_before_external_group_by
.
With distributed query processing, external aggregation is performed on remote servers. In order for the requester server to use only a small amount of RAM, set
distributed_aggregation_memory_efficient
to 1.
When merging data flushed to the disk, as well as when merging results from remote servers when the
distributed_aggregation_memory_efficient
setting is enabled, consumes up to
1/256 * the_number_of_threads
from the total amount of RAM.
When external aggregation is enabled, if there was less than
max_bytes_before_external_group_by
of data (i.e.Β data was not flushed), the query runs just as fast as without external aggregation. If any temporary data was flushed, the run time will be several times longer (approximately three times).
If you have an
ORDER BY
with a
LIMIT
after
GROUP BY
, then the amount of used RAM depends on the amount of data in
LIMIT
, not in the whole table. But if the
ORDER BY
does not have
LIMIT
, do not forget to enable external sorting (
max_bytes_before_external_sort
). |
| Markdown | [Skip to main content](https://clickhouse.com/docs/sql-reference/statements/select/group-by#__docusaurus_skipToContent_fallback)
[](https://clickhouse.com/)
- [Products](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [ClickHouse Cloud Best way to use ClickHouse. Available on AWS, GCP, and Azure.](https://clickhouse.com/cloud)
- [BYOC (Bring Your Own Cloud) The fully managed ClickHouse Cloud service, Can be deployed in your AWS account.](https://clickhouse.com/cloud/bring-your-own-cloud)
- [ClickHouse Set up a database with open-source ClickHouse. ClickHouse](https://clickhouse.com/clickhouse)
- [Discover more than 100 integrations.](https://clickhouse.com/integrations)
[Discover more than 100 integrations.](https://clickhouse.com/integrations)
- [Use cases](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [Real-time analytics](https://clickhouse.com/use-cases/real-time-analytics)
- [Machine Learning & Generative AI](https://clickhouse.com/use-cases/machine-learning-and-data-science)
- [Business Intelligence](https://clickhouse.com/use-cases/data-warehousing)
- [Logs, Events, Traces](https://clickhouse.com/use-cases/observability)
- [All use cases](https://clickhouse.com/use-cases)
[All use cases](https://clickhouse.com/use-cases)
- [Documentation](https://clickhouse.com/docs)
- [Resources](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [User stories](https://clickhouse.com/user-stories)
- [Blog](https://clickhouse.com/blog)
- [Events](https://clickhouse.com/company/events)
- [Learning and certification](https://clickhouse.com/learn)
- [Comparison](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [BigQuery](https://clickhouse.com/comparison/bigquery)
- [PostgreSQL](https://clickhouse.com/comparison/postgresql)
- [Redshift](https://clickhouse.com/comparison/redshift)
- [Rockset](https://clickhouse.com/comparison/rockset)
- [Snowflake](https://clickhouse.com/comparison/snowflake)
- [Video](https://clickhouse.com/videos)
- [Demo](https://clickhouse.com/demos)
- [Pricing](https://clickhouse.com/pricing)
- [Contact](https://clickhouse.com/company/contact?loc=nav)
[46\.7k](https://github.com/ClickHouse/ClickHouse?utm_source=clickhouse&utm_medium=website&utm_campaign=website-nav)
[Search`Ctrl``K`](https://clickhouse.com/docs/search)
[Sign in](https://console.clickhouse.cloud/signIn?loc=docs-nav-signIn-cta&glxid=8659e246-673d-4fa4-88eb-6c4e61f70f24&pagePath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&origPath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&utm_ga=GA1.1.92860196.1775283359)
[Get started](https://console.clickhouse.cloud/signUp?loc=docs-nav-signUp-cta&glxid=8659e246-673d-4fa4-88eb-6c4e61f70f24&pagePath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&origPath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&utm_ga=GA1.1.92860196.1775283359)
[Get started](https://clickhouse.com/docs/introduction-clickhouse)
[Cloud](https://clickhouse.com/docs/cloud/overview)
[Manage data](https://clickhouse.com/docs/updating-data)
[Server admin](https://clickhouse.com/docs/guides/manage-and-deploy-index)
[Reference](https://clickhouse.com/docs/sql-reference)
[Integrations](https://clickhouse.com/docs/integrations)
[ClickStack](https://clickhouse.com/docs/use-cases/observability/clickstack/overview)
[chDB](https://clickhouse.com/docs/chdb)
[About](https://clickhouse.com/docs/about)
[Knowledge Base](https://clickhouse.com/docs/knowledgebase)
[English](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [English](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [ζ₯ζ¬θͺ](https://clickhouse.com/docs/jp/sql-reference/statements/select/group-by)
- [δΈζ](https://clickhouse.com/docs/zh/sql-reference/statements/select/group-by)
- [Π ΡΡΡΠΊΠΈΠΉ](https://clickhouse.com/docs/ru/sql-reference/statements/select/group-by)
- [νκ΅μ΄](https://clickhouse.com/docs/ko/sql-reference/statements/select/group-by)
[Skip to main content](https://clickhouse.com/docs/sql-reference/statements/select/group-by#__docusaurus_skipToContent_fallback)
[](https://clickhouse.com/)
- [Products](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [ClickHouse Cloud Best way to use ClickHouse. Available on AWS, GCP, and Azure.](https://clickhouse.com/cloud)
- [BYOC (Bring Your Own Cloud) The fully managed ClickHouse Cloud service, Can be deployed in your AWS account.](https://clickhouse.com/cloud/bring-your-own-cloud)
- [ClickHouse Set up a database with open-source ClickHouse. ClickHouse](https://clickhouse.com/clickhouse)
- [Discover more than 100 integrations.](https://clickhouse.com/integrations)
[Discover more than 100 integrations.](https://clickhouse.com/integrations)
- [Use cases](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [Real-time analytics](https://clickhouse.com/use-cases/real-time-analytics)
- [Machine Learning & Generative AI](https://clickhouse.com/use-cases/machine-learning-and-data-science)
- [Business Intelligence](https://clickhouse.com/use-cases/data-warehousing)
- [Logs, Events, Traces](https://clickhouse.com/use-cases/observability)
- [All use cases](https://clickhouse.com/use-cases)
[All use cases](https://clickhouse.com/use-cases)
- [Documentation](https://clickhouse.com/docs)
- [Resources](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [User stories](https://clickhouse.com/user-stories)
- [Blog](https://clickhouse.com/blog)
- [Events](https://clickhouse.com/company/events)
- [Learning and certification](https://clickhouse.com/learn)
- [Comparison](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [BigQuery](https://clickhouse.com/comparison/bigquery)
- [PostgreSQL](https://clickhouse.com/comparison/postgresql)
- [Redshift](https://clickhouse.com/comparison/redshift)
- [Rockset](https://clickhouse.com/comparison/rockset)
- [Snowflake](https://clickhouse.com/comparison/snowflake)
- [Video](https://clickhouse.com/videos)
- [Demo](https://clickhouse.com/demos)
- [Pricing](https://clickhouse.com/pricing)
- [Contact](https://clickhouse.com/company/contact?loc=nav)
[46\.7k](https://github.com/ClickHouse/ClickHouse?utm_source=clickhouse&utm_medium=website&utm_campaign=website-nav)
[Search`Ctrl``K`](https://clickhouse.com/docs/search)
[Sign in](https://console.clickhouse.cloud/signIn?loc=docs-nav-signIn-cta&glxid=8659e246-673d-4fa4-88eb-6c4e61f70f24&pagePath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&origPath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&utm_ga=GA1.1.92860196.1775283359)
[Get started](https://console.clickhouse.cloud/signUp?loc=docs-nav-signUp-cta&glxid=8659e246-673d-4fa4-88eb-6c4e61f70f24&pagePath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&origPath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&utm_ga=GA1.1.92860196.1775283359)
[Get started](https://clickhouse.com/docs/introduction-clickhouse)
[Cloud](https://clickhouse.com/docs/cloud/overview)
[Manage data](https://clickhouse.com/docs/updating-data)
[Server admin](https://clickhouse.com/docs/guides/manage-and-deploy-index)
[Reference](https://clickhouse.com/docs/sql-reference)
[Integrations](https://clickhouse.com/docs/integrations)
[ClickStack](https://clickhouse.com/docs/use-cases/observability/clickstack/overview)
[chDB](https://clickhouse.com/docs/chdb)
[About](https://clickhouse.com/docs/about)
[Knowledge Base](https://clickhouse.com/docs/knowledgebase)
[English](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [English](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [ζ₯ζ¬θͺ](https://clickhouse.com/docs/jp/sql-reference/statements/select/group-by)
- [δΈζ](https://clickhouse.com/docs/zh/sql-reference/statements/select/group-by)
- [Π ΡΡΡΠΊΠΈΠΉ](https://clickhouse.com/docs/ru/sql-reference/statements/select/group-by)
- [νκ΅μ΄](https://clickhouse.com/docs/ko/sql-reference/statements/select/group-by)
[Search`Ctrl``K`](https://clickhouse.com/docs/search)
- [Introduction](https://clickhouse.com/docs/sql-reference)
- [Syntax](https://clickhouse.com/docs/sql-reference/syntax)
- [Input and Output Formats](https://clickhouse.com/docs/sql-reference/formats)
- [Data types](https://clickhouse.com/docs/sql-reference/data-types)
- [Statements](https://clickhouse.com/docs/sql-reference/statements)
- [SELECT](https://clickhouse.com/docs/sql-reference/statements/select)
- [ALL](https://clickhouse.com/docs/sql-reference/statements/select/all)
- [APPLY](https://clickhouse.com/docs/sql-reference/statements/select/apply-modifier)
- [ARRAY JOIN](https://clickhouse.com/docs/sql-reference/statements/select/array-join)
- [DISTINCT](https://clickhouse.com/docs/sql-reference/statements/select/distinct)
- [EXCEPT](https://clickhouse.com/docs/sql-reference/statements/select/except)
- [EXCEPT](https://clickhouse.com/docs/sql-reference/statements/select/except-modifier)
- [FORMAT](https://clickhouse.com/docs/sql-reference/statements/select/format)
- [FROM](https://clickhouse.com/docs/sql-reference/statements/select/from)
- [GROUP BY](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having)
- [INTERSECT](https://clickhouse.com/docs/sql-reference/statements/select/intersect)
- [INTO OUTFILE](https://clickhouse.com/docs/sql-reference/statements/select/into-outfile)
- [JOIN](https://clickhouse.com/docs/sql-reference/statements/select/join)
- [LIMIT BY](https://clickhouse.com/docs/sql-reference/statements/select/limit-by)
- [LIMIT](https://clickhouse.com/docs/sql-reference/statements/select/limit)
- [OFFSET](https://clickhouse.com/docs/sql-reference/statements/select/offset)
- [ORDER BY](https://clickhouse.com/docs/sql-reference/statements/select/order-by)
- [PREWHERE](https://clickhouse.com/docs/sql-reference/statements/select/prewhere)
- [QUALIFY](https://clickhouse.com/docs/sql-reference/statements/select/qualify)
- [REPLACE](https://clickhouse.com/docs/sql-reference/statements/select/replace-modifier)
- [SAMPLE](https://clickhouse.com/docs/sql-reference/statements/select/sample)
- [UNION](https://clickhouse.com/docs/sql-reference/statements/select/union)
- [WHERE](https://clickhouse.com/docs/sql-reference/statements/select/where)
- [WITH](https://clickhouse.com/docs/sql-reference/statements/select/with)
- [INSERT INTO](https://clickhouse.com/docs/sql-reference/statements/insert-into)
- [CREATE](https://clickhouse.com/docs/sql-reference/statements/create)
- [ALTER](https://clickhouse.com/docs/sql-reference/statements/alter)
- [DELETE](https://clickhouse.com/docs/sql-reference/statements/delete)
- [SYSTEM](https://clickhouse.com/docs/sql-reference/statements/system)
- [SHOW](https://clickhouse.com/docs/sql-reference/statements/show)
- [GRANT](https://clickhouse.com/docs/sql-reference/statements/grant)
- [EXPLAIN](https://clickhouse.com/docs/sql-reference/statements/explain)
- [REVOKE](https://clickhouse.com/docs/sql-reference/statements/revoke)
- [UPDATE](https://clickhouse.com/docs/sql-reference/statements/update)
- [ATTACH](https://clickhouse.com/docs/sql-reference/statements/attach)
- [CHECK TABLE](https://clickhouse.com/docs/sql-reference/statements/check-table)
- [DESCRIBE TABLE](https://clickhouse.com/docs/sql-reference/statements/describe-table)
- [DETACH](https://clickhouse.com/docs/sql-reference/statements/detach)
- [DROP](https://clickhouse.com/docs/sql-reference/statements/drop)
- [EXISTS](https://clickhouse.com/docs/sql-reference/statements/exists)
- [KILL](https://clickhouse.com/docs/sql-reference/statements/kill)
- [OPTIMIZE](https://clickhouse.com/docs/sql-reference/statements/optimize)
- [RENAME](https://clickhouse.com/docs/sql-reference/statements/rename)
- [EXCHANGE](https://clickhouse.com/docs/sql-reference/statements/exchange)
- [SET](https://clickhouse.com/docs/sql-reference/statements/set)
- [SET ROLE](https://clickhouse.com/docs/sql-reference/statements/set-role)
- [TRUNCATE](https://clickhouse.com/docs/sql-reference/statements/truncate)
- [EXECUTE AS](https://clickhouse.com/docs/sql-reference/statements/execute_as)
- [PARALLEL WITH](https://clickhouse.com/docs/sql-reference/statements/parallel_with)
- [USE](https://clickhouse.com/docs/sql-reference/statements/use)
- [WATCH](https://clickhouse.com/docs/sql-reference/statements/watch)
- [MOVE](https://clickhouse.com/docs/sql-reference/statements/move)
- [CHECK GRANT](https://clickhouse.com/docs/sql-reference/statements/check-grant)
- [UNDROP](https://clickhouse.com/docs/sql-reference/statements/undrop)
- [Operators](https://clickhouse.com/docs/sql-reference/operators)
- [Engines](https://clickhouse.com/docs/engines)
- [Database Engines](https://clickhouse.com/docs/engines/database-engines)
- [Table Engines](https://clickhouse.com/docs/engines/table-engines)
- [Functions](https://clickhouse.com/docs/sql-reference/functions)
- [Regular functions](https://clickhouse.com/docs/sql-reference/functions/regular-functions)
- [Aggregate functions](https://clickhouse.com/docs/sql-reference/aggregate-functions)
- [Table functions](https://clickhouse.com/docs/sql-reference/table-functions)
- [Window functions](https://clickhouse.com/docs/sql-reference/window-functions)
- [Formats](https://clickhouse.com/docs/interfaces/formats)
- [Data Lakes](https://clickhouse.com/docs/sql-reference/datalakes)
- [Introduction](https://clickhouse.com/docs/sql-reference)
- [Statements](https://clickhouse.com/docs/sql-reference/statements)
- [SELECT](https://clickhouse.com/docs/sql-reference/statements/select)
- GROUP BY
[Edit this page](https://github.com/ClickHouse/ClickHouse/tree/master/docs/en/sql-reference/statements/select/group-by.md)
# GROUP BY Clause
`GROUP BY` clause switches the `SELECT` query into an aggregation mode, which works as follows:
- `GROUP BY` clause contains a list of expressions (or a single expression, which is considered to be the list of length one). This list acts as a "grouping key", while each individual expression will be referred to as a "key expression".
- All the expressions in the [SELECT](https://clickhouse.com/docs/sql-reference/statements/select), [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having), and [ORDER BY](https://clickhouse.com/docs/sql-reference/statements/select/order-by) clauses **must** be calculated based on key expressions **or** on [aggregate functions](https://clickhouse.com/docs/sql-reference/aggregate-functions) over non-key expressions (including plain columns). In other words, each column selected from the table must be used either in a key expression or inside an aggregate function, but not both.
- Result of aggregating `SELECT` query will contain as many rows as there were unique values of "grouping key" in source table. Usually, this significantly reduces the row count, often by orders of magnitude, but not necessarily: row count stays the same if all "grouping key" values were distinct.
When you want to group data in the table by column numbers instead of column names, enable the setting [enable\_positional\_arguments](https://clickhouse.com/docs/operations/settings/settings#enable_positional_arguments).
Note
There's an additional way to run aggregation over a table. If a query contains table columns only inside aggregate functions, the `GROUP BY clause` can be omitted, and aggregation by an empty set of keys is assumed. Such queries always return exactly one row.
## NULL Processing[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#null-processing "Direct link to NULL Processing")
For grouping, ClickHouse interprets [NULL](https://clickhouse.com/docs/sql-reference/syntax#null) as a value, and `NULL==NULL`. It differs from `NULL` processing in most other contexts.
Here's an example to show what this means.
Assume you have this table:
```
ββxββ¬ββββyββ
β 1 β 2 β
β 2 β α΄Ία΅α΄Έα΄Έ β
β 3 β 2 β
β 3 β 3 β
β 3 β α΄Ία΅α΄Έα΄Έ β
βββββ΄βββββββ
```
The query `SELECT sum(x), y FROM t_null_big GROUP BY y` results in:
```
ββsum(x)ββ¬ββββyββ
β 4 β 2 β
β 3 β 3 β
β 5 β α΄Ία΅α΄Έα΄Έ β
ββββββββββ΄βββββββ
```
You can see that `GROUP BY` for `y = NULL` summed up `x`, as if `NULL` is this value.
If you pass several keys to `GROUP BY`, the result will give you all the combinations of the selection, as if `NULL` were a specific value.
## ROLLUP Modifier[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#rollup-modifier "Direct link to ROLLUP Modifier")
`ROLLUP` modifier is used to calculate subtotals for the key expressions, based on their order in the `GROUP BY` list. The subtotals rows are added after the result table.
The subtotals are calculated in the reverse order: at first subtotals are calculated for the last key expression in the list, then for the previous one, and so on up to the first key expression.
In the subtotals rows the values of already "grouped" key expressions are set to `0` or empty line.
Note
Mind that [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having) clause can affect the subtotals results.
**Example**
Consider the table t:
```
ββyearββ¬βmonthββ¬βdayββ
β 2019 β 1 β 5 β
β 2019 β 1 β 15 β
β 2020 β 1 β 5 β
β 2020 β 1 β 15 β
β 2020 β 10 β 5 β
β 2020 β 10 β 15 β
ββββββββ΄ββββββββ΄ββββββ
```
Query:
```
SELECT year, month, day, count(*) FROM t GROUP BY ROLLUP(year, month, day);
```
As `GROUP BY` section has three key expressions, the result contains four tables with subtotals "rolled up" from right to left:
- `GROUP BY year, month, day`;
- `GROUP BY year, month` (and `day` column is filled with zeros);
- `GROUP BY year` (now `month, day` columns are both filled with zeros);
- and totals (and all three key expression columns are zeros).
```
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2020 β 10 β 15 β 1 β
β 2020 β 1 β 5 β 1 β
β 2019 β 1 β 5 β 1 β
β 2020 β 1 β 15 β 1 β
β 2019 β 1 β 15 β 1 β
β 2020 β 10 β 5 β 1 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 1 β 0 β 2 β
β 2020 β 1 β 0 β 2 β
β 2020 β 10 β 0 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 0 β 0 β 2 β
β 2020 β 0 β 0 β 4 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 0 β 0 β 6 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
```
The same query also can be written using `WITH` keyword.
```
SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH ROLLUP;
```
**See also**
- [group\_by\_use\_nulls](https://clickhouse.com/docs/operations/settings/settings#group_by_use_nulls) setting for SQL standard compatibility.
## CUBE Modifier[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#cube-modifier "Direct link to CUBE Modifier")
`CUBE` modifier is used to calculate subtotals for every combination of the key expressions in the `GROUP BY` list. The subtotals rows are added after the result table.
In the subtotals rows the values of all "grouped" key expressions are set to `0` or empty line.
Note
Mind that [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having) clause can affect the subtotals results.
**Example**
Consider the table t:
```
ββyearββ¬βmonthββ¬βdayββ
β 2019 β 1 β 5 β
β 2019 β 1 β 15 β
β 2020 β 1 β 5 β
β 2020 β 1 β 15 β
β 2020 β 10 β 5 β
β 2020 β 10 β 15 β
ββββββββ΄ββββββββ΄ββββββ
```
Query:
```
SELECT year, month, day, count(*) FROM t GROUP BY CUBE(year, month, day);
```
As `GROUP BY` section has three key expressions, the result contains eight tables with subtotals for all key expression combinations:
- `GROUP BY year, month, day`
- `GROUP BY year, month`
- `GROUP BY year, day`
- `GROUP BY year`
- `GROUP BY month, day`
- `GROUP BY month`
- `GROUP BY day`
- and totals.
Columns, excluded from `GROUP BY`, are filled with zeros.
```
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2020 β 10 β 15 β 1 β
β 2020 β 1 β 5 β 1 β
β 2019 β 1 β 5 β 1 β
β 2020 β 1 β 15 β 1 β
β 2019 β 1 β 15 β 1 β
β 2020 β 10 β 5 β 1 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 1 β 0 β 2 β
β 2020 β 1 β 0 β 2 β
β 2020 β 10 β 0 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2020 β 0 β 5 β 2 β
β 2019 β 0 β 5 β 1 β
β 2020 β 0 β 15 β 2 β
β 2019 β 0 β 15 β 1 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 0 β 0 β 2 β
β 2020 β 0 β 0 β 4 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 1 β 5 β 2 β
β 0 β 10 β 15 β 1 β
β 0 β 10 β 5 β 1 β
β 0 β 1 β 15 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 1 β 0 β 4 β
β 0 β 10 β 0 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 0 β 5 β 3 β
β 0 β 0 β 15 β 3 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 0 β 0 β 6 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
```
The same query also can be written using `WITH` keyword.
```
SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH CUBE;
```
**See also**
- [group\_by\_use\_nulls](https://clickhouse.com/docs/operations/settings/settings#group_by_use_nulls) setting for SQL standard compatibility.
## WITH TOTALS Modifier[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#with-totals-modifier "Direct link to WITH TOTALS Modifier")
If the `WITH TOTALS` modifier is specified, another row will be calculated. This row will have key columns containing default values (zeros or empty lines), and columns of aggregate functions with the values calculated across all the rows (the "total" values).
This extra row is only produced in `JSON*`, `TabSeparated*`, and `Pretty*` formats, separately from the other rows:
- In `XML` and `JSON*` formats, this row is output as a separate 'totals' field.
- In `TabSeparated*`, `CSV*` and `Vertical` formats, the row comes after the main result, preceded by an empty row (after the other data).
- In `Pretty*` formats, the row is output as a separate table after the main result.
- In `Template` format, the row is output according to specified template.
- In the other formats it is not available.
Note
totals is output in the results of `SELECT` queries, and is not output in `INSERT INTO ... SELECT`.
`WITH TOTALS` can be run in different ways when [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having) is present. The behavior depends on the `totals_mode` setting.
### Configuring Totals Processing[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#configuring-totals-processing "Direct link to Configuring Totals Processing")
By default, `totals_mode = 'before_having'`. In this case, 'totals' is calculated across all rows, including the ones that do not pass through HAVING and `max_rows_to_group_by`.
The other alternatives include only the rows that pass through HAVING in 'totals', and behave differently with the setting `max_rows_to_group_by` and `group_by_overflow_mode = 'any'`.
`after_having_exclusive` β Don't include rows that didn't pass through `max_rows_to_group_by`. In other words, 'totals' will have less than or the same number of rows as it would if `max_rows_to_group_by` were omitted.
`after_having_inclusive` β Include all the rows that didn't pass through 'max\_rows\_to\_group\_by' in 'totals'. In other words, 'totals' will have more than or the same number of rows as it would if `max_rows_to_group_by` were omitted.
`after_having_auto` β Count the number of rows that passed through HAVING. If it is more than a certain amount (by default, 50%), include all the rows that didn't pass through 'max\_rows\_to\_group\_by' in 'totals'. Otherwise, do not include them.
`totals_auto_threshold` β By default, 0.5. The coefficient for `after_having_auto`.
If `max_rows_to_group_by` and `group_by_overflow_mode = 'any'` are not used, all variations of `after_having` are the same, and you can use any of them (for example, `after_having_auto`).
You can use `WITH TOTALS` in subqueries, including subqueries in the [JOIN](https://clickhouse.com/docs/sql-reference/statements/select/join) clause (in this case, the respective total values are combined).
## GROUP BY ALL[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#group-by-all "Direct link to GROUP BY ALL")
`GROUP BY ALL` is equivalent to listing all the SELECT-ed expressions that are not aggregate functions.
For example:
```
SELECT
a * 2,
b,
count(c),
FROM t
GROUP BY ALL
```
is the same as
```
SELECT
a * 2,
b,
count(c),
FROM t
GROUP BY a * 2, b
```
For a special case that if there is a function having both aggregate functions and other fields as its arguments, the `GROUP BY` keys will contain the maximum non-aggregate fields we can extract from it.
For example:
```
SELECT
substring(a, 4, 2),
substring(substring(a, 1, 2), 1, count(b))
FROM t
GROUP BY ALL
```
is the same as
```
SELECT
substring(a, 4, 2),
substring(substring(a, 1, 2), 1, count(b))
FROM t
GROUP BY substring(a, 4, 2), substring(a, 1, 2)
```
## Examples[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#examples "Direct link to Examples")
Example:
```
SELECT
count(),
median(FetchTiming > 60 ? 60 : FetchTiming),
count() - sum(Refresh)
FROM hits
```
As opposed to MySQL (and conforming to standard SQL), you can't get some value of some column that is not in a key or aggregate function (except constant expressions). To work around this, you can use the 'any' aggregate function (get the first encountered value) or 'min/max'.
Example:
```
SELECT
domainWithoutWWW(URL) AS domain,
count(),
any(Title) AS title -- getting the first occurred page header for each domain.
FROM hits
GROUP BY domain
```
For every different key value encountered, `GROUP BY` calculates a set of aggregate function values.
## GROUPING SETS modifier[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#grouping-sets-modifier "Direct link to GROUPING SETS modifier")
This is the most general modifier. This modifier allows manually specifying several aggregation key sets (grouping sets). Aggregation is performed separately for each grouping set, and after that, all results are combined. If a column is not presented in a grouping set, it's filled with a default value.
In other words, modifiers described above can be represented via `GROUPING SETS`. Despite the fact that queries with `ROLLUP`, `CUBE` and `GROUPING SETS` modifiers are syntactically equal, they may perform differently. When `GROUPING SETS` try to execute everything in parallel, `ROLLUP` and `CUBE` are executing the final merging of the aggregates in a single thread.
In the situation when source columns contain default values, it might be hard to distinguish if a row is a part of the aggregation which uses those columns as keys or not. To solve this problem `GROUPING` function must be used.
**Example**
The following two queries are equivalent.
```
-- Query 1
SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH ROLLUP;
-- Query 2
SELECT year, month, day, count(*) FROM t GROUP BY
GROUPING SETS
(
(year, month, day),
(year, month),
(year),
()
);
```
**See also**
- [group\_by\_use\_nulls](https://clickhouse.com/docs/operations/settings/settings#group_by_use_nulls) setting for SQL standard compatibility.
## Implementation Details[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#implementation-details "Direct link to Implementation Details")
Aggregation is one of the most important features of a column-oriented DBMS, and thus it's implementation is one of the most heavily optimized parts of ClickHouse. By default, aggregation is done in memory using a hash-table. It has 40+ specializations that are chosen automatically depending on "grouping key" data types.
### GROUP BY Optimization Depending on Table Sorting Key[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#group-by-optimization-depending-on-table-sorting-key "Direct link to GROUP BY Optimization Depending on Table Sorting Key")
The aggregation can be performed more effectively, if a table is sorted by some key, and `GROUP BY` expression contains at least prefix of sorting key or injective functions. In this case when a new key is read from table, the in-between result of aggregation can be finalized and sent to client. This behaviour is switched on by the [optimize\_aggregation\_in\_order](https://clickhouse.com/docs/operations/settings/settings#optimize_aggregation_in_order) setting. Such optimization reduces memory usage during aggregation, but in some cases may slow down the query execution.
### GROUP BY in External Memory[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#group-by-in-external-memory "Direct link to GROUP BY in External Memory")
You can enable dumping temporary data to the disk to restrict memory usage during `GROUP BY`. The [max\_bytes\_before\_external\_group\_by](https://clickhouse.com/docs/operations/settings/settings#max_bytes_before_external_group_by) setting determines the threshold RAM consumption for dumping `GROUP BY` temporary data to the file system. If set to 0 (the default), it is disabled. Alternatively, you can set [max\_bytes\_ratio\_before\_external\_group\_by](https://clickhouse.com/docs/operations/settings/settings#max_bytes_ratio_before_external_group_by), which allows to use `GROUP BY` in external memory only once the query reaches certain threshold of used memory.
When using `max_bytes_before_external_group_by`, we recommend that you set `max_memory_usage` about twice as high (or `max_bytes_ratio_before_external_group_by=0.5`). This is necessary because there are two stages to aggregation: reading the data and forming intermediate data (1) and merging the intermediate data (2). Dumping data to the file system can only occur during stage 1. If the temporary data wasn't dumped, then stage 2 might require up to the same amount of memory as in stage 1.
For example, if [max\_memory\_usage](https://clickhouse.com/docs/operations/settings/settings#max_memory_usage) was set to 10000000000 and you want to use external aggregation, it makes sense to set `max_bytes_before_external_group_by` to 10000000000, and `max_memory_usage` to 20000000000. When external aggregation is triggered (if there was at least one dump of temporary data), maximum consumption of RAM is only slightly more than `max_bytes_before_external_group_by`.
With distributed query processing, external aggregation is performed on remote servers. In order for the requester server to use only a small amount of RAM, set `distributed_aggregation_memory_efficient` to 1.
When merging data flushed to the disk, as well as when merging results from remote servers when the `distributed_aggregation_memory_efficient` setting is enabled, consumes up to `1/256 * the_number_of_threads` from the total amount of RAM.
When external aggregation is enabled, if there was less than `max_bytes_before_external_group_by` of data (i.e. data was not flushed), the query runs just as fast as without external aggregation. If any temporary data was flushed, the run time will be several times longer (approximately three times).
If you have an [ORDER BY](https://clickhouse.com/docs/sql-reference/statements/select/order-by) with a [LIMIT](https://clickhouse.com/docs/sql-reference/statements/select/limit) after `GROUP BY`, then the amount of used RAM depends on the amount of data in `LIMIT`, not in the whole table. But if the `ORDER BY` does not have `LIMIT`, do not forget to enable external sorting (`max_bytes_before_external_sort`).
[Previous FROM](https://clickhouse.com/docs/sql-reference/statements/select/from)
[Next HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having)
- [NULL Processing](https://clickhouse.com/docs/sql-reference/statements/select/group-by#null-processing)
- [ROLLUP Modifier](https://clickhouse.com/docs/sql-reference/statements/select/group-by#rollup-modifier)
- [CUBE Modifier](https://clickhouse.com/docs/sql-reference/statements/select/group-by#cube-modifier)
- [WITH TOTALS Modifier](https://clickhouse.com/docs/sql-reference/statements/select/group-by#with-totals-modifier)
- [Configuring Totals Processing](https://clickhouse.com/docs/sql-reference/statements/select/group-by#configuring-totals-processing)
- [GROUP BY ALL](https://clickhouse.com/docs/sql-reference/statements/select/group-by#group-by-all)
- [Examples](https://clickhouse.com/docs/sql-reference/statements/select/group-by#examples)
- [GROUPING SETS modifier](https://clickhouse.com/docs/sql-reference/statements/select/group-by#grouping-sets-modifier)
- [Implementation Details](https://clickhouse.com/docs/sql-reference/statements/select/group-by#implementation-details)
- [GROUP BY Optimization Depending on Table Sorting Key](https://clickhouse.com/docs/sql-reference/statements/select/group-by#group-by-optimization-depending-on-table-sorting-key)
- [GROUP BY in External Memory](https://clickhouse.com/docs/sql-reference/statements/select/group-by#group-by-in-external-memory)
Was this page helpful?
###### Try ClickHouse Cloud for FREE
Separation of storage and compute, automatic scaling, built-in SQL console, and lots more. \$300 in free credits when signing up.
[Try it for Free](https://console.clickhouse.cloud/signUp?loc=doc-card-banner&glxid=8659e246-673d-4fa4-88eb-6c4e61f70f24&pagePath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&origPath=%2Fdocs%2Fsql-reference%2Fstatements%2Fselect%2Fgroup-by&utm_ga=GA1.1.92860196.1775283359)
Β© 2016β2026 ClickHouse, Inc.
[Trademark](https://clickhouse.com/legal/trademark-policy)Β·[Privacy](https://clickhouse.com/legal/privacy-policy)Β·[Security](https://trust.clickhouse.com/)Β·[Terms of Service](https://clickhouse.com/legal/agreements/terms-of-service)

Β© 2016β2026 ClickHouse, Inc.
[Trademark](https://clickhouse.com/legal/trademark-policy)Β·[Privacy](https://clickhouse.com/legal/privacy-policy)Β·[Security](https://trust.clickhouse.com/)Β·[Terms of Service](https://clickhouse.com/legal/agreements/terms-of-service)

[](https://clickhouse.com/)
EN
- Get startedβΌ
- CloudβΌ
- Manage dataβΌ
- Server adminβΌ
- ReferenceβΌ
- IntegrationsβΌ
- ClickStackβΌ
- chDBβΌ
- AboutβΌ
[](https://clickhouse.com/)
EN
main-menu
- IntroductionβΌ
- [Syntax](https://clickhouse.com/docs/sql-reference/syntax)
- [Input and Output Formats](https://clickhouse.com/docs/sql-reference/formats)
- Data typesβΌ
- StatementsβΌ
- SELECTβΌ
- [ALL](https://clickhouse.com/docs/sql-reference/statements/select/all)
- [APPLY](https://clickhouse.com/docs/sql-reference/statements/select/apply-modifier)
- [ARRAY JOIN](https://clickhouse.com/docs/sql-reference/statements/select/array-join)
- [DISTINCT](https://clickhouse.com/docs/sql-reference/statements/select/distinct)
- [EXCEPT](https://clickhouse.com/docs/sql-reference/statements/select/except)
- [EXCEPT](https://clickhouse.com/docs/sql-reference/statements/select/except-modifier)
- [FORMAT](https://clickhouse.com/docs/sql-reference/statements/select/format)
- [FROM](https://clickhouse.com/docs/sql-reference/statements/select/from)
- [GROUP BY](https://clickhouse.com/docs/sql-reference/statements/select/group-by)
- [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having)
- [INTERSECT](https://clickhouse.com/docs/sql-reference/statements/select/intersect)
- [INTO OUTFILE](https://clickhouse.com/docs/sql-reference/statements/select/into-outfile)
- [JOIN](https://clickhouse.com/docs/sql-reference/statements/select/join)
- [LIMIT BY](https://clickhouse.com/docs/sql-reference/statements/select/limit-by)
- [LIMIT](https://clickhouse.com/docs/sql-reference/statements/select/limit)
- [OFFSET](https://clickhouse.com/docs/sql-reference/statements/select/offset)
- [ORDER BY](https://clickhouse.com/docs/sql-reference/statements/select/order-by)
- [PREWHERE](https://clickhouse.com/docs/sql-reference/statements/select/prewhere)
- [QUALIFY](https://clickhouse.com/docs/sql-reference/statements/select/qualify)
- [REPLACE](https://clickhouse.com/docs/sql-reference/statements/select/replace-modifier)
- [SAMPLE](https://clickhouse.com/docs/sql-reference/statements/select/sample)
- [UNION](https://clickhouse.com/docs/sql-reference/statements/select/union)
- [WHERE](https://clickhouse.com/docs/sql-reference/statements/select/where)
- [WITH](https://clickhouse.com/docs/sql-reference/statements/select/with)
- [INSERT INTO](https://clickhouse.com/docs/sql-reference/statements/insert-into)
- CREATEβΌ
- ALTERβΌ
- [DELETE](https://clickhouse.com/docs/sql-reference/statements/delete)
- [SYSTEM](https://clickhouse.com/docs/sql-reference/statements/system)
- [SHOW](https://clickhouse.com/docs/sql-reference/statements/show)
- [GRANT](https://clickhouse.com/docs/sql-reference/statements/grant)
- [EXPLAIN](https://clickhouse.com/docs/sql-reference/statements/explain)
- [REVOKE](https://clickhouse.com/docs/sql-reference/statements/revoke)
- [UPDATE](https://clickhouse.com/docs/sql-reference/statements/update)
- [ATTACH](https://clickhouse.com/docs/sql-reference/statements/attach)
- [CHECK TABLE](https://clickhouse.com/docs/sql-reference/statements/check-table)
- [DESCRIBE TABLE](https://clickhouse.com/docs/sql-reference/statements/describe-table)
- [DETACH](https://clickhouse.com/docs/sql-reference/statements/detach)
- [DROP](https://clickhouse.com/docs/sql-reference/statements/drop)
- [EXISTS](https://clickhouse.com/docs/sql-reference/statements/exists)
- [KILL](https://clickhouse.com/docs/sql-reference/statements/kill)
- [OPTIMIZE](https://clickhouse.com/docs/sql-reference/statements/optimize)
- [RENAME](https://clickhouse.com/docs/sql-reference/statements/rename)
- [EXCHANGE](https://clickhouse.com/docs/sql-reference/statements/exchange)
- [SET](https://clickhouse.com/docs/sql-reference/statements/set)
- [SET ROLE](https://clickhouse.com/docs/sql-reference/statements/set-role)
- [TRUNCATE](https://clickhouse.com/docs/sql-reference/statements/truncate)
- [EXECUTE AS](https://clickhouse.com/docs/sql-reference/statements/execute_as)
- [PARALLEL WITH](https://clickhouse.com/docs/sql-reference/statements/parallel_with)
- [USE](https://clickhouse.com/docs/sql-reference/statements/use)
- [WATCH](https://clickhouse.com/docs/sql-reference/statements/watch)
- [MOVE](https://clickhouse.com/docs/sql-reference/statements/move)
- [CHECK GRANT](https://clickhouse.com/docs/sql-reference/statements/check-grant)
- [UNDROP](https://clickhouse.com/docs/sql-reference/statements/undrop)
- OperatorsβΌ
- EnginesβΌ
- FunctionsβΌ
- FormatsβΌ
- [Data Lakes](https://clickhouse.com/docs/sql-reference/datalakes) |
| Readable Markdown | `GROUP BY` clause switches the `SELECT` query into an aggregation mode, which works as follows:
- `GROUP BY` clause contains a list of expressions (or a single expression, which is considered to be the list of length one). This list acts as a "grouping key", while each individual expression will be referred to as a "key expression".
- All the expressions in the [SELECT](https://clickhouse.com/docs/sql-reference/statements/select), [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having), and [ORDER BY](https://clickhouse.com/docs/sql-reference/statements/select/order-by) clauses **must** be calculated based on key expressions **or** on [aggregate functions](https://clickhouse.com/docs/sql-reference/aggregate-functions) over non-key expressions (including plain columns). In other words, each column selected from the table must be used either in a key expression or inside an aggregate function, but not both.
- Result of aggregating `SELECT` query will contain as many rows as there were unique values of "grouping key" in source table. Usually, this significantly reduces the row count, often by orders of magnitude, but not necessarily: row count stays the same if all "grouping key" values were distinct.
When you want to group data in the table by column numbers instead of column names, enable the setting [enable\_positional\_arguments](https://clickhouse.com/docs/operations/settings/settings#enable_positional_arguments).
Note
There's an additional way to run aggregation over a table. If a query contains table columns only inside aggregate functions, the `GROUP BY clause` can be omitted, and aggregation by an empty set of keys is assumed. Such queries always return exactly one row.
## NULL Processing[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#null-processing "Direct link to NULL Processing")
For grouping, ClickHouse interprets [NULL](https://clickhouse.com/docs/sql-reference/syntax#null) as a value, and `NULL==NULL`. It differs from `NULL` processing in most other contexts.
Here's an example to show what this means.
Assume you have this table:
```
ββxββ¬ββββyββ
β 1 β 2 β
β 2 β α΄Ία΅α΄Έα΄Έ β
β 3 β 2 β
β 3 β 3 β
β 3 β α΄Ία΅α΄Έα΄Έ β
βββββ΄βββββββ
```
The query `SELECT sum(x), y FROM t_null_big GROUP BY y` results in:
```
ββsum(x)ββ¬ββββyββ
β 4 β 2 β
β 3 β 3 β
β 5 β α΄Ία΅α΄Έα΄Έ β
ββββββββββ΄βββββββ
```
You can see that `GROUP BY` for `y = NULL` summed up `x`, as if `NULL` is this value.
If you pass several keys to `GROUP BY`, the result will give you all the combinations of the selection, as if `NULL` were a specific value.
## ROLLUP Modifier[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#rollup-modifier "Direct link to ROLLUP Modifier")
`ROLLUP` modifier is used to calculate subtotals for the key expressions, based on their order in the `GROUP BY` list. The subtotals rows are added after the result table.
The subtotals are calculated in the reverse order: at first subtotals are calculated for the last key expression in the list, then for the previous one, and so on up to the first key expression.
In the subtotals rows the values of already "grouped" key expressions are set to `0` or empty line.
Note
Mind that [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having) clause can affect the subtotals results.
**Example**
Consider the table t:
```
ββyearββ¬βmonthββ¬βdayββ
β 2019 β 1 β 5 β
β 2019 β 1 β 15 β
β 2020 β 1 β 5 β
β 2020 β 1 β 15 β
β 2020 β 10 β 5 β
β 2020 β 10 β 15 β
ββββββββ΄ββββββββ΄ββββββ
```
Query:
```
SELECT year, month, day, count(*) FROM t GROUP BY ROLLUP(year, month, day);
```
As `GROUP BY` section has three key expressions, the result contains four tables with subtotals "rolled up" from right to left:
- `GROUP BY year, month, day`;
- `GROUP BY year, month` (and `day` column is filled with zeros);
- `GROUP BY year` (now `month, day` columns are both filled with zeros);
- and totals (and all three key expression columns are zeros).
```
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2020 β 10 β 15 β 1 β
β 2020 β 1 β 5 β 1 β
β 2019 β 1 β 5 β 1 β
β 2020 β 1 β 15 β 1 β
β 2019 β 1 β 15 β 1 β
β 2020 β 10 β 5 β 1 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 1 β 0 β 2 β
β 2020 β 1 β 0 β 2 β
β 2020 β 10 β 0 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 0 β 0 β 2 β
β 2020 β 0 β 0 β 4 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 0 β 0 β 6 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
```
The same query also can be written using `WITH` keyword.
```
SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH ROLLUP;
```
**See also**
- [group\_by\_use\_nulls](https://clickhouse.com/docs/operations/settings/settings#group_by_use_nulls) setting for SQL standard compatibility.
## CUBE Modifier[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#cube-modifier "Direct link to CUBE Modifier")
`CUBE` modifier is used to calculate subtotals for every combination of the key expressions in the `GROUP BY` list. The subtotals rows are added after the result table.
In the subtotals rows the values of all "grouped" key expressions are set to `0` or empty line.
Note
Mind that [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having) clause can affect the subtotals results.
**Example**
Consider the table t:
```
ββyearββ¬βmonthββ¬βdayββ
β 2019 β 1 β 5 β
β 2019 β 1 β 15 β
β 2020 β 1 β 5 β
β 2020 β 1 β 15 β
β 2020 β 10 β 5 β
β 2020 β 10 β 15 β
ββββββββ΄ββββββββ΄ββββββ
```
Query:
```
SELECT year, month, day, count(*) FROM t GROUP BY CUBE(year, month, day);
```
As `GROUP BY` section has three key expressions, the result contains eight tables with subtotals for all key expression combinations:
- `GROUP BY year, month, day`
- `GROUP BY year, month`
- `GROUP BY year, day`
- `GROUP BY year`
- `GROUP BY month, day`
- `GROUP BY month`
- `GROUP BY day`
- and totals.
Columns, excluded from `GROUP BY`, are filled with zeros.
```
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2020 β 10 β 15 β 1 β
β 2020 β 1 β 5 β 1 β
β 2019 β 1 β 5 β 1 β
β 2020 β 1 β 15 β 1 β
β 2019 β 1 β 15 β 1 β
β 2020 β 10 β 5 β 1 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 1 β 0 β 2 β
β 2020 β 1 β 0 β 2 β
β 2020 β 10 β 0 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2020 β 0 β 5 β 2 β
β 2019 β 0 β 5 β 1 β
β 2020 β 0 β 15 β 2 β
β 2019 β 0 β 15 β 1 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 2019 β 0 β 0 β 2 β
β 2020 β 0 β 0 β 4 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 1 β 5 β 2 β
β 0 β 10 β 15 β 1 β
β 0 β 10 β 5 β 1 β
β 0 β 1 β 15 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 1 β 0 β 4 β
β 0 β 10 β 0 β 2 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 0 β 5 β 3 β
β 0 β 0 β 15 β 3 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
ββyearββ¬βmonthββ¬βdayββ¬βcount()ββ
β 0 β 0 β 0 β 6 β
ββββββββ΄ββββββββ΄ββββββ΄ββββββββββ
```
The same query also can be written using `WITH` keyword.
```
SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH CUBE;
```
**See also**
- [group\_by\_use\_nulls](https://clickhouse.com/docs/operations/settings/settings#group_by_use_nulls) setting for SQL standard compatibility.
## WITH TOTALS Modifier[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#with-totals-modifier "Direct link to WITH TOTALS Modifier")
If the `WITH TOTALS` modifier is specified, another row will be calculated. This row will have key columns containing default values (zeros or empty lines), and columns of aggregate functions with the values calculated across all the rows (the "total" values).
This extra row is only produced in `JSON*`, `TabSeparated*`, and `Pretty*` formats, separately from the other rows:
- In `XML` and `JSON*` formats, this row is output as a separate 'totals' field.
- In `TabSeparated*`, `CSV*` and `Vertical` formats, the row comes after the main result, preceded by an empty row (after the other data).
- In `Pretty*` formats, the row is output as a separate table after the main result.
- In `Template` format, the row is output according to specified template.
- In the other formats it is not available.
Note
totals is output in the results of `SELECT` queries, and is not output in `INSERT INTO ... SELECT`.
`WITH TOTALS` can be run in different ways when [HAVING](https://clickhouse.com/docs/sql-reference/statements/select/having) is present. The behavior depends on the `totals_mode` setting.
### Configuring Totals Processing[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#configuring-totals-processing "Direct link to Configuring Totals Processing")
By default, `totals_mode = 'before_having'`. In this case, 'totals' is calculated across all rows, including the ones that do not pass through HAVING and `max_rows_to_group_by`.
The other alternatives include only the rows that pass through HAVING in 'totals', and behave differently with the setting `max_rows_to_group_by` and `group_by_overflow_mode = 'any'`.
`after_having_exclusive` β Don't include rows that didn't pass through `max_rows_to_group_by`. In other words, 'totals' will have less than or the same number of rows as it would if `max_rows_to_group_by` were omitted.
`after_having_inclusive` β Include all the rows that didn't pass through 'max\_rows\_to\_group\_by' in 'totals'. In other words, 'totals' will have more than or the same number of rows as it would if `max_rows_to_group_by` were omitted.
`after_having_auto` β Count the number of rows that passed through HAVING. If it is more than a certain amount (by default, 50%), include all the rows that didn't pass through 'max\_rows\_to\_group\_by' in 'totals'. Otherwise, do not include them.
`totals_auto_threshold` β By default, 0.5. The coefficient for `after_having_auto`.
If `max_rows_to_group_by` and `group_by_overflow_mode = 'any'` are not used, all variations of `after_having` are the same, and you can use any of them (for example, `after_having_auto`).
You can use `WITH TOTALS` in subqueries, including subqueries in the [JOIN](https://clickhouse.com/docs/sql-reference/statements/select/join) clause (in this case, the respective total values are combined).
## GROUP BY ALL[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#group-by-all "Direct link to GROUP BY ALL")
`GROUP BY ALL` is equivalent to listing all the SELECT-ed expressions that are not aggregate functions.
For example:
```
SELECT
a * 2,
b,
count(c),
FROM t
GROUP BY ALL
```
is the same as
```
SELECT
a * 2,
b,
count(c),
FROM t
GROUP BY a * 2, b
```
For a special case that if there is a function having both aggregate functions and other fields as its arguments, the `GROUP BY` keys will contain the maximum non-aggregate fields we can extract from it.
For example:
```
SELECT
substring(a, 4, 2),
substring(substring(a, 1, 2), 1, count(b))
FROM t
GROUP BY ALL
```
is the same as
```
SELECT
substring(a, 4, 2),
substring(substring(a, 1, 2), 1, count(b))
FROM t
GROUP BY substring(a, 4, 2), substring(a, 1, 2)
```
## Examples[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#examples "Direct link to Examples")
Example:
```
SELECT
count(),
median(FetchTiming > 60 ? 60 : FetchTiming),
count() - sum(Refresh)
FROM hits
```
As opposed to MySQL (and conforming to standard SQL), you can't get some value of some column that is not in a key or aggregate function (except constant expressions). To work around this, you can use the 'any' aggregate function (get the first encountered value) or 'min/max'.
Example:
```
SELECT
domainWithoutWWW(URL) AS domain,
count(),
any(Title) AS title -- getting the first occurred page header for each domain.
FROM hits
GROUP BY domain
```
For every different key value encountered, `GROUP BY` calculates a set of aggregate function values.
## GROUPING SETS modifier[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#grouping-sets-modifier "Direct link to GROUPING SETS modifier")
This is the most general modifier. This modifier allows manually specifying several aggregation key sets (grouping sets). Aggregation is performed separately for each grouping set, and after that, all results are combined. If a column is not presented in a grouping set, it's filled with a default value.
In other words, modifiers described above can be represented via `GROUPING SETS`. Despite the fact that queries with `ROLLUP`, `CUBE` and `GROUPING SETS` modifiers are syntactically equal, they may perform differently. When `GROUPING SETS` try to execute everything in parallel, `ROLLUP` and `CUBE` are executing the final merging of the aggregates in a single thread.
In the situation when source columns contain default values, it might be hard to distinguish if a row is a part of the aggregation which uses those columns as keys or not. To solve this problem `GROUPING` function must be used.
**Example**
The following two queries are equivalent.
```
-- Query 1
SELECT year, month, day, count(*) FROM t GROUP BY year, month, day WITH ROLLUP;
-- Query 2
SELECT year, month, day, count(*) FROM t GROUP BY
GROUPING SETS
(
(year, month, day),
(year, month),
(year),
()
);
```
**See also**
- [group\_by\_use\_nulls](https://clickhouse.com/docs/operations/settings/settings#group_by_use_nulls) setting for SQL standard compatibility.
## Implementation Details[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#implementation-details "Direct link to Implementation Details")
Aggregation is one of the most important features of a column-oriented DBMS, and thus it's implementation is one of the most heavily optimized parts of ClickHouse. By default, aggregation is done in memory using a hash-table. It has 40+ specializations that are chosen automatically depending on "grouping key" data types.
### GROUP BY Optimization Depending on Table Sorting Key[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#group-by-optimization-depending-on-table-sorting-key "Direct link to GROUP BY Optimization Depending on Table Sorting Key")
The aggregation can be performed more effectively, if a table is sorted by some key, and `GROUP BY` expression contains at least prefix of sorting key or injective functions. In this case when a new key is read from table, the in-between result of aggregation can be finalized and sent to client. This behaviour is switched on by the [optimize\_aggregation\_in\_order](https://clickhouse.com/docs/operations/settings/settings#optimize_aggregation_in_order) setting. Such optimization reduces memory usage during aggregation, but in some cases may slow down the query execution.
### GROUP BY in External Memory[β](https://clickhouse.com/docs/sql-reference/statements/select/group-by#group-by-in-external-memory "Direct link to GROUP BY in External Memory")
You can enable dumping temporary data to the disk to restrict memory usage during `GROUP BY`. The [max\_bytes\_before\_external\_group\_by](https://clickhouse.com/docs/operations/settings/settings#max_bytes_before_external_group_by) setting determines the threshold RAM consumption for dumping `GROUP BY` temporary data to the file system. If set to 0 (the default), it is disabled. Alternatively, you can set [max\_bytes\_ratio\_before\_external\_group\_by](https://clickhouse.com/docs/operations/settings/settings#max_bytes_ratio_before_external_group_by), which allows to use `GROUP BY` in external memory only once the query reaches certain threshold of used memory.
When using `max_bytes_before_external_group_by`, we recommend that you set `max_memory_usage` about twice as high (or `max_bytes_ratio_before_external_group_by=0.5`). This is necessary because there are two stages to aggregation: reading the data and forming intermediate data (1) and merging the intermediate data (2). Dumping data to the file system can only occur during stage 1. If the temporary data wasn't dumped, then stage 2 might require up to the same amount of memory as in stage 1.
For example, if [max\_memory\_usage](https://clickhouse.com/docs/operations/settings/settings#max_memory_usage) was set to 10000000000 and you want to use external aggregation, it makes sense to set `max_bytes_before_external_group_by` to 10000000000, and `max_memory_usage` to 20000000000. When external aggregation is triggered (if there was at least one dump of temporary data), maximum consumption of RAM is only slightly more than `max_bytes_before_external_group_by`.
With distributed query processing, external aggregation is performed on remote servers. In order for the requester server to use only a small amount of RAM, set `distributed_aggregation_memory_efficient` to 1.
When merging data flushed to the disk, as well as when merging results from remote servers when the `distributed_aggregation_memory_efficient` setting is enabled, consumes up to `1/256 * the_number_of_threads` from the total amount of RAM.
When external aggregation is enabled, if there was less than `max_bytes_before_external_group_by` of data (i.e. data was not flushed), the query runs just as fast as without external aggregation. If any temporary data was flushed, the run time will be several times longer (approximately three times).
If you have an [ORDER BY](https://clickhouse.com/docs/sql-reference/statements/select/order-by) with a [LIMIT](https://clickhouse.com/docs/sql-reference/statements/select/limit) after `GROUP BY`, then the amount of used RAM depends on the amount of data in `LIMIT`, not in the whole table. But if the `ORDER BY` does not have `LIMIT`, do not forget to enable external sorting (`max_bytes_before_external_sort`). |
| Shard | 89 (laksa) |
| Root Hash | 12633450985039531489 |
| Unparsed URL | com,clickhouse!/docs/sql-reference/statements/select/group-by s443 |