โน๏ธ Skipped - page is already crawled
| Filter | Status | Condition | Details |
|---|---|---|---|
| HTTP status | PASS | download_http_code = 200 | HTTP 200 |
| Age cutoff | PASS | download_stamp > now() - 6 MONTH | 0 months ago |
| History drop | PASS | isNull(history_drop_reason) | No drop reason |
| Spam/ban | PASS | fh_dont_index != 1 AND ml_spam_score = 0 | ml_spam_score=0 |
| Canonical | PASS | meta_canonical IS NULL OR = '' OR = src_unparsed | Not set |
| Property | Value |
|---|---|
| URL | https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions |
| Last Crawled | 2026-04-17 10:59:41 (18 hours ago) |
| First Indexed | 2025-02-20 23:31:54 (1 year ago) |
| HTTP Status Code | 200 |
| Meta Title | Parametric Aggregate Functions | ClickHouse Docs |
| Meta Description | Documentation for Parametric Aggregate Functions |
| Meta Canonical | null |
| Boilerpipe Text | Some aggregate functions can accept not only argument columns (used for compression), but a set of parameters โ constants for initialization. The syntax is two pairs of brackets instead of one. The first is for parameters, and the second is for arguments.
histogram
โ
Calculates an adaptive histogram. It does not guarantee precise results.
histogram
(
number_of_bins
)
(
values
)
The functions uses
A Streaming Parallel Decision Tree Algorithm
. The borders of histogram bins are adjusted as new data enters a function. In common case, the widths of bins are not equal.
Arguments
values
โ
Expression
resulting in input values.
Parameters
number_of_bins
โ Upper limit for the number of bins in the histogram. The function automatically calculates the number of bins. It tries to reach the specified number of bins, but if it fails, it uses fewer bins.
Returned values
Array
of
Tuples
of the following format:
[(lower_1, upper_1, height_1), ... (lower_N, upper_N, height_N)]
lower
โ Lower bound of the bin.
upper
โ Upper bound of the bin.
height
โ Calculated height of the bin.
Example
SELECT
histogram
(
5
)
(
number
+
1
)
FROM
(
SELECT
*
FROM
system
.
numbers
LIMIT
20
)
โโhistogram(5)(plus(number, 1))โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [(1,4.5,4),(4.5,8.5,4),(8.5,12.75,4.125),(12.75,17,4.625),(17,20,3.25)] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
You can visualize a histogram with the
bar
function, for example:
WITH
histogram
(
5
)
(
rand
(
)
%
100
)
AS
hist
SELECT
arrayJoin
(
hist
)
.3
AS
height
,
bar
(
height
,
0
,
6
,
5
)
AS
bar
FROM
(
SELECT
*
FROM
system
.
numbers
LIMIT
20
)
โโheightโโฌโbarโโโโ
โ 2.125 โ โโ โ
โ 3.25 โ โโโ โ
โ 5.625 โ โโโโโ โ
โ 5.625 โ โโโโโ โ
โ 3.375 โ โโโ โ
โโโโโโโโโโดโโโโโโโโ
In this case, you should remember that you do not know the histogram bin borders.
sequenceMatch
โ
Checks whether the sequence contains an event chain that matches the pattern.
Syntax
sequenceMatch
(
pattern
)
(
timestamp
,
cond1
,
cond2
,
.
.
.
)
Note
Events that occur at the same second may lay in the sequence in an undefined order affecting the result.
Arguments
timestamp
โ Column considered to contain time data. Typical data types are
Date
and
DateTime
. You can also use any of the supported
UInt
data types.
cond1
,
cond2
โ Conditions that describe the chain of events. Data type:
UInt8
. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn't described in a condition, the function skips them.
Parameters
pattern
โ Pattern string. See
Pattern syntax
.
Returned values
1, if the pattern is matched.
0, if the pattern isn't matched.
Type:
UInt8
.
Pattern syntax
โ
(?N)
โ Matches the condition argument at position
N
. Conditions are numbered in the
[1, 32]
range. For example,
(?1)
matches the argument passed to the
cond1
parameter.
.*
โ Matches any number of events. You do not need conditional arguments to match this element of the pattern.
(?t operator value)
โ Sets the time in seconds that should separate two events. For example, pattern
(?1)(?t>1800)(?2)
matches events that occur more than 1800 seconds from each other. An arbitrary number of any events can lay between these events. You can use the
>=
,
>
,
<
,
<=
,
==
operators.
Examples
Consider data in the
t
table:
โโtimeโโฌโnumberโโ
โ 1 โ 1 โ
โ 2 โ 3 โ
โ 3 โ 2 โ
โโโโโโโโดโโโโโโโโโ
Perform the query:
SELECT
sequenceMatch
(
'(?1)(?2)'
)
(
time
,
number
=
1
,
number
=
2
)
FROM
t
โโsequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2))โโ
โ 1 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
The function found the event chain where number 2 follows number 1. It skipped number 3 between them, because the number is not described as an event. If we want to take this number into account when searching for the event chain given in the example, we should make a condition for it.
SELECT
sequenceMatch
(
'(?1)(?2)'
)
(
time
,
number
=
1
,
number
=
2
,
number
=
3
)
FROM
t
โโsequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2), equals(number, 3))โโ
โ 0 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
In this case, the function couldn't find the event chain matching the pattern, because the event for number 3 occurred between 1 and 2. If in the same case we checked the condition for number 4, the sequence would match the pattern.
SELECT
sequenceMatch
(
'(?1)(?2)'
)
(
time
,
number
=
1
,
number
=
2
,
number
=
4
)
FROM
t
โโsequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2), equals(number, 4))โโ
โ 1 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
See Also
sequenceCount
sequenceCount
โ
Counts the number of event chains that matched the pattern. The function searches event chains that do not overlap. It starts to search for the next chain after the current chain is matched.
Note
Events that occur at the same second may lay in the sequence in an undefined order affecting the result.
Syntax
sequenceCount
(
pattern
)
(
timestamp
,
cond1
,
cond2
,
.
.
.
)
Arguments
timestamp
โ Column considered to contain time data. Typical data types are
Date
and
DateTime
. You can also use any of the supported
UInt
data types.
cond1
,
cond2
โ Conditions that describe the chain of events. Data type:
UInt8
. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn't described in a condition, the function skips them.
Parameters
pattern
โ Pattern string. See
Pattern syntax
.
Returned values
Number of non-overlapping event chains that are matched.
Type:
UInt64
.
Example
Consider data in the
t
table:
โโtimeโโฌโnumberโโ
โ 1 โ 1 โ
โ 2 โ 3 โ
โ 3 โ 2 โ
โ 4 โ 1 โ
โ 5 โ 3 โ
โ 6 โ 2 โ
โโโโโโโโดโโโโโโโโโ
Count how many times the number 2 occurs after the number 1 with any amount of other numbers between them:
SELECT
sequenceCount
(
'(?1).*(?2)'
)
(
time
,
number
=
1
,
number
=
2
)
FROM
t
โโsequenceCount('(?1).*(?2)')(time, equals(number, 1), equals(number, 2))โโ
โ 2 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
sequenceMatchEvents
โ
Return event timestamps of longest event chains that matched the pattern.
Note
Events that occur at the same second may lay in the sequence in an undefined order affecting the result.
Syntax
sequenceMatchEvents
(
pattern
)
(
timestamp
,
cond1
,
cond2
,
.
.
.
)
Arguments
timestamp
โ Column considered to contain time data. Typical data types are
Date
and
DateTime
. You can also use any of the supported
UInt
data types.
cond1
,
cond2
โ Conditions that describe the chain of events. Data type:
UInt8
. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn't described in a condition, the function skips them.
Parameters
pattern
โ Pattern string. See
Pattern syntax
.
Returned values
Array of timestamps for matched condition arguments (?N) from event chain. Position in array match position of condition argument in pattern
Type: Array.
Example
Consider data in the
t
table:
โโtimeโโฌโnumberโโ
โ 1 โ 1 โ
โ 2 โ 3 โ
โ 3 โ 2 โ
โ 4 โ 1 โ
โ 5 โ 3 โ
โ 6 โ 2 โ
โโโโโโโโดโโโโโโโโโ
Return timestamps of events for longest chain
SELECT
sequenceMatchEvents
(
'(?1).*(?2).*(?1)(?3)'
)
(
time
,
number
=
1
,
number
=
2
,
number
=
4
)
FROM
t
โโsequenceMatchEvents('(?1).*(?2).*(?1)(?3)')(time, equals(number, 1), equals(number, 2), equals(number, 4))โโ
โ [1,3,4] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
See Also
sequenceMatch
windowFunnel
โ
Searches for event chains in a sliding time window and calculates the maximum number of events that occurred from the chain.
The function works according to the algorithm:
The function searches for data that triggers the first condition in the chain and sets the event counter to 1. This is the moment when the sliding window starts.
If events from the chain occur sequentially within the window, the counter is incremented. If the sequence of events is disrupted, the counter isn't incremented.
If the data has multiple event chains at varying points of completion, the function will only output the size of the longest chain.
Syntax
windowFunnel
(
window
,
[
mode
,
[
mode
,
.
.
.
]
]
)
(
timestamp
,
cond1
,
cond2
,
.
.
.
,
condN
)
Arguments
timestamp
โ Name of the column containing the timestamp. Data types supported:
Date
,
DateTime
and other unsigned integer types (note that even though timestamp supports the
UInt64
type, it's value can't exceed the Int64 maximum, which is 2^63 - 1).
cond
โ Conditions or data describing the chain of events.
UInt8
.
Parameters
window
โ Length of the sliding window, it is the time interval between the first and the last condition. The unit of
window
depends on the
timestamp
itself and varies. Determined using the expression
timestamp of cond1 <= timestamp of cond2 <= ... <= timestamp of condN <= timestamp of cond1 + window
.
mode
โ It is an optional argument. One or more modes can be set.
'strict_deduplication'
โ If the same condition holds for the sequence of events, then such repeating event interrupts further processing. Note: it may work unexpectedly if several conditions hold for the same event.
'strict_order'
โ Don't allow interventions of other events. E.g. in the case of
A->B->D->C
, it stops finding
A->B->C
at the
D
and the max event level is 2.
'strict_increase'
โ Apply conditions only to events with strictly increasing timestamps.
'strict_once'
โ Count each event only once in the chain even if it meets the condition several times.
'allow_reentry'
โ Ignore events that violate the strict order. E.g. in the case of A->A->B->C, it finds A->B->C by ignoring the redundant A and the max event level is 3.
Returned value
The maximum number of consecutive triggered conditions from the chain within the sliding time window.
All the chains in the selection are analyzed.
Type:
Integer
.
Example
Determine if a set period of time is enough for the user to select a phone and purchase it twice in the online store.
Set the following chain of events:
The user logged in to their account on the store (
eventID = 1003
).
The user searches for a phone (
eventID = 1007, product = 'phone'
).
The user placed an order (
eventID = 1009
).
The user made the order again (
eventID = 1010
).
Input table:
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-01-28 โ 1 โ 2019-01-29 10:00:00 โ 1003 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-01-31 โ 1 โ 2019-01-31 09:00:00 โ 1007 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-01-30 โ 1 โ 2019-01-30 08:00:00 โ 1009 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-02-01 โ 1 โ 2019-02-01 08:00:00 โ 1010 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
Find out how far the user
user_id
could get through the chain in a period in January-February of 2019.
Query:
SELECT
level
,
count
(
)
AS
c
FROM
(
SELECT
user_id
,
windowFunnel
(
6048000000000000
)
(
timestamp
,
eventID
=
1003
,
eventID
=
1009
,
eventID
=
1007
,
eventID
=
1010
)
AS
level
FROM
trend
WHERE
(
event_date
>=
'2019-01-01'
)
AND
(
event_date
<=
'2019-02-02'
)
GROUP
BY
user_id
)
GROUP
BY
level
ORDER
BY
level
ASC
;
Result:
โโlevelโโฌโcโโ
โ 4 โ 1 โ
โโโโโโโโโดโโโโ
Example with allow_reentry mode
This example demonstrates how
allow_reentry
mode works with user reentry patterns:
-- Sample data: user visits checkout -> product detail -> checkout again -> payment
-- Without allow_reentry: stops at level 2 (product detail page)
-- With allow_reentry: reaches level 4 (payment completion)
SELECT
level
,
count
(
)
AS
users
FROM
(
SELECT
user_id
,
windowFunnel
(
3600
,
'strict_order'
,
'allow_reentry'
)
(
timestamp
,
action
=
'begin_checkout'
,
-- Step 1: Begin checkout
action
=
'view_product_detail'
,
-- Step 2: View product detail
action
=
'begin_checkout'
,
-- Step 3: Begin checkout again (reentry)
action
=
'complete_payment'
-- Step 4: Complete payment
)
AS
level
FROM
user_events
WHERE
event_date
=
today
(
)
GROUP
BY
user_id
)
GROUP
BY
level
ORDER
BY
level
ASC
;
retention
โ
The function takes as arguments a set of conditions from 1 to 32 arguments of type
UInt8
that indicate whether a certain condition was met for the event.
Any condition can be specified as an argument (as in
WHERE
).
The conditions, except the first, apply in pairs: the result of the second will be true if the first and second are true, of the third if the first and third are true, etc.
Syntax
retention
(
cond1
,
cond2
,
.
.
.
,
cond32
)
;
Arguments
cond
โ An expression that returns a
UInt8
result (1 or 0).
Returned value
The array of 1 or 0.
1 โ Condition was met for the event.
0 โ Condition wasn't met for the event.
Type:
UInt8
.
Example
Let's consider an example of calculating the
retention
function to determine site traffic.
1.
Create a table to illustrate an example.
CREATE
TABLE
retention_test
(
date
Date
,
uid Int32
)
ENGINE
=
Memory
;
INSERT
INTO
retention_test
SELECT
'2020-01-01'
,
number
FROM
numbers
(
5
)
;
INSERT
INTO
retention_test
SELECT
'2020-01-02'
,
number
FROM
numbers
(
10
)
;
INSERT
INTO
retention_test
SELECT
'2020-01-03'
,
number
FROM
numbers
(
15
)
;
Input table:
Query:
SELECT
*
FROM
retention_test
Result:
โโโโโโโโdateโโฌโuidโโ
โ 2020-01-01 โ 0 โ
โ 2020-01-01 โ 1 โ
โ 2020-01-01 โ 2 โ
โ 2020-01-01 โ 3 โ
โ 2020-01-01 โ 4 โ
โโโโโโโโโโโโโโดโโโโโโ
โโโโโโโโdateโโฌโuidโโ
โ 2020-01-02 โ 0 โ
โ 2020-01-02 โ 1 โ
โ 2020-01-02 โ 2 โ
โ 2020-01-02 โ 3 โ
โ 2020-01-02 โ 4 โ
โ 2020-01-02 โ 5 โ
โ 2020-01-02 โ 6 โ
โ 2020-01-02 โ 7 โ
โ 2020-01-02 โ 8 โ
โ 2020-01-02 โ 9 โ
โโโโโโโโโโโโโโดโโโโโโ
โโโโโโโโdateโโฌโuidโโ
โ 2020-01-03 โ 0 โ
โ 2020-01-03 โ 1 โ
โ 2020-01-03 โ 2 โ
โ 2020-01-03 โ 3 โ
โ 2020-01-03 โ 4 โ
โ 2020-01-03 โ 5 โ
โ 2020-01-03 โ 6 โ
โ 2020-01-03 โ 7 โ
โ 2020-01-03 โ 8 โ
โ 2020-01-03 โ 9 โ
โ 2020-01-03 โ 10 โ
โ 2020-01-03 โ 11 โ
โ 2020-01-03 โ 12 โ
โ 2020-01-03 โ 13 โ
โ 2020-01-03 โ 14 โ
โโโโโโโโโโโโโโดโโโโโโ
2.
Group users by unique ID
uid
using the
retention
function.
Query:
SELECT
uid
,
retention
(
date
=
'2020-01-01'
,
date
=
'2020-01-02'
,
date
=
'2020-01-03'
)
AS
r
FROM
retention_test
WHERE
date
IN
(
'2020-01-01'
,
'2020-01-02'
,
'2020-01-03'
)
GROUP
BY
uid
ORDER
BY
uid
ASC
Result:
โโuidโโฌโrโโโโโโโโ
โ 0 โ [1,1,1] โ
โ 1 โ [1,1,1] โ
โ 2 โ [1,1,1] โ
โ 3 โ [1,1,1] โ
โ 4 โ [1,1,1] โ
โ 5 โ [0,0,0] โ
โ 6 โ [0,0,0] โ
โ 7 โ [0,0,0] โ
โ 8 โ [0,0,0] โ
โ 9 โ [0,0,0] โ
โ 10 โ [0,0,0] โ
โ 11 โ [0,0,0] โ
โ 12 โ [0,0,0] โ
โ 13 โ [0,0,0] โ
โ 14 โ [0,0,0] โ
โโโโโโโดโโโโโโโโโโ
3.
Calculate the total number of site visits per day.
Query:
SELECT
sum(r[1]) AS r1,
sum(r[2]) AS r2,
sum(r[3]) AS r3
FROM
(
SELECT
uid,
retention(date = '2020-01-01', date = '2020-01-02', date = '2020-01-03') AS r
FROM retention_test
WHERE date IN ('2020-01-01', '2020-01-02', '2020-01-03')
GROUP BY uid
)
Result:
โโr1โโฌโr2โโฌโr3โโ
โ 5 โ 5 โ 5 โ
โโโโโโดโโโโโดโโโโโ
Where:
r1
- the number of unique visitors who visited the site during 2020-01-01 (the
cond1
condition).
r2
- the number of unique visitors who visited the site during a specific time period between 2020-01-01 and 2020-01-02 (
cond1
and
cond2
conditions).
r3
- the number of unique visitors who visited the site during a specific time period on 2020-01-01 and 2020-01-03 (
cond1
and
cond3
conditions).
uniqUpTo(N)(x)
โ
Calculates the number of different values of the argument up to a specified limit,
N
. If the number of different argument values is greater than
N
, this function returns
N
+ 1, otherwise it calculates the exact value.
Recommended for use with small
N
s, up to 10. The maximum value of
N
is 100.
For the state of an aggregate function, this function uses the amount of memory equal to 1 +
N
* the size of one value of bytes.
When dealing with strings, this function stores a non-cryptographic hash of 8 bytes; the calculation is approximated for strings.
For example, if you had a table that logs every search query made by users on your website. Each row in the table represents a single search query, with columns for the user ID, the search query, and the timestamp of the query. You can use
uniqUpTo
to generate a report that shows only the keywords that produced at least 5 unique users.
SELECT SearchPhrase
FROM SearchLog
GROUP BY SearchPhrase
HAVING uniqUpTo(4)(UserID) >= 5
uniqUpTo(4)(UserID)
calculates the number of unique
UserID
values for each
SearchPhrase
, but it only counts up to 4 unique values. If there are more than 4 unique
UserID
values for a
SearchPhrase
, the function returns 5 (4 + 1). The
HAVING
clause then filters out the
SearchPhrase
values for which the number of unique
UserID
values is less than 5. This will give you a list of search keywords that were used by at least 5 unique users.
sumMapFiltered
โ
This function behaves the same as
sumMap
except that it also accepts an array of keys to filter with as a parameter. This can be especially useful when working with a high cardinality of keys.
Syntax
sumMapFiltered(keys_to_keep)(keys, values)
Parameters
keys_to_keep
:
Array
of keys to filter with.
keys
:
Array
of keys.
values
:
Array
of values.
Returned Value
Returns a tuple of two arrays: keys in sorted order, and values โโsummed for the corresponding keys.
Example
Query:
CREATE TABLE sum_map
(
`date` Date,
`timeslot` DateTime,
`statusMap` Nested(status UInt16, requests UInt64)
)
ENGINE = Log
INSERT INTO sum_map VALUES
('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10]);
SELECT sumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests) FROM sum_map;
Result:
โโsumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests)โโ
1. โ ([1,4,8],[10,20,10]) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
sumMapFilteredWithOverflow
โ
This function behaves the same as
sumMap
except that it also accepts an array of keys to filter with as a parameter. This can be especially useful when working with a high cardinality of keys. It differs from the
sumMapFiltered
function in that it does summation with overflow - i.e. returns the same data type for the summation as the argument data type.
Syntax
sumMapFilteredWithOverflow(keys_to_keep)(keys, values)
Parameters
keys_to_keep
:
Array
of keys to filter with.
keys
:
Array
of keys.
values
:
Array
of values.
Returned Value
Returns a tuple of two arrays: keys in sorted order, and values โโsummed for the corresponding keys.
Example
In this example we create a table
sum_map
, insert some data into it and then use both
sumMapFilteredWithOverflow
and
sumMapFiltered
and the
toTypeName
function for comparison of the result. Where
requests
was of type
UInt8
in the created table,
sumMapFiltered
has promoted the type of the summed values to
UInt64
to avoid overflow whereas
sumMapFilteredWithOverflow
has kept the type as
UInt8
which is not large enough to store the result - i.e. overflow has occurred.
Query:
CREATE TABLE sum_map
(
`date` Date,
`timeslot` DateTime,
`statusMap` Nested(status UInt8, requests UInt8)
)
ENGINE = Log
INSERT INTO sum_map VALUES
('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10]);
SELECT sumMapFilteredWithOverflow([1, 4, 8])(statusMap.status, statusMap.requests) as summap_overflow, toTypeName(summap_overflow) FROM sum_map;
SELECT sumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests) as summap, toTypeName(summap) FROM sum_map;
Result:
โโsumโโโโโโโโโโโโโโโโโโโฌโtoTypeName(sum)โโโโโโโโโโโโโโโโโโโโ
1. โ ([1,4,8],[10,20,10]) โ Tuple(Array(UInt8), Array(UInt8)) โ
โโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โโsummapโโโโโโโโโโโโโโโโฌโtoTypeName(summap)โโโโโโโโโโโโโโโโโโ
1. โ ([1,4,8],[10,20,10]) โ Tuple(Array(UInt8), Array(UInt64)) โ
โโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
sequenceNextNode
โ
Returns a value of the next event that matched an event chain.
Experimental function,
SET allow_experimental_funnel_functions = 1
to enable it.
Syntax
sequenceNextNode(direction, base)(timestamp, event_column, base_condition, event1, event2, event3, ...)
Parameters
direction
โ Used to navigate to directions.
forward โ Moving forward.
backward โ Moving backward.
base
โ Used to set the base point.
head โ Set the base point to the first event.
tail โ Set the base point to the last event.
first_match โ Set the base point to the first matched
event1
.
last_match โ Set the base point to the last matched
event1
.
Arguments
timestamp
โ Name of the column containing the timestamp. Data types supported:
Date
,
DateTime
and other unsigned integer types.
event_column
โ Name of the column containing the value of the next event to be returned. Data types supported:
String
and
Nullable(String)
.
base_condition
โ Condition that the base point must fulfill.
event1
,
event2
, ... โ Conditions describing the chain of events.
UInt8
.
Returned values
event_column[next_index]
โ If the pattern is matched and next value exists.
NULL
- If the pattern isn't matched or next value doesn't exist.
Type:
Nullable(String)
.
Example
It can be used when events are A->B->C->D->E and you want to know the event following B->C, which is D.
The query statement searching the event following A->B:
CREATE TABLE test_flow (
dt DateTime,
id int,
page String)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(dt)
ORDER BY id;
INSERT INTO test_flow VALUES (1, 1, 'A') (2, 1, 'B') (3, 1, 'C') (4, 1, 'D') (5, 1, 'E');
SELECT id, sequenceNextNode('forward', 'head')(dt, page, page = 'A', page = 'A', page = 'B') as next_flow FROM test_flow GROUP BY id;
Result:
โโidโโฌโnext_flowโโ
โ 1 โ C โ
โโโโโโดโโโโโโโโโโโโ
Behavior for
forward
and
head
ALTER TABLE test_flow DELETE WHERE 1 = 1 settings mutations_sync = 1;
INSERT INTO test_flow VALUES (1, 1, 'Home') (2, 1, 'Gift') (3, 1, 'Exit');
INSERT INTO test_flow VALUES (1, 2, 'Home') (2, 2, 'Home') (3, 2, 'Gift') (4, 2, 'Basket');
INSERT INTO test_flow VALUES (1, 3, 'Gift') (2, 3, 'Home') (3, 3, 'Gift') (4, 3, 'Basket');
SELECT id, sequenceNextNode('forward', 'head')(dt, page, page = 'Home', page = 'Home', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home // Base point, Matched with Home
1970-01-01 09:00:02 1 Gift // Matched with Gift
1970-01-01 09:00:03 1 Exit // The result
1970-01-01 09:00:01 2 Home // Base point, Matched with Home
1970-01-01 09:00:02 2 Home // Unmatched with Gift
1970-01-01 09:00:03 2 Gift
1970-01-01 09:00:04 2 Basket
1970-01-01 09:00:01 3 Gift // Base point, Unmatched with Home
1970-01-01 09:00:02 3 Home
1970-01-01 09:00:03 3 Gift
1970-01-01 09:00:04 3 Basket
Behavior for
backward
and
tail
SELECT id, sequenceNextNode('backward', 'tail')(dt, page, page = 'Basket', page = 'Basket', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home
1970-01-01 09:00:02 1 Gift
1970-01-01 09:00:03 1 Exit // Base point, Unmatched with Basket
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home // The result
1970-01-01 09:00:03 2 Gift // Matched with Gift
1970-01-01 09:00:04 2 Basket // Base point, Matched with Basket
1970-01-01 09:00:01 3 Gift
1970-01-01 09:00:02 3 Home // The result
1970-01-01 09:00:03 3 Gift // Base point, Matched with Gift
1970-01-01 09:00:04 3 Basket // Base point, Matched with Basket
Behavior for
forward
and
first_match
SELECT id, sequenceNextNode('forward', 'first_match')(dt, page, page = 'Gift', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit // The result
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket The result
1970-01-01 09:00:01 3 Gift // Base point
1970-01-01 09:00:02 3 Home // The result
1970-01-01 09:00:03 3 Gift
1970-01-01 09:00:04 3 Basket
SELECT id, sequenceNextNode('forward', 'first_match')(dt, page, page = 'Gift', page = 'Gift', page = 'Home') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit // Unmatched with Home
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket // Unmatched with Home
1970-01-01 09:00:01 3 Gift // Base point
1970-01-01 09:00:02 3 Home // Matched with Home
1970-01-01 09:00:03 3 Gift // The result
1970-01-01 09:00:04 3 Basket
Behavior for
backward
and
last_match
SELECT id, sequenceNextNode('backward', 'last_match')(dt, page, page = 'Gift', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home // The result
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home // The result
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket
1970-01-01 09:00:01 3 Gift
1970-01-01 09:00:02 3 Home // The result
1970-01-01 09:00:03 3 Gift // Base point
1970-01-01 09:00:04 3 Basket
SELECT id, sequenceNextNode('backward', 'last_match')(dt, page, page = 'Gift', page = 'Gift', page = 'Home') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home // Matched with Home, the result is null
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit
1970-01-01 09:00:01 2 Home // The result
1970-01-01 09:00:02 2 Home // Matched with Home
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket
1970-01-01 09:00:01 3 Gift // The result
1970-01-01 09:00:02 3 Home // Matched with Home
1970-01-01 09:00:03 3 Gift // Base point
1970-01-01 09:00:04 3 Basket
Behavior for
base_condition
CREATE TABLE test_flow_basecond
(
`dt` DateTime,
`id` int,
`page` String,
`ref` String
)
ENGINE = MergeTree
PARTITION BY toYYYYMMDD(dt)
ORDER BY id;
INSERT INTO test_flow_basecond VALUES (1, 1, 'A', 'ref4') (2, 1, 'A', 'ref3') (3, 1, 'B', 'ref2') (4, 1, 'B', 'ref1');
SELECT id, sequenceNextNode('forward', 'head')(dt, page, ref = 'ref1', page = 'A') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4 // The head can not be base point because the ref column of the head unmatched with 'ref1'.
1970-01-01 09:00:02 1 A ref3
1970-01-01 09:00:03 1 B ref2
1970-01-01 09:00:04 1 B ref1
SELECT id, sequenceNextNode('backward', 'tail')(dt, page, ref = 'ref4', page = 'B') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4
1970-01-01 09:00:02 1 A ref3
1970-01-01 09:00:03 1 B ref2
1970-01-01 09:00:04 1 B ref1 // The tail can not be base point because the ref column of the tail unmatched with 'ref4'.
SELECT id, sequenceNextNode('forward', 'first_match')(dt, page, ref = 'ref3', page = 'A') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4 // This row can not be base point because the ref column unmatched with 'ref3'.
1970-01-01 09:00:02 1 A ref3 // Base point
1970-01-01 09:00:03 1 B ref2 // The result
1970-01-01 09:00:04 1 B ref1
SELECT id, sequenceNextNode('backward', 'last_match')(dt, page, ref = 'ref2', page = 'B') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4
1970-01-01 09:00:02 1 A ref3 // The result
1970-01-01 09:00:03 1 B ref2 // Base point
1970-01-01 09:00:04 1 B ref1 // This row can not be base point because the ref column unmatched with 'ref2'. |
| Markdown | [Skip to main content](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#__docusaurus_skipToContent_fallback)
[](https://clickhouse.com/)
- [Products](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [ClickHouse Cloud Best way to use ClickHouse. Available on AWS, GCP, and Azure.](https://clickhouse.com/cloud)
- [BYOC (Bring Your Own Cloud) The fully managed ClickHouse Cloud service, Can be deployed in your AWS account.](https://clickhouse.com/cloud/bring-your-own-cloud)
- [ClickHouse Set up a database with open-source ClickHouse. ClickHouse](https://clickhouse.com/clickhouse)
- [Discover more than 100 integrations.](https://clickhouse.com/integrations)
[Discover more than 100 integrations.](https://clickhouse.com/integrations)
- [Use cases](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [Real-time analytics](https://clickhouse.com/use-cases/real-time-analytics)
- [Machine Learning & Generative AI](https://clickhouse.com/use-cases/machine-learning-and-data-science)
- [Business Intelligence](https://clickhouse.com/use-cases/data-warehousing)
- [Logs, Events, Traces](https://clickhouse.com/use-cases/observability)
- [All use cases](https://clickhouse.com/use-cases)
[All use cases](https://clickhouse.com/use-cases)
- [Documentation](https://clickhouse.com/docs)
- [Resources](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [User stories](https://clickhouse.com/user-stories)
- [Blog](https://clickhouse.com/blog)
- [Events](https://clickhouse.com/company/events)
- [Learning and certification](https://clickhouse.com/learn)
- [Comparison](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [BigQuery](https://clickhouse.com/comparison/bigquery)
- [PostgreSQL](https://clickhouse.com/comparison/postgresql)
- [Redshift](https://clickhouse.com/comparison/redshift)
- [Rockset](https://clickhouse.com/comparison/rockset)
- [Snowflake](https://clickhouse.com/comparison/snowflake)
- [Video](https://clickhouse.com/videos)
- [Demo](https://clickhouse.com/demos)
- [Pricing](https://clickhouse.com/pricing)
- [Contact](https://clickhouse.com/company/contact?loc=nav)
[46\.9k](https://github.com/ClickHouse/ClickHouse?utm_source=clickhouse&utm_medium=website&utm_campaign=website-nav)
[Search`Ctrl``K`](https://clickhouse.com/docs/search)
[Sign in](https://console.clickhouse.cloud/signIn?loc=docs-nav-signIn-cta&glxid=d8f5ecb6-b6ab-448c-86ef-529140b7035d&pagePath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&origPath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&utm_ga=GA1.1.420162415.1776423582)
[Get started](https://console.clickhouse.cloud/signUp?loc=docs-nav-signUp-cta&glxid=d8f5ecb6-b6ab-448c-86ef-529140b7035d&pagePath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&origPath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&utm_ga=GA1.1.420162415.1776423582)
[Get started](https://clickhouse.com/docs/introduction-clickhouse)
[Cloud](https://clickhouse.com/docs/cloud/overview)
[Manage data](https://clickhouse.com/docs/updating-data)
[Server admin](https://clickhouse.com/docs/guides/manage-and-deploy-index)
[Reference](https://clickhouse.com/docs/sql-reference)
[Integrations](https://clickhouse.com/docs/integrations)
[ClickStack](https://clickhouse.com/docs/use-cases/observability/clickstack/overview)
[chDB](https://clickhouse.com/docs/chdb)
[About](https://clickhouse.com/docs/about)
[Knowledge Base](https://clickhouse.com/docs/knowledgebase)
[English](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [English](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [ๆฅๆฌ่ช](https://clickhouse.com/docs/jp/sql-reference/aggregate-functions/parametric-functions)
- [ไธญๆ](https://clickhouse.com/docs/zh/sql-reference/aggregate-functions/parametric-functions)
- [ะ ัััะบะธะน](https://clickhouse.com/docs/ru/sql-reference/aggregate-functions/parametric-functions)
- [ํ๊ตญ์ด](https://clickhouse.com/docs/ko/sql-reference/aggregate-functions/parametric-functions)
[Skip to main content](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#__docusaurus_skipToContent_fallback)
[](https://clickhouse.com/)
- [Products](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [ClickHouse Cloud Best way to use ClickHouse. Available on AWS, GCP, and Azure.](https://clickhouse.com/cloud)
- [BYOC (Bring Your Own Cloud) The fully managed ClickHouse Cloud service, Can be deployed in your AWS account.](https://clickhouse.com/cloud/bring-your-own-cloud)
- [ClickHouse Set up a database with open-source ClickHouse. ClickHouse](https://clickhouse.com/clickhouse)
- [Discover more than 100 integrations.](https://clickhouse.com/integrations)
[Discover more than 100 integrations.](https://clickhouse.com/integrations)
- [Use cases](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [Real-time analytics](https://clickhouse.com/use-cases/real-time-analytics)
- [Machine Learning & Generative AI](https://clickhouse.com/use-cases/machine-learning-and-data-science)
- [Business Intelligence](https://clickhouse.com/use-cases/data-warehousing)
- [Logs, Events, Traces](https://clickhouse.com/use-cases/observability)
- [All use cases](https://clickhouse.com/use-cases)
[All use cases](https://clickhouse.com/use-cases)
- [Documentation](https://clickhouse.com/docs)
- [Resources](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [User stories](https://clickhouse.com/user-stories)
- [Blog](https://clickhouse.com/blog)
- [Events](https://clickhouse.com/company/events)
- [Learning and certification](https://clickhouse.com/learn)
- [Comparison](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [BigQuery](https://clickhouse.com/comparison/bigquery)
- [PostgreSQL](https://clickhouse.com/comparison/postgresql)
- [Redshift](https://clickhouse.com/comparison/redshift)
- [Rockset](https://clickhouse.com/comparison/rockset)
- [Snowflake](https://clickhouse.com/comparison/snowflake)
- [Video](https://clickhouse.com/videos)
- [Demo](https://clickhouse.com/demos)
- [Pricing](https://clickhouse.com/pricing)
- [Contact](https://clickhouse.com/company/contact?loc=nav)
[46\.9k](https://github.com/ClickHouse/ClickHouse?utm_source=clickhouse&utm_medium=website&utm_campaign=website-nav)
[Search`Ctrl``K`](https://clickhouse.com/docs/search)
[Sign in](https://console.clickhouse.cloud/signIn?loc=docs-nav-signIn-cta&glxid=d8f5ecb6-b6ab-448c-86ef-529140b7035d&pagePath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&origPath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&utm_ga=GA1.1.420162415.1776423582)
[Get started](https://console.clickhouse.cloud/signUp?loc=docs-nav-signUp-cta&glxid=d8f5ecb6-b6ab-448c-86ef-529140b7035d&pagePath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&origPath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&utm_ga=GA1.1.420162415.1776423582)
[Get started](https://clickhouse.com/docs/introduction-clickhouse)
[Cloud](https://clickhouse.com/docs/cloud/overview)
[Manage data](https://clickhouse.com/docs/updating-data)
[Server admin](https://clickhouse.com/docs/guides/manage-and-deploy-index)
[Reference](https://clickhouse.com/docs/sql-reference)
[Integrations](https://clickhouse.com/docs/integrations)
[ClickStack](https://clickhouse.com/docs/use-cases/observability/clickstack/overview)
[chDB](https://clickhouse.com/docs/chdb)
[About](https://clickhouse.com/docs/about)
[Knowledge Base](https://clickhouse.com/docs/knowledgebase)
[English](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [English](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [ๆฅๆฌ่ช](https://clickhouse.com/docs/jp/sql-reference/aggregate-functions/parametric-functions)
- [ไธญๆ](https://clickhouse.com/docs/zh/sql-reference/aggregate-functions/parametric-functions)
- [ะ ัััะบะธะน](https://clickhouse.com/docs/ru/sql-reference/aggregate-functions/parametric-functions)
- [ํ๊ตญ์ด](https://clickhouse.com/docs/ko/sql-reference/aggregate-functions/parametric-functions)
[Search`Ctrl``K`](https://clickhouse.com/docs/search)
- [Introduction](https://clickhouse.com/docs/sql-reference)
- [Syntax](https://clickhouse.com/docs/sql-reference/syntax)
- [Input and Output Formats](https://clickhouse.com/docs/sql-reference/formats)
- [Data types](https://clickhouse.com/docs/sql-reference/data-types)
- [Statements](https://clickhouse.com/docs/sql-reference/statements)
- [Operators](https://clickhouse.com/docs/sql-reference/operators)
- [Engines](https://clickhouse.com/docs/engines)
- [Database Engines](https://clickhouse.com/docs/engines/database-engines)
- [Table Engines](https://clickhouse.com/docs/engines/table-engines)
- [Functions](https://clickhouse.com/docs/sql-reference/functions)
- [Regular functions](https://clickhouse.com/docs/sql-reference/functions/regular-functions)
- [Aggregate functions](https://clickhouse.com/docs/sql-reference/aggregate-functions)
- [Aggregate Functions](https://clickhouse.com/docs/sql-reference/aggregate-functions/reference)
- [Combinators](https://clickhouse.com/docs/sql-reference/aggregate-functions/combinators)
- [Parametric](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [GROUPING](https://clickhouse.com/docs/sql-reference/aggregate-functions/grouping_function)
- [Combinator examples](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [Table functions](https://clickhouse.com/docs/sql-reference/table-functions)
- [Window functions](https://clickhouse.com/docs/sql-reference/window-functions)
- [Formats](https://clickhouse.com/docs/interfaces/formats)
- [Data Lakes](https://clickhouse.com/docs/sql-reference/datalakes)
- [Functions](https://clickhouse.com/docs/sql-reference/functions)
- [Aggregate functions](https://clickhouse.com/docs/sql-reference/aggregate-functions)
- Parametric
[Edit this page](https://github.com/ClickHouse/ClickHouse/tree/master/docs/en/sql-reference/aggregate-functions/parametric-functions.md)
# Parametric aggregate functions
Some aggregate functions can accept not only argument columns (used for compression), but a set of parameters โ constants for initialization. The syntax is two pairs of brackets instead of one. The first is for parameters, and the second is for arguments.
## histogram[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#histogram "Direct link to histogram")
Calculates an adaptive histogram. It does not guarantee precise results.
```
histogram(number_of_bins)(values)
```
The functions uses [A Streaming Parallel Decision Tree Algorithm](http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf). The borders of histogram bins are adjusted as new data enters a function. In common case, the widths of bins are not equal.
**Arguments**
`values` โ [Expression](https://clickhouse.com/docs/sql-reference/syntax#expressions) resulting in input values.
**Parameters**
`number_of_bins` โ Upper limit for the number of bins in the histogram. The function automatically calculates the number of bins. It tries to reach the specified number of bins, but if it fails, it uses fewer bins.
**Returned values**
- [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of [Tuples](https://clickhouse.com/docs/sql-reference/data-types/tuple) of the following format:
```
[(lower_1, upper_1, height_1), ... (lower_N, upper_N, height_N)]
```
- `lower` โ Lower bound of the bin.
- `upper` โ Upper bound of the bin.
- `height` โ Calculated height of the bin.
**Example**
```
SELECT histogram(5)(number + 1)
FROM (
SELECT *
FROM system.numbers
LIMIT 20
)
```
```
โโhistogram(5)(plus(number, 1))โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [(1,4.5,4),(4.5,8.5,4),(8.5,12.75,4.125),(12.75,17,4.625),(17,20,3.25)] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
You can visualize a histogram with the [bar](https://clickhouse.com/docs/sql-reference/functions/other-functions#bar) function, for example:
```
WITH histogram(5)(rand() % 100) AS hist
SELECT
arrayJoin(hist).3 AS height,
bar(height, 0, 6, 5) AS bar
FROM
(
SELECT *
FROM system.numbers
LIMIT 20
)
```
```
โโheightโโฌโbarโโโโ
โ 2.125 โ โโ โ
โ 3.25 โ โโโ โ
โ 5.625 โ โโโโโ โ
โ 5.625 โ โโโโโ โ
โ 3.375 โ โโโ โ
โโโโโโโโโโดโโโโโโโโ
```
In this case, you should remember that you do not know the histogram bin borders.
## sequenceMatch[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencematch "Direct link to sequenceMatch")
Checks whether the sequence contains an event chain that matches the pattern.
**Syntax**
```
sequenceMatch(pattern)(timestamp, cond1, cond2, ...)
```
Note
Events that occur at the same second may lay in the sequence in an undefined order affecting the result.
**Arguments**
- `timestamp` โ Column considered to contain time data. Typical data types are `Date` and `DateTime`. You can also use any of the supported [UInt](https://clickhouse.com/docs/sql-reference/data-types/int-uint) data types.
- `cond1`, `cond2` โ Conditions that describe the chain of events. Data type: `UInt8`. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn't described in a condition, the function skips them.
**Parameters**
- `pattern` โ Pattern string. See [Pattern syntax](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#pattern-syntax).
**Returned values**
- 1, if the pattern is matched.
- 0, if the pattern isn't matched.
Type: `UInt8`.
#### Pattern syntax[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#pattern-syntax "Direct link to Pattern syntax")
- `(?N)` โ Matches the condition argument at position `N`. Conditions are numbered in the `[1, 32]` range. For example, `(?1)` matches the argument passed to the `cond1` parameter.
- `.*` โ Matches any number of events. You do not need conditional arguments to match this element of the pattern.
- `(?t operator value)` โ Sets the time in seconds that should separate two events. For example, pattern `(?1)(?t>1800)(?2)` matches events that occur more than 1800 seconds from each other. An arbitrary number of any events can lay between these events. You can use the `>=`, `>`, `<`, `<=`, `==` operators.
**Examples**
Consider data in the `t` table:
```
โโtimeโโฌโnumberโโ
โ 1 โ 1 โ
โ 2 โ 3 โ
โ 3 โ 2 โ
โโโโโโโโดโโโโโโโโโ
```
Perform the query:
```
SELECT sequenceMatch('(?1)(?2)')(time, number = 1, number = 2) FROM t
```
```
โโsequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2))โโ
โ 1 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
The function found the event chain where number 2 follows number 1. It skipped number 3 between them, because the number is not described as an event. If we want to take this number into account when searching for the event chain given in the example, we should make a condition for it.
```
SELECT sequenceMatch('(?1)(?2)')(time, number = 1, number = 2, number = 3) FROM t
```
```
โโsequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2), equals(number, 3))โโ
โ 0 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
In this case, the function couldn't find the event chain matching the pattern, because the event for number 3 occurred between 1 and 2. If in the same case we checked the condition for number 4, the sequence would match the pattern.
```
SELECT sequenceMatch('(?1)(?2)')(time, number = 1, number = 2, number = 4) FROM t
```
```
โโsequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2), equals(number, 4))โโ
โ 1 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
**See Also**
- [sequenceCount](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencecount)
## sequenceCount[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencecount "Direct link to sequenceCount")
Counts the number of event chains that matched the pattern. The function searches event chains that do not overlap. It starts to search for the next chain after the current chain is matched.
Note
Events that occur at the same second may lay in the sequence in an undefined order affecting the result.
**Syntax**
```
sequenceCount(pattern)(timestamp, cond1, cond2, ...)
```
**Arguments**
- `timestamp` โ Column considered to contain time data. Typical data types are `Date` and `DateTime`. You can also use any of the supported [UInt](https://clickhouse.com/docs/sql-reference/data-types/int-uint) data types.
- `cond1`, `cond2` โ Conditions that describe the chain of events. Data type: `UInt8`. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn't described in a condition, the function skips them.
**Parameters**
- `pattern` โ Pattern string. See [Pattern syntax](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#pattern-syntax).
**Returned values**
- Number of non-overlapping event chains that are matched.
Type: `UInt64`.
**Example**
Consider data in the `t` table:
```
โโtimeโโฌโnumberโโ
โ 1 โ 1 โ
โ 2 โ 3 โ
โ 3 โ 2 โ
โ 4 โ 1 โ
โ 5 โ 3 โ
โ 6 โ 2 โ
โโโโโโโโดโโโโโโโโโ
```
Count how many times the number 2 occurs after the number 1 with any amount of other numbers between them:
```
SELECT sequenceCount('(?1).*(?2)')(time, number = 1, number = 2) FROM t
```
```
โโsequenceCount('(?1).*(?2)')(time, equals(number, 1), equals(number, 2))โโ
โ 2 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
## sequenceMatchEvents[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencematchevents "Direct link to sequenceMatchEvents")
Return event timestamps of longest event chains that matched the pattern.
Note
Events that occur at the same second may lay in the sequence in an undefined order affecting the result.
**Syntax**
```
sequenceMatchEvents(pattern)(timestamp, cond1, cond2, ...)
```
**Arguments**
- `timestamp` โ Column considered to contain time data. Typical data types are `Date` and `DateTime`. You can also use any of the supported [UInt](https://clickhouse.com/docs/sql-reference/data-types/int-uint) data types.
- `cond1`, `cond2` โ Conditions that describe the chain of events. Data type: `UInt8`. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn't described in a condition, the function skips them.
**Parameters**
- `pattern` โ Pattern string. See [Pattern syntax](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#pattern-syntax).
**Returned values**
- Array of timestamps for matched condition arguments (?N) from event chain. Position in array match position of condition argument in pattern
Type: Array.
**Example**
Consider data in the `t` table:
```
โโtimeโโฌโnumberโโ
โ 1 โ 1 โ
โ 2 โ 3 โ
โ 3 โ 2 โ
โ 4 โ 1 โ
โ 5 โ 3 โ
โ 6 โ 2 โ
โโโโโโโโดโโโโโโโโโ
```
Return timestamps of events for longest chain
```
SELECT sequenceMatchEvents('(?1).*(?2).*(?1)(?3)')(time, number = 1, number = 2, number = 4) FROM t
```
```
โโsequenceMatchEvents('(?1).*(?2).*(?1)(?3)')(time, equals(number, 1), equals(number, 2), equals(number, 4))โโ
โ [1,3,4] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
**See Also**
- [sequenceMatch](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencematch)
## windowFunnel[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#windowfunnel "Direct link to windowFunnel")
Searches for event chains in a sliding time window and calculates the maximum number of events that occurred from the chain.
The function works according to the algorithm:
- The function searches for data that triggers the first condition in the chain and sets the event counter to 1. This is the moment when the sliding window starts.
- If events from the chain occur sequentially within the window, the counter is incremented. If the sequence of events is disrupted, the counter isn't incremented.
- If the data has multiple event chains at varying points of completion, the function will only output the size of the longest chain.
**Syntax**
```
windowFunnel(window, [mode, [mode, ... ]])(timestamp, cond1, cond2, ..., condN)
```
**Arguments**
- `timestamp` โ Name of the column containing the timestamp. Data types supported: [Date](https://clickhouse.com/docs/sql-reference/data-types/date), [DateTime](https://clickhouse.com/docs/sql-reference/data-types/datetime) and other unsigned integer types (note that even though timestamp supports the `UInt64` type, it's value can't exceed the Int64 maximum, which is 2^63 - 1).
- `cond` โ Conditions or data describing the chain of events. [UInt8](https://clickhouse.com/docs/sql-reference/data-types/int-uint).
**Parameters**
- `window` โ Length of the sliding window, it is the time interval between the first and the last condition. The unit of `window` depends on the `timestamp` itself and varies. Determined using the expression `timestamp of cond1 <= timestamp of cond2 <= ... <= timestamp of condN <= timestamp of cond1 + window`.
- `mode` โ It is an optional argument. One or more modes can be set.
- `'strict_deduplication'` โ If the same condition holds for the sequence of events, then such repeating event interrupts further processing. Note: it may work unexpectedly if several conditions hold for the same event.
- `'strict_order'` โ Don't allow interventions of other events. E.g. in the case of `A->B->D->C`, it stops finding `A->B->C` at the `D` and the max event level is 2.
- `'strict_increase'` โ Apply conditions only to events with strictly increasing timestamps.
- `'strict_once'` โ Count each event only once in the chain even if it meets the condition several times.
- `'allow_reentry'` โ Ignore events that violate the strict order. E.g. in the case of A-\>A-\>B-\>C, it finds A-\>B-\>C by ignoring the redundant A and the max event level is 3.
**Returned value**
The maximum number of consecutive triggered conditions from the chain within the sliding time window. All the chains in the selection are analyzed.
Type: `Integer`.
**Example**
Determine if a set period of time is enough for the user to select a phone and purchase it twice in the online store.
Set the following chain of events:
1. The user logged in to their account on the store (`eventID = 1003`).
2. The user searches for a phone (`eventID = 1007, product = 'phone'`).
3. The user placed an order (`eventID = 1009`).
4. The user made the order again (`eventID = 1010`).
Input table:
```
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-01-28 โ 1 โ 2019-01-29 10:00:00 โ 1003 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-01-31 โ 1 โ 2019-01-31 09:00:00 โ 1007 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-01-30 โ 1 โ 2019-01-30 08:00:00 โ 1009 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-02-01 โ 1 โ 2019-02-01 08:00:00 โ 1010 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
```
Find out how far the user `user_id` could get through the chain in a period in January-February of 2019.
Query:
```
SELECT
level,
count() AS c
FROM
(
SELECT
user_id,
windowFunnel(6048000000000000)(timestamp, eventID = 1003, eventID = 1009, eventID = 1007, eventID = 1010) AS level
FROM trend
WHERE (event_date >= '2019-01-01') AND (event_date <= '2019-02-02')
GROUP BY user_id
)
GROUP BY level
ORDER BY level ASC;
```
Result:
```
โโlevelโโฌโcโโ
โ 4 โ 1 โ
โโโโโโโโโดโโโโ
```
**Example with allow\_reentry mode**
This example demonstrates how `allow_reentry` mode works with user reentry patterns:
```
-- Sample data: user visits checkout -> product detail -> checkout again -> payment
-- Without allow_reentry: stops at level 2 (product detail page)
-- With allow_reentry: reaches level 4 (payment completion)
SELECT
level,
count() AS users
FROM
(
SELECT
user_id,
windowFunnel(3600, 'strict_order', 'allow_reentry')(
timestamp,
action = 'begin_checkout', -- Step 1: Begin checkout
action = 'view_product_detail', -- Step 2: View product detail
action = 'begin_checkout', -- Step 3: Begin checkout again (reentry)
action = 'complete_payment' -- Step 4: Complete payment
) AS level
FROM user_events
WHERE event_date = today()
GROUP BY user_id
)
GROUP BY level
ORDER BY level ASC;
```
## retention[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#retention "Direct link to retention")
The function takes as arguments a set of conditions from 1 to 32 arguments of type `UInt8` that indicate whether a certain condition was met for the event. Any condition can be specified as an argument (as in [WHERE](https://clickhouse.com/docs/sql-reference/statements/select/where)).
The conditions, except the first, apply in pairs: the result of the second will be true if the first and second are true, of the third if the first and third are true, etc.
**Syntax**
```
retention(cond1, cond2, ..., cond32);
```
**Arguments**
- `cond` โ An expression that returns a `UInt8` result (1 or 0).
**Returned value**
The array of 1 or 0.
- 1 โ Condition was met for the event.
- 0 โ Condition wasn't met for the event.
Type: `UInt8`.
**Example**
Let's consider an example of calculating the `retention` function to determine site traffic.
**1\.** Create a table to illustrate an example.
```
CREATE TABLE retention_test(date Date, uid Int32) ENGINE = Memory;
INSERT INTO retention_test SELECT '2020-01-01', number FROM numbers(5);
INSERT INTO retention_test SELECT '2020-01-02', number FROM numbers(10);
INSERT INTO retention_test SELECT '2020-01-03', number FROM numbers(15);
```
Input table:
Query:
```
SELECT * FROM retention_test
```
Result:
```
โโโโโโโโdateโโฌโuidโโ
โ 2020-01-01 โ 0 โ
โ 2020-01-01 โ 1 โ
โ 2020-01-01 โ 2 โ
โ 2020-01-01 โ 3 โ
โ 2020-01-01 โ 4 โ
โโโโโโโโโโโโโโดโโโโโโ
โโโโโโโโdateโโฌโuidโโ
โ 2020-01-02 โ 0 โ
โ 2020-01-02 โ 1 โ
โ 2020-01-02 โ 2 โ
โ 2020-01-02 โ 3 โ
โ 2020-01-02 โ 4 โ
โ 2020-01-02 โ 5 โ
โ 2020-01-02 โ 6 โ
โ 2020-01-02 โ 7 โ
โ 2020-01-02 โ 8 โ
โ 2020-01-02 โ 9 โ
โโโโโโโโโโโโโโดโโโโโโ
โโโโโโโโdateโโฌโuidโโ
โ 2020-01-03 โ 0 โ
โ 2020-01-03 โ 1 โ
โ 2020-01-03 โ 2 โ
โ 2020-01-03 โ 3 โ
โ 2020-01-03 โ 4 โ
โ 2020-01-03 โ 5 โ
โ 2020-01-03 โ 6 โ
โ 2020-01-03 โ 7 โ
โ 2020-01-03 โ 8 โ
โ 2020-01-03 โ 9 โ
โ 2020-01-03 โ 10 โ
โ 2020-01-03 โ 11 โ
โ 2020-01-03 โ 12 โ
โ 2020-01-03 โ 13 โ
โ 2020-01-03 โ 14 โ
โโโโโโโโโโโโโโดโโโโโโ
```
**2\.** Group users by unique ID `uid` using the `retention` function.
Query:
```
SELECT
uid,
retention(date = '2020-01-01', date = '2020-01-02', date = '2020-01-03') AS r
FROM retention_test
WHERE date IN ('2020-01-01', '2020-01-02', '2020-01-03')
GROUP BY uid
ORDER BY uid ASC
```
Result:
```
โโuidโโฌโrโโโโโโโโ
โ 0 โ [1,1,1] โ
โ 1 โ [1,1,1] โ
โ 2 โ [1,1,1] โ
โ 3 โ [1,1,1] โ
โ 4 โ [1,1,1] โ
โ 5 โ [0,0,0] โ
โ 6 โ [0,0,0] โ
โ 7 โ [0,0,0] โ
โ 8 โ [0,0,0] โ
โ 9 โ [0,0,0] โ
โ 10 โ [0,0,0] โ
โ 11 โ [0,0,0] โ
โ 12 โ [0,0,0] โ
โ 13 โ [0,0,0] โ
โ 14 โ [0,0,0] โ
โโโโโโโดโโโโโโโโโโ
```
**3\.** Calculate the total number of site visits per day.
Query:
```
SELECT
sum(r[1]) AS r1,
sum(r[2]) AS r2,
sum(r[3]) AS r3
FROM
(
SELECT
uid,
retention(date = '2020-01-01', date = '2020-01-02', date = '2020-01-03') AS r
FROM retention_test
WHERE date IN ('2020-01-01', '2020-01-02', '2020-01-03')
GROUP BY uid
)
```
Result:
```
โโr1โโฌโr2โโฌโr3โโ
โ 5 โ 5 โ 5 โ
โโโโโโดโโโโโดโโโโโ
```
Where:
- `r1`\- the number of unique visitors who visited the site during 2020-01-01 (the `cond1` condition).
- `r2`\- the number of unique visitors who visited the site during a specific time period between 2020-01-01 and 2020-01-02 (`cond1` and `cond2` conditions).
- `r3`\- the number of unique visitors who visited the site during a specific time period on 2020-01-01 and 2020-01-03 (`cond1` and `cond3` conditions).
## uniqUpTo(N)(x)[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#uniquptonx "Direct link to uniqUpTo(N)(x)")
Calculates the number of different values of the argument up to a specified limit, `N`. If the number of different argument values is greater than `N`, this function returns `N` + 1, otherwise it calculates the exact value.
Recommended for use with small `N`s, up to 10. The maximum value of `N` is 100.
For the state of an aggregate function, this function uses the amount of memory equal to 1 + `N` \* the size of one value of bytes. When dealing with strings, this function stores a non-cryptographic hash of 8 bytes; the calculation is approximated for strings.
For example, if you had a table that logs every search query made by users on your website. Each row in the table represents a single search query, with columns for the user ID, the search query, and the timestamp of the query. You can use `uniqUpTo` to generate a report that shows only the keywords that produced at least 5 unique users.
```
SELECT SearchPhrase
FROM SearchLog
GROUP BY SearchPhrase
HAVING uniqUpTo(4)(UserID) >= 5
```
`uniqUpTo(4)(UserID)` calculates the number of unique `UserID` values for each `SearchPhrase`, but it only counts up to 4 unique values. If there are more than 4 unique `UserID` values for a `SearchPhrase`, the function returns 5 (4 + 1). The `HAVING` clause then filters out the `SearchPhrase` values for which the number of unique `UserID` values is less than 5. This will give you a list of search keywords that were used by at least 5 unique users.
## sumMapFiltered[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#summapfiltered "Direct link to sumMapFiltered")
This function behaves the same as [sumMap](https://clickhouse.com/docs/sql-reference/aggregate-functions/reference/summap) except that it also accepts an array of keys to filter with as a parameter. This can be especially useful when working with a high cardinality of keys.
**Syntax**
`sumMapFiltered(keys_to_keep)(keys, values)`
**Parameters**
- `keys_to_keep`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of keys to filter with.
- `keys`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of keys.
- `values`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of values.
**Returned Value**
- Returns a tuple of two arrays: keys in sorted order, and values โโsummed for the corresponding keys.
**Example**
Query:
```
CREATE TABLE sum_map
(
`date` Date,
`timeslot` DateTime,
`statusMap` Nested(status UInt16, requests UInt64)
)
ENGINE = Log
INSERT INTO sum_map VALUES
('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10]);
```
```
SELECT sumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests) FROM sum_map;
```
Result:
```
โโsumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests)โโ
1. โ ([1,4,8],[10,20,10]) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
## sumMapFilteredWithOverflow[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#summapfilteredwithoverflow "Direct link to sumMapFilteredWithOverflow")
This function behaves the same as [sumMap](https://clickhouse.com/docs/sql-reference/aggregate-functions/reference/summap) except that it also accepts an array of keys to filter with as a parameter. This can be especially useful when working with a high cardinality of keys. It differs from the [sumMapFiltered](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#summapfiltered) function in that it does summation with overflow - i.e. returns the same data type for the summation as the argument data type.
**Syntax**
`sumMapFilteredWithOverflow(keys_to_keep)(keys, values)`
**Parameters**
- `keys_to_keep`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of keys to filter with.
- `keys`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of keys.
- `values`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of values.
**Returned Value**
- Returns a tuple of two arrays: keys in sorted order, and values โโsummed for the corresponding keys.
**Example**
In this example we create a table `sum_map`, insert some data into it and then use both `sumMapFilteredWithOverflow` and `sumMapFiltered` and the `toTypeName` function for comparison of the result. Where `requests` was of type `UInt8` in the created table, `sumMapFiltered` has promoted the type of the summed values to `UInt64` to avoid overflow whereas `sumMapFilteredWithOverflow` has kept the type as `UInt8` which is not large enough to store the result - i.e. overflow has occurred.
Query:
```
CREATE TABLE sum_map
(
`date` Date,
`timeslot` DateTime,
`statusMap` Nested(status UInt8, requests UInt8)
)
ENGINE = Log
INSERT INTO sum_map VALUES
('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10]);
```
```
SELECT sumMapFilteredWithOverflow([1, 4, 8])(statusMap.status, statusMap.requests) as summap_overflow, toTypeName(summap_overflow) FROM sum_map;
```
```
SELECT sumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests) as summap, toTypeName(summap) FROM sum_map;
```
Result:
```
โโsumโโโโโโโโโโโโโโโโโโโฌโtoTypeName(sum)โโโโโโโโโโโโโโโโโโโโ
1. โ ([1,4,8],[10,20,10]) โ Tuple(Array(UInt8), Array(UInt8)) โ
โโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
```
โโsummapโโโโโโโโโโโโโโโโฌโtoTypeName(summap)โโโโโโโโโโโโโโโโโโ
1. โ ([1,4,8],[10,20,10]) โ Tuple(Array(UInt8), Array(UInt64)) โ
โโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
## sequenceNextNode[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencenextnode "Direct link to sequenceNextNode")
Returns a value of the next event that matched an event chain.
*Experimental function, `SET allow_experimental_funnel_functions = 1` to enable it.*
**Syntax**
```
sequenceNextNode(direction, base)(timestamp, event_column, base_condition, event1, event2, event3, ...)
```
**Parameters**
- `direction` โ Used to navigate to directions.
- forward โ Moving forward.
- backward โ Moving backward.
- `base` โ Used to set the base point.
- head โ Set the base point to the first event.
- tail โ Set the base point to the last event.
- first\_match โ Set the base point to the first matched `event1`.
- last\_match โ Set the base point to the last matched `event1`.
**Arguments**
- `timestamp` โ Name of the column containing the timestamp. Data types supported: [Date](https://clickhouse.com/docs/sql-reference/data-types/date), [DateTime](https://clickhouse.com/docs/sql-reference/data-types/datetime) and other unsigned integer types.
- `event_column` โ Name of the column containing the value of the next event to be returned. Data types supported: [String](https://clickhouse.com/docs/sql-reference/data-types/string) and [Nullable(String)](https://clickhouse.com/docs/sql-reference/data-types/nullable).
- `base_condition` โ Condition that the base point must fulfill.
- `event1`, `event2`, ... โ Conditions describing the chain of events. [UInt8](https://clickhouse.com/docs/sql-reference/data-types/int-uint).
**Returned values**
- `event_column[next_index]` โ If the pattern is matched and next value exists.
- `NULL` - If the pattern isn't matched or next value doesn't exist.
Type: [Nullable(String)](https://clickhouse.com/docs/sql-reference/data-types/nullable).
**Example**
It can be used when events are A-\>B-\>C-\>D-\>E and you want to know the event following B-\>C, which is D.
The query statement searching the event following A-\>B:
```
CREATE TABLE test_flow (
dt DateTime,
id int,
page String)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(dt)
ORDER BY id;
INSERT INTO test_flow VALUES (1, 1, 'A') (2, 1, 'B') (3, 1, 'C') (4, 1, 'D') (5, 1, 'E');
SELECT id, sequenceNextNode('forward', 'head')(dt, page, page = 'A', page = 'A', page = 'B') as next_flow FROM test_flow GROUP BY id;
```
Result:
```
โโidโโฌโnext_flowโโ
โ 1 โ C โ
โโโโโโดโโโโโโโโโโโโ
```
**Behavior for `forward` and `head`**
```
ALTER TABLE test_flow DELETE WHERE 1 = 1 settings mutations_sync = 1;
INSERT INTO test_flow VALUES (1, 1, 'Home') (2, 1, 'Gift') (3, 1, 'Exit');
INSERT INTO test_flow VALUES (1, 2, 'Home') (2, 2, 'Home') (3, 2, 'Gift') (4, 2, 'Basket');
INSERT INTO test_flow VALUES (1, 3, 'Gift') (2, 3, 'Home') (3, 3, 'Gift') (4, 3, 'Basket');
```
```
SELECT id, sequenceNextNode('forward', 'head')(dt, page, page = 'Home', page = 'Home', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home // Base point, Matched with Home
1970-01-01 09:00:02 1 Gift // Matched with Gift
1970-01-01 09:00:03 1 Exit // The result
1970-01-01 09:00:01 2 Home // Base point, Matched with Home
1970-01-01 09:00:02 2 Home // Unmatched with Gift
1970-01-01 09:00:03 2 Gift
1970-01-01 09:00:04 2 Basket
1970-01-01 09:00:01 3 Gift // Base point, Unmatched with Home
1970-01-01 09:00:02 3 Home
1970-01-01 09:00:03 3 Gift
1970-01-01 09:00:04 3 Basket
```
**Behavior for `backward` and `tail`**
```
SELECT id, sequenceNextNode('backward', 'tail')(dt, page, page = 'Basket', page = 'Basket', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home
1970-01-01 09:00:02 1 Gift
1970-01-01 09:00:03 1 Exit // Base point, Unmatched with Basket
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home // The result
1970-01-01 09:00:03 2 Gift // Matched with Gift
1970-01-01 09:00:04 2 Basket // Base point, Matched with Basket
1970-01-01 09:00:01 3 Gift
1970-01-01 09:00:02 3 Home // The result
1970-01-01 09:00:03 3 Gift // Base point, Matched with Gift
1970-01-01 09:00:04 3 Basket // Base point, Matched with Basket
```
**Behavior for `forward` and `first_match`**
```
SELECT id, sequenceNextNode('forward', 'first_match')(dt, page, page = 'Gift', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit // The result
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket The result
1970-01-01 09:00:01 3 Gift // Base point
1970-01-01 09:00:02 3 Home // The result
1970-01-01 09:00:03 3 Gift
1970-01-01 09:00:04 3 Basket
```
```
SELECT id, sequenceNextNode('forward', 'first_match')(dt, page, page = 'Gift', page = 'Gift', page = 'Home') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit // Unmatched with Home
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket // Unmatched with Home
1970-01-01 09:00:01 3 Gift // Base point
1970-01-01 09:00:02 3 Home // Matched with Home
1970-01-01 09:00:03 3 Gift // The result
1970-01-01 09:00:04 3 Basket
```
**Behavior for `backward` and `last_match`**
```
SELECT id, sequenceNextNode('backward', 'last_match')(dt, page, page = 'Gift', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home // The result
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home // The result
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket
1970-01-01 09:00:01 3 Gift
1970-01-01 09:00:02 3 Home // The result
1970-01-01 09:00:03 3 Gift // Base point
1970-01-01 09:00:04 3 Basket
```
```
SELECT id, sequenceNextNode('backward', 'last_match')(dt, page, page = 'Gift', page = 'Gift', page = 'Home') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home // Matched with Home, the result is null
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit
1970-01-01 09:00:01 2 Home // The result
1970-01-01 09:00:02 2 Home // Matched with Home
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket
1970-01-01 09:00:01 3 Gift // The result
1970-01-01 09:00:02 3 Home // Matched with Home
1970-01-01 09:00:03 3 Gift // Base point
1970-01-01 09:00:04 3 Basket
```
**Behavior for `base_condition`**
```
CREATE TABLE test_flow_basecond
(
`dt` DateTime,
`id` int,
`page` String,
`ref` String
)
ENGINE = MergeTree
PARTITION BY toYYYYMMDD(dt)
ORDER BY id;
INSERT INTO test_flow_basecond VALUES (1, 1, 'A', 'ref4') (2, 1, 'A', 'ref3') (3, 1, 'B', 'ref2') (4, 1, 'B', 'ref1');
```
```
SELECT id, sequenceNextNode('forward', 'head')(dt, page, ref = 'ref1', page = 'A') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4 // The head can not be base point because the ref column of the head unmatched with 'ref1'.
1970-01-01 09:00:02 1 A ref3
1970-01-01 09:00:03 1 B ref2
1970-01-01 09:00:04 1 B ref1
```
```
SELECT id, sequenceNextNode('backward', 'tail')(dt, page, ref = 'ref4', page = 'B') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4
1970-01-01 09:00:02 1 A ref3
1970-01-01 09:00:03 1 B ref2
1970-01-01 09:00:04 1 B ref1 // The tail can not be base point because the ref column of the tail unmatched with 'ref4'.
```
```
SELECT id, sequenceNextNode('forward', 'first_match')(dt, page, ref = 'ref3', page = 'A') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4 // This row can not be base point because the ref column unmatched with 'ref3'.
1970-01-01 09:00:02 1 A ref3 // Base point
1970-01-01 09:00:03 1 B ref2 // The result
1970-01-01 09:00:04 1 B ref1
```
```
SELECT id, sequenceNextNode('backward', 'last_match')(dt, page, ref = 'ref2', page = 'B') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4
1970-01-01 09:00:02 1 A ref3 // The result
1970-01-01 09:00:03 1 B ref2 // Base point
1970-01-01 09:00:04 1 B ref1 // This row can not be base point because the ref column unmatched with 'ref2'.
```
[Previous Combinators](https://clickhouse.com/docs/sql-reference/aggregate-functions/combinators)
[Next GROUPING](https://clickhouse.com/docs/sql-reference/aggregate-functions/grouping_function)
- [histogram](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#histogram)
- [sequenceMatch](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencematch)
- [sequenceCount](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencecount)
- [sequenceMatchEvents](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencematchevents)
- [windowFunnel](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#windowfunnel)
- [retention](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#retention)
- [uniqUpTo(N)(x)](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#uniquptonx)
- [sumMapFiltered](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#summapfiltered)
- [sumMapFilteredWithOverflow](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#summapfilteredwithoverflow)
- [sequenceNextNode](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencenextnode)
Was this page helpful?
###### Try ClickHouse Cloud for FREE
Separation of storage and compute, automatic scaling, built-in SQL console, and lots more. \$300 in free credits when signing up.
[Try it for Free](https://console.clickhouse.cloud/signUp?loc=doc-card-banner&glxid=d8f5ecb6-b6ab-448c-86ef-529140b7035d&pagePath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&origPath=%2Fdocs%2Fsql-reference%2Faggregate-functions%2Fparametric-functions&utm_ga=GA1.1.420162415.1776423582)
ยฉ 2016โ2026 ClickHouse, Inc.
[Trademark](https://clickhouse.com/legal/trademark-policy)ยท[Privacy](https://clickhouse.com/legal/privacy-policy)ยท[Security](https://trust.clickhouse.com/)ยท[Terms of Service](https://clickhouse.com/legal/agreements/terms-of-service)

ยฉ 2016โ2026 ClickHouse, Inc.
[Trademark](https://clickhouse.com/legal/trademark-policy)ยท[Privacy](https://clickhouse.com/legal/privacy-policy)ยท[Security](https://trust.clickhouse.com/)ยท[Terms of Service](https://clickhouse.com/legal/agreements/terms-of-service)

[](https://clickhouse.com/)
EN
- Get startedโผ
- Cloudโผ
- Manage dataโผ
- Server adminโผ
- Referenceโผ
- Integrationsโผ
- ClickStackโผ
- chDBโผ
- Aboutโผ
[](https://clickhouse.com/)
EN
main-menu
- Introductionโผ
- [Syntax](https://clickhouse.com/docs/sql-reference/syntax)
- [Input and Output Formats](https://clickhouse.com/docs/sql-reference/formats)
- Data typesโผ
- Statementsโผ
- Operatorsโผ
- Enginesโผ
- Functionsโผ
- Regular functionsโผ
- Aggregate functionsโผ
- Aggregate Functionsโผ
- [Combinators](https://clickhouse.com/docs/sql-reference/aggregate-functions/combinators)
- [Parametric](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions)
- [GROUPING](https://clickhouse.com/docs/sql-reference/aggregate-functions/grouping_function)
- Combinator examplesโผ
- Table functionsโผ
- Window functionsโผ
- Formatsโผ
- [Data Lakes](https://clickhouse.com/docs/sql-reference/datalakes) |
| Readable Markdown | Some aggregate functions can accept not only argument columns (used for compression), but a set of parameters โ constants for initialization. The syntax is two pairs of brackets instead of one. The first is for parameters, and the second is for arguments.
## histogram[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#histogram "Direct link to histogram")
Calculates an adaptive histogram. It does not guarantee precise results.
```
histogram(number_of_bins)(values)
```
The functions uses [A Streaming Parallel Decision Tree Algorithm](http://jmlr.org/papers/volume11/ben-haim10a/ben-haim10a.pdf). The borders of histogram bins are adjusted as new data enters a function. In common case, the widths of bins are not equal.
**Arguments**
`values` โ [Expression](https://clickhouse.com/docs/sql-reference/syntax#expressions) resulting in input values.
**Parameters**
`number_of_bins` โ Upper limit for the number of bins in the histogram. The function automatically calculates the number of bins. It tries to reach the specified number of bins, but if it fails, it uses fewer bins.
**Returned values**
- [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of [Tuples](https://clickhouse.com/docs/sql-reference/data-types/tuple) of the following format:
```
[(lower_1, upper_1, height_1), ... (lower_N, upper_N, height_N)]
```
- `lower` โ Lower bound of the bin.
- `upper` โ Upper bound of the bin.
- `height` โ Calculated height of the bin.
**Example**
```
SELECT histogram(5)(number + 1)
FROM (
SELECT *
FROM system.numbers
LIMIT 20
)
```
```
โโhistogram(5)(plus(number, 1))โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ [(1,4.5,4),(4.5,8.5,4),(8.5,12.75,4.125),(12.75,17,4.625),(17,20,3.25)] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
You can visualize a histogram with the [bar](https://clickhouse.com/docs/sql-reference/functions/other-functions#bar) function, for example:
```
WITH histogram(5)(rand() % 100) AS hist
SELECT
arrayJoin(hist).3 AS height,
bar(height, 0, 6, 5) AS bar
FROM
(
SELECT *
FROM system.numbers
LIMIT 20
)
```
```
โโheightโโฌโbarโโโโ
โ 2.125 โ โโ โ
โ 3.25 โ โโโ โ
โ 5.625 โ โโโโโ โ
โ 5.625 โ โโโโโ โ
โ 3.375 โ โโโ โ
โโโโโโโโโโดโโโโโโโโ
```
In this case, you should remember that you do not know the histogram bin borders.
## sequenceMatch[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencematch "Direct link to sequenceMatch")
Checks whether the sequence contains an event chain that matches the pattern.
**Syntax**
```
sequenceMatch(pattern)(timestamp, cond1, cond2, ...)
```
Note
Events that occur at the same second may lay in the sequence in an undefined order affecting the result.
**Arguments**
- `timestamp` โ Column considered to contain time data. Typical data types are `Date` and `DateTime`. You can also use any of the supported [UInt](https://clickhouse.com/docs/sql-reference/data-types/int-uint) data types.
- `cond1`, `cond2` โ Conditions that describe the chain of events. Data type: `UInt8`. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn't described in a condition, the function skips them.
**Parameters**
- `pattern` โ Pattern string. See [Pattern syntax](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#pattern-syntax).
**Returned values**
- 1, if the pattern is matched.
- 0, if the pattern isn't matched.
Type: `UInt8`.
#### Pattern syntax[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#pattern-syntax "Direct link to Pattern syntax")
- `(?N)` โ Matches the condition argument at position `N`. Conditions are numbered in the `[1, 32]` range. For example, `(?1)` matches the argument passed to the `cond1` parameter.
- `.*` โ Matches any number of events. You do not need conditional arguments to match this element of the pattern.
- `(?t operator value)` โ Sets the time in seconds that should separate two events. For example, pattern `(?1)(?t>1800)(?2)` matches events that occur more than 1800 seconds from each other. An arbitrary number of any events can lay between these events. You can use the `>=`, `>`, `<`, `<=`, `==` operators.
**Examples**
Consider data in the `t` table:
```
โโtimeโโฌโnumberโโ
โ 1 โ 1 โ
โ 2 โ 3 โ
โ 3 โ 2 โ
โโโโโโโโดโโโโโโโโโ
```
Perform the query:
```
SELECT sequenceMatch('(?1)(?2)')(time, number = 1, number = 2) FROM t
```
```
โโsequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2))โโ
โ 1 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
The function found the event chain where number 2 follows number 1. It skipped number 3 between them, because the number is not described as an event. If we want to take this number into account when searching for the event chain given in the example, we should make a condition for it.
```
SELECT sequenceMatch('(?1)(?2)')(time, number = 1, number = 2, number = 3) FROM t
```
```
โโsequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2), equals(number, 3))โโ
โ 0 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
In this case, the function couldn't find the event chain matching the pattern, because the event for number 3 occurred between 1 and 2. If in the same case we checked the condition for number 4, the sequence would match the pattern.
```
SELECT sequenceMatch('(?1)(?2)')(time, number = 1, number = 2, number = 4) FROM t
```
```
โโsequenceMatch('(?1)(?2)')(time, equals(number, 1), equals(number, 2), equals(number, 4))โโ
โ 1 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
**See Also**
- [sequenceCount](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencecount)
## sequenceCount[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencecount "Direct link to sequenceCount")
Counts the number of event chains that matched the pattern. The function searches event chains that do not overlap. It starts to search for the next chain after the current chain is matched.
Note
Events that occur at the same second may lay in the sequence in an undefined order affecting the result.
**Syntax**
```
sequenceCount(pattern)(timestamp, cond1, cond2, ...)
```
**Arguments**
- `timestamp` โ Column considered to contain time data. Typical data types are `Date` and `DateTime`. You can also use any of the supported [UInt](https://clickhouse.com/docs/sql-reference/data-types/int-uint) data types.
- `cond1`, `cond2` โ Conditions that describe the chain of events. Data type: `UInt8`. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn't described in a condition, the function skips them.
**Parameters**
- `pattern` โ Pattern string. See [Pattern syntax](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#pattern-syntax).
**Returned values**
- Number of non-overlapping event chains that are matched.
Type: `UInt64`.
**Example**
Consider data in the `t` table:
```
โโtimeโโฌโnumberโโ
โ 1 โ 1 โ
โ 2 โ 3 โ
โ 3 โ 2 โ
โ 4 โ 1 โ
โ 5 โ 3 โ
โ 6 โ 2 โ
โโโโโโโโดโโโโโโโโโ
```
Count how many times the number 2 occurs after the number 1 with any amount of other numbers between them:
```
SELECT sequenceCount('(?1).*(?2)')(time, number = 1, number = 2) FROM t
```
```
โโsequenceCount('(?1).*(?2)')(time, equals(number, 1), equals(number, 2))โโ
โ 2 โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
## sequenceMatchEvents[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencematchevents "Direct link to sequenceMatchEvents")
Return event timestamps of longest event chains that matched the pattern.
Note
Events that occur at the same second may lay in the sequence in an undefined order affecting the result.
**Syntax**
```
sequenceMatchEvents(pattern)(timestamp, cond1, cond2, ...)
```
**Arguments**
- `timestamp` โ Column considered to contain time data. Typical data types are `Date` and `DateTime`. You can also use any of the supported [UInt](https://clickhouse.com/docs/sql-reference/data-types/int-uint) data types.
- `cond1`, `cond2` โ Conditions that describe the chain of events. Data type: `UInt8`. You can pass up to 32 condition arguments. The function takes only the events described in these conditions into account. If the sequence contains data that isn't described in a condition, the function skips them.
**Parameters**
- `pattern` โ Pattern string. See [Pattern syntax](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#pattern-syntax).
**Returned values**
- Array of timestamps for matched condition arguments (?N) from event chain. Position in array match position of condition argument in pattern
Type: Array.
**Example**
Consider data in the `t` table:
```
โโtimeโโฌโnumberโโ
โ 1 โ 1 โ
โ 2 โ 3 โ
โ 3 โ 2 โ
โ 4 โ 1 โ
โ 5 โ 3 โ
โ 6 โ 2 โ
โโโโโโโโดโโโโโโโโโ
```
Return timestamps of events for longest chain
```
SELECT sequenceMatchEvents('(?1).*(?2).*(?1)(?3)')(time, number = 1, number = 2, number = 4) FROM t
```
```
โโsequenceMatchEvents('(?1).*(?2).*(?1)(?3)')(time, equals(number, 1), equals(number, 2), equals(number, 4))โโ
โ [1,3,4] โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
**See Also**
- [sequenceMatch](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencematch)
## windowFunnel[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#windowfunnel "Direct link to windowFunnel")
Searches for event chains in a sliding time window and calculates the maximum number of events that occurred from the chain.
The function works according to the algorithm:
- The function searches for data that triggers the first condition in the chain and sets the event counter to 1. This is the moment when the sliding window starts.
- If events from the chain occur sequentially within the window, the counter is incremented. If the sequence of events is disrupted, the counter isn't incremented.
- If the data has multiple event chains at varying points of completion, the function will only output the size of the longest chain.
**Syntax**
```
windowFunnel(window, [mode, [mode, ... ]])(timestamp, cond1, cond2, ..., condN)
```
**Arguments**
- `timestamp` โ Name of the column containing the timestamp. Data types supported: [Date](https://clickhouse.com/docs/sql-reference/data-types/date), [DateTime](https://clickhouse.com/docs/sql-reference/data-types/datetime) and other unsigned integer types (note that even though timestamp supports the `UInt64` type, it's value can't exceed the Int64 maximum, which is 2^63 - 1).
- `cond` โ Conditions or data describing the chain of events. [UInt8](https://clickhouse.com/docs/sql-reference/data-types/int-uint).
**Parameters**
- `window` โ Length of the sliding window, it is the time interval between the first and the last condition. The unit of `window` depends on the `timestamp` itself and varies. Determined using the expression `timestamp of cond1 <= timestamp of cond2 <= ... <= timestamp of condN <= timestamp of cond1 + window`.
- `mode` โ It is an optional argument. One or more modes can be set.
- `'strict_deduplication'` โ If the same condition holds for the sequence of events, then such repeating event interrupts further processing. Note: it may work unexpectedly if several conditions hold for the same event.
- `'strict_order'` โ Don't allow interventions of other events. E.g. in the case of `A->B->D->C`, it stops finding `A->B->C` at the `D` and the max event level is 2.
- `'strict_increase'` โ Apply conditions only to events with strictly increasing timestamps.
- `'strict_once'` โ Count each event only once in the chain even if it meets the condition several times.
- `'allow_reentry'` โ Ignore events that violate the strict order. E.g. in the case of A-\>A-\>B-\>C, it finds A-\>B-\>C by ignoring the redundant A and the max event level is 3.
**Returned value**
The maximum number of consecutive triggered conditions from the chain within the sliding time window. All the chains in the selection are analyzed.
Type: `Integer`.
**Example**
Determine if a set period of time is enough for the user to select a phone and purchase it twice in the online store.
Set the following chain of events:
1. The user logged in to their account on the store (`eventID = 1003`).
2. The user searches for a phone (`eventID = 1007, product = 'phone'`).
3. The user placed an order (`eventID = 1009`).
4. The user made the order again (`eventID = 1010`).
Input table:
```
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-01-28 โ 1 โ 2019-01-29 10:00:00 โ 1003 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-01-31 โ 1 โ 2019-01-31 09:00:00 โ 1007 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-01-30 โ 1 โ 2019-01-30 08:00:00 โ 1009 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
โโevent_dateโโฌโuser_idโโฌโโโโโโโโโโโtimestampโโฌโeventIDโโฌโproductโโ
โ 2019-02-01 โ 1 โ 2019-02-01 08:00:00 โ 1010 โ phone โ
โโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโดโโโโโโโโโโ
```
Find out how far the user `user_id` could get through the chain in a period in January-February of 2019.
Query:
```
SELECT
level,
count() AS c
FROM
(
SELECT
user_id,
windowFunnel(6048000000000000)(timestamp, eventID = 1003, eventID = 1009, eventID = 1007, eventID = 1010) AS level
FROM trend
WHERE (event_date >= '2019-01-01') AND (event_date <= '2019-02-02')
GROUP BY user_id
)
GROUP BY level
ORDER BY level ASC;
```
Result:
```
โโlevelโโฌโcโโ
โ 4 โ 1 โ
โโโโโโโโโดโโโโ
```
**Example with allow\_reentry mode**
This example demonstrates how `allow_reentry` mode works with user reentry patterns:
```
-- Sample data: user visits checkout -> product detail -> checkout again -> payment
-- Without allow_reentry: stops at level 2 (product detail page)
-- With allow_reentry: reaches level 4 (payment completion)
SELECT
level,
count() AS users
FROM
(
SELECT
user_id,
windowFunnel(3600, 'strict_order', 'allow_reentry')(
timestamp,
action = 'begin_checkout', -- Step 1: Begin checkout
action = 'view_product_detail', -- Step 2: View product detail
action = 'begin_checkout', -- Step 3: Begin checkout again (reentry)
action = 'complete_payment' -- Step 4: Complete payment
) AS level
FROM user_events
WHERE event_date = today()
GROUP BY user_id
)
GROUP BY level
ORDER BY level ASC;
```
## retention[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#retention "Direct link to retention")
The function takes as arguments a set of conditions from 1 to 32 arguments of type `UInt8` that indicate whether a certain condition was met for the event. Any condition can be specified as an argument (as in [WHERE](https://clickhouse.com/docs/sql-reference/statements/select/where)).
The conditions, except the first, apply in pairs: the result of the second will be true if the first and second are true, of the third if the first and third are true, etc.
**Syntax**
```
retention(cond1, cond2, ..., cond32);
```
**Arguments**
- `cond` โ An expression that returns a `UInt8` result (1 or 0).
**Returned value**
The array of 1 or 0.
- 1 โ Condition was met for the event.
- 0 โ Condition wasn't met for the event.
Type: `UInt8`.
**Example**
Let's consider an example of calculating the `retention` function to determine site traffic.
**1\.** Create a table to illustrate an example.
```
CREATE TABLE retention_test(date Date, uid Int32) ENGINE = Memory;
INSERT INTO retention_test SELECT '2020-01-01', number FROM numbers(5);
INSERT INTO retention_test SELECT '2020-01-02', number FROM numbers(10);
INSERT INTO retention_test SELECT '2020-01-03', number FROM numbers(15);
```
Input table:
Query:
```
SELECT * FROM retention_test
```
Result:
```
โโโโโโโโdateโโฌโuidโโ
โ 2020-01-01 โ 0 โ
โ 2020-01-01 โ 1 โ
โ 2020-01-01 โ 2 โ
โ 2020-01-01 โ 3 โ
โ 2020-01-01 โ 4 โ
โโโโโโโโโโโโโโดโโโโโโ
โโโโโโโโdateโโฌโuidโโ
โ 2020-01-02 โ 0 โ
โ 2020-01-02 โ 1 โ
โ 2020-01-02 โ 2 โ
โ 2020-01-02 โ 3 โ
โ 2020-01-02 โ 4 โ
โ 2020-01-02 โ 5 โ
โ 2020-01-02 โ 6 โ
โ 2020-01-02 โ 7 โ
โ 2020-01-02 โ 8 โ
โ 2020-01-02 โ 9 โ
โโโโโโโโโโโโโโดโโโโโโ
โโโโโโโโdateโโฌโuidโโ
โ 2020-01-03 โ 0 โ
โ 2020-01-03 โ 1 โ
โ 2020-01-03 โ 2 โ
โ 2020-01-03 โ 3 โ
โ 2020-01-03 โ 4 โ
โ 2020-01-03 โ 5 โ
โ 2020-01-03 โ 6 โ
โ 2020-01-03 โ 7 โ
โ 2020-01-03 โ 8 โ
โ 2020-01-03 โ 9 โ
โ 2020-01-03 โ 10 โ
โ 2020-01-03 โ 11 โ
โ 2020-01-03 โ 12 โ
โ 2020-01-03 โ 13 โ
โ 2020-01-03 โ 14 โ
โโโโโโโโโโโโโโดโโโโโโ
```
**2\.** Group users by unique ID `uid` using the `retention` function.
Query:
```
SELECT
uid,
retention(date = '2020-01-01', date = '2020-01-02', date = '2020-01-03') AS r
FROM retention_test
WHERE date IN ('2020-01-01', '2020-01-02', '2020-01-03')
GROUP BY uid
ORDER BY uid ASC
```
Result:
```
โโuidโโฌโrโโโโโโโโ
โ 0 โ [1,1,1] โ
โ 1 โ [1,1,1] โ
โ 2 โ [1,1,1] โ
โ 3 โ [1,1,1] โ
โ 4 โ [1,1,1] โ
โ 5 โ [0,0,0] โ
โ 6 โ [0,0,0] โ
โ 7 โ [0,0,0] โ
โ 8 โ [0,0,0] โ
โ 9 โ [0,0,0] โ
โ 10 โ [0,0,0] โ
โ 11 โ [0,0,0] โ
โ 12 โ [0,0,0] โ
โ 13 โ [0,0,0] โ
โ 14 โ [0,0,0] โ
โโโโโโโดโโโโโโโโโโ
```
**3\.** Calculate the total number of site visits per day.
Query:
```
SELECT
sum(r[1]) AS r1,
sum(r[2]) AS r2,
sum(r[3]) AS r3
FROM
(
SELECT
uid,
retention(date = '2020-01-01', date = '2020-01-02', date = '2020-01-03') AS r
FROM retention_test
WHERE date IN ('2020-01-01', '2020-01-02', '2020-01-03')
GROUP BY uid
)
```
Result:
```
โโr1โโฌโr2โโฌโr3โโ
โ 5 โ 5 โ 5 โ
โโโโโโดโโโโโดโโโโโ
```
Where:
- `r1`\- the number of unique visitors who visited the site during 2020-01-01 (the `cond1` condition).
- `r2`\- the number of unique visitors who visited the site during a specific time period between 2020-01-01 and 2020-01-02 (`cond1` and `cond2` conditions).
- `r3`\- the number of unique visitors who visited the site during a specific time period on 2020-01-01 and 2020-01-03 (`cond1` and `cond3` conditions).
## uniqUpTo(N)(x)[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#uniquptonx "Direct link to uniqUpTo(N)(x)")
Calculates the number of different values of the argument up to a specified limit, `N`. If the number of different argument values is greater than `N`, this function returns `N` + 1, otherwise it calculates the exact value.
Recommended for use with small `N`s, up to 10. The maximum value of `N` is 100.
For the state of an aggregate function, this function uses the amount of memory equal to 1 + `N` \* the size of one value of bytes. When dealing with strings, this function stores a non-cryptographic hash of 8 bytes; the calculation is approximated for strings.
For example, if you had a table that logs every search query made by users on your website. Each row in the table represents a single search query, with columns for the user ID, the search query, and the timestamp of the query. You can use `uniqUpTo` to generate a report that shows only the keywords that produced at least 5 unique users.
```
SELECT SearchPhrase
FROM SearchLog
GROUP BY SearchPhrase
HAVING uniqUpTo(4)(UserID) >= 5
```
`uniqUpTo(4)(UserID)` calculates the number of unique `UserID` values for each `SearchPhrase`, but it only counts up to 4 unique values. If there are more than 4 unique `UserID` values for a `SearchPhrase`, the function returns 5 (4 + 1). The `HAVING` clause then filters out the `SearchPhrase` values for which the number of unique `UserID` values is less than 5. This will give you a list of search keywords that were used by at least 5 unique users.
## sumMapFiltered[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#summapfiltered "Direct link to sumMapFiltered")
This function behaves the same as [sumMap](https://clickhouse.com/docs/sql-reference/aggregate-functions/reference/summap) except that it also accepts an array of keys to filter with as a parameter. This can be especially useful when working with a high cardinality of keys.
**Syntax**
`sumMapFiltered(keys_to_keep)(keys, values)`
**Parameters**
- `keys_to_keep`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of keys to filter with.
- `keys`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of keys.
- `values`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of values.
**Returned Value**
- Returns a tuple of two arrays: keys in sorted order, and values โโsummed for the corresponding keys.
**Example**
Query:
```
CREATE TABLE sum_map
(
`date` Date,
`timeslot` DateTime,
`statusMap` Nested(status UInt16, requests UInt64)
)
ENGINE = Log
INSERT INTO sum_map VALUES
('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10]);
```
```
SELECT sumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests) FROM sum_map;
```
Result:
```
โโsumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests)โโ
1. โ ([1,4,8],[10,20,10]) โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
## sumMapFilteredWithOverflow[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#summapfilteredwithoverflow "Direct link to sumMapFilteredWithOverflow")
This function behaves the same as [sumMap](https://clickhouse.com/docs/sql-reference/aggregate-functions/reference/summap) except that it also accepts an array of keys to filter with as a parameter. This can be especially useful when working with a high cardinality of keys. It differs from the [sumMapFiltered](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#summapfiltered) function in that it does summation with overflow - i.e. returns the same data type for the summation as the argument data type.
**Syntax**
`sumMapFilteredWithOverflow(keys_to_keep)(keys, values)`
**Parameters**
- `keys_to_keep`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of keys to filter with.
- `keys`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of keys.
- `values`: [Array](https://clickhouse.com/docs/sql-reference/data-types/array) of values.
**Returned Value**
- Returns a tuple of two arrays: keys in sorted order, and values โโsummed for the corresponding keys.
**Example**
In this example we create a table `sum_map`, insert some data into it and then use both `sumMapFilteredWithOverflow` and `sumMapFiltered` and the `toTypeName` function for comparison of the result. Where `requests` was of type `UInt8` in the created table, `sumMapFiltered` has promoted the type of the summed values to `UInt64` to avoid overflow whereas `sumMapFilteredWithOverflow` has kept the type as `UInt8` which is not large enough to store the result - i.e. overflow has occurred.
Query:
```
CREATE TABLE sum_map
(
`date` Date,
`timeslot` DateTime,
`statusMap` Nested(status UInt8, requests UInt8)
)
ENGINE = Log
INSERT INTO sum_map VALUES
('2000-01-01', '2000-01-01 00:00:00', [1, 2, 3], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:00:00', [3, 4, 5], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [4, 5, 6], [10, 10, 10]),
('2000-01-01', '2000-01-01 00:01:00', [6, 7, 8], [10, 10, 10]);
```
```
SELECT sumMapFilteredWithOverflow([1, 4, 8])(statusMap.status, statusMap.requests) as summap_overflow, toTypeName(summap_overflow) FROM sum_map;
```
```
SELECT sumMapFiltered([1, 4, 8])(statusMap.status, statusMap.requests) as summap, toTypeName(summap) FROM sum_map;
```
Result:
```
โโsumโโโโโโโโโโโโโโโโโโโฌโtoTypeName(sum)โโโโโโโโโโโโโโโโโโโโ
1. โ ([1,4,8],[10,20,10]) โ Tuple(Array(UInt8), Array(UInt8)) โ
โโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
```
โโsummapโโโโโโโโโโโโโโโโฌโtoTypeName(summap)โโโโโโโโโโโโโโโโโโ
1. โ ([1,4,8],[10,20,10]) โ Tuple(Array(UInt8), Array(UInt64)) โ
โโโโโโโโโโโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
```
## sequenceNextNode[โ](https://clickhouse.com/docs/sql-reference/aggregate-functions/parametric-functions#sequencenextnode "Direct link to sequenceNextNode")
Returns a value of the next event that matched an event chain.
*Experimental function, `SET allow_experimental_funnel_functions = 1` to enable it.*
**Syntax**
```
sequenceNextNode(direction, base)(timestamp, event_column, base_condition, event1, event2, event3, ...)
```
**Parameters**
- `direction` โ Used to navigate to directions.
- forward โ Moving forward.
- backward โ Moving backward.
- `base` โ Used to set the base point.
- head โ Set the base point to the first event.
- tail โ Set the base point to the last event.
- first\_match โ Set the base point to the first matched `event1`.
- last\_match โ Set the base point to the last matched `event1`.
**Arguments**
- `timestamp` โ Name of the column containing the timestamp. Data types supported: [Date](https://clickhouse.com/docs/sql-reference/data-types/date), [DateTime](https://clickhouse.com/docs/sql-reference/data-types/datetime) and other unsigned integer types.
- `event_column` โ Name of the column containing the value of the next event to be returned. Data types supported: [String](https://clickhouse.com/docs/sql-reference/data-types/string) and [Nullable(String)](https://clickhouse.com/docs/sql-reference/data-types/nullable).
- `base_condition` โ Condition that the base point must fulfill.
- `event1`, `event2`, ... โ Conditions describing the chain of events. [UInt8](https://clickhouse.com/docs/sql-reference/data-types/int-uint).
**Returned values**
- `event_column[next_index]` โ If the pattern is matched and next value exists.
- `NULL` - If the pattern isn't matched or next value doesn't exist.
Type: [Nullable(String)](https://clickhouse.com/docs/sql-reference/data-types/nullable).
**Example**
It can be used when events are A-\>B-\>C-\>D-\>E and you want to know the event following B-\>C, which is D.
The query statement searching the event following A-\>B:
```
CREATE TABLE test_flow (
dt DateTime,
id int,
page String)
ENGINE = MergeTree()
PARTITION BY toYYYYMMDD(dt)
ORDER BY id;
INSERT INTO test_flow VALUES (1, 1, 'A') (2, 1, 'B') (3, 1, 'C') (4, 1, 'D') (5, 1, 'E');
SELECT id, sequenceNextNode('forward', 'head')(dt, page, page = 'A', page = 'A', page = 'B') as next_flow FROM test_flow GROUP BY id;
```
Result:
```
โโidโโฌโnext_flowโโ
โ 1 โ C โ
โโโโโโดโโโโโโโโโโโโ
```
**Behavior for `forward` and `head`**
```
ALTER TABLE test_flow DELETE WHERE 1 = 1 settings mutations_sync = 1;
INSERT INTO test_flow VALUES (1, 1, 'Home') (2, 1, 'Gift') (3, 1, 'Exit');
INSERT INTO test_flow VALUES (1, 2, 'Home') (2, 2, 'Home') (3, 2, 'Gift') (4, 2, 'Basket');
INSERT INTO test_flow VALUES (1, 3, 'Gift') (2, 3, 'Home') (3, 3, 'Gift') (4, 3, 'Basket');
```
```
SELECT id, sequenceNextNode('forward', 'head')(dt, page, page = 'Home', page = 'Home', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home // Base point, Matched with Home
1970-01-01 09:00:02 1 Gift // Matched with Gift
1970-01-01 09:00:03 1 Exit // The result
1970-01-01 09:00:01 2 Home // Base point, Matched with Home
1970-01-01 09:00:02 2 Home // Unmatched with Gift
1970-01-01 09:00:03 2 Gift
1970-01-01 09:00:04 2 Basket
1970-01-01 09:00:01 3 Gift // Base point, Unmatched with Home
1970-01-01 09:00:02 3 Home
1970-01-01 09:00:03 3 Gift
1970-01-01 09:00:04 3 Basket
```
**Behavior for `backward` and `tail`**
```
SELECT id, sequenceNextNode('backward', 'tail')(dt, page, page = 'Basket', page = 'Basket', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home
1970-01-01 09:00:02 1 Gift
1970-01-01 09:00:03 1 Exit // Base point, Unmatched with Basket
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home // The result
1970-01-01 09:00:03 2 Gift // Matched with Gift
1970-01-01 09:00:04 2 Basket // Base point, Matched with Basket
1970-01-01 09:00:01 3 Gift
1970-01-01 09:00:02 3 Home // The result
1970-01-01 09:00:03 3 Gift // Base point, Matched with Gift
1970-01-01 09:00:04 3 Basket // Base point, Matched with Basket
```
**Behavior for `forward` and `first_match`**
```
SELECT id, sequenceNextNode('forward', 'first_match')(dt, page, page = 'Gift', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit // The result
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket The result
1970-01-01 09:00:01 3 Gift // Base point
1970-01-01 09:00:02 3 Home // The result
1970-01-01 09:00:03 3 Gift
1970-01-01 09:00:04 3 Basket
```
```
SELECT id, sequenceNextNode('forward', 'first_match')(dt, page, page = 'Gift', page = 'Gift', page = 'Home') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit // Unmatched with Home
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket // Unmatched with Home
1970-01-01 09:00:01 3 Gift // Base point
1970-01-01 09:00:02 3 Home // Matched with Home
1970-01-01 09:00:03 3 Gift // The result
1970-01-01 09:00:04 3 Basket
```
**Behavior for `backward` and `last_match`**
```
SELECT id, sequenceNextNode('backward', 'last_match')(dt, page, page = 'Gift', page = 'Gift') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home // The result
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit
1970-01-01 09:00:01 2 Home
1970-01-01 09:00:02 2 Home // The result
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket
1970-01-01 09:00:01 3 Gift
1970-01-01 09:00:02 3 Home // The result
1970-01-01 09:00:03 3 Gift // Base point
1970-01-01 09:00:04 3 Basket
```
```
SELECT id, sequenceNextNode('backward', 'last_match')(dt, page, page = 'Gift', page = 'Gift', page = 'Home') FROM test_flow GROUP BY id;
dt id page
1970-01-01 09:00:01 1 Home // Matched with Home, the result is null
1970-01-01 09:00:02 1 Gift // Base point
1970-01-01 09:00:03 1 Exit
1970-01-01 09:00:01 2 Home // The result
1970-01-01 09:00:02 2 Home // Matched with Home
1970-01-01 09:00:03 2 Gift // Base point
1970-01-01 09:00:04 2 Basket
1970-01-01 09:00:01 3 Gift // The result
1970-01-01 09:00:02 3 Home // Matched with Home
1970-01-01 09:00:03 3 Gift // Base point
1970-01-01 09:00:04 3 Basket
```
**Behavior for `base_condition`**
```
CREATE TABLE test_flow_basecond
(
`dt` DateTime,
`id` int,
`page` String,
`ref` String
)
ENGINE = MergeTree
PARTITION BY toYYYYMMDD(dt)
ORDER BY id;
INSERT INTO test_flow_basecond VALUES (1, 1, 'A', 'ref4') (2, 1, 'A', 'ref3') (3, 1, 'B', 'ref2') (4, 1, 'B', 'ref1');
```
```
SELECT id, sequenceNextNode('forward', 'head')(dt, page, ref = 'ref1', page = 'A') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4 // The head can not be base point because the ref column of the head unmatched with 'ref1'.
1970-01-01 09:00:02 1 A ref3
1970-01-01 09:00:03 1 B ref2
1970-01-01 09:00:04 1 B ref1
```
```
SELECT id, sequenceNextNode('backward', 'tail')(dt, page, ref = 'ref4', page = 'B') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4
1970-01-01 09:00:02 1 A ref3
1970-01-01 09:00:03 1 B ref2
1970-01-01 09:00:04 1 B ref1 // The tail can not be base point because the ref column of the tail unmatched with 'ref4'.
```
```
SELECT id, sequenceNextNode('forward', 'first_match')(dt, page, ref = 'ref3', page = 'A') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4 // This row can not be base point because the ref column unmatched with 'ref3'.
1970-01-01 09:00:02 1 A ref3 // Base point
1970-01-01 09:00:03 1 B ref2 // The result
1970-01-01 09:00:04 1 B ref1
```
```
SELECT id, sequenceNextNode('backward', 'last_match')(dt, page, ref = 'ref2', page = 'B') FROM test_flow_basecond GROUP BY id;
dt id page ref
1970-01-01 09:00:01 1 A ref4
1970-01-01 09:00:02 1 A ref3 // The result
1970-01-01 09:00:03 1 B ref2 // Base point
1970-01-01 09:00:04 1 B ref1 // This row can not be base point because the ref column unmatched with 'ref2'.
``` |
| Shard | 89 (laksa) |
| Root Hash | 12633450985039531489 |
| Unparsed URL | com,clickhouse!/docs/sql-reference/aggregate-functions/parametric-functions s443 |