Documentation Index
Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-0a08f105.mintlify.app/llms.txt
Use this file to discover all available pages before exploring further.
Description
pg_clickhouse is a PostgreSQL extension that enables remote query execution on ClickHouse databases, including a foreign data wrapper. It supports PostgreSQL 13 and higher and ClickHouse 23 and higher.Getting Started
The simplest way to try pg_clickhouse is the Docker image, which contains the standard PostgreSQL Docker image with the pg_clickhouse and re2 extensions:Usage
Versioning Policy
pg_clickhouse adheres to Semantic Versioning for its public releases.- The major version increments for API changes
- The minor version increments for backward compatible SQL changes
- The patch version increments for binary-only changes
- The library version (defined by
PG_MODULE_MAGICon PostgreSQL 18 and higher) includes the full semantic version, visible in the output of thepgch_version()function or the Postgrespg_get_loaded_modules()function. - The extension version (defined in the control file) includes only the major
and minor versions, visible in the
pg_catalog.pg_extensiontable, the output of thepg_available_extension_versions()function, and\dx pg_clickhouse.
v0.1.0 to v0.1.1, benefits all databases that have loaded v0.1 and
don’t need to run ALTER EXTENSION to benefit from the upgrade.
A release that increments the minor or major versions, on the other hand, will
be accompanied by SQL upgrade scripts, and all existing database that contain
the extension must run ALTER EXTENSION pg_clickhouse UPDATE to benefit from
the upgrade.
DDL SQL Reference
The following SQL DDL expressions use pg_clickhouse.CREATE EXTENSION
Use CREATE EXTENSION to add pg_clickhouse to a database:WITH SCHEMA to install it into a specific schema (recommended):
ALTER EXTENSION
Use ALTER EXTENSION to change pg_clickhouse. Examples:-
After installing a new release of pg_clickhouse, use the
UPDATEclause: -
Use
SET SCHEMAto move the extension to a new schema:
DROP EXTENSION
Use DROP EXTENSION to remove pg_clickhouse from a database:CASCADE clause to drop them, too:
CREATE SERVER
Use CREATE SERVER to create a foreign server that connects to a ClickHouse server. Example:driver: The ClickHouse connection driver to use, either “binary” or “http”. Required.dbname: The ClickHouse database to use upon connecting. Defaults to “default”.fetch_size: Approximate batch size in bytes for HTTP streaming. Batches split on row boundaries. Defaults to50000000(50 MB).0disables streaming and buffers the full response. Foreign tables can override this value.host: The host name of the ClickHouse server. Defaults to “localhost”;port: The port to connect to on the ClickHouse server. Defaults as follows:- 9440 if
driveris “binary” andhostis a ClickHouse Cloud host - 9004 if
driveris “binary” andhostisn’t a ClickHouse Cloud host - 8443 if
driveris “http” andhostis a ClickHouse Cloud host - 8123 if
driveris “http” andhostisn’t a ClickHouse Cloud host
- 9440 if
ALTER SERVER
Use ALTER SERVER to change a foreign server. Example:DROP SERVER
Use DROP SERVER to remove a foreign server:CASCADE to
also drop those dependencies:
CREATE USER MAPPING
Use CREATE USER MAPPING to map a PostgreSQL user to a ClickHouse user. For example, to map the current PostgreSQL user to the remote ClickHouse user when connecting with thetaxi_srv foreign server:
user: The name of the ClickHouse user. Defaults to “default”.password: The password of the ClickHouse user.
ALTER USER MAPPING
Use ALTER USER MAPPING to change the definition of a user mapping:DROP USER MAPPING
Use DROP USER MAPPING to remove a user mapping:IMPORT FOREIGN SCHEMA
Use IMPORT FOREIGN SCHEMA to import all the tables defines in a ClickHouse database as foreign tables into a PostgreSQL schema:LIMIT TO to limit the import to specific tables:
EXCEPT to exclude tables:
CREATE FOREIGN TABLE
Use CREATE FOREIGN TABLE to create a foreign table that can query data from a ClickHouse database:database: The name of the remote database. Defaults to the database defined for the foreign server.fetch_size: Approximate batch size in bytes for HTTP streaming. Overrides server-levelfetch_size. Defaults to50000000(50 MB).0disables streaming and buffers the full response.table_name: The name of the remote table. Default to the name specified for the foreign table.engine: The table engine used by the ClickHouse table. ForCollapsingMergeTree()andAggregatingMergeTree(), pg_clickhouse automatically applies the parameters to function expressions executed on the table.
-
column_name: The name of the column on the ClickHouse side, used in preference to the PostgreSQL attribute name when deparsing queries and inserts. Useful for mapping unquoted lowercase PostgreSQL column names to case-sensitive ClickHouse columns, e.g., -
AggregateFunction: The name of the aggregate function applied to an AggregateFunction Type column. Map the data type to the ClickHouse type passed to the function and specify the name of the aggregate function via the appropriate column option and pg_clickhouse will automatically appendMergeto an aggregate function evaluating the column. -
SimpleAggregateFunction: The name of the aggregate function applied to an SimpleAggregateFunction Type column. Map the data type to the ClickHouse type passed to the function and specify the name of the aggregate function via the appropriate column option.
ALTER FOREIGN TABLE
Use ALTER FOREIGN TABLE to change the definition of a foreign table:DROP FOREIGN TABLE
Use DROP FOREIGN TABLE to remove a foreign table:CASCADE clause to drop them, too:
DML SQL Reference
The SQL DML expressions below may use pg_clickhouse. Examples depend on these ClickHouse tables:EXPLAIN
The EXPLAIN command works as expected, but theVERBOSE option triggers the
ClickHouse “Remote SQL” query to be emitted:
SELECT
Use the SELECT statement to execute queries on pg_clickhouse tables just like any other tables:nodes table and join to it instead of the remote table:
node_id instead of the local column, and then join
to the lookup table later:
node_id, reducing
the number of rows that must be pulled back into Postgres from 1000 (all of
them) to just 8, one for each node.
PREPARE, EXECUTE, DEALLOCATE
As of v0.1.2, pg_clickhouse supports parameterized queries, mainly created by the PREPARE command:{param:type}-style query parameters:
parameters:
INSERT
Use the INSERT command to insert values into a remote ClickHouse table:COPY
Use the COPY command to insert a batch of rows into a remote ClickHouse table:⚠️ Batch API Limitations pg_clickhouse hasn’t yet implemented support for the PostgreSQL FDW batch insert API. Thus COPY currently uses INSERT statements to insert records. This will be improved in a future release.
LOAD
Use LOAD to load the pg_clickhouse shared library:SET
Use SET to set the pg_clickhouse custom configuration parameters.pg_clickhouse.session_settings
The pg_clickhouse.session_settings parameter configures ClickHouse
settings to be set on subsequent queries. Example:
join_use_nulls 1, group_by_use_nulls 1, final 1. Set it to an
empty string to fall back on the ClickHouse server’s settings.
date_time_output_format: the http driver requires it to be “iso”format_tsv_null_representation: the http driver requires the defaultoutput_format_tsv_crlf_end_of_linethe http driver requires the default
pg_clickhouse.session_settings; either use shared library preloading or
simply use one of the objects in the extension to ensure it loads.
pg_clickhouse.pushdown_regex
The pg_clickhouse.pushdown_regex parameter controls whether pg_clickhouse
pushes down regular expression functions and operators. It does so by default;
set this parameter to false to prevent them from being pushed down:
ALTER ROLE
Use ALTER ROLE’sSET command to preload pg_clickhouse
and/or SET its parameters for specific roles:
RESET command to reset pg_clickhouse preloading
and/or parameters:
Preloading
If every or nearly every Postgres connection needs to use pg_clickhouse, consider using shared library preloading to automatically load it:session_preload_libraries
Loads the shared library for every new connection to PostgreSQL:
shared_preload_libraries
Loads the shared library into the PostgreSQL parent process at startup time:
Data Types
pg_clickhouse maps the following ClickHouse data types to PostgreSQL data types. IMPORT FOREIGN SCHEMA uses the first type in the PostgreSQL column when importing columns; additional types may be used in CREATE FOREIGN TABLE statements:| ClickHouse | PostgreSQL | Notes |
|---|---|---|
| Bool | boolean | |
| Date | date | |
| Date32 | date | |
| DateTime | timestamptz | |
| Decimal | numeric | |
| Float32 | real | |
| Float64 | double precision | |
| IPv4 | inet | |
| IPv6 | inet | |
| Int16 | smallint | |
| Int32 | integer | |
| Int64 | bigint | |
| Int8 | smallint | |
| JSON | jsonb, json | |
| String | text, bytea | |
| UInt16 | integer | |
| UInt32 | bigint | |
| UInt64 | bigint | Errors on values > BIGINT max |
| UInt8 | smallint | |
| UUID | uuid |
BYTEA
ClickHouse does not provide the equivalent of the PostgreSQL BYTEA type, but allows any bytes to be stored in String type. In general ClickHouse strings should be mapped to the PostgreSQL TEXT, but when using binary data, map it to BYTEA. Example:SELECT query will output:
Function and Operator Reference
Functions
These functions provide the interface to query a ClickHouse database.clickhouse_raw_query
host=localhost port=8123. The supported connection
parameters are:
host: The host to connect to; required.port: The HTTP port to connect to; defaults to8123unlesshostis a ClickHouse Cloud host, in which case it defaults to8443dbname: The name of the database to connect to.username: The username to connect as; defaults todefaultpassword: The password to use to authenticate; defaults to no password
EXECUTE access to this function; consider GRANTing
access only to roles that legitimately need to execute ad-hoc ClickHouse
queries, e.g., a dedicated ClickHouse admin role:
Useful for queries that return no records, but queries that do return values
will be returned as a single text value:
Pushdown functions
pg_clickhouse pushes down a subset of the PostgreSQL builtin functions used
in conditionals (HAVING and WHERE clauses). That subset maps to ClickHouse
equivalents as follows:
abs: absfactorial: factorialmod(int2/int4/int8/numeric): modulopow&power(float8/numeric): powround: roundsin,cos,tan,atan,atan2,sinh,cosh,tanh,asinh,degrees,radians,pi: ClickHouse math functions of the same name.asin,acos,atanh,acoshare not pushed down: PG raises on out-of-range input where CH returnsNaN.date_part:date_part('day'): toDayOfMonthdate_part('doy'): toDayOfYeardate_part('dow'): toDayOfWeekdate_part('year'): toYeardate_part('month'): toMonthdate_part('hour'): toHourdate_part('minute'): toMinutedate_part('second'): toSeconddate_part('quarter'): toQuarterdate_part('isoyear'): toISOYeardate_part('week'): toISOYeardate_part('epoch'): toISOYear
date_trunc:date_trunc('week'): toMondaydate_trunc('second'): toStartOfSeconddate_trunc('minute'): toStartOfMinutedate_trunc('hour'): toStartOfHourdate_trunc('day'): toStartOfDaydate_trunc('month'): toStartOfMonthdate_trunc('quarter'): toStartOfQuarterdate_trunc('year'): toStartOfYear
extract(field FROM source): same mappings asdate_partdate(timestamp)&date(timestamptz): toDate (deparsed as CH aliasdate)array_position: indexOfarray_cat: arrayConcatarray_append: arrayPushBackarray_prepend: arrayPushFrontarray_remove: arrayRemovearray_length&cardinality: lengtharray_to_string: arrayStringConcatstring_to_array: splitByStringsplit_part: splitByString + array subscripttrim_array: arrayResizearray_fill: arrayWithConstantarray_reverse: arrayReversearray_shuffle: arrayShufflearray_sample: arrayRandomSamplearray_sort: arraySort / arrayReverseSortbtrim: trimBothltrim: ltrimrtrim: rtrimconcat_ws: concatWithSeparatorlower(text): lowerUTF8upper(text): upperUTF8substring(text, ...)&substr(text, ...): substringUTF8substring(bytea, ...)&substr(bytea, ...): substringlength(text): lengthUTF8length(bytea)&octet_length: lengthreverse(text): reverseUTF8reverse(bytea): reversestrpos: positionUTF8regexp_like: matchregexp_replace: replaceRegexpOne or replaceRegexpOne when thegflag is presentregexp_split_to_array: splitByRegexpmd5: MD5json_extract_path_text: sub-column syntaxjson_extract_path: toJSONString + sub-column syntaxjsonb_extract_path_text: sub-column syntaxjsonb_extract_path: toJSONString + sub-column syntaxbit_count(bytea): bitCountto_timestamp(float8): fromUnixTimestampto_char(timestamp[tz], fmt): formatDateTime whenfmtis a string constant whose every keyword has a faithful ClickHouse equivalent. See to_char() under Compatibility Notes for the supported keywords. Otherwise the function evaluates locally in PostgreSQL.statement_timestamp,transaction_timestamp, &clock_timestamp: nowInBlock64 (nowInBlock64(9, $session_timezone))CURRENT_DATE: now and toDate (toDate(now($session_timezone)))now,CURRENT_TIMESTAMP, &LOCALTIMESTAMP: now64 (now64(9, $session_timezone))CURRENT_TIMESTAMP(n)&LOCALTIMESTAMP(n): now64 (now64(n, $session_timezone))CURRENT_DATABASE: Passed as value from PostgreSQL function.CURRENT_SCHEMA: Passed as value from PostgreSQL function.CURRENT_CATALOG: Passed as value from PostgreSQL function.CURRENT_USER: Passed as value from PostgreSQL function.USER: Passed as value from PostgreSQL function.CURRENT_ROLE: Passed as value from PostgreSQL function.SESSION_USER: Passed as value from PostgreSQL function.
Pushdown operators
- Array slice (
arr[L:U]): arraySlice @>(array contains): hasAll<@(array contained by): hasAll&&(array overlap): hasAny~(regexp match): match!~(regexp not match): match~*(case insensitive regexp no match): match!~*(case insensitive regexp not match): match->>(JSON/JSONB extract element as text): sub-column syntax->(JSON/JSONB extract): toJSONString + sub-column syntax
Custom functions
These custom functions created bypg_clickhouse provide foreign query
pushdown for select ClickHouse functions with no PostgreSQL equivalents. If
any of these functions can’t be pushed down they will raise an exception.
Extension pushdown
pg_clickhouse recognizes functions from select core and third-party extensions, pushing them down to their ClickHouse equivalents.re2
All re2 extension functions push down 1:1 to ClickHouse:re2match→ matchre2extract→ extractre2extractall→ extractAllre2regexpextract→ regexpExtractre2extractgroups→ extractGroupsre2replaceregexpone→ replaceRegexpOnere2replaceregexpall→ replaceRegexpAllre2countmatches→ countMatchesre2countmatchescaseinsensitive→ countMatchesCaseInsensitivere2multimatchany→ multiMatchAnyre2multimatchanyindex→ multiMatchAnyIndexre2multimatchallindices→ multiMatchAllIndices
intarray
One intarray function pushes down to ClickHouse:idx→ indexOf
fuzzystrmatch
Two fuzzystrmatch functions push down to ClickHouse:soundex: soundexlevenshtein(2-arg): editDistanceUTF8
Pushdown casts
pg_clickhouse pushes down casts such asCAST(x AS bigint) for compatible
data types. For incompatible types the pushdown will fail; if x in this
example is a ClickHouse UInt64, ClickHouse will refuse to cast the value.
In order to push down casts to incompatible data types, pg_clickhouse provides
the following functions. They raise an exception in PostgreSQL if they’re not
pushed down.
Pushdown aggregates
These PostgreSQL aggregate functions pushdown to ClickHouse.Custom aggregates
These custom aggregate functions created bypg_clickhouse provide foreign
query pushdown for select ClickHouse aggregate functions with no PostgreSQL
equivalents. If any of these functions can’t be pushed down they will raise
an exception.
Pushdown ordered set aggregates
These ordered-set aggregate functions map to ClickHouse Parametric aggregate functions by passing their direct argument as a parameter and theirORDER BY expressions as arguments. For example, this PostgreSQL query:
ORDER BY suffixes DESC and NULLS FIRST
aren’t supported and will raise an error.
percentile_cont(double): quantilequantile(double): quantilequantileExact(double): quantileExact
Pushdown window functions
These PostgreSQL window functions push down to ClickHouse withOVER (PARTITION BY ... ORDER BY ...) clauses, including frame specifications where
applicable.
- row_number
- rank
- dense_rank
- ntile
- cume_dist
- percent_rank
- lead
- lag
- first_value
- last_value
- nth_value
min/max(withOVERclause)
row_number, rank, dense_rank, ntile, cume_dist,
percent_rank) omit their frame clause during pushdown because ClickHouse
rejects frame specifications on these functions.
Compatibility notes
Regular expressions
While pg_clickhouse pushes down regular expressions to ClickHouse equivalents when pg_clickhouse.pushdown_regex is true (the default), and makes an effort to ensure a basic level of compatibility, be aware of the differences between the two and how pg_clickhouse handles them.-
PostgreSQL supports POSIX Regular Expressions while ClickHouse supports
RE2 Regular Expressions. Beware of differences in behavior: write RE2
when the regular expression will be evaluated by ClickHouse (e.g., in a
WHEREclause) and POSIX when it will be evaluated by Postgres (e.g., in aSELECTclause). -
pg_clickhouse pushes down the Postgres [Regex flags] by prepending them to
ClickHouse regular expression inside
(?). For example:BecomesNote the inclusion of-s; this aligns the behavior with Postgres regular expressions by disablings, which ClickHouse enables by default. pg_clickhouse will not include-sif the flags in the Postgres function call includes. Unfortunately, this behavior breaks the compatibility of some regular expression in Postgres 24 and earlier. -
The only flags both support, and therefore can be used when evaluated by
ClickHouse, are:
i: case-insensitivem: multi-line mode:s: let.match\np: partial newline-sensitive matching (treated the same ass)t: tight syntax (the default, removed by pg_clickhouse)
- Any other flags passed to regular expression functions will cause the function not to be pushed down.
-
The exception is
regexp_replace(), which also supports thegflag. Whengis set, pg_clickhouse usesreplaceRegexpAll()instead ofreplaceRegexpOne()and removes the flag before prepending other flags. -
The replacement argument to Postgres
regexp_replace()supports\&to refer to the entire match, while in ClickHouse supports\0for the entire match. Be sure to use\0when the function pushes down to ClickHouse.
to_char()
PostgreSQL to_char() for timestamp and timestamp with time zone
pushes down to ClickHouse formatDateTime only when the format argument
is a non-NULL string constant whose every PostgreSQL keyword has a
byte-for-byte identical ClickHouse equivalent. If the format is dynamic
(not a Const), or contains any unsupported keyword or modifier, the
call falls back to local evaluation in PostgreSQL — pushdown is never
attempted with a partial translation, so output stays PG-compatible.
Two-argument to_char() forms over numeric, interval, and other
non-timestamp types never push down; ClickHouse formatDateTime only
formats date-time values.
Translated keywords
| PostgreSQL | ClickHouse | Meaning |
|---|---|---|
YYYY, yyyy | %Y | 4-digit year |
YY, yy | %y | 2-digit year |
MM, mm | %m | zero-padded month (01–12) |
DD, dd | %d | zero-padded day of month (01–31) |
DDD, ddd | %j | zero-padded day of year (001–366) |
HH24, hh24 | %H | zero-padded 24-hour (00–23) |
HH, hh, HH12, hh12 | %I | zero-padded 12-hour (01–12) |
MI, mi | %i | zero-padded minute (00–59) |
SS, ss | %S | zero-padded second (00–59) |
Q, q | %Q | quarter (1–4) |
Mon | %b | abbreviated month name, e.g., Oct |
Dy | %a | abbreviated weekday name, e.g., Mon |
AM, PM | %p | meridiem indicator, always uppercase |
Quoted text and literals
Text wrapped in"..." passes through verbatim, with any literal %
doubled to %% to escape ClickHouse’s specifier prefix. A \" outside
quotes also passes through as a literal ". Inside "...", backslash
only escapes "; other backslash sequences are treated as literal text.