Query Identifier

From the beginning, Datasentinel has relied on an MD5 hash of the query text to compute query identifiers.

This approach was beneficial for older PostgreSQL versions, prior to the introduction of the native query_id field in the pg_stat_activity view and the compute_query_id parameter (available since PostgreSQL 14).

It enables mapping entries between pg_stat_activity and pg_stat_statements to build the database workload of active sessions, and remains beneficial for recent versions.

Starting with Agent 3.9.0, you can switch to using PostgreSQL’s native queryid as the query identifier.

A new configuration parameter, use-postgres-queryid, allows switching from the md5-based identifier to the internal queryid computed by PostgreSQL. This parameter is disabled by default.

The existing query_md5_id tag is preserved in the UI.

When this option is enabled, it contains the PostgreSQL queryid value instead of an md5 hash.

This option also provides a workaround for a rare issue in pg_stat_statements where an invalid query text encoding (for example, containing a null byte) can trigger the following error when collecting metrics from the extension.

ERROR: invalid byte sequence for encoding "UTF8": 0x00

If you are satisfied with the MD5 approach, there is no need to switch.

Enabling the parameter

The parameter can be configured at multiple levels:

Instance level, to enable it for a specific monitored instance

# Using Agent CLI
datasentinel update connection "<<CONNECTION_NAME>>" use-postgres-queryid on
# restart agent
datasentinel stop agent
datasentinel start agent

or modify agent configuration file $HOME/.datasentinel/agent.yml

# Connection configuration (stop agent first)
    use-postgres-queryid: true

Agent level, in which case it applies to all instances monitored by that agent

# Using Agent CLI
datasentinel set use-postgres-queryid on
# restart agent
datasentinel stop agent
datasentinel start agent

or modify agent configuration file $HOME/.datasentinel/agent.yml

# File configuration section (stop agent first)
agent:
    use-postgres-queryid: true

Platform level when using the Agentless feature, in which case it applies to all monitored instances.

Available since version 2026.02.

/datasentinel/soft/datasentinel_backend/config/datasentinel.yml

agent:
 use-postgres-queryid: true

Points of Attention

There are situations to be aware of when using PostgreSQL queryids with Datasentinel (non-exhaustive):

query_id missing from pg_stat_activity even when the backend state is active

The query_id is not available in pg_stat_activity during the initial Parsing and Analysis phases, or if the backend blocks on a lock during these steps. This occurs because the id is computed and reported only after the analysis phase successfully completes. If the query_id is not available for a query, Datasentinel computes an md5 hash after removing literal values from the query text.

Example

## Session 1: Exclusive lock
pgbench=# begin;lock table pgbench_branches in exclusive mode;
BEGIN
LOCK TABLE
pgbench=*#

## Session 2: Waiting
pgbench=# update pgbench_branches set bbalance=1 where bid=10;

### pg_stat_activity
postgres=# select query_id, state, pid , wait_event_type, wait_event, query 
from pg_stat_activity where query ~* 'pgbench_branches';
       query_id       |        state        |  pid   | wait_event_type | wait_event |                                                          query                                                           
----------------------+---------------------+--------+-----------------+------------+--------------------------------------------------------------------------------------------------------------------------
                      | active              | 214446 | Lock            | relation   | update pgbench_branches set bbalance=1 where bid=10;
  8734522334460367230 | idle in transaction | 214326 | Client          | ClientRead | lock table pgbench_branches in exclusive mode;

Multiple query IDs for queries that look the same

PostgreSQL computes the query_id by hashing the internal structure of a parsed statement. This fingerprinting process is sensitive to how the parser classifies data types, particularly numeric literals, which can lead to different query_id values for queries that appear structurally identical.
When the PostgreSQL parser processes a query containing numbers, it categorizes them based on their magnitude and format.
Even though both queries perform a comparison against a number, the internal representation of that number can differ (integer vs bigint for example). This difference in the parse tree structure results in distinct hashes, generating different query_id values.
In the Datasentinel UI, different query_id values result in separate queries, while the md5 method aggregates their metrics into one.

Example

select /*test_queryid*/ * from pgbench_branches where bid = 10000000000000;
 bid | bbalance | filler 
-----+----------+--------
(0 rows)

select /*test_queryid*/ * from pgbench_branches where bid = 10;
 bid | bbalance  | filler 
-----+-----------+--------
  10 | 333100390 | 
(1 row)

select queryid, calls, query from pg_stat_statements where query ~* 'test_queryid';
       queryid       | calls |                             query                              
---------------------+-------+----------------------------------------------------------------
 3003798125598319662 |     1 | select /*test_queryid*/ * from pgbench_branches where bid = $1
 4063077833847830339 |     1 | select /*test_queryid*/ * from pgbench_branches where bid = $1
(2 rows)

select queryid, planid, calls, plan from pg_store_plans where queryid in (select queryid from pg_stat_statements where query ~* 'test_query');
       queryid       |       planid        | calls |                                              plan                                              
---------------------+---------------------+-------+------------------------------------------------------------------------------------------------
 3003798125598319662 | 8997990410952493373 |     1 | Index Scan using pgbench_branches_pkey on pgbench_branches  (cost=0.27..8.30 rows=1 width=364)+
                     |                     |       |   Index Cond: (bid = '10000000000000'::bigint)
 4063077833847830339 | 7082055953033278291 |     1 | Index Scan using pgbench_branches_pkey on pgbench_branches  (cost=0.27..8.30 rows=1 width=364)+
                     |                     |       |   Index Cond: (bid = 1)
(2 rows)

PreviousAgentless & System Metrics NextSimplified pg_instance Display

Last updated 27 days ago

hashtagEnabling the parameter

hashtagPoints of Attention

Enabling the parameter

Points of Attention