# Query Identifier

This approach was beneficial for older PostgreSQL versions, prior to the introduction of the native `query_id` field in the [`pg_stat_activity`](https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW) view and the [`compute_query_id`](https://postgresqlco.nf/doc/en/param/compute_query_id/) parameter (available since PostgreSQL 14).&#x20;

It enables mapping entries between `pg_stat_activity` and `pg_stat_statements` to build the database [workload of active sessions](https://docs.datasentinel.io/manual/features/key-features/session-history), and remains beneficial for recent versions.&#x20;

Starting with [Agent 3.9.0](https://docs.datasentinel.io/manual/implementation/agent-usage/release-notes#v3.9.0), you can switch to using PostgreSQL’s native `queryid` as the query identifier.&#x20;

A new configuration parameter, **`use-postgres-queryid`**, allows switching from the md5-based identifier to the internal `queryid` computed by PostgreSQL. This parameter is **disabled** by default.

{% hint style="info" %}
The existing **query\_md5\_id** tag is preserved in the UI.

When this option is enabled, it contains the PostgreSQL **queryid** value instead of an **md5 hash**.
{% endhint %}

This option also provides a workaround for a rare issue in **`pg_stat_statements`** where an invalid query text encoding (for example, containing a null byte) can trigger the following error when collecting metrics from the extension.

```
ERROR: invalid byte sequence for encoding "UTF8": 0x00
```

{% hint style="warning" %}
If you are satisfied with the MD5 approach, there is no need to switch.
{% endhint %}

***

## Enabling the parameter

The parameter can be configured at multiple levels:

* **Instance level**, to enable it for a specific monitored instance

```bash
# Using Agent CLI
datasentinel update connection "<<CONNECTION_NAME>>" use-postgres-queryid on
# restart agent
datasentinel stop agent
datasentinel start agent
```

or modify agent configuration file `$HOME/.datasentinel/agent.yml`

```yaml
# Connection configuration (stop agent first)
    use-postgres-queryid: true  
```

* **Agent level**, in which case it applies to **all instances monitored by that agent**

```bash
# Using Agent CLI
datasentinel set use-postgres-queryid on
# restart agent
datasentinel stop agent
datasentinel start agent
```

or modify agent configuration file `$HOME/.datasentinel/agent.yml`

```yaml
# File configuration section (stop agent first)
agent:
    use-postgres-queryid: true                                               
```

* **Platform level** when using the **Agentless** **feature,** in which case it applies to all monitored instances.&#x20;

{% hint style="info" %}
Available since version 2026.02.
{% endhint %}

{% code title="/datasentinel/soft/datasentinel\_backend/config/datasentinel.yml" %}

```bash
agent:
 use-postgres-queryid: true  
```

{% endcode %}

***

## Points of Attention

There are situations to be aware of when using PostgreSQL queryids with Datasentinel (non-exhaustive):

* **`query_id`** missing from **`pg_stat_activity`** even when the backend `state` is `active`

> The **`query_id`** is not available in **`pg_stat_activity`** during the initial Parsing and Analysis phases, or if the backend blocks on a lock during these steps. \
> This occurs because the id is computed and reported only after the analysis phase successfully completes.\
> \
> \&#xNAN;*If the query\_id is not available for a query, Datasentinel computes an md5 hash after removing literal values from the query text.*

*Example*

```sql
## Session 1: Exclusive lock
pgbench=# begin;lock table pgbench_branches in exclusive mode;
BEGIN
LOCK TABLE
pgbench=*# 
```

```sql
## Session 2: Waiting
pgbench=# update pgbench_branches set bbalance=1 where bid=10;

```

```sql
### pg_stat_activity
postgres=# select query_id, state, pid , wait_event_type, wait_event, query 
from pg_stat_activity where query ~* 'pgbench_branches';
       query_id       |        state        |  pid   | wait_event_type | wait_event |                                                          query                                                           
----------------------+---------------------+--------+-----------------+------------+--------------------------------------------------------------------------------------------------------------------------
                      | active              | 214446 | Lock            | relation   | update pgbench_branches set bbalance=1 where bid=10;
  8734522334460367230 | idle in transaction | 214326 | Client          | ClientRead | lock table pgbench_branches in exclusive mode;

```

* Multiple query IDs for queries that look the same

> PostgreSQL computes the **`query_id`** by hashing the internal structure of a parsed statement. This fingerprinting process is sensitive to how the parser classifies data types, particularly **numeric literals**, which can lead to different **`query_id`** values for queries that appear structurally identical.
>
> When the PostgreSQL parser processes a query containing numbers, it categorizes them based on their magnitude and format.
>
> Even though both queries perform a comparison against a number, the internal representation of that number can differ (integer vs bigint for example). This difference in the parse tree structure results in distinct hashes, generating different query\_id values.<br>
>
> *In the Datasentinel UI, different query\_id values result in separate queries, while the md5 method aggregates their metrics into one.*

*Example*

```sql
select /*test_queryid*/ * from pgbench_branches where bid = 10000000000000;
 bid | bbalance | filler 
-----+----------+--------
(0 rows)

select /*test_queryid*/ * from pgbench_branches where bid = 10;
 bid | bbalance  | filler 
-----+-----------+--------
  10 | 333100390 | 
(1 row)

select queryid, calls, query from pg_stat_statements where query ~* 'test_queryid';
       queryid       | calls |                             query                              
---------------------+-------+----------------------------------------------------------------
 3003798125598319662 |     1 | select /*test_queryid*/ * from pgbench_branches where bid = $1
 4063077833847830339 |     1 | select /*test_queryid*/ * from pgbench_branches where bid = $1
(2 rows)

select queryid, planid, calls, plan from pg_store_plans where queryid in (select queryid from pg_stat_statements where query ~* 'test_query');
       queryid       |       planid        | calls |                                              plan                                              
---------------------+---------------------+-------+------------------------------------------------------------------------------------------------
 3003798125598319662 | 8997990410952493373 |     1 | Index Scan using pgbench_branches_pkey on pgbench_branches  (cost=0.27..8.30 rows=1 width=364)+
                     |                     |       |   Index Cond: (bid = '10000000000000'::bigint)
 4063077833847830339 | 7082055953033278291 |     1 | Index Scan using pgbench_branches_pkey on pgbench_branches  (cost=0.27..8.30 rows=1 width=364)+
                     |                     |       |   Index Cond: (bid = 1)
(2 rows)

```
