# Query Identifier

This approach was beneficial for older PostgreSQL versions, prior to the introduction of the native `query_id` field in the [`pg_stat_activity`](https://www.postgresql.org/docs/current/monitoring-stats.html#MONITORING-PG-STAT-ACTIVITY-VIEW) view and the [`compute_query_id`](https://postgresqlco.nf/doc/en/param/compute_query_id/) parameter (available since PostgreSQL 14).

It enables mapping entries between `pg_stat_activity` and `pg_stat_statements` to build the database [workload of active sessions](/manual/features/key-features/session-history.md), and remains beneficial for recent versions.

Starting with [Agent 3.9.0](https://docs.datasentinel.io/manual/features/tips-and-hints/pages/MAP2YFfUJ0hfgplw8cAI#v3.9.0), you can switch to using PostgreSQL’s native `queryid` as the query identifier.

A new configuration parameter, **`use-postgres-queryid`**, allows switching from the md5-based identifier to the internal `queryid` computed by PostgreSQL. This parameter is **disabled** by default.

{% hint style="info" %}
The existing **query\_md5\_id** tag is preserved in the UI.

When this option is enabled, it contains the PostgreSQL **queryid** value instead of an **md5 hash**.
{% endhint %}

This option also provides a workaround for a rare issue in **`pg_stat_statements`** where an invalid query text encoding (for example, containing a null byte) can trigger the following error when collecting metrics from the extension.

```
ERROR: invalid byte sequence for encoding "UTF8": 0x00
```

{% hint style="warning" %}
If you are satisfied with the MD5 approach, there is no need to switch.
{% endhint %}

***

## Enabling the parameter

The parameter can be configured at multiple levels:

* **Instance level**, to enable it for a specific monitored instance

```bash
# Using Agent CLI
datasentinel update connection "<<CONNECTION_NAME>>" use-postgres-queryid on
# restart agent
datasentinel stop agent
datasentinel start agent
```

or modify agent configuration file `$HOME/.datasentinel/agent.yml`

```yaml
# Connection configuration (stop agent first)
    use-postgres-queryid: true  
```

* **Agent level**, in which case it applies to **all instances monitored by that agent**

```bash
# Using Agent CLI
datasentinel set use-postgres-queryid on
# restart agent
datasentinel stop agent
datasentinel start agent
```

or modify agent configuration file `$HOME/.datasentinel/agent.yml`

```yaml
# File configuration section (stop agent first)
agent:
    use-postgres-queryid: true                                               
```

* **Platform level** when using the **Agentless** **feature,** in which case it applies to all monitored instances.

{% hint style="info" %}
Available since version 2026.02.
{% endhint %}

{% code title="/datasentinel/soft/datasentinel\_backend/config/datasentinel.yml" %}

```bash
agent:
 use-postgres-queryid: true  
```

{% endcode %}

***

## Points of Attention

There are situations to be aware of when using PostgreSQL queryids with Datasentinel (non-exhaustive):

* **`query_id`** missing from **`pg_stat_activity`** even when the backend `state` is `active`

> The **`query_id`** is not available in **`pg_stat_activity`** during the initial Parsing and Analysis phases, or if the backend blocks on a lock during these steps.\
> This occurs because the id is computed and reported only after the analysis phase successfully completes.\
> \
> \&#xNAN;*If the query\_id is not available for a query, Datasentinel computes an md5 hash after removing literal values from the query text.*

*Example*

```sql
## Session 1: Exclusive lock
pgbench=# begin;lock table pgbench_branches in exclusive mode;
BEGIN
LOCK TABLE
pgbench=*# 
```

```sql
## Session 2: Waiting
pgbench=# update pgbench_branches set bbalance=1 where bid=10;

```

```sql
### pg_stat_activity
postgres=# select query_id, state, pid , wait_event_type, wait_event, query 
from pg_stat_activity where query ~* 'pgbench_branches';
       query_id       |        state        |  pid   | wait_event_type | wait_event |                                                          query                                                           
----------------------+---------------------+--------+-----------------+------------+--------------------------------------------------------------------------------------------------------------------------
                      | active              | 214446 | Lock            | relation   | update pgbench_branches set bbalance=1 where bid=10;
  8734522334460367230 | idle in transaction | 214326 | Client          | ClientRead | lock table pgbench_branches in exclusive mode;

```

* Multiple query IDs for queries that look the same

> PostgreSQL computes the **`query_id`** by hashing the internal structure of a parsed statement. This fingerprinting process is sensitive to how the parser classifies data types, particularly **numeric literals**, which can lead to different **`query_id`** values for queries that appear structurally identical.
>
> When the PostgreSQL parser processes a query containing numbers, it categorizes them based on their magnitude and format.
>
> Even though both queries perform a comparison against a number, the internal representation of that number can differ (integer vs bigint for example). This difference in the parse tree structure results in distinct hashes, generating different query\_id values.<br>
>
> *In the Datasentinel UI, different query\_id values result in separate queries, while the md5 method aggregates their metrics into one.*

*Example*

```sql
select /*test_queryid*/ * from pgbench_branches where bid = 10000000000000;
 bid | bbalance | filler 
-----+----------+--------
(0 rows)

select /*test_queryid*/ * from pgbench_branches where bid = 10;
 bid | bbalance  | filler 
-----+-----------+--------
  10 | 333100390 | 
(1 row)

select queryid, calls, query from pg_stat_statements where query ~* 'test_queryid';
       queryid       | calls |                             query                              
---------------------+-------+----------------------------------------------------------------
 3003798125598319662 |     1 | select /*test_queryid*/ * from pgbench_branches where bid = $1
 4063077833847830339 |     1 | select /*test_queryid*/ * from pgbench_branches where bid = $1
(2 rows)

select queryid, planid, calls, plan from pg_store_plans where queryid in (select queryid from pg_stat_statements where query ~* 'test_query');
       queryid       |       planid        | calls |                                              plan                                              
---------------------+---------------------+-------+------------------------------------------------------------------------------------------------
 3003798125598319662 | 8997990410952493373 |     1 | Index Scan using pgbench_branches_pkey on pgbench_branches  (cost=0.27..8.30 rows=1 width=364)+
                     |                     |       |   Index Cond: (bid = '10000000000000'::bigint)
 4063077833847830339 | 7082055953033278291 |     1 | Index Scan using pgbench_branches_pkey on pgbench_branches  (cost=0.27..8.30 rows=1 width=364)+
                     |                     |       |   Index Cond: (bid = 1)
(2 rows)

```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.datasentinel.io/manual/features/tips-and-hints/query-identifier.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
