We have observed that suddenly some parts of the application perform very slowly. It's possible that your database is slowing things down. If you are facing performance-related issues in PostgreSQL, you will need to know all your query execution statistics, like how many times the same query was executed, the total and average time for queries, and many more, so that you can analyze those queries and clear up the bottlenecks.
While doing performance optimization activities, it is very important to know about the long-running queries of our database server.
Postgresql provides a module or an extension called pg_stat_statements to manage all of the above requirements.
What is pg_stat_statements?
pg_stat_statements is a preload shared library that you can use in order to load additional PostgreSQL functionality.
pg_stat_statements tracks execution plans and execution statistics of ALL queries that were executed on the database.
You will need to configure the pg_stat_statements module or extension so that you can easily use the required statistics for performance tuning.
pg_stat_statements in brief
When pg_stat_statements is active, it tracks statistics across all databases on the server. To access and manipulate these statistics, the module provides the views pg_stat_statements and pg_stat_statements_info and the utility functions pg_stat_statements_reset and pg_stat_statements.
The pg_stat_statements View
The statistics gathered by the pg_stat_statements module are made available via a view named pg_stat_statements. This view contains one row for each distinct combination of database ID, user ID, query ID and whether it's a top-level statement or not.
Please see the PostgreSQL documentation for the pg_stat_statements for the view's columns.
The pg_stat_statements_info View
The statistics of the pg_stat_statements module itself are tracked and made available via a view named pg_stat_statements_info. This view contains only a single row.
Dealloc : Total number of times pg_stat_statements entries about the least-executed statements were deallocated because more distinct statements than pg_stat_statements.max were observed
stats_reset : Time at which all statistics in the pg_stat_statements view were last reset.
Please see the PostgreSQL documentation for the pg_stat_statements_info for the view's columns.
Configuration Parameters
You will need to set up the configuration parameters in the postgresql.conf file. After changes in postgresql.conf you will need to restart PostgreSQL service.
Please see the PostgreSQL documentation for the Configuration Parameters for more details.
Functions
pg_stat_statements_reset discards statistics gathered so far by pg_stat_statements corresponding to the specified userid, dbid and queryid. By default, this function can only be executed by superusers.
Please see the PostgreSQL documentation for the Functions for more details.
Who can access and see the content of pg_stat_statements?
The pg_stat_statements contains sensitive information, therefore in order to access it, you should be a superuser or a member of the pg_read_all_stats role.
Other users can see the stats using thepg_stat_statements_info
view.
Steps to configure and enable pg_stat_statements module:
-
- First, Install pg_stat_statements Extension:
CREATE EXTENSION pg_stat_statements;
Once we install the extension, It starts to log the required query execution information in the pg_stat_statements table.
- Select the data of pg_stat_statements:
SELECT * FROM pg_stat_statements;
- If you get a below error, require to change few parameters in postgresql.conf file:
ERROR: pg_stat_statements must be loaded via shared_preload_libraries
You can solve the above error by changing the following parameters in postgresql.conf file.
After changes in postgresql.conf you need to restart PostgreSQL service.
pg_stat_statements.max = 10000
pg_stat_statements.track = all
Shared_preload_libraries = ‘pg_stat_statements’ -
After setting the parameters in postgresql.conf file and restart of the server check the data of pg_stat_statements using
SELECT * FROM pg_stat_statements;
- First, Install pg_stat_statements Extension:
Parameter |
Description |
pg_stat_statements.max |
Sets the maximum number of statements tracked by pg_stat_statements. |
pg_stat_statements.track |
Selects which statements are tracked by pg_stat_statements. |
shared_preload_libraries |
Lists shared libraries to preload into the server. |
Reset pg_stat_statements:
Using below function, you can also reset pg_stat_statements:
SELECT * FROM pg_stat_statements_reset();
How to disable pg_stat_statements?
Using below function, you can also disable the pg_stat_statements extension
BigAnimal settings for pg_stat_statements
By default, on BigAnimal the extension is already installed, and the configuration is already set.
You can change the parameter values using the BigAnimal portal. Only after changing the pg_stat_statements.max value,the cluster get restarted.
Scripts
Top 10 I/O-intensive queries
SELECT
pss.userid::regrole,
pd.datname,
pss.query
FROM pg_stat_statements AS pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY (pss.blk_read_time + pss.blk_write_time)/pss.calls DESC
LIMIT 10;
Top 10 time-consuming queries
SELECT
pss.userid::regrole,
pd.datname,
pss.query
FROM pg_stat_statements as pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY pss.min_exec_time DESC
LIMIT 10;
SELECT
pss.userid::regrole,
pd.datname,
pss.query
FROM pg_stat_statements as pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY pss.total_exec_time DESC
LIMIT 10;
Top 10 response-time outliers
SELECT
pss.userid::regrole,
pd.datname,
pss.query
FROM pg_stat_statements as pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY pss.stddev_exec_time DESC
LIMIT 10;
Top 10 queries by memory usage
SELECT
pss.userid::regrole,
pd.datname,
pss.query
FROM pg_stat_statements as pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY (pss.shared_blks_hit + pss.shared_blks_dirtied) DESC
LIMIT 10;
Top 10 consumers of temporary space
SELECT
pss.userid::regrole,
pd.datname,
pss.query
FROM pg_stat_statements as pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY pss.temp_blks_written DESC
LIMIT 10;
Top 10 long executed queries
SELECT
pd.datname,
pss.query AS SQLQuery,
pss.rows AS TotalRowCount,
(pss.total_exec_time / 1000 / 60) AS TotalMinute,
((pss.total_exec_time / 1000 / 60)/calls) as TotalAverageTime
FROM pg_stat_statements AS pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY 1 DESC
LIMIT 10;
List queries with total no. of calls, total rows & rows returned etc:
SELECT
pd.datname,
pss.query,
pss.calls,
pss.total_exec_time,
pss.rows,
100.0 * pss.shared_blks_hit / nullif(pss.shared_blks_hit + pss.shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements AS pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY pss.total_exec_time DESC
LIMIT 10;
The most hit and read objects:
SELECT
pd.datname,
pss.query,
pss.calls,
pss.total_exec_time,
pss.rows,
100.0 * pss.shared_blks_hit / nullif(pss.shared_blks_hit + pss.shared_blks_read, 0) AS hit_percent
FROM pg_stat_statements AS pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY pss.total_exec_time DESC
LIMIT 10;
List queries by total_time & see which query spends most time in the database:
SELECT
pd.datname,
round(pss.total_exec_time*1000)/1000 AS total_time,
pss.query
FROM pg_stat_statements AS pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
ORDER BY pss.total_exec_time DESC
LIMIT 10;
List queries on 'per execution' basis & try to sample them over time
SELECT
pd.datname,
pss.queryid,
pss.query,
pss.calls,
pss.total_exec_time/pss.calls,
pss.rows/pss.calls,
pss.temp_blks_read/pss.calls,
pss.temp_blks_written/pss.calls
FROM pg_stat_statements AS pss
INNER JOIN pg_database AS pd
ON pss.dbid = pd.oid
WHERE pss.calls != 0
ORDER BY pss.total_exec_time DESC
LIMIT 10;