Troubleshooting PostgreSQL Performance from Slow Queries

PostgreSQL Performance Troubleshooting with Slow Queries



Introduction  

If you are doing a very detailed Performance Diagnostics / Forensics then we strongly recommend you to understand the Data Access Path of underlying queries, cost of query execution, wait events / locks and system resource usage by PostgreSQL infrastructure operations. MinervaDB Performance Engineering Team measures performance by “Response Time” , So finding slow queries in PostgreSQL will be the most appropriate point to start this blog. PostgreSQL Server is highly configurable to collect details on query performance: slow query log, auditing execution plans with auto_explain and querying pg_stat_statements  . 

Using PostgreSQL slow query log to troubleshoot the performance

Step 1 – Open postgresql.conf file in your favorite text editor ( In Ubuntu, postgreaql.conf is available on /etc/postgresql/ ) and update configuration parameter log_min_duration_statement , By default configuration the slow query log is not active, To enable the slow query log on globally, you can change postgresql.conf:

log_min_duration_statement = 2000

In the above configuration, PostgreSQL will log queries, which take longer than 2 seconds.

Step 2 – A “reload” (by simply calling the SQL function) is sufficient, there is no need for a PostgreSQL server restart and Don’t worry, it won’t interrupt any active connections:

postgres=# SELECT pg_reload_conf();
 pg_reload_conf 
----------------
 t 
(1 row)

Note: It’s often too heavy for PostgreSQL infrastructure if you change slow query log settings in postgresql.conf , Therefore it makes more sensible to change only for a selected database or user:

postgres=# ALTER DATABASE minervadb SET log_min_duration_statement = 2000;
ALTER DATABASE

To complete the detailed performance forensics / diagnostics of high latency queries you can use aut0_explain , We have explained same below for queries exceeding certain threshold in PostgreSQL to send plan to the log file:

postgres=# LOAD 'auto_explain';
LOAD
postgres=# SET auto_explain.log_analyze TO on;
SET
postgres=# SET auto_explain.log_min_duration TO 2000;
SET

You can also enable auto explain in postgresql.conf with the settings below:

session_preload_libraries = 'auto_explain';

Note: Please do not forget to call pg_reload_conf() after the change made to postgresql.conf

More examples on PostgreSQL auto explain is copied below:

postgres=# CREATE TABLE minervdb_bench  AS
postgres-# SELECT * FROM generate_series(1, 10000000) AS id;
SELECT 10000000
postgres=# CREATE INDEX idx_id ON minervadb_bench(id);

postgres=# CREATE INDEX idx_id ON minervdb_bench(id);
CREATE INDEX
postgres=# ANALYZE;
ANALYZE

postgres=# LOAD 'auto_explain';
LOAD
postgres=# SET auto_explain.log_analyze TO on;
SET
postgres=# SET auto_explain.log_min_duration TO 200;
SET

postgres=# explain SELECT * FROM minervdb_bench  WHERE id < 5000;
                                      QUERY PLAN                                       
---------------------------------------------------------------------------------------
 Index Only Scan using idx_id on minervdb_bench  (cost=0.43..159.25 rows=4732 width=4)
   Index Cond: (id < 5000)
(2 rows)

postgres=# explain SELECT * FROM minervdb_bench  WHERE id < 200000;
                                        QUERY PLAN                                        
------------------------------------------------------------------------------------------
 Index Only Scan using idx_id on minervdb_bench  (cost=0.43..6550.25 rows=198961 width=4)
   Index Cond: (id < 200000)
(2 rows)

postgres=# explain SELECT count(*) FROM minervdb_bench GROUP BY id % 2;
                                     QUERY PLAN                                      
-------------------------------------------------------------------------------------
 GroupAggregate  (cost=1605360.71..1805360.25 rows=9999977 width=12)
   Group Key: ((id % 2))
   ->  Sort  (cost=1605360.71..1630360.65 rows=9999977 width=4)
         Sort Key: ((id % 2))
         ->  Seq Scan on minervdb_bench  (cost=0.00..169247.71 rows=9999977 width=4)
 JIT:
   Functions: 6
   Options: Inlining true, Optimization true, Expressions true, Deforming true
(8 rows)

Using pg_stat_statements

We can use pg_stat_statements to group the identical PostgreSQL queries by latency, To enable pg_stat_statements you have to add the following line to postgresql.conf and restart PostgreSQL server:

# postgresql.conf
shared_preload_libraries = 'pg_stat_statements'

pg_stat_statements.max = 10000
pg_stat_statements.track = all

Run “CREATE EXTENSION pg_stat_statements” in your database so that PostgreSQL will create a view for you:

postgres=# CREATE EXTENSION pg_stat_statements;
CREATE EXTENSION

postgres=# \d pg_stat_statements
                    View "public.pg_stat_statements"
       Column        |       Type       | Collation | Nullable | Default 
---------------------+------------------+-----------+----------+---------
 userid              | oid              |           |          | 
 dbid                | oid              |           |          | 
 queryid             | bigint           |           |          | 
 query               | text             |           |          | 
 calls               | bigint           |           |          | 
 total_time          | double precision |           |          | 
 min_time            | double precision |           |          | 
 max_time            | double precision |           |          | 
 mean_time           | double precision |           |          | 
 stddev_time         | double precision |           |          | 
 rows                | bigint           |           |          | 
 shared_blks_hit     | bigint           |           |          | 
 shared_blks_read    | bigint           |           |          | 
 shared_blks_dirtied | bigint           |           |          | 
 shared_blks_written | bigint           |           |          | 
 local_blks_hit      | bigint           |           |          | 
 local_blks_read     | bigint           |           |          | 
 local_blks_dirtied  | bigint           |           |          | 
 local_blks_written  | bigint           |           |          | 
 temp_blks_read      | bigint           |           |          | 
 temp_blks_written   | bigint           |           |          | 
 blk_read_time       | double precision |           |          | 
 blk_write_time      | double precision |           |          | 

postgres=#

pg_stat_statements view columns explained (Source: https://www.postgresql.org/docs/12/pgstatstatements.html)

Name Type References Description
userid oid pg_authid.oid OID of user who executed the statement
dbid oid pg_database.oid OID of database in which the statement was executed
queryid bigint Internal hash code, computed from the statement’s parse tree
query text Text of a representative statement
calls bigint Number of times executed
total_time double precision Total time spent in the statement, in milliseconds
min_time double precision Minimum time spent in the statement, in milliseconds
max_time double precision Maximum time spent in the statement, in milliseconds
mean_time double precision Mean time spent in the statement, in milliseconds
stddev_time double precision Population standard deviation of time spent in the statement, in milliseconds
rows bigint Total number of rows retrieved or affected by the statement
shared_blks_hit bigint Total number of shared block cache hits by the statement
shared_blks_read bigint Total number of shared blocks read by the statement
shared_blks_dirtied bigint Total number of shared blocks dirtied by the statement
shared_blks_written bigint Total number of shared blocks written by the statement
local_blks_hit bigint Total number of local block cache hits by the statement
local_blks_read bigint Total number of local blocks read by the statement
local_blks_dirtied bigint Total number of local blocks dirtied by the statement
local_blks_written bigint Total number of local blocks written by the statement
temp_blks_read bigint Total number of temp blocks read by the statement
temp_blks_written bigint Total number of temp blocks written by the statement
blk_read_time double precision Total time the statement spent reading blocks, in milliseconds (if track_io_timing is enabled, otherwise zero)
blk_write_time double precision Total time the statement spent writing blocks, in milliseconds (if track_io_timing is enabled, otherwise zero)

You can list queries by latency / Response Time in PostgreSQL  by querying pg_stat_statements:

postgres=# \x
Expanded display is on.

select query,calls,total_time,min_time,max_time,mean_time,stddev_time,rows from pg_stat_statements order by mean_time desc;

-[ RECORD 1 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
query       | SELECT count(*) FROM minervdb_bench GROUP BY id % $1
calls       | 6
total_time  | 33010.533078
min_time    | 4197.876021
max_time    | 6485.33594
mean_time   | 5501.755512999999
stddev_time | 826.3716429081501
rows        | 72
-[ RECORD 2 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
query       | CREATE INDEX idx_id ON minervdb_bench(id)
calls       | 1
total_time  | 4560.808456
min_time    | 4560.808456
max_time    | 4560.808456
mean_time   | 4560.808456
stddev_time | 0
rows        | 0
-[ RECORD 3 ]---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
query       | ANALYZE
calls       | 1
total_time  | 441.725223
min_time    | 441.725223
max_time    | 441.725223
mean_time   | 441.725223
stddev_time | 0
rows        | 0

-[ RECORD 4 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------
query       | SELECT a.attname,                                                                                                                                                         
                                                                                                                                                +
            |   pg_catalog.format_type(a.atttypid, a.atttypmod),                                                                                                                        
                                                                                                                                                +
            |   (SELECT substring(pg_catalog.pg_get_expr(d.adbin, d.adrelid, $1) for $2)                                                                                                                                                                                                                                                +
            |    FROM pg_catalog.pg_attrdef d                                                                                                                                                                                                                                                                                           +
            |    WHERE d.adrelid = a.attrelid AND d.adnum = a.attnum AND a.atthasdef),                                                                                                                                                                                                                                                  +
            |   a.attnotnull,                                                                                                                                                                                                                                                                                                           +
            |   (SELECT c.collname FROM pg_catalog.pg_collation c, pg_catalog.pg_type t                                                                                                                                                                                                                                                 +
            |    WHERE c.oid = a.attcollation AND t.oid = a.atttypid AND a.attcollation <> t.typcollation) AS attcollation,                                                                                                                                                                                                             +
            |   a.attidentity,                                                                                                                                                                                                                                                                                                          +
            |   a.attgenerated                                                                                                                                                                                                                                                                                                          +
            | FROM pg_catalog.pg_attribute a                                                                                                                                                                                                                                                                                            +
            | WHERE a.attrelid = $3 AND a.attnum > $4 AND NOT a.attisdropped                                                                                                                                                                                                                                                            +
            | ORDER BY a.attnum
calls       | 4
total_time  | 1.053107
min_time    | 0.081565
max_time    | 0.721785
mean_time   | 0.26327675000000006
stddev_time | 0.2658756938743884
rows        | 86

If you already know the epicenter of the bottleneck is a particular query or event / time, you can reset statistics just before query / event to monitor the problematic components in the PostgreSQL performance, You can do that by just calling pg_stat_statements_reset() as copied below:

postgres= SELECT pg_stat_statements_reset();

Conclusion

Performance tuning is the process of optimizing PostgreSQL performance by streamlining the execution of multiple SQL statements. In other words, performance tuning simplifies the process of accessing and altering information contained by the database with the intention of improving query response times and database application operations.

 

About MinervaDB Corporation 88 Articles
Independent and vendor neutral consulting, support, remote DBA services and training for MySQL, MariaDB, Percona Server, PostgreSQL and ClickHouse with core expertize in performance, scalability and high availability . We are an virtual corporation, all of us work from home on multiple timezones and stay connected via Email, Skype, Google Hangouts, Phone and IRC supporting over 250 customers worldwide
UA-155183614-1