Deprecated: Function Yoast\WP\SEO\Conditionals\Schema_Blocks_Conditional::get_feature_flag is deprecated since version Yoast SEO 20.5 with no alternative available. in /home1/minerho3/public_html/wp-includes/functions.php on line 6078

Deprecated: Function Yoast\WP\SEO\Conditionals\Schema_Blocks_Conditional::get_feature_flag is deprecated since version Yoast SEO 20.5 with no alternative available. in /home1/minerho3/public_html/wp-includes/functions.php on line 6078

Deprecated: Function Yoast\WP\SEO\Conditionals\Schema_Blocks_Conditional::get_feature_flag is deprecated since version Yoast SEO 20.5 with no alternative available. in /home1/minerho3/public_html/wp-includes/functions.php on line 6078

Warning: Cannot modify header information - headers already sent by (output started at /home1/minerho3/public_html/wp-includes/functions.php:6078) in /home1/minerho3/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/minerho3/public_html/wp-includes/functions.php:6078) in /home1/minerho3/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/minerho3/public_html/wp-includes/functions.php:6078) in /home1/minerho3/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/minerho3/public_html/wp-includes/functions.php:6078) in /home1/minerho3/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/minerho3/public_html/wp-includes/functions.php:6078) in /home1/minerho3/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/minerho3/public_html/wp-includes/functions.php:6078) in /home1/minerho3/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/minerho3/public_html/wp-includes/functions.php:6078) in /home1/minerho3/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831

Warning: Cannot modify header information - headers already sent by (output started at /home1/minerho3/public_html/wp-includes/functions.php:6078) in /home1/minerho3/public_html/wp-includes/rest-api/class-wp-rest-server.php on line 1831
{"id":3684,"date":"2020-04-29T20:43:07","date_gmt":"2020-04-29T20:43:07","guid":{"rendered":"http:\/\/minervadb.com\/?p=3684"},"modified":"2020-04-29T20:46:34","modified_gmt":"2020-04-29T20:46:34","slug":"when-postgresql-vacuum-wont-remove-dead-rows-from-a-table","status":"publish","type":"post","link":"https:\/\/minervadb.com\/index.php\/2020\/04\/29\/when-postgresql-vacuum-wont-remove-dead-rows-from-a-table\/","title":{"rendered":"When PostgreSQL vacuum won’t remove dead rows from a Table\u00a0"},"content":{"rendered":"

When PostgreSQL vacuum won’t remove dead rows from a Table ?<\/span><\/h1>\n
\n

What is VACUUM in PostgreSQL ?<\/h4>\n

In PostgreSQL, whenever rows in a table deleted, The existing row or tuple is marked as dead ( will not be physically removed) and during an update, it marks corresponding exiting tuple as dead and inserts a new tuple so in PostgreSQL UPDATE operations = DELETE + INSERT. These dead tuples consumes unnecessary storage and eventually you have a bloated PostgreSQL database, This is a serious issue to solve for a PostgreSQL DBA. VACUUM reclaims in the storage occupied by dead tuples, Please keep this noted that reclaimed storage space is never given back to the resident operating system rather they are just defragmented within the same database page and so storage for reuse to future data inserts in the same table. Does the pain stops here ? No, It doesn\u2019t. Bloat seriously affect the PostgreSQL query performance, In PostgreSQL tables and indexes are stored as array of fixed-size pages ( usually 8KB in size). Whenever a query requests for rows, the PostgreSQL instance loads these pages into the memory and dead rows causes expensive disk I\/O during data loading.<\/p>\n

How to monitor autovacuum has processed bloated tables ?<\/h4>\n

If you are suspecting bloated tables in your PostgreSQL infrastructure, The \u00a0first thing to check is vacuum has processed those bloated table. We use following script to collect last processed vacuum data:<\/p>\n

SELECT schemaname, relname, n_live_tup, n_dead_tup, last_autovacuum\r\nFROM pg_stat_all_tables\r\nORDER BY n_dead_tup\r\n    \/ (n_live_tup\r\n       * current_setting('autovacuum_vacuum_scale_factor')::float8\r\n          + current_setting('autovacuum_vacuum_threshold')::float8)\r\n     DESC\r\nLIMIT 10;\r\n<\/pre>\n

But sometimes you can see vacuum is run recently and still didn’t free the dead tuples:<\/p>\n

 schemaname |    relname   | n_live_tup | n_dead_tup |   last_autovacuum\r\n-----------+--------------+------------+------------+---------------------\r\n ad_ops_dB  | ad_clicks    |      96100 |      96100 | 2020-04-18 16:33:47\r\n pg_catalog | pg_attribute |         11 |        259 |\r\n pg_catalog | pg_amop      |        193 |        81  |\r\n pg_catalog | pg_class     |         61 |         29 |\r\n pg_catalog | pg_type      |         39 |         14 |\r\n pg_catalog | pg_index     |          8 |         21 |\r\n pg_catalog | pg_depend    |       7349 |        962 |\r\n pg_catalog | pg_trigger   |          6 |         37 |\r\n pg_catalog | pg_proc      |        469 |         87 |\r\n pg_catalog | pg_shdepend  |         18 |         11 |\r\n(10 rows)<\/pre>\n

So it’s evident now that sometimes PostgreSQL vacuum is not removing dead rows.<\/p>\n

When PostgreSQL Vacuum won’t remove the dead rows ?<\/h4>\n

Transactions in PostgreSQL are identified with xid<\/span><\/a>\u00a0 ( transaction or “xact” ID), PostgreSQL will assign a transaction xid only if it starts modifying the data because it’s only from that point where other process need to start tracking it changes, These are not applicable for read only transactions.<\/p>\n

We have \u00a0copied \u00a0below the PostgreSQL data structure ( from proc.c<\/a><\/span> ) that handle transactions:<\/p>\n

typedef struct PGXACT\r\n{\r\n    TransactionId xid;   \/* id of top-level transaction currently being\r\n                          * executed by this proc, if running and XID\r\n                          * is assigned; else InvalidTransactionId *\/\r\n\r\n    TransactionId xmin;  \/* minimal running XID as it was when we were\r\n                          * starting our xact, excluding LAZY VACUUM:\r\n                          * vacuum must not remove tuples deleted by\r\n                          * xid >= xmin ! *\/\r\n\r\n    ...\r\n} PGXACT;<\/pre>\n

PostgreSQL Vacuum removes only the dead rows that are not in use anymore. A tuple is considered not needed when transaction ID of the deleting transaction is older than oldest transaction which is still active in the PostgreSQL database. The vacuum processes calculate the minimum boundary of data that they need to retain by tracking the minimum of xmins of all active transactions.<\/p>\n

The following are three situations which holds back xmin horizon in a PostgreSQL infrastructure:<\/p>\n

1. Long running transactions<\/h4>\n

You can find the details of long running queries and their respective xmin values form the query copied below:<\/p>\n

SELECT pid, datname, usename, state, backend_xmin\r\nFROM pg_stat_activity\r\nWHERE backend_xmin IS NOT NULL\r\nORDER BY age(backend_xmin) DESC;<\/pre>\n

P.S. If you think those transactions are no longer required, Please use pg_terminate_backend()\u00a0<\/a><\/span>\u00a0to terminate PostgreSQL sessions blocking Vacuum processes<\/p>\n

2. Abandoned replication slots<\/h4>\n

In PostgreSQL a replication slot is a data structure to control PostgreSQL from deleting the data that are still required by a standby server to catch-up with the primary database instance. If ever the replication to a \u00a0standby server \/ slave is delayed or slave PostgreSQL instance goes down for longer duration then replication slot will prevent vacuum from deleting old records \/ rows. To monitor replication slots and their relative xmin value please use the query below<\/p>\n

SELECT slot_name, slot_type, database, xmin\r\nFROM pg_replication_slots\r\nORDER BY age(xmin) DESC;\r\n<\/pre>\n

P.S. – To drop replication slots that are no longer needed, please use the function pg_drop_replication_slot<\/a>( )<\/span><\/p>\n

3. Orphaned prepared transactions<\/h4>\n

In a two-phase commit PostgreSQL database infrastructure, a distributed transaction is first prepared using PREPARE<\/a><\/span> statement and later committed with COMMIT PREPARED<\/a><\/span> statement. To monitor all prepared transactions and their respective xmin value please run the query below:<\/p>\n

SELECT gid, prepared, owner, database, transaction AS xmin\r\nFROM pg_prepared_xacts\r\nORDER BY age(transaction) DESC;<\/pre>\n

P.S. – We recommend ROLLBACK PREPARED<\/a><\/span> SQL statement to remove prepared transactions<\/p>\n

Conclusion<\/h4>\n

PostgreSQL Autovacuum addresses table bloating efficiently but there are situations where vacuum does’t work as expected so we strongly recommend to regularly check on how vacuum is processing the bloated tables
\n\"\"<\/a><\/p>\n

Reference links<\/p>\n