MySQL - MariaDB - ClickHouse - InnoDB - Galera Cluster - MySQL Support - MariaDB Support - MySQL Consulting - MariaDB Consulting - MySQL Remote DBA - MariaDB Remote DBA - Emergency DBA Support - Remote DBA - Database Migration - PostgreSQL - PostgreSQL Consulting - PostgreSQL Support - PostgreSQL Remote DBA

PostgreSQL DBA Daily Checklist

MinervaDB Corporation — Fri, 18 Dec 2020 20:01:34 +0000

The daily checklist of a PostgreSQL DBA

We often get this question, What are the most important things a PostgreSQL DBA should do to guarantee optimal performance and reliability, Do we have checklist for PostgreSQL DBAs to follow daily ? Since we are getting this question too often, Thought let’s note it as blog post and share with community of PostgreSQL ecosystem. The only objective this post is to share the information, Please don’t consider this as a run-book or recommendation from MinervaDB PostgreSQL support. We at MinervaDB are not accountable of any negative performance in you PostgreSQL performance with running these scripts in production database infrastructure of your business, The following is a simple daily checklist for PostgreSQL DBA:

Task 1: Check that all the PostgreSQL instances are up and operational:

pgrep -u postgres -fa -- -D

What if you have several instances of PostgreSQL are running:

pgrep -fa -- -D |grep postgres

Task 2: Monitoring PostgreSQL logsRecord PostgreSQL error logs: Open postgresql.conf configuration file, Under the ERROR REPORTING AND LOGGING section of the file, use following config parameters:

log_destination = 'stderr'
logging_collector = on
log_directory = 'pg_log'
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'
log_truncate_on_rotation = off
log_rotation_age = 1d
log_min_duration_statement = 0
log_connections = on
log_duration = on
log_hostname = on
log_timezone = 'UTC'

Save the postgresql.conf file and restart the postgres server.

sudo service postgresql restart

Task 3: Confirm PostgreSQL backup completed successfully

Use backup logs (possible only with PostgreSQL logical backup) to audit backup quality:

$ pg_dumpall > /backup-path/pg-backup-dump.sql > /var/log/postgres/pg-backup-dump.log

Task 4: Monitoring PostgreSQL Database Size:

select datname, pg_size_pretty(pg_database_size(datname)) from pg_database order by pg_database_size(datname);

Task 5: Monitor all PostgreSQL queries running (please repeat this task every 90 minutes during business / peak hours):

SELECT pid, age(clock_timestamp(), query_start), usename, query 
FROM pg_stat_activity 
WHERE query != '' AND query NOT ILIKE '%pg_stat_activity%' 
ORDER BY query_start desc;

Task 6: Inventory of indexes in PostgreSQL database:

select
    t.relname as table_name,
    i.relname as index_name,
    string_agg(a.attname, ',') as column_name
from
    pg_class t,
    pg_class i,
    pg_index ix,
    pg_attribute a
where
    t.oid = ix.indrelid
    and i.oid = ix.indexrelid
    and a.attrelid = t.oid
    and a.attnum = ANY(ix.indkey)
    and t.relkind = 'r'
    and t.relname not like 'pg_%'
group by  
    t.relname,
    i.relname
order by
    t.relname,
    i.relname;

Task 7: Finding the largest databases in your PostgreSQL cluster

SELECT d.datname as Name,  pg_catalog.pg_get_userbyid(d.datdba) as Owner,
    CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
        THEN pg_catalog.pg_size_pretty(pg_catalog.pg_database_size(d.datname))
        ELSE 'No Access'
    END as Size
FROM pg_catalog.pg_database d
    order by
    CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')
        THEN pg_catalog.pg_database_size(d.datname)
        ELSE NULL
    END desc -- nulls first
    LIMIT 20

Task 8: when you are suspecting some serious performance bottleneck in PostgreSQL ? Especially when you suspecting transactions blocking each other:

WITH RECURSIVE l AS (
  SELECT pid, locktype, mode, granted,
ROW(locktype,database,relation,page,tuple,virtualxid,transactionid,classid,objid,objsubid) obj
  FROM pg_locks
), pairs AS (
  SELECT w.pid waiter, l.pid locker, l.obj, l.mode
  FROM l w
  JOIN l ON l.obj IS NOT DISTINCT FROM w.obj AND l.locktype=w.locktype AND NOT l.pid=w.pid AND l.granted
  WHERE NOT w.granted
), tree AS (
  SELECT l.locker pid, l.locker root, NULL::record obj, NULL AS mode, 0 lvl, locker::text path, array_agg(l.locker) OVER () all_pids
  FROM ( SELECT DISTINCT locker FROM pairs l WHERE NOT EXISTS (SELECT 1 FROM pairs WHERE waiter=l.locker) ) l
  UNION ALL
  SELECT w.waiter pid, tree.root, w.obj, w.mode, tree.lvl+1, tree.path||'.'||w.waiter, all_pids || array_agg(w.waiter) OVER ()
  FROM tree JOIN pairs w ON tree.pid=w.locker AND NOT w.waiter = ANY ( all_pids )
)
SELECT (clock_timestamp() - a.xact_start)::interval(3) AS ts_age,
       replace(a.state, 'idle in transaction', 'idletx') state,
       (clock_timestamp() - state_change)::interval(3) AS change_age,
       a.datname,tree.pid,a.usename,a.client_addr,lvl,
       (SELECT count(*) FROM tree p WHERE p.path ~ ('^'||tree.path) AND NOT p.path=tree.path) blocked,
       repeat(' .', lvl)||' '||left(regexp_replace(query, 's+', ' ', 'g'),100) query
FROM tree
JOIN pg_stat_activity a USING (pid)
ORDER BY path;

Task 9: Identify bloated Tables in PostgreSQL :

WITH constants AS (
    -- define some constants for sizes of things
    -- for reference down the query and easy maintenance
    SELECT current_setting('block_size')::numeric AS bs, 23 AS hdr, 8 AS ma
),
no_stats AS (
    -- screen out table who have attributes
    -- which dont have stats, such as JSON
    SELECT table_schema, table_name, 
        n_live_tup::numeric as est_rows,
        pg_table_size(relid)::numeric as table_size
    FROM information_schema.columns
        JOIN pg_stat_user_tables as psut
           ON table_schema = psut.schemaname
           AND table_name = psut.relname
        LEFT OUTER JOIN pg_stats
        ON table_schema = pg_stats.schemaname
            AND table_name = pg_stats.tablename
            AND column_name = attname 
    WHERE attname IS NULL
        AND table_schema NOT IN ('pg_catalog', 'information_schema')
    GROUP BY table_schema, table_name, relid, n_live_tup
),
null_headers AS (
    -- calculate null header sizes
    -- omitting tables which dont have complete stats
    -- and attributes which aren't visible
    SELECT
        hdr+1+(sum(case when null_frac <> 0 THEN 1 else 0 END)/8) as nullhdr,
        SUM((1-null_frac)*avg_width) as datawidth,
        MAX(null_frac) as maxfracsum,
        schemaname,
        tablename,
        hdr, ma, bs
    FROM pg_stats CROSS JOIN constants
        LEFT OUTER JOIN no_stats
            ON schemaname = no_stats.table_schema
            AND tablename = no_stats.table_name
    WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
        AND no_stats.table_name IS NULL
        AND EXISTS ( SELECT 1
            FROM information_schema.columns
                WHERE schemaname = columns.table_schema
                    AND tablename = columns.table_name )
    GROUP BY schemaname, tablename, hdr, ma, bs
),
data_headers AS (
    -- estimate header and row size
    SELECT
        ma, bs, hdr, schemaname, tablename,
        (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
        (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
    FROM null_headers
),
table_estimates AS (
    -- make estimates of how large the table should be
    -- based on row and page size
    SELECT schemaname, tablename, bs,
        reltuples::numeric as est_rows, relpages * bs as table_bytes,
    CEIL((reltuples*
            (datahdr + nullhdr2 + 4 + ma -
                (CASE WHEN datahdr%ma=0
                    THEN ma ELSE datahdr%ma END)
                )/(bs-20))) * bs AS expected_bytes,
        reltoastrelid
    FROM data_headers
        JOIN pg_class ON tablename = relname
        JOIN pg_namespace ON relnamespace = pg_namespace.oid
            AND schemaname = nspname
    WHERE pg_class.relkind = 'r'
),
estimates_with_toast AS (
    -- add in estimated TOAST table sizes
    -- estimate based on 4 toast tuples per page because we dont have 
    -- anything better.  also append the no_data tables
    SELECT schemaname, tablename, 
        TRUE as can_estimate,
        est_rows,
        table_bytes + ( coalesce(toast.relpages, 0) * bs ) as table_bytes,
        expected_bytes + ( ceil( coalesce(toast.reltuples, 0) / 4 ) * bs ) as expected_bytes
    FROM table_estimates LEFT OUTER JOIN pg_class as toast
        ON table_estimates.reltoastrelid = toast.oid
            AND toast.relkind = 't'
),
table_estimates_plus AS (
-- add some extra metadata to the table data
-- and calculations to be reused
-- including whether we cant estimate it
-- or whether we think it might be compressed
    SELECT current_database() as databasename,
            schemaname, tablename, can_estimate, 
            est_rows,
            CASE WHEN table_bytes > 0
                THEN table_bytes::NUMERIC
                ELSE NULL::NUMERIC END
                AS table_bytes,
            CASE WHEN expected_bytes > 0 
                THEN expected_bytes::NUMERIC
                ELSE NULL::NUMERIC END
                    AS expected_bytes,
            CASE WHEN expected_bytes > 0 AND table_bytes > 0
                AND expected_bytes <= table_bytes
                THEN (table_bytes - expected_bytes)::NUMERIC
                ELSE 0::NUMERIC END AS bloat_bytes
    FROM estimates_with_toast
    UNION ALL
    SELECT current_database() as databasename, 
        table_schema, table_name, FALSE, 
        est_rows, table_size,
        NULL::NUMERIC, NULL::NUMERIC
    FROM no_stats
),
bloat_data AS (
    -- do final math calculations and formatting
    select current_database() as databasename,
        schemaname, tablename, can_estimate, 
        table_bytes, round(table_bytes/(1024^2)::NUMERIC,3) as table_mb,
        expected_bytes, round(expected_bytes/(1024^2)::NUMERIC,3) as expected_mb,
        round(bloat_bytes*100/table_bytes) as pct_bloat,
        round(bloat_bytes/(1024::NUMERIC^2),2) as mb_bloat,
        table_bytes, expected_bytes, est_rows
    FROM table_estimates_plus
)
-- filter output for bloated tables
SELECT databasename, schemaname, tablename,
    can_estimate,
    est_rows,
    pct_bloat, mb_bloat,
    table_mb
FROM bloat_data
-- this where clause defines which tables actually appear
-- in the bloat chart
-- example below filters for tables which are either 50%
-- bloated and more than 20mb in size, or more than 25%
-- bloated and more than 4GB in size
WHERE ( pct_bloat >= 50 AND mb_bloat >= 10 )
    OR ( pct_bloat >= 25 AND mb_bloat >= 1000 )
ORDER BY pct_bloat DESC;

Task 10: Identify bloated indexes in PostgreSQL :

-- btree index stats query
-- estimates bloat for btree indexes
WITH btree_index_atts AS (
    SELECT nspname, 
        indexclass.relname as index_name, 
        indexclass.reltuples, 
        indexclass.relpages, 
        indrelid, indexrelid,
        indexclass.relam,
        tableclass.relname as tablename,
        regexp_split_to_table(indkey::text, ' ')::smallint AS attnum,
        indexrelid as index_oid
    FROM pg_index
    JOIN pg_class AS indexclass ON pg_index.indexrelid = indexclass.oid
    JOIN pg_class AS tableclass ON pg_index.indrelid = tableclass.oid
    JOIN pg_namespace ON pg_namespace.oid = indexclass.relnamespace
    JOIN pg_am ON indexclass.relam = pg_am.oid
    WHERE pg_am.amname = 'btree' and indexclass.relpages > 0
         AND nspname NOT IN ('pg_catalog','information_schema')
    ),
index_item_sizes AS (
    SELECT
    ind_atts.nspname, ind_atts.index_name, 
    ind_atts.reltuples, ind_atts.relpages, ind_atts.relam,
    indrelid AS table_oid, index_oid,
    current_setting('block_size')::numeric AS bs,
    8 AS maxalign,
    24 AS pagehdr,
    CASE WHEN max(coalesce(pg_stats.null_frac,0)) = 0
        THEN 2
        ELSE 6
    END AS index_tuple_hdr,
    sum( (1-coalesce(pg_stats.null_frac, 0)) * coalesce(pg_stats.avg_width, 1024) ) AS nulldatawidth
    FROM pg_attribute
    JOIN btree_index_atts AS ind_atts ON pg_attribute.attrelid = ind_atts.indexrelid AND pg_attribute.attnum = ind_atts.attnum
    JOIN pg_stats ON pg_stats.schemaname = ind_atts.nspname
          -- stats for regular index columns
          AND ( (pg_stats.tablename = ind_atts.tablename AND pg_stats.attname = pg_catalog.pg_get_indexdef(pg_attribute.attrelid, pg_attribute.attnum, TRUE)) 
          -- stats for functional indexes
          OR   (pg_stats.tablename = ind_atts.index_name AND pg_stats.attname = pg_attribute.attname))
    WHERE pg_attribute.attnum > 0
    GROUP BY 1, 2, 3, 4, 5, 6, 7, 8, 9
),
index_aligned_est AS (
    SELECT maxalign, bs, nspname, index_name, reltuples,
        relpages, relam, table_oid, index_oid,
        coalesce (
            ceil (
                reltuples * ( 6 
                    + maxalign 
                    - CASE
                        WHEN index_tuple_hdr%maxalign = 0 THEN maxalign
                        ELSE index_tuple_hdr%maxalign
                      END
                    + nulldatawidth 
                    + maxalign 
                    - CASE /* Add padding to the data to align on MAXALIGN */
                        WHEN nulldatawidth::integer%maxalign = 0 THEN maxalign
                        ELSE nulldatawidth::integer%maxalign
                      END
                )::numeric 
              / ( bs - pagehdr::NUMERIC )
              +1 )
         , 0 )
      as expected
    FROM index_item_sizes
),
raw_bloat AS (
    SELECT current_database() as dbname, nspname, pg_class.relname AS table_name, index_name,
        bs*(index_aligned_est.relpages)::bigint AS totalbytes, expected,
        CASE
            WHEN index_aligned_est.relpages <= expected 
                THEN 0
                ELSE bs*(index_aligned_est.relpages-expected)::bigint 
            END AS wastedbytes,
        CASE
            WHEN index_aligned_est.relpages <= expected
                THEN 0 
                ELSE bs*(index_aligned_est.relpages-expected)::bigint * 100 / (bs*(index_aligned_est.relpages)::bigint) 
            END AS realbloat,
        pg_relation_size(index_aligned_est.table_oid) as table_bytes,
        stat.idx_scan as index_scans
    FROM index_aligned_est
    JOIN pg_class ON pg_class.oid=index_aligned_est.table_oid
    JOIN pg_stat_user_indexes AS stat ON index_aligned_est.index_oid = stat.indexrelid
),
format_bloat AS (
SELECT dbname as database_name, nspname as schema_name, table_name, index_name,
        round(realbloat) as bloat_pct, round(wastedbytes/(1024^2)::NUMERIC) as bloat_mb,
        round(totalbytes/(1024^2)::NUMERIC,3) as index_mb,
        round(table_bytes/(1024^2)::NUMERIC,3) as table_mb,
        index_scans
FROM raw_bloat
)
-- final query outputting the bloated indexes
-- change the where and order by to change
-- what shows up as bloated
SELECT *
FROM format_bloat
WHERE ( bloat_pct > 50 and bloat_mb > 10 )
ORDER BY bloat_mb DESC;

Task 11: Monitor blocked and blocking activities in PostgreSQL:

 SELECT blocked_locks.pid     AS blocked_pid,
         blocked_activity.usename  AS blocked_user,
         blocking_locks.pid     AS blocking_pid,
         blocking_activity.usename AS blocking_user,
         blocked_activity.query    AS blocked_statement,
         blocking_activity.query   AS current_statement_in_blocking_process
   FROM  pg_catalog.pg_locks         blocked_locks
    JOIN pg_catalog.pg_stat_activity blocked_activity  ON blocked_activity.pid = blocked_locks.pid
    JOIN pg_catalog.pg_locks         blocking_locks 
        ON blocking_locks.locktype = blocked_locks.locktype
        AND blocking_locks.database IS NOT DISTINCT FROM blocked_locks.database
        AND blocking_locks.relation IS NOT DISTINCT FROM blocked_locks.relation
        AND blocking_locks.page IS NOT DISTINCT FROM blocked_locks.page
        AND blocking_locks.tuple IS NOT DISTINCT FROM blocked_locks.tuple
        AND blocking_locks.virtualxid IS NOT DISTINCT FROM blocked_locks.virtualxid
        AND blocking_locks.transactionid IS NOT DISTINCT FROM blocked_locks.transactionid
        AND blocking_locks.classid IS NOT DISTINCT FROM blocked_locks.classid
        AND blocking_locks.objid IS NOT DISTINCT FROM blocked_locks.objid
        AND blocking_locks.objsubid IS NOT DISTINCT FROM blocked_locks.objsubid
        AND blocking_locks.pid != blocked_locks.pid

    JOIN pg_catalog.pg_stat_activity blocking_activity ON blocking_activity.pid = blocking_locks.pid
   WHERE NOT blocked_locks.granted;

Task 12: Monitoring PostgreSQL Disk I/O performance

-- perform a "select pg_stat_reset();" when you want to reset counter statistics
with 
all_tables as
(
SELECT  *
FROM    (
    SELECT  'all'::text as table_name, 
        sum( (coalesce(heap_blks_read,0) + coalesce(idx_blks_read,0) + coalesce(toast_blks_read,0) + coalesce(tidx_blks_read,0)) ) as from_disk, 
        sum( (coalesce(heap_blks_hit,0)  + coalesce(idx_blks_hit,0)  + coalesce(toast_blks_hit,0)  + coalesce(tidx_blks_hit,0))  ) as from_cache    
    FROM    pg_statio_all_tables  --> change to pg_statio_USER_tables if you want to check only user tables (excluding postgres's own tables)
    ) a
WHERE   (from_disk + from_cache) > 0 -- discard tables without hits
),
tables as 
(
SELECT  *
FROM    (
    SELECT  relname as table_name, 
        ( (coalesce(heap_blks_read,0) + coalesce(idx_blks_read,0) + coalesce(toast_blks_read,0) + coalesce(tidx_blks_read,0)) ) as from_disk, 
        ( (coalesce(heap_blks_hit,0)  + coalesce(idx_blks_hit,0)  + coalesce(toast_blks_hit,0)  + coalesce(tidx_blks_hit,0))  ) as from_cache    
    FROM    pg_statio_all_tables --> change to pg_statio_USER_tables if you want to check only user tables (excluding postgres's own tables)
    ) a
WHERE   (from_disk + from_cache) > 0 -- discard tables without hits
)
SELECT  table_name as "table name",
    from_disk as "disk hits",
    round((from_disk::numeric / (from_disk + from_cache)::numeric)*100.0,2) as "% disk hits",
    round((from_cache::numeric / (from_disk + from_cache)::numeric)*100.0,2) as "% cache hits",
    (from_disk + from_cache) as "total hits"
FROM    (SELECT * FROM all_tables UNION ALL SELECT * FROM tables) a
ORDER   BY (case when table_name = 'all' then 0 else 1 end), from_disk desc

References

☛ MinervaDB is trusted by top companies worldwide

The post PostgreSQL DBA Daily Checklist appeared first on The WebScale Database Infrastructure Operations Experts.

MinervaDB Athena 2020 -Role of Venture Capital Companies in building Database Systems Companies by Anandamoy Roychowdhary

MinervaDB Corporation — Tue, 15 Dec 2020 10:32:37 +0000

Role of Venture Capital Companies in building Database Systems Companies by Anandamoy Roychowdhary

Anandamoy Roychowdhary (a.k.a Roy) talked about how Venture Capital companies like Sequoia Capital can help you in building successfully Database Systems companies both strategically and technically. In this talk, Roy shared the most common metrics followed by Sequoia Capital to measure business from seed stage to the growth capital phase with more insights to how open source projects get evolved to wider adoption, community development and institutionalization of the business.

⊗ Talk from Roy on how venture capital companies can help you to build the Database Systems startups / business for scale

The post MinervaDB Athena 2020 -Role of Venture Capital Companies in building Database Systems Companies by Anandamoy Roychowdhary appeared first on The WebScale Database Infrastructure Operations Experts.

MinervaDB Athena 2020 – Profiling Linux Operations for Performance and Troubleshooting by Tanel Poder

MinervaDB Corporation — Mon, 14 Dec 2020 19:30:23 +0000

MinervaDB Athena 2020 – Profiling Linux Operations for Performance and

Troubleshooting by Tanel Poder

In Athena 2020, Tanel Poder talked about troubleshooting Linux operations performance based on the detailed forensics and evidence collection through0x.tools (Linux Process Snapper from Tanel), Which is a free, open source /proc file system sampling tool which annotates Linux thread handling activities more intuitively. The talk was really interesting for Open Source Database Systems folks who attended the conference because they spend most of their professional hours troubleshooting performance of their Database Infrastructure Operations for optimal performance and also in this talk Tanel’s approach was bottom-up (interpreting performance metrics with Linux probes) compared top-down approach (depending on Database Systems metadata views), You can download the PDF of the talk here

⊗ Profiling Linux Operations for Performance and vTroubleshooting by Tanel Poder

The post MinervaDB Athena 2020 – Profiling Linux Operations for Performance and Troubleshooting by Tanel Poder appeared first on The WebScale Database Infrastructure Operations Experts.

MinervaDB Athena 2020 Key note – Building Database Infrastructure for Performance and Reliability

MinervaDB Corporation — Sun, 13 Dec 2020 19:01:53 +0000

MinervaDB Athena 2020 Key note – Building Database Infrastructure for

Performance and Reliability

MinervaDB Athena 2020, Open Source WebScale Database Systems Infrastructure Operations Virtual Conference hosted by MinervaDB Inc. happened on Friday, 11 December 2020 (09:00 AM PST to 07:00 PM PST). My talk was “Building Database Infrastructure for Performance and Reliability“, This talk is about what it takes to build optimal and reliable Database Infrastructure Operations for WebScale. The talk covered topics like capacity planning / sizing, observability, performance optimization / tuning, scale-out / replication, sharding, database reliability engineering, database security etc. , So why I choose this topic for speaking ? For last several years when I talk to customers about building their MySQL / MariaDB / PostgreSQL for performance or reliability, The standardization of Database Infrastructure Operations at a scale was getting complex and quality of DBA ops. was getting compromised occasionally, this was also worrying me professionally so I decided to define a framework for optimal DBA Ops. This talk is about all the required components for building database infrastructure addressing performance and reliability. To conclude, I will take this opportunity to thank all attendees and speakers of MinervaDB Athena 2020, Looking forward to your support in the future conferences.

MinervaDB Athena 2020 Key note – Building Database Infrastructure for Performance and Reliability

The post MinervaDB Athena 2020 Key note – Building Database Infrastructure for Performance and Reliability appeared first on The WebScale Database Infrastructure Operations Experts.

Benchmarking MySQL 8.0 Performance on Amazon EC2

MinervaDB Corporation — Mon, 05 Oct 2020 18:24:14 +0000

MySQL 8.0 Performance Benchmarking on Amazon EC2

The scope of performance benchmarking

The core objective of this benchmarking exercise is to measure MySQL 8.0 performance, This include INSERTs , SELECTs and complex transaction processing (both INSERTs and SELECTs) without any tuning of MySQL 8 instance’s my.cnf. We agree tuning my.cnf will greatly improve performance but in this activity we wanted to benchmark MySQL 8 transaction processing capabilities and technically in MinervaDB we measure performance by Response Time and believe you can build high performance MySQL applications by writing optimal SQL. We have used Sysbench (https://github.com/MinervaDB/MinervaDB-Sysbench release 1.0.20) for this benchmarking activity. This is not a paid / sponsored benchmarking effort by any of the software or hardware vendors, We will remain forever an vendor neutral and independent web-scale database infrastructure operations company with core expertise in performance, scalability, high availability and database reliability engineering. You can download detailed copy of this benchmarking here

Note: This MySQL 8.0 performance benchmarking paper is published by MinervaDB Performance Engineering Team, You are free to copy the entire content for research and publishing without copyrighting the content. This document is distributed in the hope that it will be useful but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

☛ A low cost and instant gratification health check-up for your MySQL infrastructure operations from MinervaDB

Highly responsive and proactive MySQL performance health check-up, diagnostics and forensics.
Detailed report on your MySQL configuration, expensive SQL, index operations, performance, scalability and reliability.
Recommendations for building an optimal, scalable, highly available and reliable MySQL infrastructure operations.
Per MySQL instance performance audit, detailed report and recommendations.
Security Audit – Detailed Database Security Audit Report which includes the results of the audit and an actionable Compliance and Security Plan for fixing vulnerabilities and ensuring the ongoing security of your data.

** You are paying us only for the MySQL instance we have worked for :

MySQL Health Check-up	Rate ( plus GST / Goods and Services Tax where relevant )
MySQL infrastructure operations detailed health check-up, diagnostics report and recommendations	US $7,500 / MySQL instance

☛ MinervaDB contacts – Sales & General Inquiries

Business Function	Contact
CONTACT GLOBAL SALES (24*7)	(844) 588-7287 (USA) (415) 212-6625 (USA) (778) 770-5251 (Canada)
TOLL FREE PHONE (24*7)	(844) 588-7287
MINERVADB FAX	+1 (209) 314-2364
MinervaDB Email - General / Sales / Consulting	contact@minervadb.com
MinervaDB Email - Support	support@minervadb.com
MinervaDB Email -Remote DBA	remotedba@minervadb.com
Shiv Iyer Email - Founder and Principal	shiv@minervadb.com
CORPORATE ADDRESS: CALIFORNIA	MinervaDB Inc., 340 S LEMON AVE #9718 WALNUT 91789 CA, US
CORPORATE ADDRESS: DELAWARE	MinervaDB Inc., PO Box 2093 PHILADELPHIA PIKE #3339 CLAYMONT, DE 19703
CORPORATE ADDRESS: HOUSTON	MinervaDB Inc., 1321 Upland Dr. PMB 19322, Houston, TX 77043, US

The post Benchmarking MySQL 8.0 Performance on Amazon EC2 appeared first on The WebScale Database Infrastructure Operations Experts.

MinervaDB Database Platforms Virtual Conference 2020

MinervaDB Corporation — Wed, 02 Sep 2020 16:36:16 +0000

MinervaDB Database Platforms Virtual Conference 2020

Friday, 11 December 2020 (09:00 AM PST to 07:00 PM PST)

MinervaDB is excited to announce an full-stack Open Source Database Systems Conference – MinervaDB Database Platforms Virtual Conference 2020 focussing on MySQL, MariaDB, MyRocks, PostgreSQL, ClickHouse, NoSQL, Columnar Stores, Big Data, SRE and DevOps. addressing performance, scalability and reliability. This is a free virtual conference (hosted on GoToWebinar) and so you don’t have to plan any travels or leave your family / friends during this global pandemic. This conference is scheduled for Friday, 11 December 2020 (09:00 AM PST to 07:00 PM PST) and you can register (100% free) for the conference here. Our call for papers / talks / speakers will be open very soon. If you are interested to be on MinervaDB Database Platforms Virtual Conference 2020 conference committee , Please contact our Founder and Principal Shiv Iyer directly – shiv@minervadb.com and for event sponsorship / advertisements please contact sponsorship@minervadb.com

The post MinervaDB Database Platforms Virtual Conference 2020 appeared first on The WebScale Database Infrastructure Operations Experts.

MariaDB S3 Storage Engine – MariaDB 10.5.4 New Feature

MinervaDB Corporation — Sat, 08 Aug 2020 01:04:12 +0000

Introducing MariaDB S3 Storage Engine for database archiving and performance

MariaDB S3 storage engine

MariaDB S3 is a read only storage engine based on Aria code which allows you to archive MariaDB tables in Amazon S3 (on the size of s3_block_size) or any third-party public or private cloud that implements S3 API (of which there are many), but still have them accessible for reading in MariaDB. Internally, The S3 storage inherits from Aria code and hooks that change reads, So that instead if reading from local disk it reads from S3. We recommend S3 storage engine for our MariaDB 10.5 customers (if you are planning the update to MariaDB 10.5, We strongly recommend you to read our blogs MariaDB 10.5. New Features- Upgrade from MariaDB 10.4 to MariaDB 10.5 ) with very high volume tables which would become fairly inactive, but are still important so that they can not be removed. In that case, an option is to move such a table to an archiving service, which is accessible through an S3 API. So MariaDB S3 is a technically and commercially cost efficient storage engine built for MariaDB archiving.

Installing MariaDB S3 Storage Engine

The S3 storage engine is available from MariaDB 10.5.4.

[mysqld]
plugin-maturity = alpha

Note: S3 storage engine is currently in alpha maturity , So you cannot load by default S3 storage engine on MariaDB Server stable release ( this happens due to default value of system variable plugin_maturity ). To enable S3 storage engine you have to set plugin-maturity = alpha and restart the server.

Install plugin library:

INSTALL SONAME 'ha_s3';

How can you move data to S3 storage engine ?

If you want to move data from an existing table to S3 storage engine, Please run:

ALTER TABLE table_name ENGINE=S3

To move data back to a regular InnoDB table, Please run:

ALTER TABLE s3_table_name ENGINE=INNODB

MariaDB S3 storage engine performance

The following are some best practices to consider for optimal performance on MariaDB S3 storage engine:

MariaDB S3 Tables supports all the options (including ALTER TABLE) supported by Aria engine:
- ALTER TABLE
- DROP TABLE
- SELECT
- SHOW TABLES
- MariaDB S3 TABLES can also participate actively on Partitioning
s3_block_size is the default block size of the table, For most of the load the default size 4M is sufficient and we don’t recommend increasing this system variable
Be conservative about querying information_schema tables as S3 has to check if there is new tables in S3.
DROP statements on non existing tables are slower as S3 has to check if the table is in S3.
MariaDB S3 Tables can use COMPRESSION_ALGORITHM=zlib to reduce the amount of data transferred from S3 to the local cache.
If you are expecting high volume MariaDB S3 tables, Then we strongly recommend you to increase s3_pagecache_buffer_size ( sequel to innodb_buffer_pool_size of InnoDB and default value is 128M ) for optimal caching and better index handling.
MariaDB S3 Tables I/O performance also greatly depend on your connection speed to your S3 provider.

Conclusion

MariaDB powers database infrastructure for some of the largest planet-scale internet properties, This means handling database volume, performance, scalability, reliability, capacity planning / sizing and archiving are some of the most complex problems to solve. Thanks to MariaDB Server Engineering Team for coming up with S3 storage engine which is seamlessly addressing database archiving across multiple (vendor neutral and independent ) S3 providers.

☛ Want to engage MinervaDB for MariaDB Consulting, Enterprise-Class 24*7 Support and Remote DBA Services ?

Business Function	Contact
CONTACT GLOBAL SALES (24*7)	(844) 588-7287 (USA) (415) 212-6625 (USA) (778) 770-5251 (Canada)
TOLL FREE PHONE (24*7)	(844) 588-7287
MINERVADB FAX	+1 (209) 314-2364
MinervaDB Email - General / Sales / Consulting	contact@minervadb.com
MinervaDB Email - Support	support@minervadb.com
MinervaDB Email -Remote DBA	remotedba@minervadb.com
Shiv Iyer Email - Founder and Principal	shiv@minervadb.com
CORPORATE ADDRESS: CALIFORNIA	MinervaDB Inc., 340 S LEMON AVE #9718 WALNUT 91789 CA, US
CORPORATE ADDRESS: DELAWARE	MinervaDB Inc., PO Box 2093 PHILADELPHIA PIKE #3339 CLAYMONT, DE 19703
CORPORATE ADDRESS: HOUSTON	MinervaDB Inc., 1321 Upland Dr. PMB 19322, Houston, TX 77043, US

The post MariaDB S3 Storage Engine – MariaDB 10.5.4 New Feature appeared first on The WebScale Database Infrastructure Operations Experts.

MySQL Backup Strategies and Tools – MinervaDB Webinar

MinervaDB Corporation — Fri, 19 Jun 2020 08:58:44 +0000

MinervaDB Webinar – MySQL Backup Strategies and Tools

Most often Database Systems outages happen due to user error and it is also the biggest reason for data loss / damage or corruption. In these type of failures, It is application modifying or destroying the data on its own or through a user choice. Hardware failure also contributes to database infrastructure crashes and corruption. To address these sort of data reliability issues, you must recover and restore to the point in time before the corruption occurred. Disaster Recover tools returns the data to its original state at the cost of any other changes that were being made to the data since the point the corruption took place. MinervaDB founder and Principal, hosted a webinar (Thursday, June 18, 2020 – 06:00 PM to 07:00 PM PDT) on MySQL backup strategies and tools addressing the topics below:

Proactive MySQL DR – From strategy to execution
Building capacity for reliable MySQL DR
MySQL DR strategies
MySQL Backup tools
Managing MySQL DR Ops. for very large databases
Testing MySQL Backups
Biggest MySQL DR mistakes
MySQL DR Best Practices and Checklist

You can download the PDF (slides) of webinar here

☛ MinervaDB contacts for MySQL Database Backup and Database Reliability Engineering Services

Business Function	Contact
CONTACT GLOBAL SALES (24*7)	(844) 588-7287 (USA) (415) 212-6625 (USA) (778) 770-5251 (Canada)
TOLL FREE PHONE (24*7)	(844) 588-7287
MINERVADB FAX	+1 (209) 314-2364
MinervaDB Email - General / Sales / Consulting	contact@minervadb.com
MinervaDB Email - Support	support@minervadb.com
MinervaDB Email -Remote DBA	remotedba@minervadb.com
Shiv Iyer Email - Founder and Principal	shiv@minervadb.com
CORPORATE ADDRESS: CALIFORNIA	MinervaDB Inc., 340 S LEMON AVE #9718 WALNUT 91789 CA, US
CORPORATE ADDRESS: DELAWARE	MinervaDB Inc., PO Box 2093 PHILADELPHIA PIKE #3339 CLAYMONT, DE 19703
CORPORATE ADDRESS: HOUSTON	MinervaDB Inc., 1321 Upland Dr. PMB 19322, Houston, TX 77043, US

The post MySQL Backup Strategies and Tools – MinervaDB Webinar appeared first on The WebScale Database Infrastructure Operations Experts.

MySQL Backup and Disaster Recovery Webinar

MinervaDB Corporation — Fri, 12 Jun 2020 00:43:11 +0000

MySQL Backup and Disaster Recovery Webinar

(Thursday, June 18, 2020 – 06:00 PM to 07:00 PM PDT)

There can be several reasons for a MySQL database outage: hardware failure, power outage, human error, natural disaster etc. We may not be able prevent all the disaster from happening but investing on a robust disaster recovery plan is very important for building fault-tolerant database infrastructure operations on MySQL. Every MySQL DBA is accountable for developing a disaster recovery plan addressing data sensitivity, data loss tolerance and data security. Join Shiv Iyer, Founder and Principal of MinervaDB to lean about the best practices for building highly reliable MySQL DR strategy and operations on Thursday, June 18, 2020 – 06:00 PM to 07:00 PM PDT. Building DR for a high traffic MySQL database infrastructure means deep understanding of multiple backup strategies and choosing optimal ones which are best suited for performance and reliability. Most of the data intensive MySQL infrastructure will have a combination of multiple backup methods and tools, In this webinar Shiv talks about his experiences in the past and present on building MySQL DR Ops, tools and zero tolerance data loss methods.

Join this webinar to learn more about:

Proactive MySQL DR – From strategy to execution
Building capacity for reliable MySQL DR
MySQL DR strategies
MySQL Backup tools
Managing MySQL DR Ops. for very large databases
Testing MySQL Backups
Biggest MySQL DR mistakes
MySQL DR Best Practices and Checklist

The post MySQL Backup and Disaster Recovery Webinar appeared first on The WebScale Database Infrastructure Operations Experts.

MySQL Backup Strategies – Building MySQL DR Solutions

MinervaDB Corporation — Thu, 11 Jun 2020 03:16:56 +0000

MySQL Backup Strategies – What you should know before considering MySQL DR solutions ?

MySQL powers all the major internet properties, Which include Google, Facebook, Twitter, LinkedIn, Uber etc. So how do we plan for MySQL disaster recovery and what are the most common MySQL DR tools used today for building highly reliable database infrastructure operations ? There can be several reasons for a MySQL database outage: hardware failure, power outage, human error, natural disaster etc. We may not be able prevent all the disaster from happening but investing on a robust disaster recovery plan is very important for building fault-tolerant database infrastructure operations on MySQL. Every MySQL DBA is accountable for developing a disaster recovery plan addressing data sensitivity, data loss tolerance and data security. Functionally you have several database backup strategies available with MySQL:

Full backup – Full backup backs up the whole database, This also include transaction log so that the full database can be recovered after a full database backup is restored. Full database backups represent the database at the time the backup finished. Full backups are storage resource intensive and takes more time to finish. If you have for a large database, we strongly recommend to supplement a full database backup with a series of differential database backups.
Differential backup – A differential backup is based on the most recent, previous full data backup. A differential backup captures only the data that has changed since that full backup. The full backup upon which a differential backup is based is known as the base of the differential. Full backups, except for copy-only backups, can serve as the base for a series of differential backups, including database backups, partial backups, and file backups. The base backup for a file differential backup can be contained within a full backup, a file backup, or a partial backup. The differential backups are most recommended when the subset of a database is modified more frequently than the rest of the database.
Incremental backup – A incremental backup contains all changes to the data since the last backup. Both differential and incremental backup does only backing up changed files. But they differ significantly in how they do it, and how useful the result is.while an incremental backup only includes the data that has changed since the previous backup, a differential backup contains all of the data that has changed since the last full backup. The advantage that differential backup offers over incremental backups is a shorter restore time. Because, the backup has to be reconstituted from the last full backup and all the incremental backups since.

MySQL Backup tools

The following are the list of MySQL backup tools (logical and physical) discussed in this blog:

mysqldump

mysqldump is a MySQL client utility which can be used to perform logical backups , The mysqldump generate output in SQL ( default and most commonly used to reproduce MySQL schema objects and data), CSV, other delimited text or XML format. We have copied below the restrictions of mysqldump:

mysqldump does not dump performance_schema or sys schema be default. To enforce dumping or logical backup of any of these schema objects, You have to explicitly mention them –databases option or if you want to just dump performance_schema use –skip-lock-tables option.
mysqldump does not dump the INFORMATION_SCHEMA schema.
mysqldump does not dump the InnoDB CREATE TABLESPACE statements.
mysqldump does not dump the NDB Cluster ndbinfo information database.
mysqldump includes statements required to recreate the general_log and slow_query_log tables for dumps of the mysql database. But, Log table contents are not dumped

Script to dump all the databases:

shell> mysqldump --all-databases > all_databases.sql

Script to dump the entire database:

shell> mysqldump db_name > db_name_dump.sql

Script to dump several databases with one command:

shell> mysqldump --databases db_name1 [db_name2 ...] > databases_dump.sql

mysqlpump

The mysqlpump is another client utility for logical backup of MySQL database like mysqldump which is capable of parallel processing of databases and other schema objects with databases to perform high performance dumping process, We listed below mysqlpump most compelling features:

MySQL dump with parallel processing capabilities for databases and other objects within databases.
MySQL user accounts will be dumped as account-management statements (CREATE USER, GRANT) rather than as inserts into the mysql system database.
By using mysqlpump you can create a compressed output.
Much faster compared to mysqldump. Because, The InnoDB secondary indexes are created after rows are inserted to the table.

P.S. – We have blogged about “How to use mysqlpump for faster MySQL logical backup ? ” here

MySQL Enterprise Backup

MySQL Enterprise Backup is a hot / online backup tool for MySQL ( optimized for InnoDB only though capable of backup and restore of tables created on other storage engines supported by MySQL ) capable of performing full, incremental and differential backup. MySQL Enterprise Backup also support cloud storage backup, backup encryption and compression. We have explained most compelling MySQL Enterprise Backup 8.0 features below:

Transparent page compression for InnoDB.
Backup history available for all members of Group Replication by making sure backup_history table is updated on primary node after each mysqlbackup operation.
Storage engine of the mysql.backup_history table on a backed-up server has switched from CSV to InnoDB.
mysqlbackup now supports encrypted InnoDB undo logs .
mysqlbackup now supports high performance incremental backup by setting page tracking functionality on MySQL (set –incremental=page-track).
Much better MySQL Enterprise Backup 8.0 troubleshooting with now mysqlbackup prints a stack trace after being terminated by a signal.
Selective restores of tables or schema from full backup for Table-Level Recovery (TLR)

Percona XtraBackup

Percona XtraBackup is an open source MySQL hot backup solution from Percona addressing incremental, fast, compressed and secured backup for InnoDB optimally. Most of our customers users Percona XtraBackup for DR of their MySQL infrastructure, The following features makes Percona XtraBackup obvious choice for MySQL Backup and DR:

Hot backup solution for InnoDB without blocking / locking transaction processing.
Point-in-time recovery for InnoDB.
MySQL incremental backup support.
Percona XtraBackup supports incremental compressed backups.
High performance streaming backup support for InnoDB.
Parallel backup and copy-back support for faster backup and restoration.
Secondary indexes defragmentation support for InnoDB.
Percona XtraBackup support rsync to minimize locking.
Track Percona XtraBackup history with Backup history table.
Percona XtraBackup supports offline backup.

Conclusion

We always recommend a combination of multiple backup strategies / tools for maximum data reliability and optimal restoration. We cannot have a common backup strategy for all the customers, It depends on factors like infrastructure, MySQL distribution, database size, SLA etc. Backups are most import components in database infrastructure operations and we follow zero tolerance DR for building highly available and fault-tolerant MySQL infrastructure.

Do you want to engage MinervaDB Remote DBA for MySQL Disaster Recovery (DR) and Database Reliability Engineering (Data SRE) ?

Business Function	Contact
CONTACT GLOBAL SALES (24*7)	(844) 588-7287 (USA) (415) 212-6625 (USA) (778) 770-5251 (Canada)
TOLL FREE PHONE (24*7)	(844) 588-7287
MINERVADB FAX	+1 (209) 314-2364
MinervaDB Email - General / Sales / Consulting	contact@minervadb.com
MinervaDB Email - Support	support@minervadb.com
MinervaDB Email -Remote DBA	remotedba@minervadb.com
Shiv Iyer Email - Founder and Principal	shiv@minervadb.com
CORPORATE ADDRESS: CALIFORNIA	MinervaDB Inc., 340 S LEMON AVE #9718 WALNUT 91789 CA, US
CORPORATE ADDRESS: DELAWARE	MinervaDB Inc., PO Box 2093 PHILADELPHIA PIKE #3339 CLAYMONT, DE 19703
CORPORATE ADDRESS: HOUSTON	MinervaDB Inc., 1321 Upland Dr. PMB 19322, Houston, TX 77043, US

The post MySQL Backup Strategies – Building MySQL DR Solutions appeared first on The WebScale Database Infrastructure Operations Experts.