Maximizing rsync throughput in 2 easy commands

rsync is an excellent tool for linux to copy files between different systems. However, it doesn’t yet have the ability to run multiple copy processes in parallel which means that if you are limited by the speed you can read filesystem metadata (ie list the files), or you have a very fast network connection and lots of cores on both servers you can significantly speed up copying files by running processes in parallel. For example, one process can copy files at perhaps 50MB/sec, however with a 16-core server, 1gbps network connection and a fast SSD array you can copy data at 1GB/sec (gigabytes). Here’s how:

Firstly, you need to get ssh set up so you can connect between the machines without using a password. Even if you are copying between two remote systems and you use ssh-agent key forwarding (which I highly recommend), this can become a significant bottleneck so it’s best to do the following and generate a new key on the source system:

ssh-keygen -f rsync

Hit enter when it prompts for a passphrase so that the key is generated without needing a password to open it. This will create two files, rsync which is your private key and rsync.pub which you want to add to your authorized keys on the remote host using something like:

ssh-copy-id -i rsync.pub user@remote_host

You should then be able to ssh without needing a password by doing:

ssh -i rsync user@remote_host

Next, we need to go to the remote host and allow lots of ssh sessions to be opened at once; open up /etc/ssh/sshd_config on remote_host and append or change these lines:

MaxSessions 100
MaxStartups 100

Now, on your source host run the following command to ensure that rsync uses the ssh key you just created:

RSYNC_RSH="ssh -i rsync"

Now for the good stuff – first we need to mirror the directory structure but not the files:

rsync -za --include='*/' --exclude='*' /local/path/ remote_server:/remote_path/

And now we can do the copy in parallel (you might need to install the parallel command using something like apt-get install parallel):

cd /local/path/; find -L . -type f | parallel -j 30 rsync -za {} user@remote_host:/remote/path/{}

# To exclude some files append as many of these as you want to the find command: \! -name file_to_exclude

This will copy 30 files in parallel, using compression. Play with the exact number, but 1.5 times the number of cores in your boxes should be enough. You can monitor the disk bandwidth with iostat -mx or the network throughput with a tool like iptraf. One of those, or the CPU usage should now be saturated, and your copy should be going as fast as is physically possible. You can re-run this afterwards to synchronise even quicker than a normal rsync, however you won’t be able to use it to delete files.

Running lots of postgres commands in parallel

Postgres is great, however one limitation is that you can only run one command at a time in the shell. Sometimes however when you are doing administrative functions over multiple tables, for example (re)creating indexes or vacuuming and you have a nice powerful box, you can run many of these commands in parallel for easy speedup. Here’s an easy way to run lots of commands in parallel.

Firstly, create a text file with one command per line. For example

vacuum full a;
vacuum full b;
vacuum full c;
vacuum full d;

Then, ensure that you have your .pgpass file set up correctly so that you can just run psql [database] [user] without being prompted for a password.

Finally, run the following command:

xargs -d "\n" -n 1 -P 20 psql database_name username -c < list_of_commands.txt

-P 20 specifies the number of jobs to run in parallel so change this to what your server can cope with.

Finding the amount of space wasted in postgres tables

Because of way that postgres handles transaction isolation (ie using MVCC), when you modify or delete a row in a table it marks it as deleted, and then frees the space at a later point in time using (auto)vacuum. However, unless you use the heavy-weight VACUUM FULL command (which exclusive locks the table and totally rewrites it, causing anything trying to access it to block until the command is finished) the space is never reclaimed by the operating system. Normally this is not a problem – if you have a heavily used table with 20mb of data in it it probably has 5-10mb of overhead with the dead rows, reclaimed free space etc which is acceptable. However there are a few situations where it is useful to know what exactly the overhead is:

  1. Sometimes if your table changes very quickly, is large, and your disks or autovacuum parameters are unable to keep up, it can end up growing massive. For example we had a table that contains 3Gb of data but was taking up 45Gb due to the fact that autovacuum couldn’t keep up with the frequency of changes in the table
  2. If you are using table partitioning to store historic data then to make the most use of space you want to see whether a VACUUM FULL would be advantageous to run or not. For example if you have a table that is recording data collected from each day, some days it may be mostly just inserts so doesn’t need vacuuming; other days it may have a number of changes made and so have quite a lot of free space that can be reclaimed. Additionally, VACUUM FULL optimizes the order of data in the table and the indexes making it more performant.

In the first case, looking at the output of a command like

SELECT
    psut.relname,
    to_char(psut.last_vacuum, 'YYYY-MM-DD HH24:MI') as last_vacuum,
    to_char(psut.last_autovacuum, 'YYYY-MM-DD HH24:MI') as last_autovacuum,
    pg_class.reltuples::bigint AS n_tup,
    psut.n_dead_tup::bigint AS dead_tup,
    CASE WHEN pg_class.reltuples > 0 THEN
        (psut.n_dead_tup / pg_class.reltuples * 100)::int
    ELSE 0
    END AS perc_dead,
    CAST(current_setting('autovacuum_vacuum_threshold') AS bigint) + (CAST(current_setting('autovacuum_vacuum_scale_factor') AS numeric) * pg_class.reltuples) AS av_threshold,
    CASE WHEN CAST(current_setting('autovacuum_vacuum_threshold') AS bigint) + (CAST(current_setting('autovacuum_vacuum_scale_factor') AS numeric) * pg_class.reltuples) < psut.n_dead_tup THEN
        '*'
    ELSE ''
    END AS expect_av
FROM pg_stat_user_tables psut
    JOIN pg_class on psut.relid = pg_class.oid
ORDER BY 5 desc, 4 desc;

(sorry I can’t remember where I found this) should show you that there are a very large number of dead tuples waiting to be reclaimed (ie turned in to free space) in the table.

However, if your disks were struggling at one point, but then you tweaked autovacuum so it reclaimed the dead tuples correctly (as in case 1 above), your table could now be 90% free space but there is no easy way to find this out within postgres.

Fortunately, there is an excellent extension called pgstattuple which allows you to find out the amount of free space within a table file that has been reclaimed but not released to the operating system. The following query lists all tables which are over 100Mb in size, and have more than 10Mb of free space and have more than 20% free space (you can tweak these numbers – I just did it for our platform where our typical table size is 1Gb+):

select
    table_schema,
    table_name,
    free_percent,
    pg_size_pretty( free_space ) AS space_free,
    pg_size_pretty( pg_relation_size( quoted_name ) ) AS total_size
from (
    select
        table_schema,
        table_name,
        quoted_name,
        space_stats.approx_free_percent AS free_percent,
        space_stats.approx_free_space AS free_space
    from
        ( select *,
            quote_ident( table_schema ) || '.' || quote_ident( table_name ) AS quoted_name
            from information_schema.tables
            where
                table_type = 'BASE TABLE' and table_schema not in ('information_schema', 'pg_catalog')
                and pg_relation_size( quote_ident( table_schema ) || '.' || quote_ident( table_name ) ) > 100000000
        ) t, pgstattuple_approx( quoted_name ) AS space_stats
) t
where
    free_percent > 20
    AND free_space > 10000000
ORDER BY free_space DESC;

This only uses an approximate count, however even so it can be a bit slow (it just took 10 minutes here) on a system with many tables and heavy IO. You can use this to find the tables that would most benefit from a VACUUM FULL command being run.