using zstd multi-threaded with tar in linux

using zstd multi-threaded with tar in linux

a way for high compression ratio with faster compression speeds

Introduction

when it comes to compressing files zstd gives a high compression ratio and faster decompression speeds. In my experience packing Linux tarballs, and backing my old codebase to archive, zstd gave me by far the best compression ratio.

zstd gives almost ~50-70+% compression ratio with binary & text files, but files like Audio and Video files may not get compressed, because if they are encoded to a codec, and the codec already compresses the file, then zstd can't compress it anymore.

Let's get into the command.

The command.

tar -I 'zstd -T$(nproc) -19' -cpf target.tar.zstd target_dir

Options

  • -I set compression program for tar

  • 'zstd -T$(nproc) -19' zstd compression program where the argument -T is for the number of threads to use and -19 is for compression level and $(nproc) returns the number of threads/processing units available to the current program ( as help says )

  • -cpf: c sets tar to achieve mode, p preserves the default permission that the target directory/file has and f to define the archive/compression file name

This sets zstd to use max possible maximum amount of threads with compression level "19" ( can be 0 to 20 ) thus achieving multithreaded zstd compression.

Decompression

Decompression doesn't need any special arguments as it's a tarball. here is a simple decompression command.

tar -xpf target.tar.zstd -C target_dir

just make sure the directory you are decompressing into exists if not just mkdir target_dir it.


Advanced

You can use tar arguments for advanced usage like --exclude to exclude a directory or -v to print out all the progress.

Here is a command I wrote to compress the complete Linux filesystem with tar and zstd

tar \
    -I 'zstd -T8 19' \
    --exclude={/dev/*,/sys/*,/proc/*,/run/*} \
    --exclude={/backup.tar.zstd} \
    -cvpf /backup.tar.zstd /

This gave a ~72% compression ratio. although I didn't exclude catches or media files it is still by far the best one

zstd also supports high compression ratio upto level 20 and greater but at a cost of 100% CPU. just dont do multithreading with ultra compression if you have a potato cpu.