Details
-
Improvement
-
Status: Done
-
High
-
Resolution: Fixed
-
None
-
None
-
None
-
Yes
-
Yes
-
No
-
Yes
Description
Currently the "--compression" option is hidden (meaning not visible in "--help" output, but will be accepted). It supports gzip (i.e. a zlib family algorithm) and none with gzip being the current default.
(The backup/restore code has lz4 and snappy compression too, but it's just not exposed through the --compression option a.t.m.)
As a pbm-agent makes a single archive file per replicaset compression is applied on a single stream. The current implementation is using the default gzip library which is single threaded. I.e. it uses one core as opposed to a library like pigz that parallelizes compression of the single stream using multiple cores.
The CPU bottleneck of the gzip ("zlib-deflate") algorithm is currently restricting the dump speed significantly. It was nice to keep the same compression as mongdump uses, as in theory it allows mongorestore to be used on the pbm backup files.
Task
- Find parallelizing-in-CPU compression libraries to replace the currently used ones
- But only start compression pipes with parallelization preferences set to use half the CPUs because we shouldn't let the CPU be monopolized for backup.
- Test which algorithm is best, make that the new deafult
Attachments
Issue Links
- relates to
-
PBM-452 Add compression ratio to pbm-speed-test
-
- Done
-