[nSLUG] tar + compression
George N. White III
gnwiii at gmail.com
Tue Jan 25 15:04:57 AST 2011
On Tue, Jan 25, 2011 at 11:28 AM, Peter Dobcsanyi <petrus at ftml.net> wrote:
> Using tar with various compression methods.
> System: Ubuntu 10.10
> CPU: Pentium(R) 4 CPU 3.20GHz
> source: Django's mercurial repo with build/ and built docs/ (i.e. mainly text)
Large repositories are tricky -- it is painful to deal with a single
archive. I'd prefer an archive of smaller compressed chunks so you
can more quickly
extract a portion of the archive.
> tar commands with running times:
> tar cf /tmp/d.tar django 0.10s user 0.90s system 29% cpu 3.404 total
> tar czf /tmp/d.tar.gz django 16.95s user 1.78s system 101% cpu 18.387 total
> tar cjf /tmp/d.tar.bz2 django 81.88s user 2.26s system 100% cpu 1:23.57 total
> tar cJf /tmp/d.tar.xz django 166.25s user 3.80s system 100% cpu 2:48.91 total
What about memory requirements? There is a parallel xz implementation
help when running on new multi-core CPU's.
> -rw------- 1 peter peter 184176640 2011-01-25 10:23 /tmp/d.tar 100.00%
> -rw------- 1 peter peter 128148318 2011-01-25 10:21 /tmp/d.tar.gz 69.58%
> -rw------- 1 peter peter 115184718 2011-01-25 10:20 /tmp/d.tar.bz2 62.54%
> -rw------- 1 peter peter 93680468 2011-01-25 10:17 /tmp/d.tar.xz 50.86%
> My conclusion: gzip is a good compromise between time and compression ratio.
The tradeoffs depend on the application. If you plan to store many
compressed files for a long time
then smaller size can represent significant cost savings in storage
capacity and transfer times, so
smallest size wins.
George N. White III <aa056 at chebucto.ns.ca>
Head of St. Margarets Bay, Nova Scotia
More information about the nSLUG