Generate tarballs in Java
We're currently generating tarballs using the tar
and xz
command-line tools triggered by a cronjob. While this is very fast, it doesn't integrate that well with the rest of our code. For example, it would be much easier to extract descriptor types, publication times, and file digests for #31204 (moved) if tarball generation happened in Java.
One possible issue might be that generated tarballs are larger or that compression takes longer. This is something I wanted to figure out early, which is why I ran some tests today:
|= Compression preset level =||= xz
=||= XZ for Java 1.6 =|
|----------------------------||--------||-------------------|
| 1 | 269M | 1m27.522s | 269M | 1m22.905s |
| 3 | 77M | 1m6.590s | 77M | 1m15.03s |
| 6 | 30M | 3m8.426s | 30M | 4m54.837s |
| 9 | 18M | 2m56.801s | 18M | 5m6.998s |
| 9e | 16M | 7m2.364s || NA |
We're currently using xz -9e
, but I can't find this option in XZ for Java. The closest is compression preset level 9. That means that our tarballs would be 18M/16M = 12.5% larger and be created in 306.998s/422.364s = 73% of the current time.
Is it a blocker that our tarballs would be 12.5% larger? If so, we might try harder to configure XZ for Java in the same way as xz -9e
operates, even though that would very likely increase tarball generation time.
Not working on this at the moment, just leaving my thoughts here for discussion and for picking this up as time permits.