Opened 3 months ago

Closed 5 weeks ago

#31264 closed defect (fixed)

tar.gz output files contain nonreproducible timestamps

Reported by: JeremyRand Owned by: boklm
Priority: Medium Milestone:
Component: Applications/rbm Version:
Severity: Normal Keywords: TorBrowserTeam201909R
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Steps to reproduce:

Run the following command twice:

./rbm/rbm build gocompress --target nightly --target torbrowser-linux-x86_64

Expected results:

The output .tar.gz files should be identical.

Observed results:

The gzip header contains different timestamps per build, based on when the build was done. See the following Diffoscope:

https://try.diffoscope.org/kpqdeyggzdec.html

Text version of Diffoscope output in case the above link expires:

--- a/gocompress-cc9eb1d7ad76-linux-x86_64-4fd18e.tar.gz
+++ b/gocompress-cc9eb1d7ad76-linux-x86_64-4fd18e.tar.gz
├── filetype from file(1)
│ @@ -1 +1 @@
│ -gzip compressed data, last modified: Tue Jul 30 00:09:03 2019, from Unix, original size 20551680
│ +gzip compressed data, last modified: Tue Jul 30 00:11:48 2019, from Unix, original size 20551680

Other notes:

Switching from .tar.gz to .tar.xz fixes the issue and results in reproducible binaries. Given that .xz has much better compression than .gz and (AFAIK) is usually readily available on GNU/Linux and macOS systems just like .gz, my recommendation is to simply switch the .tar.gz to .tar.xz in tor-browser-build, and add a warning to the "tar" entry in rbm's options_misc.asc saying that using .gz compression should not be used because it will break reproducibility.

Since this issue affects both rbm and Tor Browser, I'm not sure which component to select for this ticket. I'm going with rbm, but feel free to change that if you like. Or feel free to split it into 2 tickets if that makes it easier to make sure that both components get a fix.

Child Tickets

Change History (11)

comment:1 Changed 3 months ago by JeremyRand

If you agree with my recommended fix, I'm happy to code up patches for it.

comment:2 Changed 3 months ago by boklm

Given that .xz has much better compression than .gz and (AFAIK) is usually readily available on GNU/Linux and macOS systems just like .gz

The reason to use gzip to compress source tarballs is that xz is a lot slower than gzip. So switching to xz would save some space on disk, but would probably slow down the build. Especially for components like firefox where we might spend several minutes just to compress it with xz.

However it looks like we could easily fix the gzip reproducibility issue by using the -n or --no-name option:
https://wiki.debian.org/ReproducibleBuilds/TimestampsInGzipHeaders

comment:3 Changed 3 months ago by JeremyRand

The reason to use gzip to compress source tarballs is that xz is a lot slower than gzip. So switching to xz would save some space on disk, but would probably slow down the build. Especially for components like firefox where we might spend several minutes just to compress it with xz.

Ah, that makes sense, thanks for the explanation.

However it looks like we could easily fix the gzip reproducibility issue by using the -n or --no-name option:
https://wiki.debian.org/ReproducibleBuilds/TimestampsInGzipHeaders

Nice, I wasn't aware of that trick.

So, would a good solution be to patch http://jqs44zhtxl2uo6gk.onion/builders/rbm.git/tree/lib/RBM/DefaultConfig.pm?id=e04f03f9626e993bb66d7784d258f95ca07bc769#n578 , replacing this:

tar --no-recursion [% IF c('gnu_utils') -%]

With this:

GZIP="--no-name" tar --no-recursion [% IF c('gnu_utils') -%]

Or is there a better place to put that flag?

comment:4 in reply to:  3 Changed 3 months ago by boklm

Replying to JeremyRand:

So, would a good solution be to patch http://jqs44zhtxl2uo6gk.onion/builders/rbm.git/tree/lib/RBM/DefaultConfig.pm?id=e04f03f9626e993bb66d7784d258f95ca07bc769#n578 , replacing this:

tar --no-recursion [% IF c('gnu_utils') -%]

With this:

GZIP="--no-name" tar --no-recursion [% IF c('gnu_utils') -%]

Yes, that would work.

An other place we should patch is the maketar function in lib/RBM.pm, which is used to generate the source tarballs:

diff --git a/lib/RBM.pm b/lib/RBM.pm
index 75912af..087ebe3 100644
--- a/lib/RBM.pm
+++ b/lib/RBM.pm
@@ -582,7 +582,7 @@ sub maketar {
     }
     my %compress = (
         xz  => ['xz', '-f'],
-        gz  => ['gzip', '-f'],
+        gz  => ['gzip', '--no-name', '-f'],
         bz2 => ['bzip2', '-f'],
     );
     if (my $c = project_config($project, 'compress_tar', $options)) {

comment:5 Changed 3 months ago by JeremyRand

Patch at https://notabug.org/JeremyRand/rbm/src/gzip-timestamps , Git commit hash ee2d4f2e53277055105dbc85832ed3ebeeb45f45.

It does occur to me that this patch will replace whatever the pre-existing value of GZIP was, which might conceivably cause problems for some users. Would it be better to replace GZIP="--no-name" with GZIP="--no-name ${GZIP}"?

comment:6 Changed 3 months ago by boklm

Keywords: TorBrowserTeam201907R added
Status: newneeds_review

comment:7 Changed 3 months ago by gk

Keywords: TorBrowserTeam201908R added; TorBrowserTeam201907R removed

No July any longer.

comment:8 Changed 6 weeks ago by gk

Keywords: TorBrowserTeam201909R added; TorBrowserTeam201908R removed

No August anymore.

comment:9 in reply to:  5 Changed 6 weeks ago by boklm

Replying to JeremyRand:

Patch at https://notabug.org/JeremyRand/rbm/src/gzip-timestamps , Git commit hash ee2d4f2e53277055105dbc85832ed3ebeeb45f45.

This patch looks good to me, thanks.

It does occur to me that this patch will replace whatever the pre-existing value of GZIP was, which might conceivably cause problems for some users. Would it be better to replace GZIP="--no-name" with GZIP="--no-name ${GZIP}"?

Probably a not very common use-case, but maybe some day someone will want to set gzip options this way, so that might be useful. Please update the patch with this change if you think that's useful.

comment:10 Changed 5 weeks ago by JeremyRand

Probably a not very common use-case, but maybe some day someone will want to set gzip options this way, so that might be useful. Please update the patch with this change if you think that's useful.

Done; updated patch at https://notabug.org/JeremyRand/rbm/src/gzip-timestamps , Git commit hash 5a41aae4a0d745f74b675d3c9c142b3d5fb3ca09.

comment:11 in reply to:  10 Changed 5 weeks ago by boklm

Resolution: fixed
Status: needs_reviewclosed

Replying to JeremyRand:

Done; updated patch at https://notabug.org/JeremyRand/rbm/src/gzip-timestamps , Git commit hash 5a41aae4a0d745f74b675d3c9c142b3d5fb3ca09.

Thanks. I merged this patch to rbm.git and updated the rbm submodule in tor-browser-build with commit 873865aa9dc3d3f734458e86dbe542db67ad1929.

Note: See TracTickets for help on using tickets.