Okay. That does not work for me at least not for tor-android-service and topl my lists are a) way longer (attached is the one for topl) and b) the patch does not help me creating it. First of all specifying the --offline flag just breaks for me. Are you sure the things needed get downloaded if you specify that flag? Seems a bit counter-intuitive to me. What works, though, is replacing --offline with --debug and removing -Dmaven.repo.local=$gradle_repo. Still I don't get your lists in #30460 (moved) and #30461 (moved).
The problem still persists. It seems I got some files downloaded several times from different locations (that's for tor-android-service as an example):
Since the de-duping is occurring on artifact name (and not the entire URL), we reverse the characters and sort. This puts same artifact names in adjacent lines. Then take only unique values for first field (artifact name)
I found out what was going on here. The logs will attempt to download a URL and then later in the logs the build will log the artifact as not found. Then it moves on to try the same artifact at another repo. this records two entries (the new grep I provided filters out potentially the good site, keeping the bad one so we shouln't use it).
I have no idea how to coordinate the log flow to see if it is successful without writing a custom program to track state. All of this changes if we move to another version of Gradle, since the logs are not a consistent forma. I'm not sure that it is worth the time to create a custom log parser.
In my case, I would have multiple entries in the download-urls.txt. When I process it, my program excludes any artifact it doesn't find so everything is correct in the gradle-dependency file.
Looking through the logs further, I think one way to solve this is to parse out missing resource entries into another file. Then remove those missing resource entries from the download-urls.txt file.
No dice. Testing the tor-onion-proxy-library deps I can see that the list checked in has 277 entries while the list I get following your steps has 313 entries. On closer inspection it turns out that your commands do not eliminate all duplicates. E.g. on my list I get
08:50:17.105 [INFO] [org.gradle.internal.resource.transfer.DefaultCacheAwareExternalResourceAccessor] Found locally available resource with matching checksum: [/var/tmp/dist/android-toolchain/gradle/caches/modules-2/files-2.1/com.android.tools.analytics-library/protos/26.1.0/1e2d35fcc1668730a6c6be29e61fbd059278f243/protos-26.1.0.jar](https://repo.spring.io/plugins-release/com/android/tools/analytics-library/protos/26.1.0/protos-26.1.0.jar,)
So, it's performing something but later decides that the actual fetch is not needed as there is already something cached in the toolchain available.
I wonder why the first fetch is needed then at all and whether we could just nuke the gradle cache when constructing the list. I'll try the latter.
Hrm. The cache is empty at the beginning. I guess it gets populated with the result for the first download and then when the repo is switched the resource is taken from that cache?
Alright. So, what about not including any download attempt but only the GET ones? We are not interested in the HEAD ones anyway. If I do that and then put the missing resources into download-fails.txt and then just pass on the lines that are in the former file like so:
comm -23 download-attempts.txt download-fails.txt > download-urls.txt
I get a matching file to the one we checked in (that's for TOPL, I have not checked for tor-android-service) after sorting the latter (and removing the comments).
As I recall, some artifacts links are 301 redirects, which were redirecting to URL paths that did not have the maven URL path in them. This made it impossible to reconstruct the groupId/artifactId/version directory paths we needed from the URL output to download-urls.txt.
Alright. So, what about not including any download attempt but only the GET ones? We are not interested in the HEAD ones anyway. If I do that and then put the missing resources into download-fails.txt and then just pass on the lines that are in the former file like so:
comm -23 download-attempts.txt download-fails.txt > download-urls.txt
I get a matching file to the one we checked in (that's for TOPL, I have not checked for tor-android-service) after sorting the latter (and removing the comments).
Okay, my idea does not work for tor-android-service we need to get back to the drawing board. :(
I was actually wrong:
07:07 <+GeKo> the crucial part i missed was making sure there is no local cache passed to gradle07:07 <+GeKo> otherwise gradle seems just to check the sha1 value of what is already there07:08 <+GeKo> instead of downloading the resources (again)
So, if we are (re-)constructing the gradle dependencies it seems to be important to omit -Dmaven.repo.local=$gradle_repo as well and not just replace --offline with --debug and allow network access. We should note that in the doc.
sisbell: So, I have now the sorted URLs, how do I actually created the dependencies file now? It seems the how-to is missing some command here helping with that? Or maybe they just need better explaining?
If we are running in a local container, the repo will always start out as empty unless there is something in the dependency list. I was thinking if we want to reconstruct the list, we would just empty out the gradle-dependencies-list file. But that requires commenting out some sections of the build that deals with moving the repo directories around so is maybe not ideal either.
Sure we can go with the approach of removing the maven.repo flag. I'll add that to documentation.
Okay, my idea does not work for tor-android-service we need to get back to the drawing board. :(
I was actually wrong:
{{{
07:07 <+GeKo> the crucial part i missed was making sure there is no local cache passed to gradle
07:07 <+GeKo> otherwise gradle seems just to check the sha1 value of what is already there
07:08 <+GeKo> instead of downloading the resources (again)
}}}
So, if we are (re-)constructing the gradle dependencies it seems to be important to omit -Dmaven.repo.local=$gradle_repo as well and not just replace --offline with --debug and allow network access. We should note that in the doc.
You will then need to add the new dependency URLs and SHA-256 values into theprojects/$project/gradle-dependencies-list.txt file. The format of this file ispipe delimited sha256sum | url
sisbell: So, I have now the sorted URLs, how do I actually created the dependencies file now? It seems the how-to is missing some command here helping with that? Or maybe they just need better explaining?
In regards to creating the dependencies file, I was just using a program I did (in Rust) to pull down the artifacts and generate the sha. I guess everyone is doing their own thing here. I'm not sure my program would be for everybody since its done in Rust. Is there a simple script that others are using for this?
{{{
You will then need to add the new dependency URLs and SHA-256 values into the
projects/$project/gradle-dependencies-list.txt file. The format of this file is
pipe delimited
sha256sum | url
}}}
Are you thinking this needs to be expanded?
Well, I originally thought there were some commands on how to actually create the dependency txt file including the sha256sums. But I guess what we have is okay-ish in that regard and one could use whatever one wants to actually assemble the file (a script, a rust program etc.).
Okay, my idea does not work for tor-android-service we need to get back to the drawing board. :(
I was actually wrong:
{{{
07:07 <+GeKo> the crucial part i missed was making sure there is no local cache passed to gradle
07:07 <+GeKo> otherwise gradle seems just to check the sha1 value of what is already there
07:08 <+GeKo> instead of downloading the resources (again)
}}}
So, if we are (re-)constructing the gradle dependencies it seems to be important to omit -Dmaven.repo.local=$gradle_repo as well and not just replace --offline with --debug and allow network access. We should note that in the doc.
I now get 313 dependencies for TOPL with following your instructions even though we currently have only 277 on file. Thus, this needs another round of revisions. Maybe comment:15 still could help? Or is that not working for the firefox project for some reason?
I now get 313 dependencies for TOPL with following your instructions even though we currently have only 277 on file. Thus, this needs another round of revisions. Maybe comment:15 still could help? Or is that not working for the firefox project for some reason?
I have 313 as well so that part matches with what you are getting. I'm not really sure why it used to pull in 277. Since this gives consistent results now is this something we want to investigate? The method I used previously didn't filter out bad URLs at the log parse level, they would be filtered out as part of the program I use for downloading and generating the hash.
I now get 313 dependencies for TOPL with following your instructions even though we currently have only 277 on file. Thus, this needs another round of revisions. Maybe comment:15 still could help? Or is that not working for the firefox project for some reason?
I have 313 as well so that part matches with what you are getting. I'm not really sure why it used to pull in 277. Since this gives consistent results now is this something we want to investigate? The method I used previously didn't filter out bad URLs at the log parse level, they would be filtered out as part of the program I use for downloading and generating the hash.
The idea behind this script is to have the least amount of dependencies needed for building assembled because any binary dependency is a risk which is hard to mitigate by reproducible builds.
Now, current master builds with 277 dependencies, so we should not introduce dozens of new deps which do who knows what. So, yes, we need to figure out here what is going on.
Looks good now, thanks! Just as a note: it turns out that not all platforms need the same amount of dependencies. For instance on x86_64 there are two deps less needed for the Firefox dependencies (the multidex-instrumentation ones). Not sure whether it would be worth the effort having per arch dependencies in case they differ, probably not.
Anyway, I cherry-picked the fix to master (commit 5d8507dd6ae08f9d3552aba0b34cab918c35e341) and added my points. sisbell, please do the same.