2017-10-25 02:09:20,131 WARN o.t.c.b.SanitizedBridgesWriter:952 Unrecognized line 'bridge-distribution-request any'. Skipping.
We should investigate what the exact specification of this line is, decide whether it's harmless to keep, add it to CollecTor's bridge descriptor module, try it out locally, get it reviewed, put out a release, and deploy that.
To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information
Child items 0
Show closed items
No child items are currently assigned. Use child items to break down this issue into smaller parts.
Linked items 0
Link issues together to show that they're related.
Learn more.
2017-10-25 05:09:01,147 WARN o.t.c.b.SanitizedBridgesWriter:952 Unrecognized line 'bridge-distribution-request any'. Skipping.2017-10-25 05:09:01,488 WARN o.t.c.b.SanitizedBridgesWriter:959 Could not parse server descriptor.java.lang.NullPointerException: null at org.torproject.collector.bridgedescs.SanitizedBridgesWriter.scrubIpv4Address(SanitizedBridgesWriter.java:265) at org.torproject.collector.bridgedescs.SanitizedBridgesWriter.sanitizeAndStoreServerDescriptor(SanitizedBridgesWriter.java:751) at org.torproject.collector.bridgedescs.BridgeDescriptorParser.parse(BridgeDescriptorParser.java:44) at org.torproject.collector.bridgedescs.BridgeSnapshotReader.<init>(BridgeSnapshotReader.java:200) at org.torproject.collector.bridgedescs.SanitizedBridgesWriter.startProcessing(SanitizedBridgesWriter.java:217) at org.torproject.collector.cron.CollecTorMain.run(CollecTorMain.java:67) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748)2017-10-25 05:09:01,865 WARN o.t.c.b.SanitizedBridgesWriter:952 Unrecognized line 'bridge-distribution-request any'. Skipping.
"bridge-distribution-request" SP Method NL [At most once, bridges only.] The "Method" describes how a Bridge address is distributed by BridgeDB. Recognized methods are: "none", "any", "https", "email", "moat", "hyphae". If set to "none", BridgeDB will avoid distributing your bridge address. If set to "any", BridgeDB will choose how to distribute your bridge address. Choosing any of the other methods will tell BridgeDB to distribute your bridge via a specific method: - "https" specifies distribution via the web interface at https://bridges.torproject.org; - "email" specifies distribution via the email autoresponder at bridges@torproject.org; - "moat" specifies distribution via an interactive menu inside Tor Browser; and - "hyphae" specifies distribution via a cryptographically-secure, invitation-based system. Potential future "Method" specifiers must be as follows: Method = (KeywordChar | "_") + All bridges SHOULD include this line. Non-bridges MUST NOT include it. (It is not currently implemented, or obeyed by Bridge DB.) BridgeDB SHOULD treat unrecognized Method values as if they were "none". (Default: "any")
Hmm. What do you mean by "single out", and why would we have to scrub this information?
We did have bridge pool assignment files a while back and only stopped providing them, because BridgeDB stopped providing them to us reliably. But we did not scrub the information how a bridge was distributed there, and here we're looking at information from bridges how they want to be distributed, which may or may not be obeyed by BridgeDB.
Can you elaborate why you think this information needs to be scrubbed?
Hmm, so far I could not reproduce this problem.
Please find a temporary suggestion for catching this type of problem and retrieving more information about it.
There are also many other places that could cause trouble when unforeseen data arrives. Also some java 8 date handling classes might be of advantage. I'll open a ticket about that. And also think further about this problem.
Maybe open a different ticket for the unrecognized line?
Trac: Summary: Include "bridge-distribution-request" lines in sanitized bridge descriptors to Fix NPE and include "bridge-distribution-request" lines in sanitized bridge descriptors Description: From CollecTor's logs:
2017-10-25 02:09:20,131 WARN o.t.c.b.SanitizedBridgesWriter:952 Unrecognized line 'bridge-distribution-request any'. Skipping.
We should investigate what the exact specification of this line is, decide whether it's harmless to keep, add it to CollecTor's bridge descriptor module, try it out locally, get it reviewed, put out a release, and deploy that.
to
For NPE log excerpt see comment:1.
From CollecTor's logs:
2017-10-25 02:09:20,131 WARN o.t.c.b.SanitizedBridgesWriter:952 Unrecognized line 'bridge-distribution-request any'. Skipping.
We should investigate what the exact specification of this line is, decide whether it's harmless to keep, add it to CollecTor's bridge descriptor module, try it out locally, get it reviewed, put out a release, and deploy that.
Looks like this issue is caused by a server descriptor produced by an alternate Tor implementation that identifies itself as "platform Pearl 0fd5756 on Linux". And that implementation uses a different order of elements in its descriptor.
I'll start working on a fix now. If you started working on a fix, too, and would rather continue with that, let's meet on the meeting pad.
Please review the fix in my task-23981 branch. It still lacks a change log entry, and with more time I'd have done another review run myself, so handle with care.
Note that I tried to keep the patch minimal in order to get it fixed ASAP. While working on this patch I considered taking a different approach where we'd include placeholders for parts that we cannot sanitize yet and fill them in after finishing to read the original descriptor. This could be a List<StringBuilder> to retain order of original lines and where we'd keep references to some StringBuilders that we need to sanitize after reading. Still something to think about. Let's do that after fixing this ugly bug.
Oh, and we still need to make a final decision what to do with those new lines. Okay to keep them unchanged?
Oh, and we still need to make a final decision what to do with those new lines. Okay to keep them unchanged?
I hope we choose "yes, ok to keep them unchanged". That way external researchers will be able to tie together usage patterns, and blocking patterns, to the distribution method that the bridge asked for.
(I am still sad that we don't record (and publish) what distribution mechanism bridgedb picked for a bridge. Most bridges probably won't set this line on their own, so seeing this line won't be a substitute for learning what bridgedb decided to do.)
Oh, and we still need to make a final decision what to do with those new lines. Okay to keep them unchanged?
I hope we choose "yes, ok to keep them unchanged". That way external researchers will be able to tie together usage patterns, and blocking patterns, to the distribution method that the bridge asked for.
That's the current plan, yes.
(I am still sad that we don't record (and publish) what distribution mechanism bridgedb picked for a bridge. Most bridges probably won't set this line on their own, so seeing this line won't be a substitute for learning what bridgedb decided to do.)
Good point. I'll put "Resume gathering bridge distribution statistics from BridgeDB" on the roadmap we're about to finish soon, though it will be on the long-term list for Q4/2018 or later. But at least we'll remember when we make the next roadmap or run out of other goals earlier than we thought (hahaha).