Opened 7 months ago

Closed 3 months ago

Last modified 2 months ago

#28614 closed defect (fixed)

Can't parse networkstatus consensus time

Reported by: Vort Owned by: nickm
Priority: High Milestone: Tor: 0.4.0.x-final
Component: Core Tor/Tor Version: Tor: 0.4.0.1-alpha
Severity: Normal Keywords: 040-must, 040-rc-must, regression, postfreeze-ok, tbb-needs, 040-backport
Cc: cypherpunks2 Actual Points:
Parent ID: Points:
Reviewer: ahf Sponsor: Sponsor8

Description

When I run tor.exe -f torrc, one of the first messages in log file is:
Nov 25 19:51:48.000 [warn] ISO time "2018-11-25 17:00:00\r" was unparseable
Nov 25 19:51:48.000 [warn] Unable to parse networkstatus consensus

Version tested: 3741f9e5 on Windows 7 SP1 x64.

Child Tickets

Attachments (4)

tor_cons.png (23.3 KB) - added by Vort 7 months ago.
Screenshot
tor_mmap.png (130.0 KB) - added by Vort 6 months ago.
Stack trace
0358.txt (2.9 KB) - added by Vort 3 months ago.
0403.txt (1.9 KB) - added by Vort 3 months ago.

Download all attachments as: .zip

Change History (40)

comment:1 Changed 7 months ago by nickm

Milestone: Tor: 0.4.0.x-final

Hm. Does this happen with older versions of Tor, or is it a regression with this one?

To me it sounds like the issue is with the \r at the end of the line -- like the consensus file was saved in a windows text format, but it's being interpreted as a unix-style text format.

comment:2 Changed 7 months ago by Vort

Never seen this with 0.3.4.9.

comment:3 Changed 7 months ago by gk

Version: Tor: 0.3.5.5-alpha

I was about to file this bug, too. This happens on Windows as well when testing in a Tor Browser context. I think this prevents Tor Browser starting up properly as I've seen numerous instances where Tor Browser "hangs" for 20 minutes after those messages. But that's "only" happening with a clean, new bundle on first start. Luckily, 0.3.5.5-alpha is fine, though.

comment:4 Changed 7 months ago by teor

Keywords: 040-rc-must regression added

comment:5 Changed 7 months ago by nickm

Is there a way to reproduce this easily? Is there a Tor version where it always happens, and one where it never happens?

I wonder if the mmap-related changes of #27244 could be responsible here.

Changed 7 months ago by Vort

Attachment: tor_cons.png added

Screenshot

comment:6 Changed 7 months ago by Vort

Reproduction is simple: just run tor.exe.

Changed 6 months ago by Vort

Attachment: tor_mmap.png added

Stack trace

comment:7 Changed 6 months ago by Vort

Yes, it may be related to mmap.
Here is additional info: tor_mmap.png.

comment:8 in reply to:  3 ; Changed 6 months ago by gk

Replying to gk:

I was about to file this bug, too. This happens on Windows as well when testing in a Tor Browser context. I think this prevents Tor Browser starting up properly as I've seen numerous instances where Tor Browser "hangs" for 20 minutes after those messages. But that's "only" happening with a clean, new bundle on first start. Luckily, 0.3.5.5-alpha is fine, though.

I think I am wrong. This just happens with testing Tor Browser 8.0.4 which ships 0.3.4.9. The result is that bootstrapping is delayed for several minutes (assuming that delay is actually caused by that parsing issue). I wonder, though, while this seems to be a new bug...

comment:9 in reply to:  8 Changed 6 months ago by gk

Replying to gk:

Replying to gk:

I was about to file this bug, too. This happens on Windows as well when testing in a Tor Browser context. I think this prevents Tor Browser starting up properly as I've seen numerous instances where Tor Browser "hangs" for 20 minutes after those messages. But that's "only" happening with a clean, new bundle on first start. Luckily, 0.3.5.5-alpha is fine, though.

I think I am wrong. This just happens with testing Tor Browser 8.0.4 which ships 0.3.4.9. The result is that bootstrapping is delayed for several minutes (assuming that delay is actually caused by that parsing issue). I wonder, though, while this seems to be a new bug...

Wait, scratch that. I think I took the wrong bundle to test. Sorry for the noise.

comment:10 Changed 6 months ago by gk

Note: #28405 seems highly related (but does have a different milestone and keywords).

comment:11 Changed 5 months ago by nickm

Keywords: postfreeze-ok added

Mark some tickets as postfreeze-ok, to indicate that I think they are okay to accept in 0.4.0 post-freeze. Does not indicate that they are all necessary to do postfreeze.

comment:12 Changed 5 months ago by gk

Keywords: tbb-needs added

comment:13 Changed 5 months ago by arma

Hint: in maint-0.3.5's src/feature/nodelist/networkstatus.c's networkstatus_read_cached_consensus_impl(), we call

  char *result = read_file_to_str(filename, RFTS_IGNORE_MISSING, NULL);

Note that we don't pass it the RFTS_BIN flag, which means tor_open_cloexec()'s open() ultimately uses O_TEXT rather than O_BINARY for its flags.

But 0.4.0's networkstatus_map_cached_consensus_impl()'s tor_mmap_file() doesn't have the same Windows notion of O_TEXT.

And on Linux, open() doesn't care if it's binary or text because they're the same thing, so Linux (and really everything other than Windows) continued to work when we shifted from text to binary in 0.4.0.

Of course, for extra fun notice that we seem to have a Windows-specific mmap implementation: see the #elif _WIN32 version of tor_mmap_file().

comment:14 Changed 5 months ago by arma

Or maybe I am looking at this bug the wrong way around: how did it come to have \r's in it on disk? I assume it's because we wrote it with Windows-style O_TEXT which added the \r's. So if we want it to not have \r's in it when we read it, we need to write it in O_BINARY mode too? And then there will still be the edge case where we try to load a legacy file which has \r's in it, and we should teach ourselves to handle that case better (e.g. discard it and fetch another right then)?

comment:15 Changed 5 months ago by nickm

So we need to do two things here:

  • always write a consensus in binary mode
  • on windows, when we encounter a consensus with CRLF in it, strip the CRs.

comment:16 Changed 5 months ago by nickm

Owner: set to nickm
Status: newaccepted

comment:17 Changed 5 months ago by nickm

Status: acceptedneeds_review

Proposed fix in bug28614_040, PR at https://github.com/torproject/tor/pull/671

comment:18 Changed 5 months ago by nickm

Priority: MediumHigh

comment:19 Changed 5 months ago by watt

- When reading a consensus fiel from disk, detect whether it
FILE

Could we avoid that and just ask users to clear the old file if it's wrong?

comment:20 Changed 5 months ago by ahf

Reviewer: ahf

comment:21 Changed 5 months ago by nickm

Could we avoid that and just ask users to clear the old file if it's wrong?

We could, but that would be inconvenience to the users, and it's best not to inconvenience users for a problem that we can fix.

Not everybody is comfortable navigating inside folders and deleting stuff.

comment:22 in reply to:  21 ; Changed 5 months ago by watt

Replying to nickm:

Could we avoid that and just ask users to clear the old file if it's wrong?

We could, but that would be inconvenience to the users, and it's best not to inconvenience users for a problem that we can fix.

Not everybody is comfortable navigating inside folders and deleting stuff.

No user intervention then. But adding more code to core tor just for transition to a new file format is not the best option. Will you remove it when tor is upgraded?

comment:23 Changed 5 months ago by gk

Cc: cypherpunks2 added

Closed #29302 as a duplicate

comment:24 in reply to:  22 ; Changed 5 months ago by nickm

Replying to watt:

Replying to nickm:

Could we avoid that and just ask users to clear the old file if it's wrong?

We could, but that would be inconvenience to the users, and it's best not to inconvenience users for a problem that we can fix.

Not everybody is comfortable navigating inside folders and deleting stuff.

No user intervention then. But adding more code to core tor just for transition to a new file format is not the best option. Will you remove it when tor is upgraded?

Probably; would you like to open a ticket for that? This code will be relevant until all earlier versions of Tor are obsolete, since people have any time between now and then to update. Since 0.3.5.x is supported till 2022, we shouldn't remove it before then.

comment:25 Changed 4 months ago by ahf

Status: needs_reviewmerge_ready

Tested this on Windows. Looks like it makes the 'ISO time "...\r" was unparseable' go away.

Code looks good.

comment:26 in reply to:  24 Changed 4 months ago by watt

Replying to nickm:

Replying to watt:

Replying to nickm:

Could we avoid that and just ask users to clear the old file if it's wrong?

We could, but that would be inconvenience to the users, and it's best not to inconvenience users for a problem that we can fix.

Not everybody is comfortable navigating inside folders and deleting stuff.

No user intervention then. But adding more code to core tor just for transition to a new file format is not the best option. Will you remove it when tor is upgraded?

Probably; would you like to open a ticket for that? This code will be relevant until all earlier versions of Tor are obsolete, since people have any time between now and then to update. Since 0.3.5.x is supported till 2022, we shouldn't remove it before then.

Usually such tickets are buried in the Tor: unspecified milestone. And it's almost impossible that somebody will remind us about it in 2022 :(
The main drawback is that tor should do additional parsing again and again at runtime instead of one time deletion during update.

comment:27 Changed 4 months ago by dgoulet

Resolution: fixed
Status: merge_readyclosed

Merged.

comment:28 Changed 3 months ago by Vort

The problem is half-fixed.
Fresh installation have no such messages anymore.
But updating from 0.3.5.8 is still broken - same message pops out.
Tested version: 0.4.0.3-alpha.

comment:29 in reply to:  28 ; Changed 3 months ago by teor

Keywords: 040-must added
Resolution: fixed
Status: closedreopened
Version: Tor: 0.4.0.1-alpha

We need to check for \r\n before we try to parse the consensus, because tor logs warnings when it fails. I left some comments on the pull request.

Replying to Vort:

The problem is half-fixed.
Fresh installation have no such messages anymore.
But updating from 0.3.5.8 is still broken - same message pops out.
Tested version: 0.4.0.3-alpha.

Yes, we should fix that message.
But does Tor successfully parse the consensus, after it logs the message?

comment:30 Changed 3 months ago by teor

Keywords: 040-backport added

comment:31 in reply to:  29 ; Changed 3 months ago by Vort

Replying to teor:

But does Tor successfully parse the consensus, after it logs the message?

I am not sure how to check it.
Log says Mar 28 08:23:58.000 [notice] Bootstrapped 100% (done): Done

BTW, clean start of 0.4.0.3-alpha brings another problem:
Mar 28 08:26:00.000 [warn] Failed to unlink C:\Users\Vort\AppData\Roaming\tor\unverified-microdesc-consensus: Permission denied
Maybe it is related.

comment:32 in reply to:  31 ; Changed 3 months ago by teor

Replying to Vort:

Replying to teor:

But does Tor successfully parse the consensus, after it logs the message?

I am not sure how to check it.
Log says Mar 28 08:23:58.000 [notice] Bootstrapped 100% (done): Done

Please send us a full copy of your logs.
Please delete any guard fingerprints or names.

We should be able to tell from the bootstrap progress and times if you needed a new consensus or not.

BTW, clean start of 0.4.0.3-alpha brings another problem:
Mar 28 08:26:00.000 [warn] Failed to unlink C:\Users\Vort\AppData\Roaming\tor\unverified-microdesc-consensus: Permission denied
Maybe it is related.

I don't think it is, but I opened #29930 for it.

Changed 3 months ago by Vort

Attachment: 0358.txt added

Changed 3 months ago by Vort

Attachment: 0403.txt added

comment:33 in reply to:  32 Changed 3 months ago by Vort

Replying to teor:

Please delete any guard fingerprints or names.

Nothing interesting there.
It's a clean start inside VM.

Please send us a full copy of your logs.

Test sequence:

  1. Manually remove C:\Users\Vort\AppData\Roaming\tor.
  2. Start 0.3.5.8. Close it: 0358.txt
  3. Start 0.4.0.3-alpha. Close it: 0403.txt

We should be able to tell from the bootstrap progress and times if you needed a new consensus or not.

My relay is fine. I was able to fix this problem by removing of some files in data directory.

comment:34 Changed 3 months ago by nickm

Looks like the main issue here is mainly logging -- we're logging the failures at 'warning', but we're only logging the message about retrying at 'info'.

Opening a child ticket for that...

comment:35 Changed 3 months ago by nickm

Resolution: fixed
Status: reopenedclosed

The issue was fixed, but the logs were misleading. Opened #30004 to improve the logging.

comment:36 Changed 2 months ago by teor

Sponsor: Sponsor8

Related to #27244.

Note: See TracTickets for help on using tickets.