Opened 12 years ago

Last modified 7 years ago

#593 closed defect (Fixed)

signature download for consensuses fail.

Reported by: weasel Owned by: arma
Priority: Low Milestone:
Component: Core Tor/Tor Version: 0.2.0.17-alpha
Severity: Keywords:
Cc: weasel, nickm, arma Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

In order to ensure that each directory authority has all the signatures
from other authorities before they publish a consensus an authority X
goes out and fetches the list of all detached sigs from every other
authority if X is missing sigs from at least one.

However, this doesn't work.

With some additional/louder logging I can see this at tor26:
Jan 24 15:57:31.143 [notice] Time to fetch any signatures that we're missing.
Jan 24 15:57:31.143 [notice] Fetching missing votes/signatures from all authorities (purpose 13)
Jan 24 15:57:31.143 [notice] Sending command to fetch detached sigs. base.state is 1
Jan 24 15:57:31.143 [notice] Asking 128.31.0.34:9031 for /tor/status-vote/next/consensus-signatures.z
Jan 24 15:57:31.143 [notice] Sending command to fetch detached sigs. base.state is 1
Jan 24 15:57:31.143 [notice] Asking 140.247.60.64 for /tor/status-vote/next/consensus-signatures.z
Jan 24 15:57:31.143 [notice] Sending command to fetch detached sigs. base.state is 1
Jan 24 15:57:31.143 [notice] Asking 216.224.124.114:9030 for /tor/status-vote/next/consensus-signatures.z
Jan 24 15:57:31.143 [notice] Sending command to fetch detached sigs. base.state is 1
Jan 24 15:57:31.143 [notice] Asking 88.198.7.215 for /tor/status-vote/next/consensus-signatures.z
Jan 24 15:57:31.143 [notice] Sending command to fetch detached sigs. base.state is 1
Jan 24 15:57:31.143 [notice] Asking 213.73.91.31 for /tor/status-vote/next/consensus-signatures.z
Jan 24 15:57:31.165 [notice] Dir connection to router 88.198.7.215:80 established.
Jan 24 15:57:31.165 [notice] client finished sending command (88.198.7.215).
Jan 24 15:57:31.174 [notice] Dir connection to router 213.73.91.31:80 established.
Jan 24 15:57:31.174 [notice] client finished sending command (213.73.91.31).
Jan 24 15:57:31.187 [notice] 'fetch' response not all here, but we're at eof. Closing.
Jan 24 15:57:31.187 [notice] conn to 88.198.7.215 reached eof, retval is -1
Jan 24 15:57:31.187 [notice] Giving up downloading detached signatures from '88.198.7.215'
Jan 24 15:57:31.206 [notice] 'fetch' response not all here, but we're at eof. Closing.
Jan 24 15:57:31.206 [notice] conn to 213.73.91.31 reached eof, retval is -1
Jan 24 15:57:31.206 [notice] Giving up downloading detached signatures from '213.73.91.31'
Jan 24 15:57:31.247 [notice] Dir connection to router 128.31.0.34:9031 established.
Jan 24 15:57:31.247 [notice] client finished sending command (128.31.0.34).
Jan 24 15:57:31.322 [notice] Dir connection to router 216.224.124.114:9030 established.
Jan 24 15:57:31.322 [notice] client finished sending command (216.224.124.114).
Jan 24 15:57:31.352 [notice] 'fetch' response not all here, but we're at eof. Closing.
Jan 24 15:57:31.352 [notice] conn to 128.31.0.34 reached eof, retval is -1
Jan 24 15:57:31.352 [notice] Giving up downloading detached signatures from '128.31.0.34'
Jan 24 15:57:31.521 [notice] 'fetch' response not all here, but we're at eof. Closing.
Jan 24 15:57:31.521 [notice] conn to 216.224.124.114 reached eof, retval is -1
Jan 24 15:57:31.521 [notice] Giving up downloading detached signatures from '216.224.124.114'
Jan 24 16:00:01.895 [notice] Time to publish the consensus and discard old votes

Note the "'fetch' response not all here, but we're at eof. Closing." log entry.

The problem appears to be that
a) we send bogus Content-Length headers for compressed stuff like

next/conensus.z and next/consensus-signatures.z, and

b) we declare a download failed when we don't have as many bytes as the

Content-Length says there should be.

Point (a) also confuses the hell out of wget.

[Automatically added by flyspray2trac: Operating System: All]

Child Tickets

Change History (6)

comment:1 Changed 12 years ago by arma

See r13268 for a potential fix to a). Let me know if it works for you.

Also, "deflated" has got to be the dumbest fucking word in the history
of compression rfcs. I vote we remove it from our code, or we're going
to have more bugs like this. "To uncompress something, you undeflate it."

As for b), should we leave it as-is so we uncover more bugs like this,
or should we make clients tolerate getting hung up on half-way through
a directory fetch and then try to parse it anyway, which might lead to
other complaints?

comment:2 Changed 12 years ago by weasel

It fixes parts. It still isn't fixed completely tho:
Jan 30 15:57:31.405 [notice] Time to fetch any signatures that we're missing.
..
Jan 30 15:57:31.406 [info] tor_gzip_uncompress(): possible truncated or corrupt zlib data
Jan 30 15:57:31.406 [info] connection_dir_client_reached_eof(): Unable to decompress HTTP body (server '127.1.0.6:9030').
Jan 30 15:57:31.406 [info] connection_dir_request_failed(): Giving up downloading detached signatures from '127.1.0.6'
Jan 30 15:57:31.406 [info] tor_gzip_uncompress(): possible truncated or corrupt zlib data
Jan 30 15:57:31.406 [info] connection_dir_client_reached_eof(): Unable to decompress HTTP body (server '127.1.0.5:9030').
Jan 30 15:57:31.406 [info] connection_dir_request_failed(): Giving up downloading detached signatures from '127.1.0.5'
Jan 30 15:57:31.406 [info] tor_gzip_uncompress(): possible truncated or corrupt zlib data
Jan 30 15:57:31.406 [info] connection_dir_client_reached_eof(): Unable to decompress HTTP body (server '127.1.0.4:9030').
Jan 30 15:57:31.406 [info] connection_dir_request_failed(): Giving up downloading detached signatures from '127.1.0.4'
Jan 30 15:57:31.406 [info] tor_gzip_uncompress(): possible truncated or corrupt zlib data
Jan 30 15:57:31.406 [info] connection_dir_client_reached_eof(): Unable to decompress HTTP body (server '127.1.0.3:9030').
Jan 30 15:57:31.406 [info] connection_dir_request_failed(): Giving up downloading detached signatures from '127.1.0.3'
Jan 30 15:57:31.406 [info] tor_gzip_uncompress(): possible truncated or corrupt zlib data
Jan 30 15:57:31.406 [info] connection_dir_client_reached_eof(): Unable to decompress HTTP body (server '127.1.0.2:9030').
Jan 30 15:57:31.406 [info] connection_dir_request_failed(): Giving up downloading detached signatures from '127.1.0.2'

comment:3 Changed 12 years ago by weasel

Fri 01:09:16 <weasel> aha!
Fri 01:10:15 <weasel> connection_write_to_buf_zlib("", 0, conn, 1); breaks stuff.
Fri 01:10:28 <weasel> if I replace it by connection_write_to_buf_zlib("foo", 3, conn, 1); I get somethign that can be decompressed without errors
Fri 01:16:25 <weasel> in _connection_write_to_buf_impl() if len is 0 we don't even care if zlib is done or not
Fri 01:16:29 <weasel> that doesn't seem right
Fri 01:20:59 <weasel> keys/all.z is likewise broken.
Fri 01:21:06 -!- Irssi: Pasting 9 lines to #tor-dev. Press Ctrl-K if you wish to do this or Ctrl-C to cancel.
Fri 01:21:07 <weasel> --- connection.c (revision 13345)
Fri 01:21:07 <weasel> +++ connection.c (working copy)
Fri 01:21:07 <weasel> @@ -2250,7 +2250,7 @@
Fri 01:21:07 <weasel> /* XXXX This function really needs to return -1 on failure. */
Fri 01:21:07 <weasel> int r;
Fri 01:21:07 <weasel> size_t old_datalen;
Fri 01:21:07 <weasel> - if (!len)
Fri 01:21:07 <weasel> + if (!len && !(zlib < 0))
Fri 01:21:07 <weasel> return;
Fri 01:29:41 <weasel> I think that's it but now I sleep.

comment:4 Changed 12 years ago by nickm

Applied as r13347. Is it fixed now?

comment:5 Changed 12 years ago by weasel

flyspray2trac: bug closed.
it should be fixed. if not I'll reopen

comment:6 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.