Opened 3 years ago

Closed 3 years ago

#21648 closed defect (implemented)

prop140: Caches generate diffs as appropriate

Reported by: nickm Owned by: nickm
Priority: Medium Milestone: Tor: 0.3.1.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: prop140 TorCoreTeam201704
Cc: ahf Actual Points: 5
Parent ID: #13339 Points: 2
Reviewer: Sponsor: Sponsor4

Description

Directory caches should pre-generate and pre-compress diffs as appropriate. It might be best to do this in a different thread or process. It might be best to pre-generate some and create others on demand.

Child Tickets

Change History (16)

comment:1 Changed 3 years ago by dgoulet

Keywords: prop140 added

comment:2 Changed 3 years ago by nickm

Owner: set to nickm
Status: newassigned

comment:3 Changed 3 years ago by nickm

Actual Points: 4
Keywords: TorCoreTeam201704 added

comment:4 Changed 3 years ago by nickm

The compression part of this is done in consdiffmgr_compress. Once I wire everything up to the rest of Tor, and test a little harder, I'll call this branch needs_review.

comment:5 Changed 3 years ago by nickm

Actual Points: 45
Status: assignedneeds_review

This is now ready for review. Please review the following separate branches:

  • consdiffmgr_compress
  • actually_compute_diffs

comment:7 Changed 3 years ago by ahf

Cc: ahf added

comment:8 Changed 3 years ago by ahf

Status: needs_reviewneeds_revision

Overall this looks good. I've added a couple of comments, but nothing serious.

comment:9 Changed 3 years ago by nickm

Status: needs_revisionneeds_review

Tried fixing most issues, though I think one of them might not be worth fixing. See gitlab discussion for more info

comment:10 Changed 3 years ago by ahf

Status: needs_reviewmerge_ready

This looks good. Let's get it in. I agree with 705b72b08612b1c2bc20e347e72e3cb2083367c8 not helping to make the code cleaner. Let's skip that.

comment:11 Changed 3 years ago by nickm

okay. I've resolved a merge conflict locally; than I tried running under Chutney to make sure it wouldn't explode.

It exploded. :) Currently trying to diagnose.

comment:12 Changed 3 years ago by nickm

13:03 < nickm> to review: I only crash with clang. And only when fragile

hardening is disabled.

13:05 < nickm> clang version 3.9.1 (tags/RELEASE_391/final)
13:06 < nickm> branch ticket21648

comment:13 Changed 3 years ago by nickm

Thread 2 "tor" received signal SIGSYS, Bad system call.
[Switching to Thread 0x7ffff4895700 (LWP 11482)]
0x00007ffff5fdee97 in mprotect () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install keyutils-libs-1.5.9-8.fc24.x86_64 krb5-libs-1.14.4-7.fc25.x86_64 libcom_err-1.43.3-1.fc25.x86_64 libevent-2.0.22-1.fc25.x86_64 libgcc-6.3.1-1.fc25.x86_64 libscrypt-1.21-2.fc24.x86_64 libseccomp-2.3.2-1.fc25.x86_64 libselinux-2.5-13.fc25.x86_64 libzstd-1.1.3-1.fc25.x86_64 openssl-libs-1.0.2k-1.fc25.x86_64 pcre-8.40-7.fc25.x86_64 xz-libs-5.2.2-2.fc24.x86_64 zlib-1.2.8-10.fc24.x86_64
(gdb) bt
#0  0x00007ffff5fdee97 in mprotect () from /lib64/libc.so.6
#1  0x00007ffff5f606aa in sysmalloc () from /lib64/libc.so.6
#2  0x00007ffff5f6163a in _int_malloc () from /lib64/libc.so.6
#3  0x00007ffff5f62f14 in malloc () from /lib64/libc.so.6
#4  0x00007ffff6b30527 in ZSTD_initCStream_advanced () from /lib64/libzstd.so.1
#5  0x00007ffff6b31048 in ZSTD_initCStream_usingDict ()
   from /lib64/libzstd.so.1
#6  0x00005555556cfccb in tor_zstd_compress_new (compress=<optimized out>, 
    method=<optimized out>, level=<optimized out>)
    at src/common/compress_zstd.c:201
#7  0x00005555556cee43 in tor_compress_new (compress=1, method=ZSTD_METHOD, 
    compression_level=BEST_COMPRESSION) at src/common/compress.c:451
#8  0x00005555556ce59f in tor_compress_impl (compress=<optimized out>, 
    out=<optimized out>, out_len=<optimized out>, 
    in=0x7ffff5fdee97 <mprotect+7> "H=\001\360\377\377s\001\303H\213\r\301\337+", in_len=1818, method=<optimized out>, compression_level=BEST_COMPRESSION, 
    complete_only=<optimized out>, protocol_warn_level=<optimized out>)
    at src/common/compress.c:98
#9  0x00005555556ce53e in tor_compress (out=0x1a0000, out_len=0x3, 
    in=0x7ffff5fdee97 <mprotect+7> "H=\001\360\377\377s\001\303H\213\r\301\337+", in_len=140737018593280, method=487424) at src/common/compress.c:213
#10 0x0000555555660053 in consensus_diff_worker_threadfn (
    state_=<optimized out>, work_=0x555555af97f0) at src/or/consdiffmgr.c:1225

comment:14 Changed 3 years ago by nickm

no, false alarm there. My "disable threading" logic was itself b0rken.

Last edited 3 years ago by nickm (previous) (diff)

comment:15 Changed 3 years ago by nickm

I think I figured it out.

This happens when we go to allocate something fairly big in a worker thread, AND the linux seccomp2 sandbox is turned on, AND we're using the standard glibc allocator and not some other thing that took over for the glibc allocator. We never ran into this before, since the current workers don't allocate much.

Right now we have a 1-MB limit on how much RAM we allow the seccomp2 sandbox to mprotect(PROT_READ|PRO_WRITE) at once. I guess that that's not good enough for the threads, and it's trying to mprotect() more.

To confirm, I'll dump the syscall arguments for the above backtrace:

Thread 2 "tor" received signal SIGSYS, Bad system call.
[Switching to Thread 0x7ffff447c700 (LWP 2851)]
0x00007ffff5fdee97 in mprotect () from /lib64/libc.so.6
Missing separate debuginfos, use: dnf debuginfo-install keyutils-libs-1.5.9-8.fc24.x86_64 krb5-libs-1.14.4-7.fc25.x86_64 libcom_err-1.43.3-1.fc25.x86_64 libevent-2.0.22-1.fc25.x86_64 libgcc-6.3.1-1.fc25.x86_64 libscrypt-1.21-2.fc24.x86_64 libseccomp-2.3.2-1.fc25.x86_64 libselinux-2.5-13.fc25.x86_64 libzstd-1.1.3-1.fc25.x86_64 openssl-libs-1.0.2k-1.fc25.x86_64 pcre-8.40-7.fc25.x86_64 xz-libs-5.2.2-2.fc24.x86_64 zlib-1.2.8-10.fc24.x86_64
(gdb) print $rdi
$1 = 140737155325952
(gdb) print $rsi
$2 = 8880128
(gdb) print $rdx
$3 = 3

That's mprotect(0x7fffec266000, 880128, PROT_READ|PROT_WRITE.

Short-term solution -- raise MALLOC_MP_LIM.

Longer-term solution -- make the seccomp2 sandbox wiser about interned strings. We'll want to do that anyway here.

comment:16 Changed 3 years ago by nickm

Resolution: implemented
Status: merge_readyclosed

okay, that did it: #22096

Merging at last.

Note: See TracTickets for help on using tickets.