Opened 14 months ago

Last modified 4 months ago

#27315 needs_information defect

Sandbox regression in 0.3.4.7-rc

Reported by: toralf Owned by:
Priority: Medium Milestone: Tor: 0.3.5.x-final
Component: Core Tor/Tor Version: Tor: 0.3.4.7-rc
Severity: Normal Keywords: regression?
Cc: danielpinto52@… Actual Points:
Parent ID: Points:
Reviewer: ahf Sponsor:

Description

despite ticket #25440 I dpo observe here at a stable hardened Gentoo Linux with glibc-2.27 this:
https://paste.pound-python.org/show/cohcouPCOOVdH0VnaOhm/

Child Tickets

Attachments (1)

info.log.gz (2.3 KB) - added by toralf 14 months ago.
info.log.gz

Download all attachments as: .zip

Change History (19)

Changed 14 months ago by toralf

Attachment: info.log.gz added

info.log.gz

comment:1 Changed 14 months ago by nickm

Component: - Select a componentCore Tor/Tor
Keywords: regression? added
Milestone: Tor: 0.3.4.x-final

comment:2 Changed 14 months ago by Jigsaw52

I've talked to toralf on IRC and confirmed this was caused by the fix for #25440. My fix consists in using both the sandbox rule for openat before the #25440 fix and the rule after the fix. toralf tested on his system and it worked and I tested on mine (which was previously affected by #25440) and it worked too.

Here is the branch for 0.3.4.7-rc: https://github.com/Jigsaw52/tor/commits/fix-27314

And here is the fix on the current master: https://github.com/Jigsaw52/tor/commits/fix-27314-master

comment:3 Changed 14 months ago by nickm

Status: newneeds_review

comment:4 Changed 14 months ago by asn

Reviewer: ahf

comment:5 Changed 14 months ago by ahf

Status: needs_reviewneeds_revision

I have some questions here:

  • Should we really add both rules where AT_FDCWD is casted and where it isn't?

Additionally, the following needs to change before we can accept it:

  • We need a changes file explaining the issue.
  • I think the commit message needs to contain some more information about what the issue is here since it's quite subtle for someone just reading the commit on its own.

comment:6 Changed 14 months ago by Jigsaw52

The rule with the cast fixes #25440 on systems affected by it. The #25440 fix replaced the old version of the rule (without the cast) with the cast version. However, we now discovered that, on toralf system (which was not affected by #25440), removing the old rule breaks tor.

In attempt to identify why toralf system behaves different than mine (which was affected by #25440), I asked toralf run a small program to check check the value of AT_FDCWD and size of several relevant types but all the results were equal to my system. As I was unable to find a way to identify which systems need which rule, my solution was to enable both rules.

I have added the changes file and improved the commit message.

comment:7 Changed 13 months ago by toralf

Bad news: With 0.3.4.8 at a stable hardened Gentoo Desktop I do run into this again:

t44 tmp # cat info.log 
Sep 11 19:33:56.000 [notice] Tor 0.3.4.8 (git-da95b91355248ad8) opening new log file.
Sep 11 19:33:56.000 [warn] Your log may contain sensitive information - you're logging more than "notice". Don't log unless it serves an important reason. Overwrite the log afterwards.
Sep 11 19:33:56.000 [info] options_act_reversible(): Recomputed OOS thresholds: ConnLimit 1000, ConnLimit_ 29968, ConnLimit_high_thresh 29904, ConnLimit_low_thresh 22476
Sep 11 19:33:56.000 [info] tor_lockfile_lock(): Locking "/var/lib/tor/data/lock"
Sep 11 19:33:56.000 [info] or_state_load(): Loaded state from "/var/lib/tor/data/state"
Sep 11 19:33:56.000 [info] sampled_guards_update_from_consensus(): Not updating the sample guard set; we have no live consensus.
Sep 11 19:33:56.000 [info] sample_reachable_filtered_entry_guards(): Trying to sample a reachable guard: We know of 0 in the USABLE_FILTERED set.
Sep 11 19:33:56.000 [info] sample_reachable_filtered_entry_guards():   (That isn't enough. Trying to expand the sample.)
Sep 11 19:33:56.000 [info] entry_guards_expand_sample(): Not expanding the sample guard set; we have no live consensus.
Sep 11 19:33:56.000 [info] sample_reachable_filtered_entry_guards():   (After filters [b], we have 0 guards to consider.)
Sep 11 19:33:56.000 [info] circuit_build_times_parse_state(): Adding 145 timeouts.
Sep 11 19:33:56.000 [info] circuit_build_times_parse_state(): Loaded 1000/1000 values from 122 lines in circuit time histogram
Sep 11 19:33:56.000 [info] circuit_build_times_get_xm(): Xm mode #0: 225 135
Sep 11 19:33:56.000 [info] circuit_build_times_get_xm(): Xm mode #1: 275 113
Sep 11 19:33:56.000 [info] circuit_build_times_get_xm(): Xm mode #2: 225 135
Sep 11 19:33:56.000 [info] circuit_build_times_set_timeout(): Based on 1000 circuit times, it looks like we don't need to wait so long for circuits to finish. We will now assume a circuit is too slow to use after waiting 4 seconds.
Sep 11 19:33:56.000 [info] circuit_build_times_set_timeout(): Circuit timeout data: 4126.277304ms, 60000.000000ms, Xm: 239, a: 0.564979, r: 0.210000
Sep 11 19:33:56.000 [info] read_file_to_str(): Could not open "/var/lib/tor/data/router-stability": No such file or directory
Sep 11 19:33:56.000 [info] init_cookie_authentication(): Generated auth cookie file in '"/var/lib/tor/data/control_auth_cookie"'.
Sep 11 19:33:56.000 [info] scheduler_kist_set_full_mode(): Setting KIST scheduler with kernel support (KIST mode)
Sep 11 19:33:56.000 [info] cmux_ewma_set_options(): Enabled cell_ewma algorithm because of value in CircuitPriorityHalflifeMsec in consensus; scale factor is 0.793701 per 10 seconds
Sep 11 19:33:56.000 [notice] Parsing GEOIP IPv4 file /usr/share/tor/geoip.
Sep 11 19:33:56.000 [notice] Parsing GEOIP IPv6 file /usr/share/tor/geoip6.
Sep 11 19:33:56.000 [info] add_predicted_port(): New port prediction added. Will continue predictive circ building for 2494 more seconds.
Sep 11 19:33:56.000 [info] crypto_global_init(): NOT using OpenSSL engine support.
Sep 11 19:33:56.000 [info] evaluate_evp_for_aes(): This version of OpenSSL has a known-good EVP counter-mode implementation. Using it.
Sep 11 19:33:56.000 [notice] We were built to run on a 64-bit CPU, with OpenSSL 1.0.1 or later, but with a version of OpenSSL that apparently lacks accelerated support for the NIST P-224 and P-256 groups. Building openssl with such support (using the enable-ec_nistp_64_gcc_128 option when configuring it) would make ECDH much faster.
Sep 11 19:33:56.000 [notice] Bootstrapped 0%: Starting
Sep 11 19:33:56.000 [warn] Could not open "/var/lib/tor/data/cached-certs": Permission denied
Sep 11 19:33:56.000 [warn] Could not open "/var/lib/tor/data/cached-consensus": Permission denied
Sep 11 19:33:56.000 [warn] Could not open "/var/lib/tor/data/unverified-consensus": Permission denied
Sep 11 19:33:56.000 [warn] Could not open "/var/lib/tor/data/cached-microdesc-consensus": Permission denied
Sep 11 19:33:56.000 [warn] Could not open "/var/lib/tor/data/unverified-microdesc-consensus": Permission denied
Sep 11 19:33:56.000 [warn] Could not open "/var/lib/tor/data/cached-microdescs" for mmap(): Permission denied
Sep 11 19:33:56.000 [warn] Could not open "/var/lib/tor/data/cached-microdescs.new": Permission denied
Sep 11 19:33:56.000 [info] microdesc_cache_reload(): Reloaded microdescriptor cache. Found 0 descriptors.
Sep 11 19:33:56.000 [warn] Could not open "/var/lib/tor/data/cached-descriptors" for mmap(): Permission denied
Sep 11 19:33:56.000 [warn] Could not open "/var/lib/tor/data/cached-extrainfo" for mmap(): Permission denied
Sep 11 19:33:56.000 [notice] Starting with guard context "default"
Sep 11 19:33:56.000 [info] sampled_guards_update_from_consensus(): Not updating the sample guard set; we have no live consensus.
Sep 11 19:33:56.000 [info] sample_reachable_filtered_entry_guards(): Trying to sample a reachable guard: We know of 0 in the USABLE_FILTERED set.
Sep 11 19:33:56.000 [info] sample_reachable_filtered_entry_guards():   (That isn't enough. Trying to expand the sample.)
Sep 11 19:33:56.000 [info] entry_guards_expand_sample(): Not expanding the sample guard set; we have no live consensus.
Sep 11 19:33:56.000 [info] sample_reachable_filtered_entry_guards():   (After filters [b], we have 0 guards to consider.)
Sep 11 19:33:56.000 [info] I learned some more directory information, but not enough to build a circuit: We have no usable consensus.
Sep 11 19:33:57.000 [info] update_consensus_bootstrap_attempt_downloads(): Launching microdesc bootstrap mirror networkstatus consensus download.
Sep 11 19:33:57.000 [info] sample_reachable_filtered_entry_guards(): Trying to sample a reachable guard: We know of 0 in the USABLE_FILTERED set.
Sep 11 19:33:57.000 [info] sample_reachable_filtered_entry_guards():   (That isn't enough. Trying to expand the sample.)
Sep 11 19:33:57.000 [info] entry_guards_expand_sample(): Not expanding the sample guard set; we have no live consensus.
Sep 11 19:33:57.000 [info] sample_reachable_filtered_entry_guards():   (After filters [7], we have 0 guards to consider.)
Sep 11 19:33:57.000 [info] select_entry_guard_for_circuit(): Absolutely no sampled guards were available. Marking all guards for retry and starting from top again.
Sep 11 19:33:57.000 [info] directory_pick_generic_dirserver(): No router found for consensus network-status fetch; falling back to dirserver list.
Sep 11 19:33:57.000 [info] connection_ap_make_link(): Making internal direct tunnel to [scrubbed]:22 ...
Sep 11 19:33:57.000 [info] connection_ap_make_link(): ... application connection created and linked.
Sep 11 19:33:57.000 [info] directory_send_command(): Downloading consensus from 176.31.180.157:22 using /tor/status-vote/current/consensus-microdesc/0232AF+14C131+23D15D+27102B+49015F+D586D1+E8A9C4+ED03BB+EFCBE7.z

============================================================ T= 1536687237
(Sandbox) Caught a bad syscall attempt (syscall openat)
/usr/bin/tor(+0x191e3a)[0x55ec09e79e3a]
/lib64/libpthread.so.0(open64+0x5d)[0x7efc590023ad]
/lib64/libpthread.so.0(open64+0x5d)[0x7efc590023ad]
/usr/bin/tor(tor_open_cloexec+0x40)[0x55ec09e606a0]
/usr/bin/tor(start_writing_to_file+0x16a)[0x55ec09e7420a]
/usr/bin/tor(+0x18c2eb)[0x55ec09e742eb]
/usr/bin/tor(+0x18c438)[0x55ec09e74438]
/usr/bin/tor(or_state_save+0x151)[0x55ec09da0401]
/usr/bin/tor(+0x503dd)[0x55ec09d383dd]
/usr/bin/tor(+0x6bc71)[0x55ec09d53c71]
/usr/lib64/libevent-2.1.so.6(+0x226bd)[0x7efc59ecc6bd]
/usr/lib64/libevent-2.1.so.6(event_base_loop+0x4e7)[0x7efc59ecd377]
/usr/bin/tor(do_main_loop+0x17a)[0x55ec09d3c33a]
/usr/bin/tor(tor_run_main+0x11a5)[0x55ec09d3e8b5]
/usr/bin/tor(tor_main+0x3a)[0x55ec09d36bba]
/usr/bin/tor(main+0x19)[0x55ec09d36949]
/lib64/libc.so.6(__libc_start_main+0xfd)[0x7efc58c4005d]
/usr/bin/tor(_start+0x2a)[0x55ec09d3699a]

comment:8 Changed 13 months ago by teor

Keywords: 034-backport added
Status: needs_revisionneeds_review

comment:9 Changed 13 months ago by ahf

@toralf, is this with the patch from comment #6?

comment:10 Changed 13 months ago by toralf

You mean commit 7eb6458b61499d9042214a34c2b790ed8a1b229f of https://github.com/Jigsaw52/tor.git ?
No, I didn'tz applied that.
I assumed that that patch was already pulled in 0.3.4.8 - but seems to be not - so sry for the noise, then I'll patch it myself.

comment:11 Changed 13 months ago by toralf

FWIW - applied cleanly against 0.3.4.8 - and works at a stable hardened Gentoo.

Last edited 13 months ago by toralf (previous) (diff)

comment:12 Changed 12 months ago by ahf

Status: needs_reviewneeds_information

We are bit concerned about adding an additional rule for each openat call because we only have limited amount of rules to place with seccomp.

Do we think we are able to eliminate the additional rule so it's either one of the rules? How do we know which users are impacted by this?

comment:13 in reply to:  12 Changed 11 months ago by Jigsaw52

Replying to ahf:

Do we think we are able to eliminate the additional rule so it's either one of the rules? How do we know which users are impacted by this?

Ideally each rule should only be applied to the systems that need it. The tricky part is figuring out why some systems need each rule. I have tried to identify some difference between toralf system (which is affected) and mine (which isn't) but failed to to so. I could try further but I need the collaboration of someone affected.

comment:14 Changed 10 months ago by Jigsaw52

I made a small program which uses the same code as tor to setup seccomp rules and then opens a file and I get the same behavior as I do with tor (crash before the #25440 fix, success after). It also prints the pseudo-code for the rules generated by libseccomp.

The code is here: https://pastebin.com/zQz1sf22

Compile with: gcc test.c -o test -l seccomp
Run without arguments to use the rule after the #25440 fix: ./test
Run with any arguments to use the rule before the #25440 fix: ./test foo

If someone affected by this, could run the program with and without arguments and post both outputs maybe that could help finding out whats different between our systems.

comment:15 Changed 10 months ago by Jigsaw52

Cc: danielpinto52@… added

comment:16 Changed 9 months ago by toralf

This is at my hardened Gentoo relay:

mr-fox ~ # gcc test.c -o test -l seccomp
test.c: In function ‘main’:
test.c:92:14: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 2 has type ‘int’ [-Wformat=]
   printf("%llu %llu\n", AT_FDCWD, (unsigned int)AT_FDCWD);
           ~~~^
           %u
test.c:92:19: warning: format ‘%llu’ expects argument of type ‘long long unsigned int’, but argument 3 has type ‘unsigned int’ [-Wformat=]
   printf("%llu %llu\n", AT_FDCWD, (unsigned int)AT_FDCWD);
                ~~~^
                %u
mr-fox ~ # ./test
Testing rule before fix.
#
# pseudo filter code start
#
# filter for arch x86_64 (3221225534)
if ($arch == 3221225534)
  # filter for syscall "fstat64" (-10010) [priority: 65535]
  if ($syscall == -10010)
    action ALLOW;
  # filter for syscall "exit_group" (231) [priority: 65535]
  if ($syscall == 231)
    action ALLOW;
  # filter for syscall "rt_sigreturn" (15) [priority: 65535]
  if ($syscall == 15)
    action ALLOW;
  # filter for syscall "fstat" (5) [priority: 65535]
  if ($syscall == 5)
    action ALLOW;
  # filter for syscall "write" (1) [priority: 65535]
  if ($syscall == 1)
    action ALLOW;
  # filter for syscall "read" (0) [priority: 65535]
  if ($syscall == 0)
    action ALLOW;
  # filter for syscall "openat" (257) [priority: 65531]
  if ($syscall == 257)
    if ($a0.hi32 == 4294967295)
      if ($a0.lo32 == 4294967196)
        if ($a1.hi32 == 22044)
          if ($a1.lo32 == 1279610616)
            action ALLOW;
  # default action
  action KILL;
# invalid architecture action
action KILL;
#
# pseudo filter code end
#
GNU libc version: 2.27
GNU libc release: stable
libseccomp 2.3.3
18446744073709551516 4294967196
4294967196 4294967196
Before openat
Bad system call
mr-fox ~ # ./test foo
Testing rule before fix.
#
# pseudo filter code start
#
# filter for arch x86_64 (3221225534)
if ($arch == 3221225534)
  # filter for syscall "fstat64" (-10010) [priority: 65535]
  if ($syscall == -10010)
    action ALLOW;
  # filter for syscall "exit_group" (231) [priority: 65535]
  if ($syscall == 231)
    action ALLOW;
  # filter for syscall "rt_sigreturn" (15) [priority: 65535]
  if ($syscall == 15)
    action ALLOW;
  # filter for syscall "fstat" (5) [priority: 65535]
  if ($syscall == 5)
    action ALLOW;
  # filter for syscall "write" (1) [priority: 65535]
  if ($syscall == 1)
    action ALLOW;
  # filter for syscall "read" (0) [priority: 65535]
  if ($syscall == 0)
    action ALLOW;
  # filter for syscall "openat" (257) [priority: 65531]
  if ($syscall == 257)
    if ($a0.hi32 == 4294967295)
      if ($a0.lo32 == 4294967196)
        if ($a1.hi32 == 22046)
          if ($a1.lo32 == 1081806584)
            action ALLOW;
  # default action
  action KILL;
# invalid architecture action
action KILL;
#
# pseudo filter code end
#
GNU libc version: 2.27
GNU libc release: stable
libseccomp 2.3.3
18446744073709551516 4294967196
4294967196 4294967196
Before openat
Bad system call

comment:17 Changed 4 months ago by nickm

Milestone: Tor: 0.3.4.x-finalTor: 0.3.5.x-final

0.3.4 is EOL; if we're going to get the info we need for these, it will be under 0.3.5.

comment:18 Changed 4 months ago by nickm

Keywords: 034-backport removed

Removing 034-backport from all open tickets: 034 has reached EOL.

Note: See TracTickets for help on using tickets.