Opened 5 months ago

Closed 5 months ago

#24633 closed defect (fixed)

to->pending->tqh_last is 0xFFFFFFFFFFFFFFFF。

Reported by: sx5486510 Owned by: nickm
Priority: Medium Milestone: Tor: 0.2.9.x-final
Component: Core Tor/Tor Version: Tor: 0.3.2.1-alpha
Severity: Normal Keywords: msvc 029-backport 030-backport 031-backport
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

I'm debuging tor. And it throw an error at function timeouts_sched,

The version is: release-0.3.2

this is the stack:

tor.exe!timeouts_sched(timeouts * T=0x0000000000517e90, timeout * to=0x0000000003307450, unsigned int64 expires=2123027) 行 355 C

tor.exe!timeouts_add(timeouts * T=0x0000000000517e90, timeout * to=0x0000000003307450, unsigned int64 timeout=6290) 行 394 C
tor.exe!timer_schedule(timeout * t=0x0000000003307450, const timeval * tv=0x000000000027dee8) 行 300 C
tor.exe!channelpadding_schedule_padding(channel_s * chan=0x00000000032f7c80, int in_ms=629) 行 478 C
tor.exe!channelpadding_decide_to_pad_channel(channel_s * chan=0x00000000032f7c80) 行 783 C
tor.exe!run_connection_housekeeping(int i=3,
int64 now=1513329560) 行 1132 C
tor.exe!run_scheduled_events(int64 now=1513329560) 行 1464 C
tor.exe!second_elapsed_callback(periodic_timer_t * timer=0x00000000042a5550, void * arg=0x0000000000000000) 行 2216 C
tor.exe!periodic_timer_cb(
int64 fd=-1, short what=1, void * arg=0x00000000042a5550) 行 187 C
tor.exe!event_persist_closure(event_base * base=0x00000000004e8fa0, event * ev=0x000000000427d200) 行 1532 C
tor.exe!event_process_active_single_queue(event_base * base=0x00000000004e8fa0, evcallback_list * activeq=0x00000000004e8eb0, int max_to_process=2147483647, const timeval * endtime=0x0000000000000000) 行 1591 C
tor.exe!event_process_active(event_base * base=0x00000000004e8fa0) 行 1689 C
tor.exe!event_base_loop(event_base * base=0x00000000004e8fa0, int flags=0) 行 1912 C
tor.exe!run_main_loop_once() 行 2631 C
tor.exe!run_main_loop_until_done() 行 2685 C
tor.exe!do_main_loop() 行 2599 C
tor.exe!tor_main(int argc=1, char * * argv=0x000000000046c850) 行 3780 C
tor.exe!main(int argc=1, char * * argv=0x000000000046c850) 行 34 C

1.Run tor
2.when it say Bootstrapped 100%: Done, disable network
3.enable network
It will crush.

timeout.c
#if !defined WHEEL_NUM
#define WHEEL_NUM 4
#endif

...

struct timeouts {

struct timeout_list wheel[WHEEL_NUM][WHEEL_LEN], expired;
...

}

In function timeouts_sched,
int wheel, slot;
...
wheel = timeout_wheel(rem);
...
to->pending = &T->wheel[wheel][slot];

if wheel >= WHEEL_NUM, it will crush.

Child Tickets

Change History (10)

comment:1 Changed 5 months ago by gk

Component: - Select a componentCore Tor/Tor

comment:2 Changed 5 months ago by nickm

Milestone: Tor: 0.3.2.x-final
Owner: set to nickm
Status: newaccepted

comment:3 Changed 5 months ago by nickm

Keywords: msvc 029-backport 030-backpot 031-backpot added

Have you checked what the actual value of wheel was in this case? For the call stack you're showing, rem should be 6290. Assuming WHEEL_BIT == 6 (the default) and WHEEL_NUM == 5 (set in src/common/timers.c), we have wheel == timeout_wheel(rem).

That function is defined as

static inline int timeout_wheel(timeout_t timeout) {
	/* must be called with timeout != 0, so fls input is nonzero */
	return (fls(MIN(timeout, TIMEOUT_MAX)) - 1) / WHEEL_BIT;
} /* timeout_wheel() */

so:

(MIN(6290,TIMEOUT_MAX) == 6290)
fls(6290) == (64 - clz64(6290)) == (64 - 51) == 13
(13 - 1) / WHEEL_BIT == 12 / 6 == 2

So if you're seeing wheel >= 5, there might be a bug here. Maybe in the implementation of clz64?

To test that -- if you build timeout_bitops.c as a standalone binary, with TEST_BITOPS defined, do the tests there fail? I suspect that this may be a MSVC-only bug if so. At first glance, it seems that the definitions of the various functions in timeout-bits.c for MSVC could be wrong. If so, that would seem to match up with what's reported upstream at https://github.com/wahern/timeout/issues/22 .

comment:4 Changed 5 months ago by cypherpunks

Keywords: 030-backport 031-backport added; 030-backpot 031-backpot removed

pot, hmm...

comment:5 Changed 5 months ago by sx5486510

Yes, i run the test, it faild at if (clz64(vv) != naive_clz(64, vv)), any things to resolve it?

Thanks

comment:6 Changed 5 months ago by sx5486510

I modify the function like this:
src/ext/timeouts/timeout-bitops.c

...

#define LT(n) n, n, n, n, n, n, n, n, n, n, n, n, n, n, n, n
static const char logTable[256] = {

0, 1, 2, 2, 3, 3, 3, 3, 4, 4, 4, 4, 4, 4, 4, 4,
LT(5), LT(6), LT(6), LT(7), LT(7), LT(7), LT(7),
LT(8), LT(8), LT(8), LT(8), LT(8), LT(8), LT(8), LT(8)

};
#undef LT

...

static inline int clz32(unsigned long val)
{

unsigned int t;

if ((t = val >> 24)) {

return 8 - logTable[t];

}
else if ((t = val >> 16)) {

return 16 - logTable[t];

}
else if ((t = val >> 8)) {

return 24 - logTable[t];

}
else {

return 32 - logTable[val];

}
DWORD zeros = 0;
_BitScanReverse(&zeros, val);
return zeros;

}
...

static inline int clz64(uint64_t val)
{

unsigned int64 t;

if ((t = val >> 56)) {

return 8 - logTable[t];

}
else if ((t = val >> 48)) {

return 16 - logTable[t];

}
else if ((t = val >> 40)) {

return 24 - logTable[t];

}
else if ((t = val >> 32)) {

return 32 - logTable[t];

}
else if ((t = val >> 24)) {

return 40 - logTable[t];

}
else if ((t = val >> 16)) {

return 48 - logTable[t];

}
else if ((t = val >> 8)) {

return 56 - logTable[t];

}
else {

return 64 - logTable[val];

}

DWORD zeros = 0;
_BitScanReverse64(&zeros, val);
return zeros;

}
...

The code is copy from here: https://fossies.org/linux/Pike/src/bitvector.h

The test is pass, but i'm not sure the code will return correct result in any case.

comment:7 in reply to:  6 Changed 5 months ago by yawning

Replying to sx5486510:

The code is copy from here: https://fossies.org/linux/Pike/src/bitvector.h

https://fossies.org/linux/Pike/COPYING

A more appropriate fix is documented in the upstream ticket anyway...

Last edited 5 months ago by yawning (previous) (diff)

comment:8 Changed 5 months ago by nickm

Status: acceptedneeds_review

If you can test (running the bitops tests and running tor) using the patch here, that would be confirmation that the approach works:

diff --git a/src/ext/timeouts/timeout-bitops.c b/src/ext/timeouts/timeout-bitops.c
index a018f33b958500..45466f6cb313ca 100644
--- a/src/ext/timeouts/timeout-bitops.c
+++ b/src/ext/timeouts/timeout-bitops.c
@@ -40,7 +40,7 @@ static __inline int clz32(unsigned long val)
 {
        DWORD zeros = 0;
        _BitScanReverse(&zeros, val);
-       return zeros;
+       return 31 - zeros;
 }
 #ifdef _WIN64
 /* According to the documentation, these only exist on Win64. */
@@ -54,7 +54,7 @@ static __inline int clz64(uint64_t val)
 {
        DWORD zeros = 0;
        _BitScanReverse64(&zeros, val);
-       return zeros;
+       return 63 - zeros;
 }
 #else
 static __inline int ctz64(uint64_t val)

Also available in my public git repo in a branch called bug24633_029.

comment:9 in reply to:  8 Changed 5 months ago by sx5486510

It works! thx bro!

Replying to nickm:

If you can test (running the bitops tests and running tor) using the patch here, that would be confirmation that the approach works:

diff --git a/src/ext/timeouts/timeout-bitops.c b/src/ext/timeouts/timeout-bitops.c
index a018f33b958500..45466f6cb313ca 100644
--- a/src/ext/timeouts/timeout-bitops.c
+++ b/src/ext/timeouts/timeout-bitops.c
@@ -40,7 +40,7 @@ static __inline int clz32(unsigned long val)
 {
        DWORD zeros = 0;
        _BitScanReverse(&zeros, val);
-       return zeros;
+       return 31 - zeros;
 }
 #ifdef _WIN64
 /* According to the documentation, these only exist on Win64. */
@@ -54,7 +54,7 @@ static __inline int clz64(uint64_t val)
 {
        DWORD zeros = 0;
        _BitScanReverse64(&zeros, val);
-       return zeros;
+       return 63 - zeros;
 }
 #else
 static __inline int ctz64(uint64_t val)

Also available in my public git repo in a branch called bug24633_029.

comment:10 Changed 5 months ago by nickm

Milestone: Tor: 0.3.2.x-finalTor: 0.2.9.x-final
Resolution: fixed
Status: needs_reviewclosed

okay; merged my branch to 0.2.9 and forward.

Note: See TracTickets for help on using tickets.