Opened 12 months ago

Closed 9 months ago

#32472 closed defect (fixed)

buf_flush_to_tls: Non-fatal assertion !(flushlen > *buf_flushlen)

Reported by: dgoulet Owned by: nickm
Priority: Medium Milestone: Tor: 0.4.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: assert, tor-connection 043-must
Cc: Actual Points:
Parent ID: Points:
Reviewer: dgoulet Sponsor:

Description

Had this on my relay because I ran out of disk space due to debug.log being on.

Nov 05 12:39:01.801 [warn] tor_bug_occurred_(): Bug: src/lib/tls/buffers_tls.c:152: buf_flush_to_tls: Non-fatal assertion !(flushlen > *buf_flushlen) failed. (Future instances of this warning will be silenced.) (on Tor 0.4.3.0-alpha-dev 4413b98190d94b54)

So the code that exploded is:

  if (BUG(flushlen > *buf_flushlen)) {
    flushlen = *buf_flushlen;
  }

It was triggered by lack of disk space for sure.

The only thing I can see is if connection_handle_write_impl() was called with force = 1 which happens with connection_flush() which makes tor use the bucket limit instead of the outbuf "flushlen" and thus could lead to that assert().

Child Tickets

Change History (8)

comment:1 Changed 10 months ago by nickm

Keywords: 043-must added

comment:2 Changed 9 months ago by nickm

Owner: set to nickm
Status: newassigned

I wonder why this didn't give a stack trace. I'll try to figure this out, though it might be as simple as changing it to IF_BUG_ONCE if I can't figure it out.

comment:3 Changed 9 months ago by nickm

I've made a child ticket to do the IF_BUG_ONCE fix. Investigating this one more closely.

comment:4 in reply to:  description Changed 9 months ago by nickm

Replying to dgoulet:

The only thing I can see is if connection_handle_write_impl() was called with force = 1 which happens with connection_flush() which makes tor use the bucket limit instead of the outbuf "flushlen" and thus could lead to that assert().

I think it's force == 1 that makes us not use connection_bucket_write_limit(). So anyway, let's look more deeply at connection_bucket_write_limit, asking: when can it ever return a value that is greater than conn->outbuf_flushlen?

Side observation 1: we're being a bit messy with ssize_t versus size_t here. I wonder if that's an issue? I'm told this is a 64-bit host: if so, we can't possibly be overflowing SSIZE_MAX. But could we be underflowing 0?

The return value from connection_is_rate_limited() will come from connection_bucket_get_share(), called with its conn_bucket value no greater than outbuf_flushlen. So let's look at that function, and ask whether it can return something greater than conn_bucket?

I think that we maybe can:

  if (conn_bucket >= 0 && at_most > conn_bucket)
    at_most = conn_bucket;

  if (at_most < 0)
    return 0;

If conn_bucket is negative, then we will not clip at_most to be no more than conn_bucket. Of course, the real outbuf_flushlen is unsigned and can't be negative, but in principle it _could_ underflow!

comment:5 Changed 9 months ago by nickm

After a bunch of examination, however, I can't actually find a way for the underflow to happen. I wonder if it would make sense to shift this over to ssize_t everywhere, just in case.

comment:6 Changed 9 months ago by nickm

Status: assignedneeds_review

I'm thinking that the actual short-term fix for this should be only #33093 [making the warnings not repeat], and the correct long-term fix is to do #33097 [remove outbuf_flushlen entirely] in 0.4.4. With that in mind, I'm calling this ticket "needs_review" because its child is in needs_review, and that's all we should need to do to call this "fixed enough for 0.4.3."

comment:7 Changed 9 months ago by dgoulet

Reviewer: dgoulet

comment:8 Changed 9 months ago by dgoulet

Resolution: fixed
Status: needs_reviewclosed

#33093 was merged.

Note: See TracTickets for help on using tickets.