Opened 4 years ago

Closed 3 years ago

#18362 closed enhancement (implemented)

Tor could use a generic 'handle' implementation.

Reported by: nickm Owned by: nickm
Priority: High Milestone: Tor: 0.2.9.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: modularity, TorCoreTeam201604, tor-modularity
Cc: Actual Points:
Parent ID: Points: small
Reviewer: dgoulet Sponsor: SponsorS-can

Description

Frequently we want to have one object have a pointer to another, but we don't want to have the first object own the second. In these cases, we need to do one of the following ugly C dances:

  • We make sure that the pointed-to object never outlives the pointing object.
  • We make sure that when the pointed-to object is freed, the pointer to it is set to NULL.
  • Instead of using a pointer, we use some kind of unique identifier, and look up the pointed-to object in a hash table.

The first two options are error-prone, and the third is slower than regular pointer access.

Instead of these choices, we could use a 'handle' pattern to create a standard way to look up objects indirectly; we could use some of the tricks from a usual 'weak reference' implementation. Ideally, we could write the interface in such a way as to permit more than one possible implementation.

The branch weakref in my public repository has some janky progress towards a solution here.

Child Tickets

Change History (18)

comment:1 Changed 4 years ago by nickm

Owner: set to nickm
Status: newaccepted

comment:2 Changed 4 years ago by nickm

Type: defectenhancement

comment:3 in reply to:  description Changed 4 years ago by cypherpunks

Replying to nickm:

  • We make sure that the pointed-to object never outlives the pointing object.

I think you have this backwards, nickm.

comment:4 in reply to:  description Changed 4 years ago by cypherpunks

Replying to nickm:

the third is slower than regular pointer access.

If you have a thread safe implementation, that's not really the case. You are going to use several atomic operations for each access, so a hashtable access wouldn't add that much overhead (probably comparable to 1 atomic operation using an int/int64 key).

Your weakref implementation looks fundamentally broken. You need to convert the weak reference to a strong one before using the pointed to object. Otherwise, the object can get deleted in a multi-threaded context. You may want to look at C++ shared_ptr/weak_ptr.

comment:5 Changed 4 years ago by nickm

Summary: Tor could use a generic 'handle' / 'weakref' implementation.Tor could use a generic 'handle' implementation.

comment:6 Changed 4 years ago by nickm

Sponsor: SponsorSSponsorS-can

comment:7 Changed 4 years ago by nickm

The branch is now called handle, to emphasize that it's not the usual kind of weak reference.

comment:8 Changed 4 years ago by nickm

Keywords: TorCoreTeam201604 added

For March, I aim to have alpha code here. For april I aim to merge.

comment:9 Changed 4 years ago by nickm

Status: acceptedneeds_review

comment:10 Changed 4 years ago by dgoulet

Reviewer: dgoulet

comment:11 Changed 4 years ago by nickm

Keywords: tor-modularity added

comment:12 Changed 3 years ago by dgoulet

Status: needs_reviewneeds_revision

Some review comments:

  • In _handle_new, I don't think the two lines below need to be in the critical section. How I understand it, the mutex is used to protect the head object (handle_head) but the following doesn't do memory access on it.
    name ## _handle_t *new_ref = tor_malloc_zero(sizeof(*new_ref));
    new_ref->head = head;
  • Probably all refcount increment/decrement could be done atomically which would remove the locking in the _handle_new function (small improvement).

General comment. I think we could achieve this without locking (only atomic op). I find it a bit weird to use locking in something that in a multi-thread context would not be very useful since it doesn't give any guarantee on the validity of object pointer. In that context, even if _handle_get sends me back a non NULL pointer, I have no clue if that pointer is usable or not just after.

comment:13 in reply to:  12 Changed 3 years ago by cypherpunks

Replying to dgoulet:

Some review comments:

  • In _handle_new, I don't think the two lines below need to be in the critical section. How I understand it, the mutex is used to protect the head object (handle_head) but the following doesn't do memory access on it.
    name ## _handle_t *new_ref = tor_malloc_zero(sizeof(*new_ref));
    new_ref->head = head;

You overlooked the more obvious blunder: The locking is too lax, the creation of the head is not exclusive and, as a result, simultaneous calls to *_new() can allocate multiple heads.
The key aspect the implementer should realise is that the object and the head are *both* shared resources, so access to them should be exclusive.

And now you realise that the lock cannot be (only, or at all, it depends) inside the head, for you need to hold the lock before creating the head!

More comments later.

comment:14 Changed 3 years ago by nickm

Makes sense to drop the MT support entirely until we have a case that needs it. Do we have one pending?

comment:15 Changed 3 years ago by cypherpunks

So, this implementation is totally broken for multithreaded use (I thought this was the main use-case). I found several other synchonisation errors, but seeing how the whole thing would probably need rework, it doesn't seem like you'd be particularly interested, are you?

The only other thing I could add is that _new() and _clear() are missing a tor_assert(object) at the start. And that maybe you should be more clear in the documentation about the critical importance of calling _clear before destroying the object.

comment:16 Changed 3 years ago by nickm

Status: needs_revisionneeds_review

Removed busted MT support; addressed other comments.

comment:17 Changed 3 years ago by dgoulet

Status: needs_reviewmerge_ready

We can basically use this as a "delete flag" for an object in a single thread environment instead of updating every reference to an object once it gets freed. I don't have any use case from the top of my head that could use that but I'm sure they exist and will come. And heck, if this becomes thread-safe one day, extra bonus :).

Before you merge, nitpick: please change this at the end: #endif /* TOR_HANDLE_H */

lgtm;

comment:18 Changed 3 years ago by nickm

Resolution: implemented
Status: merge_readyclosed

fixed; squashed; merged. Thanks for the reviews

Note: See TracTickets for help on using tickets.