Opened 7 months ago

Closed 6 months ago

#24337 closed enhancement (implemented)

Every _free() function should be a macro that sets the corresponding pointer to NULL.

Reported by: nickm Owned by: nickm
Priority: Medium Milestone: Tor: 0.3.3.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: review-group-26, review-group-27
Cc: Actual Points:
Parent ID: Points:
Reviewer: dgoulet Sponsor: Sponsor8-can

Description

Our tor_free() macros has saved us a bunch of trouble in the past by ensuring that once we've freed a pointer, that pointer gets set to NULL.

Wouldn't it be cool if all of our free() calls did this? It should make our code generally less error-prone.

As we work on #23847, I bet we will find it extremely helpful.

Child Tickets

Change History (24)

comment:1 Changed 7 months ago by nickm

Status: assignedneeds_review

In my branch macro_free, I've started doing this. I got through all of src/common/*.h, and then decided I'd take a break and ask for comment before going any further.

comment:2 Changed 7 months ago by nickm

(On IRC, catalyst suggested that we should globally upcase these macros and any other magic macros that don't behave like C functions. I agree; let's do that in another ticket.)

comment:3 Changed 7 months ago by nickm

Sponsor: Sponsor8-can

My macro_free branch is now ready for review.

It only hits the free functions defined in .h files.

I expect that when we merge this, we'll need to regenerate the patches that make the macros uppercase: I've included the perl scripts to do so in the commit messaages.

comment:4 Changed 7 months ago by nickm

Note to the reviewer: you are not expected to review the enormous machine-generated commits! Instead I suggest that you re-run the perl scripts to verify that I really did generate those commits automatically, and then review the substitutions in those scripts to make sure that they look reasonable. This should save you a lot of time.

comment:5 Changed 7 months ago by nickm

Keywords: review-group-26 added

Creating review-group-26.

comment:6 Changed 7 months ago by asn

Hey, are we really sure we want to do this tor_free() -> TOR_FREE() substitution? It looks kind of alien.

I wonder if the benefits outweight the ugliness here. I can probably get used to it given a bit of time, but I'm wondering if this is really necessary at this point. I haven't heard of many (if any) people get confused by this tor_free is macro, tor_free_() is function situation, and perhaps we could better resolve this confusion with a function doc on top of tor_free().

Furthermore, this seems like a pretty substantial readability/code-style change that has nothing to do with the original ticket purpose. Also, we have more macros that are lowercase (e.g. circuit_mark_for_close(), connection_mark_for_close(), get_primary_or_port(), etc.), do we really plan to turn these all to capitals?

$ grep -ER "#define [a-z]+" src/*/* | wc -l 
510

Do other people find this awkward, or do people like this code-style change?
What are the benefits and what are the drawbacks?

Last edited 7 months ago by asn (previous) (diff)

comment:7 Changed 7 months ago by dgoulet

I'm going to add my opinion to this.

There are cases (and maybe not standardize) where lowercase makes sense. For me, it does when the macro acts as a "function" that kind of could be a static inline but can't because we need to handle local variables (tor_free() for instance). And the lower case macro don't need to have pre-processor tricks applied to it (like doing arithmetic in it).

I think there is a separation of semantic between a lower and upper case macro and Tor kind of follows it as "acting as a function" vs constant or inline testing a value (ex: ERRNO_IS_EAGAIN()).

The case of tor_free() for me is a clear Tor wrapper around libc functions to make them safer and we have a many of those but they are functions. And I would prefer that it stays lower case for the semantic here linking that to our other wrappers and as tor_free_() is a function. Furthermore, there is a "if(predict_likely)" in there so one more argument in my opinion to treat it as a "function" that does have conditions and is a wrapper over the libc free().

Finally, this change will hurt the git blame for no gain imho. We'll actually bring confusion and more syntax split with the current code base as asn pointed out our other lower case macros (approx_time() is a big one.)

comment:8 Changed 7 months ago by nickm

So, I had originally left all these macros in lowercase, until someone (Taylor, I think?) asked me "so, what problems have we run into with tor_free() so far?" And the biggest problems I could think of is that people are confused because tor_free() does not actually behave like a function.

In C, if you call foo(ptr), and foo is a function, then the value of ptr cannot change. But tor_free(ptr) does change the value of ptr to NULL, which is something that tor_free() could not do if it were a function. New developers on Tor often don't expect this to happen.

That said, it is usually a pretty short learning curve, so it might be okay to leave everything as it is.

comment:9 Changed 7 months ago by dgoulet

Right. The "change the local value" is the part that is making it stand out as a "function". But having it upper case won't help much of making new developers realize it "could do" something like that imo... It might maybe motivate them to go look at the macro?...

TBH, I personally don't see that as a "problem" for new developers, as long as we document it properly. But also, every tor_* function, any developer should look at them when they use them because we do wrappers for safety and/or compatibility reasons which imo MUST be known to any developers.

We could help new developers by adding a note to our HACKING/ documentation on "if you use tor_*, please go read them" kind of statement.

comment:10 Changed 7 months ago by Hello71

I support uppercase for all macros. Even for long-time contributors that know tor_free is a macro, it increases cognitive overhead unnecessarily. In the absence of strong overriding factors, like "C does not support this", the code should be as easy to read as possible, with syntactic sugar that is as transparent as possible, without having to keep some extra thing in mind. A small thing, I agree, but death by a thousand cuts.

tl;dr even oldies can use a reminder

comment:11 Changed 7 months ago by nickm

Status: needs_reviewneeds_revision

After discussion at today's meeting, we're leaning towards having this be in lowercase, given various factors, and in the interest of avoiding a permanent bikeshed debate.

comment:12 Changed 7 months ago by nickm

Status: needs_revisionneeds_review

I've made a new branch based on the old one. It removes the uppercasing commits. It also adds a new pair of commits that change tor_free() and object_free() so that they can withstand usage like tor_free(ptr++);.

The branch is now macro_free_v2. The only new commits are d6549f0d1d68e09d260968af95016218ae16072d and 534e6094fc5336785105deeaacd487e9e79ebbab (and the revised changes file).

comment:13 Changed 7 months ago by nickm

Keywords: review-group-27 added

comment:14 Changed 7 months ago by dgoulet

I do like this as a safety measure. And code looks reasonable.

Questions and comments:

First thing. Can't we pass the typename instead of "some part of the type name" and then append the _t for FREE_AND_NULL()? I got confused when I saw this in hs_service.h because "hs_service" is not a thing at all in the code:

#define hs_service_free(s) FREE_AND_NULL(hs_service, (s))

... which also would probably allow to remove the UNMATCHED version?

On a more general note, what I'm concerned about is the added complexity that any free function of any object will kind of require in the future to follow our "code standards". Probably not unreasonable to ask for this but we are talking about an interesting requirement for future code to match this template:

void a_free_(a)
#define a_free(a) FREE_AND_NULL(<type of a>, (a))

Which basically creates two public symbols for the same job, one "safe" and one "unsafe" (for the definition of safe being source ptr to NULL).

I'm personally fine adapting here especially if this whole exercise becomes our practice and help mitigate use-after-free in some ways but we should just take a moment and realize the future of our free functions :).

comment:15 Changed 7 months ago by nickm

FWIW, we currently have 87 FREE_AND_NULL() cases, and 18 FREE_AND_NULL_UNMATCHED() cases. Another option might be to rename (many of) the remaining 18 types that have a free function that doesn't match their type name. :)

I do agree that this is more complexity for making a foo_free() function -- but I do think it is something we want to require.

comment:16 Changed 6 months ago by nickm

I've updated the branch so that there is only FREE_AND_NULL(), and everything uses it. Back to you, dgoulet. :)

comment:17 Changed 6 months ago by nickm

(and FREE_AND_NULL now does what FREE_AND_NULL_UNMATCHED) used to do.

comment:18 Changed 6 months ago by dgoulet

Status: needs_reviewneeds_revision

I like it! Although, did you do that by hand or with a script? I'm asking because I see some indentation issue when it goes to 3 lines which should be not that complicated to fix with a script?

+#define service_intro_point_free(ip)                            \
+  FREE_AND_NULL(hs_service_intro_point_t,             \
+                          service_intro_point_free_, (ip))

Ideally, we would like to have them lined up:

+#define service_intro_point_free(ip)              \
+  FREE_AND_NULL(hs_service_intro_point_t,         \
+                service_intro_point_free_, (ip))

Also, if we could have a note in our CodingStandards.md about this template, it would be grand!

Finally, correct me if I'm wrong, but this also *only* applies to non-static (public) free functions so if we merge this, we'll still have a series of static functions that don't do this "free and null" dance. Can we think of something to fix that? I mean, seems that we can only enforce this template to our public functions but nothing within a C code file unless we make all those symbols public and I'm not too keen about that :S.

comment:19 Changed 6 months ago by dgoulet

Reviewer: dgoulet

comment:20 Changed 6 months ago by nickm

I did the line-splitting by hand. I'm not super excited to go through and reindent all the lines -- though I'd run a script to do it, if we can have one that isn't super-disruptive.

I'd like to apply this to static free function as well, but that will take a little longer. Shall I give it a try?

comment:21 Changed 6 months ago by nickm

Status: needs_revisionneeds_review

Okay; I've made the requested changes. :)

comment:22 Changed 6 months ago by dgoulet

Status: needs_reviewmerge_ready

Ok I guess we are good to go. Compiles and make check passes. However, it is based on a head commit for which test-network-all is broken so I can't really test that part.

Last thing, let us make a note at the next weekly meeting to "announce" to everyone that this is a thing now because creating free() functions just became a bit more complex than just tor_free() :).

Extra idea for helping us but doesn't block this branch getting merged: What about we add an helper macro to declare/implement free functions which would use a standard naming convention (and we could have one that takes a free_fn for special cases:

#define DECL_FREE(name, typename)                 \
  void name##_(typename *__obj);                  \
  static inline void                              \
  name(typename *__obj) {                         \
    FREE_AND_NULL(typename, name##_, (__obj));     \
  }

As an example with hs_service_free():

+ DECL_FREE(hs_service_free, hs_service_t)
- void hs_service_free_(hs_service_t *service);
- #define hs_service_free(s) FREE_AND_NULL(hs_service_t, hs_service_free_, (s))

It makes the hs_service_free() inline, then declares the hs_service_free_() which needs to be implemented in a .c file for which we could use IMPL_FREE(hs_service_free, hs_service_t) kind of macro for the signature.

The thing I like about this is that it completely hides the fact that a second symbol with a trailing underscore exists and thus forces us to always use and consider hs_service_free instead of hs_service_free_. We could go one step further and use something like void __name## which would make the symbol not even in the "namespace" (or prefix it in its own namespace like "tor_free*").

Anyway, we don't have to change the whole patch again but I think it would 1) help and simplify the code base especially helping developers creating free function by using the DECL/IMPL macros and 2) hide the "private but not really private" real free function so we never make the mistake of using it directly.

comment:23 Changed 6 months ago by nickm

I don't think that macro example would work -- putting FREE_AND_NULL in an inline function would make it have no effect on the varfiable the functio is called on -- only obj would get set to NULL.

What you'd need here is some way do say

#define DECL_FREE(name, typename)                 \
  void name##_(typename *__obj);                  \
  #define name(obj) FREE_AND_NULL(typename, name##_, (obj))

but you can't do that in C.

Squashed and merged. Some conflicts resolved in "5ee0cccd49e57fad8c81".

comment:24 Changed 6 months ago by nickm

Resolution: implemented
Status: merge_readyclosed
Note: See TracTickets for help on using tickets.