Opened 3 years ago

Closed 10 months ago

#13827 closed defect (implemented)

Cell handling code duplication in channel.c

Reported by: rl1987 Owned by: pingl
Priority: Medium Milestone: Tor: 0.3.0.x-final
Component: Core Tor/Tor Version:
Severity: Normal Keywords: refactoring, easy
Cc: Actual Points:
Parent ID: Points: 0.1
Reviewer: dgoulet, nickm Sponsor:

Description

In 336c856e52d211aad6b40d29986264f3277a1327 :

  • channel_get_var_cell_handler() and channel_get_cell_handler() looks very similar
  • channel_write_cell(), channel_write_packed_cell() and channel_write_var_cell() is mostly duplicated code.
  • so are channel_queue_cell() and channel_queue_var_cell()

We should refactor to reduce the code duplication.

Child Tickets

Attachments (4)

test_channel.c.patch (7.0 KB) - added by pingl 11 months ago.
test_channel.c patch
relay.c.patch (832 bytes) - added by pingl 11 months ago.
relay.c patch
channel.h.patch (508 bytes) - added by pingl 11 months ago.
channel.h patch
channel.c.patch (4.4 KB) - added by pingl 11 months ago.
channel.c patch

Download all attachments as: .zip

Change History (24)

comment:1 Changed 3 years ago by rl1987

Owner: set to rl1987
Status: newaccepted

comment:2 Changed 3 years ago by balubenith

These functions as you have mentioned have same code so we need to modify it .
I would like to know what will be the solution of this bug as i am new to bug solving

comment:3 Changed 3 years ago by nickm

Probably the best way to handle this would be to figure out how and whether cell_t, var_cell_t, and packed_cell_t can (or should) be merged into a single type, and/or to look at dataflow to see if some of those functions can be rewritten to use the others instead.

One wrinkle here is that the *cell_t types need to be very efficient; every piece of data that Tor transmits goes through one of them.

As for eliminating duplicate code in general: I think it's a higher priority to do it for larger functions, and proportionally less valuable to do it for smaller functions.

comment:4 Changed 19 months ago by nickm

Points: small/medium
Severity: Normal

comment:5 Changed 11 months ago by pingl

channel_write_cell()channel_write_packed_cell() and channel_write_var_cell() are combined into the channel_write() function that takes as a 3rd argument the cell type. That's defined in channel.h as enum with 3 possible values: CELL, PACKED_CELL, VAR_CELL.

comment:6 Changed 11 months ago by nickm

Milestone: Tor: unspecifiedTor: 0.2.9.x-final
Status: acceptedneeds_review

comment:7 Changed 11 months ago by nickm

Keywords: review-group-9 added

comment:8 Changed 11 months ago by nickm

Owner: changed from rl1987 to pingl
Status: needs_reviewassigned

comment:9 Changed 11 months ago by nickm

Status: assignedneeds_review

comment:10 Changed 11 months ago by asn

Hey pingl, you might consider uploading your patches on a git repo somewhere, and then post a link to the branch here. As an example, here is another ticket where this is done:
https://trac.torproject.org/projects/tor/ticket/19487#comment:3

This will make review slightly easier and it's a better way to contribute code to Tor.

Thanks :)

comment:11 in reply to:  10 Changed 11 months ago by pingl

Sorry, here it is the link for the git tree :)

https://github.com/aigna/tor/tree/defect13827

Last edited 11 months ago by pingl (previous) (diff)

comment:12 Changed 11 months ago by dgoulet

Points: small/medium0.1
Reviewer: dgoulet
Status: needs_reviewneeds_revision

(I've pull your patch into bug13827_029_01 if anyone wants to see it from git.tpo in my repo.)

DG1: This worries me:

-  q.u.var.var_cell = var_cell;
+  q.u.fixed.cell = cell;

In theory, that could work since it's a union and all cell points there but kind of recipe for disaster and bad semantic. What you could do is take a reference on the right cell member of the union while in the switch case and then assign it after.

DG2: Can't you use CELL_QUEUE_* as the cell type?

DG3: Few things. I would rename ctype to cell_type. The switch case MUST have a default branch that you could do a BUG() on and bail. Finally, no need for extra space between the case and the end of the function.

Thanks for this!

comment:13 in reply to:  12 Changed 11 months ago by pingl

Please check the new patches. I think you are right about DG1 and also there is an issue about the fact that the dereferencing is not available outside the switch scope. That means that cell would remain of type void*. For these reasons I've moved the (still a bit duplicated) code into the switch cases. I'll add the default case (I wasn't sure how to manage it) and change the ctype.

Thanks for the comments, I am new to tor.

Also this is the link for the git commit https://github.com/aigna/tor/tree/defect13827/

Replying to dgoulet:

(I've pull your patch into bug13827_029_01 if anyone wants to see it from git.tpo in my repo.)

DG1: This worries me:

-  q.u.var.var_cell = var_cell;
+  q.u.fixed.cell = cell;

In theory, that could work since it's a union and all cell points there but kind of recipe for disaster and bad semantic. What you could do is take a reference on the right cell member of the union while in the switch case and then assign it after.

DG2: Can't you use CELL_QUEUE_* as the cell type?

DG3: Few things. I would rename ctype to cell_type. The switch case MUST have a default branch that you could do a BUG() on and bail. Finally, no need for extra space between the case and the end of the function.

Thanks for this!

Last edited 11 months ago by pingl (previous) (diff)

Changed 11 months ago by pingl

Attachment: test_channel.c.patch added

test_channel.c patch

Changed 11 months ago by pingl

Attachment: relay.c.patch added

relay.c patch

comment:14 Changed 11 months ago by nickm

Status: needs_revisionneeds_review

comment:15 Changed 11 months ago by nickm

Milestone: Tor: 0.2.9.x-finalTor: 0.3.0.x-final
Status: needs_reviewneeds_revision

More fundamentally, I think my issue with the channel.c patch is that it doesn't actually reduce the code duplication very much. There are still 3 CHANNEL_IS_CLOSING() checks that are more or less the same, 6 log statements that are more or less the same, etc. And there's a loss of type safety -- we should be scared every time we use void *.

Here's what I'd suggest: Have an underlying function that all 3 functions call. It could look like this:

static int
channel_write_cell_generic_(channel_t *chan, const char *celltype, void *cellptr, cell_queue_entry_t *q)
{
   tor_assert...
   if (CHANNEL_IS_CLOSING(chan)) { 
     log...
     return;
   }
   log...
   channel_write_cell_queue_entry(chan, q);  
}

void
channel_write_var_cell(channel_t *chan, var_cell_t *var_cell)
{
   cell_queue_entry_t q;
   q.type = CELL_QUEUE_VAR,
   q.u.var_cell = var_cell;
   channel_write_cell_generic(chan, "var_cell_t", var_cell, &q);
}
...

comment:16 Changed 11 months ago by pingl

Ok, that makes sense. Just fix that and I'll commit the new version.

Last edited 11 months ago by pingl (previous) (diff)

Changed 11 months ago by pingl

Attachment: channel.h.patch added

channel.h patch

Changed 11 months ago by pingl

Attachment: channel.c.patch added

channel.c patch

comment:17 Changed 11 months ago by pingl

Obviously relay.c and test_channel.c don't need any patch now.

comment:18 Changed 11 months ago by nickm

Status: needs_revisionneeds_review

comment:19 Changed 11 months ago by nickm

Keywords: review-group-9 removed
Reviewer: dgouletdgoulet, nickm
Status: needs_reviewmerge_ready

Nice! This can go in once 0.3.0 is forked.

comment:20 Changed 10 months ago by nickm

Resolution: implemented
Status: merge_readyclosed

Merged!

Note: See TracTickets for help on using tickets.