Opened 11 months ago

Last modified 10 months ago

#28684 new defect

Make sbws easy to understand and maintain

Reported by: juga Owned by:
Priority: Medium Milestone: sbws: unspecified
Component: Core Tor/sbws Version:
Severity: Normal Keywords:
Cc: juga, teor, irl Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

There was not an initial design and we are still adding functionality. Some of the new functionality require modifying several parts of the code and it's becoming harder to maintain and understand.
This ticket is to identify the changes needed (refactors, etc.).

Child Tickets

TicketStatusOwnerSummaryComponent
#27362new(sub-)packages outside of core (cli) should not need to know about confs and argsCore Tor/sbws
#28282assignedjugaRefactor bandwidth file generation codeCore Tor/sbws
#28718newSimplify configuration optionsCore Tor/sbws
#28737closedRedesign sbws torrc option configurationCore Tor/sbws
#29046closedjugaRemove unused testnetsCore Tor/sbws
#29047assignedjugaImprove code style following PEP8 and PEP257Core Tor/sbws
#29048assignedjugaRemove unused codeCore Tor/sbws
#29057assignedjugaAdapt bandwidth file classes to be compatible with stem (descriptors, etc) documentsCore Tor/sbws
#29717newRefactor Relay and RelayList to be able to initialize them without Tor's controllerCore Tor/sbws
#29718newInclude a refactor planCore Tor/sbws
#29721newRefactor RelayListCore Tor/sbws
#29726newRename constants, variables, classes, methods, functionsCore Tor/sbws
#29755newConsider to rename the Bandwidth file KeyValuesCore Tor/sbws

Change History (19)

comment:2 Changed 11 months ago by teor

These are some good ideas. But they might take a long time to do.

Do you want to do them after sbws 1.1?

If we just want to change variable names, we can do the changes in any version.
If we want to change the keys in the data file, we should write code that reads the old keys, or do the changes in sbws 2.0.
If we want to change the keys in the bandwidth file, we could do the changes in sbws 2.0 and bandwidth file format 2.0.

comment:3 Changed 11 months ago by teor

But no-one is parsing the bandwidth file yet, so if we want to make major changes, now would be a good time.

comment:4 in reply to:  3 ; Changed 11 months ago by juga

Replying to teor:

But no-one is parsing the bandwidth file yet, so if we want to make major changes, now would be a good time.

We can create a child ticket for changing the keys in the bandwidth file, do it after 1.0 and upgrade the other milestones to 2.X.

comment:5 in reply to:  4 ; Changed 11 months ago by teor

Cc: irl added

Replying to juga:

Replying to teor:

But no-one is parsing the bandwidth file yet, so if we want to make major changes, now would be a good time.

We can create a child ticket for changing the keys in the bandwidth file, do it after 1.0 and upgrade the other milestones to 2.X.

We could change the keys in the bandwidth file, or we could leave them as they are, and document the abbreviations.
I am not sure if changing the keys is worth it. We would also have to change the spec.
Edit: and fix any bugs that we create in the change

juga, do you think that changing the keys will make sbws a lot easier to maintain?
If it won't, or if it will take a lot of time, we should just focus on fixing bugs and deploying sbws.

irl, how much work have you done on parsing the bandwidth file?
Is now a good time to change some of the existing keys?
Edit: complete question

Last edited 11 months ago by teor (previous) (diff)

comment:6 Changed 11 months ago by irl

I have not yet done any work on parsing the bandwidth file, and it's unlikely to happen this week.

For any files that are archived we should have parsers, so that means tjr's archive of torflow files and then whatever sbws files look like when we start archiving them.

comment:7 Changed 11 months ago by irl

If the file format is changing, and refactoring things is something you have time for, adding a bandwidth list formatter and parser to stem would allow it to be used in sbws and then also in bushel. There won't be any inter-op testing so we're more likely to miss bugs in spec compliance but there would be reduced code maintenance.

comment:8 in reply to:  5 Changed 11 months ago by juga

Replying to teor:

Replying to juga:

Replying to teor:

But no-one is parsing the bandwidth file yet, so if we want to make major changes, now would be a good time.

We can create a child ticket for changing the keys in the bandwidth file, do it after 1.0 and upgrade the other milestones to 2.X.

We could change the keys in the bandwidth file, or we could leave them as they are, and document the abbreviations.
I am not sure if changing the keys is worth it. We would also have to change the spec.
Edit: and fix any bugs that we create in the change

juga, do you think that changing the keys will make sbws a lot easier to maintain?
If it won't, or if it will take a lot of time, we should just focus on fixing bugs and deploying sbws.

I agree that just changing the name of the keys doesn't affect much on the maintainability of sbws and i agree will take some time, however as you said, if we don't do it now we'll have those keys for longer time. I don't know...

comment:9 in reply to:  7 Changed 11 months ago by juga

Replying to irl:

If the file format is changing, and refactoring things is something you have time for, adding a bandwidth list formatter and parser to stem would allow it to be used in sbws and then also in bushel. There won't be any inter-op testing so we're more likely to miss bugs in spec compliance but there would be reduced code maintenance.

This sounds a good idea. I think that implementing the bandwidth file parsing in stem is just mostly copy-pasting what we implemented in sbws (without the scaling part and parsing results, it's not too complex nor too much code0.
So, maybe instead of renaming the keys, as soon as there's time (in 1/2 weeks), i could try this. Later we can rename the keys there.

comment:10 in reply to:  6 Changed 11 months ago by teor

Replying to irl:

I have not yet done any work on parsing the bandwidth file, and it's unlikely to happen this week.

For any files that are archived we should have parsers, so that means tjr's archive of torflow files and then whatever sbws files look like when we start archiving them.

Ok, if we've already archived some torflow and sbws flies, then we shouldn't change the keys.

comment:11 Changed 11 months ago by irl

We can't just change the keys if we have old files that we'd want to parse. It would have to be a new version of the format and we'd need at least parsers (if not also formatters) for both.

When I say "archvied" I mean files that are eventually going to end up in CollecTor. Currently this is only tjr's files unless someone is archiving sbws bandwidth lists and can convince me that they are useful to archive (I am not aware of anyone doing this).

comment:12 in reply to:  11 Changed 11 months ago by teor

Replying to irl:

We can't just change the keys if we have old files that we'd want to parse. It would have to be a new version of the format and we'd need at least parsers (if not also formatters) for both.

When I say "archvied" I mean files that are eventually going to end up in CollecTor. Currently this is only tjr's files unless someone is archiving sbws bandwidth lists and can convince me that they are useful to archive (I am not aware of anyone doing this).

How is tjr archiving bandwidth files?
From which authorities?

If tjr is archiving bandwidth files from longclaw, then those files are generated by sbws.

comment:14 Changed 11 months ago by teor

It is not sbws. sbws has headers.

comment:15 Changed 10 months ago by juga

Now that sbws milestone 1.0 is mostly done, i could work on this before starting with 1.1 milestone.
Should i then create a ticket to implement this in stem?.

irl, which should be the output of the parser?, json?.

If json, would it be something like?.

{
  "timestamp": int,
  # other keyvalues in header
  # bandwidth lines
  {
    # one bandwidth line
    {
      "bw": int,
      "nodeid": string,
      # other line's keyvalues
    }
  }
}

comment:16 Changed 10 months ago by irl

The output of the parser for stem should be a Python object. Take a look at the ServerDescriptor class for an example:

https://stem.torproject.org/api/descriptor/server_descriptor.html#stem.descriptor.server_descriptor.RelayDescriptor

comment:17 in reply to:  16 Changed 10 months ago by juga

Replying to irl:

The output of the parser for stem should be a Python object. Take a look at the ServerDescriptor class for an example:

https://stem.torproject.org/api/descriptor/server_descriptor.html#stem.descriptor.server_descriptor.RelayDescriptor

Oh, then it's even easier, since it's already an object (https://github.com/torproject/sbws/blob/master/sbws/lib/v3bwfile.py#L511).
The things to do then:

  1. Maybe solve first #28282 to eliminate the aggregation logic from the classes
  2. To do not depend on a stem version that is not released, duplicate the code both in sbws and stem, until there's a new stem release

comment:18 Changed 10 months ago by irl

It would be a good idea to speak to atagar about what he would want to see for the parser/formatter. Ideally it would work in the same way as the existing ones which might need some functions renamed or with different arguments. Those bugs should be new tickets too, in the stem component.

comment:19 Changed 10 months ago by juga

i've created #29056 for this, we can now discuss this there.

Note: See TracTickets for help on using tickets.