Opened 6 years ago

Closed 5 years ago

#9975 closed enhancement (fixed)

use argparse rather than getopt

Reported by: infinity0 Owned by: infinity0
Priority: Low Milestone:
Component: Archived/Flashproxy Version:
Severity: Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Depends on #6810

We ought to move to argparse rather than hard-code our own. This would make it easier to inheritance options between modules, for example something like

# in flashproxy.util

def add_module_opts(parser):
  parser.add_argument("--unsafe_logging", etc)
  old_parse = parser.parse_args
  parser.parse_args = lambda **kwargs: _parse_module_opts(old_parse(**kwargs))

def _parse_module_opts(parsed_opts): # a Namespace object
  if parsed_opts.unsafe_logging:
    # do something
  return parsed_opts

Then in the main script:

def main(argv):
  parser = argparse.ArgumentParser()
  parser.add_argument(app-specific)
  flashproxy.util.add_module_opts(parser)
  ...
  opts = parser.parse_args()
  ...

Child Tickets

Change History (19)

comment:1 Changed 6 years ago by dcf

argparse can generate --help output automatically, right? For me that's the killer feature. (If we can coax it into writing part of our man pages too, that's double killer.)

I don't get the purpose of the _parse_module_opts trick in the description. What concretely happens in # do something? Couldn't you just call a function that just calls add_argument for each common option?

FYI, the class options in most of the programs isn't meant to stand for "command-line options"; it is really just a container for a lot of global variables (most of which happen to be manipulatable through command-line options). On first appraisal, I would like to continue using the command-line parser as a data source for filling in the options table, but I would not like to couple internal options so tightly to the command-line interface as to replace the global options table with a global Namespace object. On the other hand, maybe it doesn't matter much either way.

argparse is used by obfsproxy so we already figured out its bundle packaging issues.

comment:2 Changed 6 years ago by infinity0

In argparse, there are two stages - you configure your parser (add_argument), then you do the parsing (parse_args then process the Namespace object that comes out of it).

Some options really belong to utility modules - for example --unsafe-logging is only used by the logging code and safe_str. We can move this code into the common library, and also put the configuration there (add_module_opts), as well as processing of the Namespace object (_parse_module_opts).

It works essentially as you want - the internal options class (or whatever other representation we choose) could remain, and we'd just process the Namespace object to populate it. So for the # do something example, this might be flashproxy_util_options.unsafe_logging = parsed_opts.unsafe_logging.

comment:3 in reply to:  2 Changed 6 years ago by dcf

Replying to infinity0:

In argparse, there are two stages - you configure your parser (add_argument), then you do the parsing (parse_args then process the Namespace object that comes out of it).

Some options really belong to utility modules - for example --unsafe-logging is only used by the logging code and safe_str. We can move this code into the common library, and also put the configuration there (add_module_opts), as well as processing of the Namespace object (_parse_module_opts).

It works essentially as you want - the internal options class (or whatever other representation we choose) could remain, and we'd just process the Namespace object to populate it. So for the # do something example, this might be flashproxy_util_options.unsafe_logging = parsed_opts.unsafe_logging.

That sounds good. That --unsafe-logging is a good example.

comment:4 Changed 6 years ago by infinity0

Owner: changed from dcf to infinity0
Status: newassigned

comment:5 Changed 6 years ago by infinity0

Working on this right now, things seem promising:

$ git diff master --stat | tail -n1
 8 files changed, 241 insertions(+), 518 deletions(-)

Have done the reg methods so far, will do the client tomorrow.

https://github.com/infinity0/flashproxy/compare/master...bug9975

comment:6 Changed 6 years ago by infinity0

Code pretty much complete and pushed to the above link, I will write some additional docstrings next.

comment:7 Changed 6 years ago by atagar

Just spotted this ticket. For what it's worth both stem and arm use getopt using the following pattern...

https://gitweb.torproject.org/arm.git/blob/HEAD:/arm/starter.py#l101

Personally I like it quite a bit, but that said go with whatever you like best. :)

comment:8 in reply to:  7 Changed 6 years ago by dcf

Status: assignedneeds_review

Replying to atagar:

Just spotted this ticket. For what it's worth both stem and arm use getopt using the following pattern...

https://gitweb.torproject.org/arm.git/blob/HEAD:/arm/starter.py#l101

Personally I like it quite a bit, but that said go with whatever you like best. :)

Do you have a way to automatically generate --help output for the options? Automatic --help output is the big thing for me.

I'm no fan of argparse--I think it makes a lot of wrong design decisions. It's telling that the very first example in its documentation shows how to sum a list of numbers, instead of a realistic use case. This issue is particularly baffling. But I think argparse is okay as long as we stay away from the cleverer parts and use it as a straightforward option parser. flashproxy has a bunch of helper programs, with many common options shared between them, and it was starting to become a burden to update the parser and --help output in all of them at the same time.

comment:9 Changed 6 years ago by atagar

Do you have a way to automatically generate --help output for the options? Automatic --help output is the big thing for me.

I consider automatic help to be a minus rather than a plus (I want full control over the help output).

This issue is particularly baffling.

True, positional arguments is one thing it definitely does not handle well. Stem and arm don't need that, but if you do then you might need to fall back on checking the argv for those (ick).

comment:10 in reply to:  9 Changed 6 years ago by infinity0

Replying to atagar:

This issue is particularly baffling.

True, positional arguments is one thing it definitely does not handle well. Stem and arm don't need that, but if you do then you might need to fall back on checking the argv for those (ick).

If we really need this, it can be done with a separate optional vs positional parser. Call parse_known_args on the optional parser, then pass unknown_args to the positional parser. Not sure why this isn't mentioned as a workaround on that ticket. (Some additional fudging would be needed to make the help text work correctly though.)

Another thing which is a bit easier on argparse is taking additional options from a module, so that different programs can share the same options. This reduces quite a lot of code duplication. It is probably possible on getopt but the methods I can think of are all very messy.

comment:11 Changed 6 years ago by dcf

What's in https://github.com/infinity0/flashproxy/tree/bug9975 seems to have quite a lot of refactoring and other changes on top of a migration to argparse. I understand that argparse enables these changes, but could we have a smaller branch that only migrates to argparse? I mean even without splitting out common options. That's something I would love to merge right away.

When it comes to parsing common options in submodules, that's something I would like to think about a bit more.

comment:12 Changed 6 years ago by infinity0

I've re-done these series of commits here, now grouped by intent/topic rather than by file

https://github.com/infinity0/flashproxy/compare/master...bug9975

  • The first 5445680 is a minor deduplication that was better placed at the start.
  • The next two 95c4a88 c665081 do the actual migration to argparse.
  • The next few 913fa0c 116582c 817a00e 8b1c987 do refactorings within files, mostly making global variables more local in scope.
  • The final 31913a4 does a deduplication over all the files, mostly deleting common code.

comment:13 Changed 6 years ago by infinity0

Pushed another commit to fix a bug I introduced in flashproxy-reg-url.

comment:14 Changed 6 years ago by dcf

Thank you for rewriting your branch as you have done. There is good stuff here, though I still have reservations.

5445680b, 8b1c987a, and ea156f70 are great, go ahead.

95c4a889 and c6650817 are okay. I'm fine with those being merged any time. I hate that it will permanently break some command lines that currently work--I just know I'll be bitten by this fact in the future, and will curse myself for having allowed it. But I'm willing to try and ignore that feeling if you feel strongly for argparse.

I strongly disagree with the localization of global variables in 913fa0cc 116582c9 817a00e9. Options are global state. They should be represented by global variables. Is there a technical reason for localizing the variables, having to do with argparse? Is it to avoid copying values from ns to options?

I've read the code of add_module_opts many times and I still don't understand what it's doing. Is there any way to rewrite it so that it's more clear what is going on? I don't see why the code has do anything other than call parser.add_argument a bunch of times.

+    old_parse = parser.parse_args
+    def parse_args(namespace):
+        options.transport = namespace.transport
+        options.facilitator_pubkey = namespace.facilitator_pubkey
+        return namespace
+    parser.parse_args = lambda *a, **kw: parse_args(old_parse(*a, **kw))

I think you will agree that ignore_pubkey is ugly. It's contrary to the purpose of this branch, which is greater orthogonality and fewer special cases. Do you have a second or third proposed alternate design?

+def add_module_opts(parser, ignore_pubkey=False):
 ...
+    parser.add_argument("--facilitator-pubkey", metavar="FILENAME",
+        help=(_OPTION_IGNORED if ignore_pubkey else "encrypt registrations to "
+        "the given PEM-formatted public key file (default built-in)."))

I hate to be so negative. I've tried to get over my misgivings about this branch but I can't quite shake them.

comment:15 Changed 6 years ago by infinity0

There is no technical reason for localising the variables, so we can drop those if you feel strongly about it. However, it's generally good to restrict the scope of variables as much as possible. In this case, you could think of those not as "global options", but options that control the instantiation of the program.

If you had a function prog(a,b) that called other smaller functions, you wouldn't necessarily want all of those smaller functions to see a,b and be able to mutate them. edit: commonly, you want the smaller functions to see them but not mutate them, so you pass a,b into the smaller functions, similar to what I do in those commits:

def f(a): # do stuff with a
def g(b): # do stuff with b
def prog(a,b):
  f(a)
  g(b)
  # do more stuff

This means that f,g have fewer free variables, which makes them more reusable.

For add_module_opts: argparse only creates and populates a local namespace object (containing parsed options across all modules), and doesn't touch global state. So we need to copy the state into the module's global options object. Also, some of these options (e.g. address_family) are complex enough that argparse's own parse_args can only handle it incompletely; we need to finish off the job with extra logic. If we don't do all of this, then consumers of our module need to do it themselves, duplicating code.

I'll think about a better way for doing ignore_pubkey...

Last edited 6 years ago by infinity0 (previous) (diff)

comment:16 Changed 6 years ago by infinity0

Here's another try. I've rebased everything on top of current master, since it's been a long time since the original fork:

https://github.com/infinity0/flashproxy/compare/master...bug9975_rebase

The bottom 4 commits 2d1fd22d...d3dbe189 correspond to the ones you approved above. The next commit e6b07f1d is a minor fix on top. (In the previous branch, this fix was incorporated into a different commit.) I will straight-up merge these 5 commits to master in a few days, but I thought I'd give you some time in case you want to have a look at it again.

The next commit badb39f9 corresponds to 31913a47 in the previous branch, and is possibly controversial. IMO it's a significant refactoring gain - 177 ins, 338 del. I've avoided localising the global options variables in flashproxy-client, as you suggested. However, for the reg-methods, it turns out that this localising is redundant - the refactoring already pushes these into the flashproxy.{reg,keys,utils} modules. The end-result is that (in each reg-method program) we have a global options object initialised after cmdline-parsing, instead of a global options class initialised after program load. (I've also avoided pushing everything into main()s, to make the diff a bit shorter.)

I have a few other ways for doing ignore_pubkey. Part of the difficulty is that inherently the http reg-method is an exception - we're not doing an encrypted registration. Here are my preferences:

  1. Treat reg-http as an even more special anomaly, and just verbosely copy the other flag-parsing code from flashproxy.reg. This would make the shared code nicer, its interface nicer for future reg-method authors, but it would make reg-http look worse.
  2. Keep ignore_pubkey in the code, but instead of ignoring the flag, just disable it altogether. This makes things nicer for the user, but it means flashproxy-client can't unconditionally pass --facilitator-pubkey to the reg-methods.
  3. Take ignore_pubkey out of the code, and have *every other* reg-method have to explicitly parse --facilitator-pubkey.

Actually, I like (1) the more I think about it, so I will probably go and implement that. Perhaps I'll even keep the ignore-pubkey behaviour (inside reg-http only), since it lets flashproxy-client unconditionally pass the flag to any reg-method.

comment:17 Changed 5 years ago by infinity0

I've implemented (1) and will merge this soon, like in the next hour.

The localisation of some existing global variables will be necessary in order to do #11574 (e.g. options.transport and options.facilitatory_pubkey) but I will do them in future commits.

comment:18 Changed 5 years ago by infinity0

Status: needs_reviewaccepted

comment:19 Changed 5 years ago by infinity0

Resolution: fixed
Status: acceptedclosed
Note: See TracTickets for help on using tickets.