Cleanup of pyptlib internals

added component::circumvention/pluggable transport owner::asn priority::medium pyptlib resolution::fixed status::closed type::defect labels

Discussed this with Ximin on IRC.

First part of review:

Change the fromEnv() business to happen in __init__() or in init().
Update the sphinx docs with the new API.
Some tests for the new API. It would be awesome to have tests for closures like validate_transports() too.
validate_sever_bindaddr -> validate_server_bindaddr
_declareSupports expects a list according to its docstring: type(transports) should never be str. I like keeping the dynamic type madness to the minimum :)
You really don't want to do self.reportMethodError(t, 'unsupported transport'). We are supposed to ignore transports we don't recognize. If you send a single method error, the whole transport proxy will get destroyed by Tor.

Dealt with the above plus updated docs and added more tests.

Compare link above should show the new commits.

Only "tests for closures like validate_transports() too" is not done. I am effectively testing them, as part of the test for the overall fromEnv parser, but I can pull them out into their own tests if you want.

Comments after first code review:

Why is fromEnv() a classmethod that returns an instance of the class? Why don't you do that stuff in the __init__ of the class? I'd prefer if it was so, if there is no real reason for fromEnv to be a classmethod.
env_id is unused.
Why do you call the function getServedTransports? Why Served? Same with getServedBindAddresses. Served by whom and where? I think I would prefer getTransports or something.
Can you make it clearer that getServedBindAddresses returns a dict? Also, how are people supposed to get the address of the ExtendedORPort, or the state location with the new API? Can this be mentioned in the sphinx docs?
You didn't update the docs for the reportMethodSuccess API change.
In EnvError, is the following correct? Doesn't that return a boolean?

    def __str__(self):
        return self.message or self.cause.message

Please do the :raises: doc dance in exception-raising places like CheckClientMode and env_has_k.
Have you used pylint (or any other static analyzer) on the new code? Did it find anything fix worthy (unused vars, unused imports, etc.)

Apart from the above, we need to write patches for obfsproxy and sshproxy, and then we are ready to merge :)

Replying to asn:

Comments after first code review:

Why is fromEnv() a classmethod that returns an instance of the class? Why don't you do that stuff in the __init__ of the class? I'd prefer if it was so, if there is no real reason for fromEnv to be a classmethod.

The abstract reason is to separate parsing vs initialisation which is good software engineering practise. This is a common idiom, I am not doing anything weird here.

A concrete reason is to provide an entry point that bypasses all the heavyweight env-var parsing. This allows us to create an empty default config in a simple way, rather than having to worry about reverse-engineering the correct env-var values that, when parsed, turn into an empty default config. This shortcut is actually done in test_core.py, as well as any potential future code that might want to initialise a Config() without going through env-vars.

If you don't consider this to be a "real reason" - why not? Having a separate init is only about 20 more lines total; if I remove this then I would have to add even more lines to the tests, including duplicating a lot of the code in test_core.py into server- and client-specific versions. I'll make this change if you insist, but I maintain that it adds complexity even if it looks shorter.

env_id is unused.

Done, removed.

Why do you call the function getServedTransports? Why Served? Same with getServedBindAddresses. Served by whom and where? I think I would prefer getTransports or something.

This is (to quote the new docstring) "transports that this plugin can serve", as opposed to transports requested by Tor in the TOR_PT_*_TRANSPORTS environment variable (which is a superset of it). I can make the name change if you still prefer it, though.

Can you make it clearer that getServedBindAddresses returns a dict? Also, how are people supposed to get the address of the ExtendedORPort, or the state location with the new API? Can this be mentioned in the sphinx docs?

Done former, already did latter.

You didn't update the docs for the reportMethodSuccess API change.

Done.

In EnvError, is the following correct? Doesn't that return a boolean? {{{ def str(self): return self.message or self.cause.message }}}

a or b is short for a if bool(a) else b

Please do the :raises: doc dance in exception-raising places like CheckClientMode and env_has_k.

Done.

Have you used pylint (or any other static analyzer) on the new code? Did it find anything fix worthy (unused vars, unused imports, etc.)

Done.

Apart from the above, we need to write patches for obfsproxy and sshproxy, and then we are ready to merge :) I'll prepare those patches, but they should only be merged after this is already deployed.

Replying to asn:

Apart from the above, we need to write patches for obfsproxy and sshproxy, and then we are ready to merge :)

https://github.com/infinity0/sshproxy/compare/pyptlib-api https://github.com/infinity0/obfsproxy/compare/pyptlib-api

Replying to asn:

Why do you call the function getServedTransports? Why Served? Same with getServedBindAddresses. Served by whom and where? I think I would prefer getTransports or something.

OK, I've done the rename and also removed the old API methods that expose the "transports that Tor wants", on the basis that users are unlikely to want this information independently of the also-existing "transports that we support (and that Tor wants)".

Replying to infinity0:

Replying to asn:

Apart from the above, we need to write patches for obfsproxy and sshproxy, and then we are ready to merge :)

https://github.com/infinity0/sshproxy/compare/pyptlib-api https://github.com/infinity0/obfsproxy/compare/pyptlib-api

Hm. The obfsproxy modifications are not complete. There are still remnants of managed_info in managed/server.py and I get NameErrorL global name 'managed_info' when I try to run it.

BTW, can you please test the new obfsproxy with the new pyptlib?

Oh, I just noticed that the tests in test/test_server.py and in test/test_client.py were not rewritten for the new API, and are still using the deprecated functions! Naughty!

We definitely need to rewrite (and enrich) these tests to use the new API before deploying this branch. Even though those tests suck, they test that pyptlib throws the proper exceptions, and returns correct results -- should be good to verify that for the new API.

If you don't have time to do it, I will do it before merging.

Thanks!

Replying to asn:

Oh, I just noticed that the tests in test/test_server.py and in test/test_client.py were not rewritten for the new API, and are still using the deprecated functions! Naughty!

We definitely need to rewrite (and enrich) these tests to use the new API before deploying this branch. Even though those tests suck, they test that pyptlib throws the proper exceptions, and returns correct results -- should be good to verify that for the new API.

If you don't have time to do it, I will do it before merging.

Thanks!

Ugh. Scratch that! The tests were updated fine. I was looking (and barking) up the wrong (git) tree...

Merged! Thank you very much!

Trac:
Resolution: N/A to fixed
Status: new to closed

closed

moved to tpo/anti-censorship/pluggable-transports/trac#9485 (closed)

Cleanup of pyptlib internals

Child items ...

Activity