Opened 11 months ago

Last modified 5 months ago

#32534 needs_review defect

settle on one canonical jtorctl

Reported by: eighthave Owned by: tbb-team
Priority: High Milestone:
Component: Applications/Tor Browser Version:
Severity: Normal Keywords: Android, tbb-mobile, TorBrowserTeam202004R, gitlab-tb-jtorctl
Cc: n8fr8, gk, akwizgran, sisbell Actual Points:
Parent ID: Points:
Reviewer: sysrqb Sponsor:

Description

@n8fr8 recently switched Orbot to use the Briar jtorctl. So if Tor Browser and Thali also use that, then we have one canonical jtorctl again. With the new native Android TorService, it'll work best if jtorctl is embedded in the service, providing an API for consumers.

Child Tickets

TicketStatusOwnerSummaryComponent
#32574closedhiroset up mirror repo on GitHub torproject/jtorctlInternal Services/Service - github tpo

Change History (27)

comment:1 Changed 11 months ago by sysrqb

Keywords: TorBrowserTeam202001 added

This can come out of the discussions we have around which libraries we use for Tor Browser on Android.

comment:2 Changed 11 months ago by eighthave

@sisbell mention that Tor Browser uses net.freehaven.tor.control:jtorctl:0.2, then makes a custom subclass of TorControlConnection. Here's the tricky part: the TorService needs a way to reliably broadcast when Tor is started, meaning apps can use it to make network connections. The only way I've seen to do that is using the ControlPort via a TorControlConnection instance. That means that TorService needs to use its own TorControlConnection instance. jtorctl only supports a single TorControlConnection instance. Then all the bits that the Android apps need come via Android methods, like broadcast Intents.

From what I've seen in Orbot, this is doable. I haven't seen Tor Browser's custom TorControlConnection. But this would be doable if Tor Browser's custom TorControlConnection was upstreamed into a new jtorctl release. As far as I've seen, jtorctl should be the Java representation of all the commands available on the ControlPort. So making jtorctl canonical would be the way to implement that.

comment:3 in reply to:  2 Changed 11 months ago by gk

Replying to eighthave:

jtorctl should be the Java representation of all the commands available on the ControlPort. So making jtorctl canonical would be the way to implement that.

+1

comment:4 Changed 11 months ago by sisbell

I assume we would be taking this as the canonical project

https://github.com/guardianproject/jtorctl

If so I can submit a PR for some minor changes. The extensions I made are simple things like:

    public void takeownership() throws IOException {
        sendAndWaitForResponse("TAKEOWNERSHIP\r\n", null);
    }

    public void resetOwningControllerProcess() throws IOException {
        resetConf(Collections.singleton("__OwningControllerProcess"));
    }

    public void reloadConf() throws IOException {
        signal("HUP");
    }

comment:5 Changed 11 months ago by eighthave

@n8fr8 just switched Orbot to Briar's fork, which I think already includes that code snippet. It is implementation 'org.briarproject:jtorctl:0.3' which I believe comes from https://github.com/akwizgran/jtorctl. So https://github.com/guardianproject/jtorctl should be considered a defunct dev fork, as far as I can tell. How about reviving https://git.torproject.org/jtorctl.git as the canonical repo?

Then I was thinking it could make sense to have jtorctl versions match the tor versions in terms of compatibility. For example, if tor 0.4.3.0 adds the a new SUPERFOO command, then the jtorctl version that implements it would be called org.torproject:jtorctl:0.4.3.0

Last edited 11 months ago by eighthave (previous) (diff)

comment:6 Changed 11 months ago by eighthave

I guess we should also have an automatic naming convention for translating ControlPort command names to Java method names. The existing standard is to use camelCase for ControlPort commands with an _ in them, and when there are clear words. Then camelCase for methods that are not just direct calls for ControlPort commands. For example:

  • takeOwnership() -> TAKEOWNERSHIP
  • setEvents() -> SETEVENTS
  • addOnion() -> ADD_ONION
  • dropGuards() -> DROPGUARDS
  • resetConf() -> RESETCONF
  • hsFetch() -> HSFETCH
  • hsForget() -> HSFORGET
  • sendAndWaitForResponse()

So @sisbell's proposal looks good to me, except takeownership() should be takeOwnership().

I can put together this update for the things that I'm proposing, as long as Tor Project is willing to use it and make the releases to Maven Central.

comment:7 Changed 11 months ago by gk

eighthave: nobody responded so far, and I am sorry about that. +1 to what you said in comment:5 and comment:6. Please move forward with that. We'll meanwhile think about how to integrate the jtorctl release builds and Maven Central pushes into our build setup.

Last edited 11 months ago by gk (previous) (diff)

comment:8 Changed 11 months ago by eighthave

I'm digging into this now, one thing that surprised me is that the constants in TorControlCommands seem to be entirely unused. The only thing that I could find in that file are the constants added by Briar for parsing the onion info, which does not come from Tor:

    public static final String HS_ADDRESS = "onionAddress";
    public static final String HS_PRIVKEY = "onionPrivKey";

comment:9 Changed 11 months ago by sisbell

Cc: sisbell added

comment:10 in reply to:  6 Changed 11 months ago by sisbell

Replying to eighthave:

I guess we should also have an automatic naming convention for translating ControlPort command names to Java method names. The existing standard is to use camelCase for ControlPort commands with an _ in them, and when there are clear words. Then camelCase for methods that are not just direct calls for ControlPort commands. For example:

  • takeOwnership() -> TAKEOWNERSHIP
  • setEvents() -> SETEVENTS
  • addOnion() -> ADD_ONION
  • dropGuards() -> DROPGUARDS
  • resetConf() -> RESETCONF
  • hsFetch() -> HSFETCH
  • hsForget() -> HSFORGET
  • sendAndWaitForResponse()

So @sisbell's proposal looks good to me, except takeownership() should be takeOwnership().

I can put together this update for the things that I'm proposing, as long as Tor Project is willing to use it and make the releases to Maven Central.

This makes sense on the naming conventions. There are a lot of possible commands. We can probably just start with those we need for Orbot/tor-android-service. Any additional ones from the community can be opened as tickets. If we have the ability to control the code. I'd suggest a base classes which just the connection stuff and then drop all the tor specific commands to a subclass, just to make the code a little more readable.

comment:11 Changed 11 months ago by eighthave

The list of public commands isn't so long, and they're mostly implemented. I was thinking we should have them all implemented so that the library only needs to change if the commands change. That said, the list of conf and info options is long, but those are reachable using GETINFO/GETCONF/etc/, so I don't think we need more of an API there.

So something like this should probably stay out of the library:

    public void resetOwningControllerProcess() throws IOException {
        resetConf(Collections.singleton("__OwningControllerProcess"));
    }

Here's the public list of commands:

  • ADD_ONION
  • ATTACHSTREAM
  • AUTHCHALLENGE
  • AUTHENTICATE
  • CLOSECIRCUIT
  • CLOSESTREAM
  • DEL_ONION
  • DROPGUARDS
  • DROPOWNERSHIP
  • EXTENDCIRCUIT
  • GETCONF
  • GETINFO
  • HSFETCH
  • HSPOST
  • LOADCONF
  • MAPADDRESS
  • POSTDESCRIPTOR
  • PROTOCOLINFO
  • REDIRECTSTREAM
  • RESETCONF
  • RESOLVE
  • SAVECONF
  • SETCIRCUITPURPOSE
  • SETCONF
  • SETEVENTS
  • SETROUTERPURPOSE
  • SIGNAL
  • TAKEOWNERSHIP
  • USEFEATURE
Last edited 11 months ago by eighthave (previous) (diff)

comment:12 Changed 11 months ago by sysrqb

This all sounds great to me! We probably won't have time for working on this within the next month, but I think we can pick this up at the beginning of the year. In the mean time, we can begin thinking about the API that should be exposed for this (mentioned in comment:2). That discussion is probably best had on the tbb-dev@ mailing list.

comment:13 Changed 11 months ago by akwizgran

All sounds good to me too. Thanks for moving this forward!

By the way, the HSFORGET command was implemented in a Tor patch that we no longer use, so that method could be dropped from jtorctl.

comment:14 Changed 11 months ago by eighthave

Ok, my merge+modernize work is pretty much done. I'll be using this to complete TorService. Here's my fork:

It includes:

Still needs:

  • a complete test suite, look at stem library for examples
  • some crasher bugs in some events, for example EVENT_HS_DESC_CONTENT
  • util/helper methods for parsing the data structures from events

comment:15 Changed 11 months ago by eighthave

For future work: there are tools for generating Java parsers based on the ABNF (Augmented Backus-Naur Form) grammar that the control-spec uses. Since it seems there are crasher bugs based on parsing events, it seems work trying one of these generated parsers using the control-spec. It claims to be RFC2234 ABNF, which might not be the same as the newer RFC5234. Here are two parser generators that I found that would be worth trying:

comment:16 Changed 11 months ago by sisbell

Overall looks good. A few clarifications, questions.

  1. From an API perspective, the distinction between rawEventListeners and the EventHandlers isn't obvious. Why is one called handler and the other is called listener? What is the different? Should we use consistent terms?
  2. Better to use ArrayList.isEmpty rather than ArrayList.size() == 0 (TorControlConnection:221)
  3. What is the package name we are going with (net.freehaven? or org.torproject?)
  4. We do have clear SignalType enums, Does it make sense to use these for type safety?

comment:17 in reply to:  15 Changed 11 months ago by sisbell

ANTLR would be the standard one for Java.  The performance isn't great for a lot of small, high-speed events due to the startup time of the parser (this may have improved since I was using it a couple of years ago). So that's one area of caution.

https://www.antlr.org/

Replying to eighthave:

For future work: there are tools for generating Java parsers based on the ABNF (Augmented Backus-Naur Form) grammar that the control-spec uses. Since it seems there are crasher bugs based on parsing events, it seems work trying one of these generated parsers using the control-spec. It claims to be RFC2234 ABNF, which might not be the same as the newer RFC5234. Here are two parser generators that I found that would be worth trying:

comment:18 in reply to:  16 Changed 11 months ago by eighthave

Replying to sisbell:

Overall looks good. A few clarifications, questions.

  1. From an API perspective, the distinction between rawEventListeners and the EventHandlers isn't obvious. Why is one called handler and the other is called listener? What is the different? Should we use consistent terms?

I think the EventHandler API should be deprecated since it only allows a single instance to be registered, and it introduces its own abstraction layer which needs to be separately maintained, and diverges from the control-spec docs. I chose the word "Listener" since as far as I know, that's the common name for the Java pattern where you can register multiple instances to listen for events.

  1. Better to use ArrayList.isEmpty rather than ArrayList.size() == 0 (TorControlConnection:221)

yes, thanks for that.

  1. What is the package name we are going with (net.freehaven? or org.torproject?)

I like the idea of switching to org.torproject to mark it as official, but I would be OK if people wanted to keep it as net.freehaven.

  1. We do have clear SignalType enums, Does it make sense to use these for type safety?

I'm not sure which enums you're referring to, but I agree it does make sense to sync any relevant data structures from Tor source code to jtorctl. I think jtorctl should reflect the Tor source code as directly as possible, to keep it maintainable and related to the dos.

https://www.antlr.org/

Sounds good to me, I've never worked with parser generators, so I'm just guessing here.

comment:19 Changed 10 months ago by eighthave

as a step towards making this officially org.torproject:jtorctl:0.4.2.5, I made a release of info.guardianproject:jtorctl:0.4 based on my tested fork of jtorctl, which in turn is based on the org.briarproject:jtorctl:0.3 release. We'll be using that in Orbot-mini, a new app based on Orbot. It builds reproducibly with either Maven or Gradle, and it is close to generating the same files with both Maven and Gradle. I also published a .buildinfo file used to rebuild the artifacts on Maven Central:
https://repo1.maven.org/maven2/info/guardianproject/jtorctl/0.4/

For any GitLab forks, it will build and deploy to that fork's Maven repostory, for example:
https://gitlab.com/eighthave/jtorctl/-/jobs/383905747

Here's the Maven repo overview:
https://gitlab.com/eighthave/jtorctl/-/packages

And this repo can be used by adding this to build.gradle:

allprojects {
    repositories {
        maven { url 'file:///usr/share/maven-repo' }
        google()
        jcenter()
        maven { url 'https://gitlab.com/api/v4/projects/15548368/packages/maven'}
    }
}

comment:20 Changed 9 months ago by sysrqb

Keywords: TorBrowserTeam202002 added; TorBrowserTeam202001 removed
Priority: MediumHigh

comment:21 Changed 9 months ago by akwizgran

Commenting here because the GitLab repo seems to have issues turned off.

It looks like there might be a potential NPE (not caused by your changes @eighthave) in TorControlConnection#readReply(). The return value of input.readLine() at line 147 is checked for null. But when reading data lines at line 170 we don't check for null. It seems to me that we could get an NPE at line 173 if the end of the stream is reached while reading a multi-line reply. But I thought I'd better check before opening an MR as I don't understand the code or the control protocol very well.

https://gitlab.com/eighthave/jtorctl/blob/master/src/net/freehaven/tor/control/TorControlConnection.java#L143

Last edited 9 months ago by akwizgran (previous) (diff)

comment:22 Changed 9 months ago by eighthave

Looks like you found something, seems like a null guard would be appropriate.

The code is not so complicated, but it has calcified, since everyone has been working off of quick forks and adding new logic in their own layers. By putting them all together and putting a little effort into switching to a unified project, we can have a solid jtorctl with a nice test suite. Plus we can have one jtorctl provides the whole Tor control API. Before this work, only an small subset of the API was implemented, and it included stuff that had been removed from Tor.

comment:23 Changed 8 months ago by sisbell

Keywords: TorBrowserTeam202002R added; TorBrowserTeam202002 removed
Status: newneeds_review

Added project to pull and build jtorctl

https://github.com/sisbell/tor-browser-build/commits/bug-32534

comment:24 Changed 8 months ago by pili

Keywords: TorBrowserTeam202003R added; TorBrowserTeam202002R removed

We are no longer in February moving reviews

comment:25 Changed 8 months ago by sysrqb

Reviewer: sysrqb

comment:26 Changed 7 months ago by pili

Keywords: TorBrowserTeam202004R added; TorBrowserTeam202003R removed

We are no longer in March

comment:27 Changed 5 months ago by sysrqb

Keywords: gitlab-tb-jtorctl added; jtorctl removed
Note: See TracTickets for help on using tickets.