Opened 8 months ago

Closed 5 months ago

#33428 closed enhancement (implemented)

Make chutney check for relay microdescriptors before verifying

Reported by: teor Owned by: teor
Priority: Medium Milestone:
Component: Core Tor/Chutney Version:
Severity: Normal Keywords: ipv6, prop311, outreachy-ipv6, network-team-roadmap-2020Q1
Cc: teor, anuradha1904, nickm, dgoulet Actual Points: 0.6
Parent ID: #33050 Points: 1
Reviewer: nickm Sponsor: Sponsor55-can

Description

In #33232, we make chutney check for relay and bridge descriptors, and relays in the consensus, before verifying. But we don't check for relay microdescriptors.

The easiest way to check for relay microdescriptors, is using each relay's ed25519 key. In #30642, we will add an ed25519-identity file to each relay's data directory. Chutney can read the contents of this file, and then search for it in cached-microdescriptors and cached-microdescriptors.new.

Child Tickets

Change History (31)

comment:1 Changed 8 months ago by anuradha1904

Hey, I am anuradha, an outreachy applicant, would like to work on the following issue. Can someone give me context on how to start with the following issue? Thank you.

comment:2 Changed 8 months ago by nickm

Sure! The context here is that Chutney (https://github.com/torproject/chutney) is a program that runs a small Tor network on your desktop. Part of its functionality is to see whether the network has successfully bootstrapped or not. In #33232 (specifically #33379), there's work to see whether relays have the information they need in order to use the network. But that work doesn't cover the kind of information called a "microdescriptor". Microdescriptors are more fully documented in dir-spec.txt in the tor specifications directory. As part of this ticket, you'd want to extend the work of #33379 to check for microdescriptors as well.

Teor, is there more that anuradha1904 should know here? Would it make sense to share a link to your existing #33379 work-in-progress, so they have something to base theirs on?

comment:3 in reply to:  description Changed 8 months ago by teor

Hi anuradha1904, this ticket is one of the stretch goals for the Tor IPv6 Outreachy project. You're welcome to start working on it, but we're not sure how difficult it will be. Please stay in touch, so we can help you out if you get stuck.

A lot of the work on the IPv6 project is in the tor binary, which is written in C. Do you know how to program in C ?

If you don't know how to program in C, that's ok. There is plenty of Python work to do in Chutney. I have added some more chutney tasks to the Outreachy project description, and the outreachy-ipv6 keyword.

I'm working on the branch for #33379 right now. Once it passes the tests, I'll post a link to the pull request here.

Nick, is reading a file containing the ed25519 key is a good design for this feature?

For the RSA key, we use tor --list-fingerprint. We could add a feature to tor so it accepts a key type argument:

  • tor --list-fingerprint rsa
  • tor --list-fingerprint ed25519

If we want to read a file, we could just read keys/ed25519_master_id_public_key, so we don't have to depend on #30642.

What do you think?

comment:4 Changed 8 months ago by teor

Here is my work-in-progress branch for #33379. It still has some bugs. It works for about half the chutney networks in tor's make test-network-all:

comment:5 Changed 8 months ago by anuradha1904

Hey teor, what do you think is appropriate issue for me to start from a beginner perspective, and I do know how to code in C but am more comfortable with python, so would like to start my work in python. Thank you.

comment:6 in reply to:  5 ; Changed 8 months ago by teor

Replying to anuradha1904:

Hey teor, what do you think is appropriate issue for me to start from a beginner perspective, and I do know how to code in C but am more comfortable with python, so would like to start my work in python. Thank you.

I have looked at all the current outreachy-ipv6 issues, and this is the best Python issue. Here are some hints to get you started:

We don't want to wait for the new ed25519-identity file in #30642, so here's how you can read the binary data from the key file, and encode it in base64:

$ chutney/tools/test-network.sh
# the last 32 bytes of the key file is the key data
# we want to remove the newline and the trailing "=" from the base64 output
$ cat chutney/net/nodes/000a/keys/ed25519_master_id_public_key | tail -c 32 | base64 | head -c -2
Hqp2E6Pi0BAWmip+2XEowl/CgjGesgfC0jNipORQ37A
$ grep "Hqp2E6Pi0BAWmip+2XEowl/CgjGesgfC0jNipORQ37A" chutney/net/nodes/*/cached-microdescs*
net/nodes/000a/cached-microdescs.new:id ed25519 Hqp2E6Pi0BAWmip+2XEowl/CgjGesgfC0jNipORQ37A
net/nodes/001a/cached-microdescs.new:id ed25519 Hqp2E6Pi0BAWmip+2XEowl/CgjGesgfC0jNipORQ37A
...

The key data is random, so the exact values will be different in every chutney network. You'll need to take the script cat chutney/net/nodes/000a/keys/ed25519_master_id_public_key | tail -c 32 | base64 | head -c -2, and translate it into Python.

You can start by writing a function called _setEd25519Id(), which should be called at the end of _genRouterKey():
https://github.com/torproject/chutney/blob/master/lib/chutney/TorNet.py#L741

The function should:

  • read keys/ed25519_master_id_public_key from datadir, if that file exists
  • extract the key data
  • convert it to base64
  • discard the trailing "="
  • set ed25519-id in the Environ to that value

You can check you have the right value, by printing it out, then searching for it in the cached-microdescs and cached-microdescs.new files in the datadir.

After you've written _setEd25519Id(), we can move on to the rest of this task.

I hope that's enough to get you started!

comment:7 Changed 8 months ago by anuradha1904

Sure, got the idea, will start with it. Thank you very much.

comment:8 Changed 8 months ago by teor

Parent ID: #33232#33050

comment:9 Changed 8 months ago by anuradha1904

For extracting the ed25519 public key, I am removing the first 32 byte from the binary and saving the remaining file like this:-
with open("/home/anuradha/chutney/net/nodes/000a/keys/ed25519_master_id_public_key", 'r', errors = 'ignore') as f:

f.seek(32)
rest = f.read()
print(rest)

I am currently adding the hardcoded path for testing, will chnage it in the original file.
When I am printing the rest of the binary file, i do get the key i need, however when I try to convert it to base64, i get the error "ValueError: string argument should contain only ASCII characters", i tried searching but could not get the satified answer. How should i go with the problem? Kindly guide me.

comment:10 Changed 8 months ago by teor

Hi anuradha1904,

It's hard to give advice without knowing the details of what you are doing.

Maybe you're trying to decode rather than encode?
Maybe you're passing a string, but should be passing bytes?

Please show us the exact code you are running, the command you are using to run the code, your python version, and the exact output you're getting.

comment:11 Changed 8 months ago by anuradha1904

This is the code that I wrote:

with open("/home/anuradha/chutney/net/nodes/000a/keys/ed25519_master_id_public_key", 'r', errors = 'ignore') as f:
   f.seek(32)
   rest = f.read()
   print(rest)
   EncodedZip = base64.b64encode(rest)
   print(EncodedZip)

I am able to print the key that is required but I am not able to perform any encoding on it, it gives the error "ValueError: string argument should contain only ASCII characters".

Edit: fix code formatting

Last edited 8 months ago by teor (previous) (diff)

comment:12 in reply to:  11 Changed 8 months ago by teor

Replying to anuradha1904:

This is the code that I wrote:

with open("/home/anuradha/chutney/net/nodes/000a/keys/ed25519_master_id_public_key", 'r', errors = 'ignore') as f:
   f.seek(32)
   rest = f.read()
   print(rest)
   EncodedZip = base64.b64encode(rest)
   print(EncodedZip)

I am able to print the key that is required but I am not able to perform any encoding on it, it gives the error "ValueError: string argument should contain only ASCII characters".

When I search for "python b64encode", I see:
https://duckduckgo.com/html?q=python%20b64encode

Here is part of the answer: b64encode expects 'bytes', not 'string':
https://stackoverflow.com/questions/8908287/why-do-i-need-b-to-encode-a-string-with-base64/41437531#41437531

Here is an explanation of the difference between 'bytes' and 'string':
https://stackoverflow.com/questions/6224052/what-is-the-difference-between-a-string-and-a-byte-string

Here is the base64.b64encode reference documentation:
https://docs.python.org/3.5/library/base64.html#base64.b64encode

You didn't say if you were running python 2 or python 3. But if you're getting that error, you're probably running python 3.

Here is some code that should work for python 3:

with open("/home/anuradha/chutney/net/nodes/000a/keys/ed25519_master_id_public_key", 'rb') as f:
   ...

You might also want to delete this line, because it could cause errors:

   print(rest)
Last edited 8 months ago by teor (previous) (diff)

comment:13 in reply to:  6 ; Changed 7 months ago by anuradha1904

Replying to teor: Hey teor, all thanks to you, I am done with majority of the task of the function, can you please guide me "set ed25519-id in the Environ to that value" step. Thank you.

Replying to anuradha1904:

Hey teor, what do you think is appropriate issue for me to start from a beginner perspective, and I do know how to code in C but am more comfortable with python, so would like to start my work in python. Thank you.

I have looked at all the current outreachy-ipv6 issues, and this is the best Python issue. Here are some hints to get you started:

We don't want to wait for the new ed25519-identity file in #30642, so here's how you can read the binary data from the key file, and encode it in base64:

$ chutney/tools/test-network.sh
# the last 32 bytes of the key file is the key data
# we want to remove the newline and the trailing "=" from the base64 output
$ cat chutney/net/nodes/000a/keys/ed25519_master_id_public_key | tail -c 32 | base64 | head -c -2
Hqp2E6Pi0BAWmip+2XEowl/CgjGesgfC0jNipORQ37A
$ grep "Hqp2E6Pi0BAWmip+2XEowl/CgjGesgfC0jNipORQ37A" chutney/net/nodes/*/cached-microdescs*
net/nodes/000a/cached-microdescs.new:id ed25519 Hqp2E6Pi0BAWmip+2XEowl/CgjGesgfC0jNipORQ37A
net/nodes/001a/cached-microdescs.new:id ed25519 Hqp2E6Pi0BAWmip+2XEowl/CgjGesgfC0jNipORQ37A
...

The key data is random, so the exact values will be different in every chutney network. You'll need to take the script cat chutney/net/nodes/000a/keys/ed25519_master_id_public_key | tail -c 32 | base64 | head -c -2, and translate it into Python.

You can start by writing a function called _setEd25519Id(), which should be called at the end of _genRouterKey():
https://github.com/torproject/chutney/blob/master/lib/chutney/TorNet.py#L741

The function should:

  • read keys/ed25519_master_id_public_key from datadir, if that file exists
  • extract the key data
  • convert it to base64
  • discard the trailing "="
  • set ed25519-id in the Environ to that value

You can check you have the right value, by printing it out, then searching for it in the cached-microdescs and cached-microdescs.new files in the datadir.

After you've written _setEd25519Id(), we can move on to the rest of this task.

I hope that's enough to get you started!

comment:14 in reply to:  13 Changed 7 months ago by teor

Replying to anuradha1904:

Replying to teor: Hey teor, all thanks to you, I am done with majority of the task of the function, can you please guide me "set ed25519-id in the Environ to that value" step. Thank you.

Hi, I'd like to look at the code you've written so far. Lots of developers get reviews on incomplete code, particularly when they are working on a new project.

Please submit a pull request (PR) on the chutney GitHub, containing the code you have written so far:
https://github.com/torproject/chutney

When you submit your PR, GitHub will run chutney's continuous integration (CI). Let us know if CI fails, and we will help you fix it.

(Sometimes CI fails because of the server or other programs, not your code.)

Replying to anuradha1904:
You can start by writing a function called _setEd25519Id(), which should be called at the end of _genRouterKey():
https://github.com/torproject/chutney/blob/master/lib/chutney/TorNet.py#L741

The function should:

  • read keys/ed25519_master_id_public_key from datadir, if that file exists
  • extract the key data
  • convert it to base64
  • discard the trailing "="

Your PR should do these things.

  • set ed25519-id in the Environ to that value

You can set ed25519-id in a similar way to the code in _genRouterKey(), which sets fingerprint:

self._env['fingerprint'] = fingerprint

https://github.com/torproject/chutney/blob/9ffa2857fb4c6e75156159233263ac5898fc410c/lib/chutney/TorNet.py#L744

You can check you have the right value, by printing it out, then searching for it in the cached-microdescs and cached-microdescs.new files in the datadir.

Did you do this check?

After you've written _setEd25519Id(), we can move on to the rest of this task.

comment:15 Changed 7 months ago by anuradha1904

Hey teor, I have made the PR, kindly check https://github.com/torproject/chutney/pull/60

comment:16 Changed 7 months ago by anuradha1904

Cc: anuradha1904 added
Status: newneeds_review

comment:17 Changed 7 months ago by teor

Owner: set to anuradha1904
Status: needs_reviewassigned

Thanks for this pull request, I made some comments on GitHub.

My biggest question is:

Have you tested this function?
How do you know it works?

comment:18 Changed 7 months ago by teor

Status: assignedneeds_revision

comment:19 Changed 7 months ago by teor

Reviewer: teor
Status: needs_revisionneeds_review

Hi, the new PR is at:

It looks good, but I am waiting for CI to finish, so I know the code actually works.

In the meantime, I made a few code style comments. We try to use the same code style in different functions, so people can read our code easily.

comment:20 Changed 7 months ago by teor

Actual Points: 0.3
Status: needs_reviewneeds_revision

Hi, this is looking really good. We're almost there.

Just 2 more minor changes to go, and then we will be finished the first step.

I have opened #33675 for the second step of this task.

comment:21 Changed 7 months ago by teor

I have opened #33676 for the third step of this task.

comment:22 Changed 7 months ago by teor

Looks like Travis CI is having some issues. A job failed, I restarted it.

comment:23 Changed 7 months ago by teor

Oh, that's not a Tor pull request, it's on your repository. So I can't restart it.

comment:24 Changed 7 months ago by teor

Before you push new changes to your pull request, your chutney code should pass:

  • "make test-network-all" on tor master

If you get stuck, the getting started instructions have more details:

But be careful, some of the commands are different.

comment:25 Changed 7 months ago by gaba

Keywords: network-team-roadmap-2020Q1 added

Add all the tickets from sponsor 55 that are done and being worked on to the keyword #network-team-roadmap-2020Q1 so I can look at them in the wiki page...

comment:26 Changed 5 months ago by teor

Actual Points: 0.30.5
Owner: changed from anuradha1904 to teor
Status: needs_revisionassigned

I'm going to finish off this ticket, because we'd like to have this feature, and I might not be around to do mentoring in future.

comment:27 Changed 5 months ago by teor

Reviewer: teor
Status: assignedneeds_review

See my PR, which is based on anuradha1904's PR:

comment:28 Changed 5 months ago by teor

Cc: nickm dgoulet added

You might also want to check the CI on the underlying branch in:

(This code can be unstable due to timing issues. So if it passes both CI runs, then it's pretty good.)

comment:29 Changed 5 months ago by teor

Actual Points: 0.50.6

0.3.5 still needs a bit of time before verifying, so I reverted that part of the branch.

I originally thought these timing issues were due to microdescriptors. But there's clearly something more subtle that we fixed in 0.4.0 and later.

comment:30 Changed 5 months ago by asn

Reviewer: nickm

comment:31 Changed 5 months ago by nickm

Resolution: implemented
Status: needs_reviewclosed

This looks okay to me; I'll merge it to master.

Note: See TracTickets for help on using tickets.