Opened 2 months ago

Closed 2 months ago

#30125 closed enhancement (implemented)

Port server's log sanitization to client, broker, and proxy-go

Reported by: dcf Owned by: cohosh
Priority: Medium Milestone:
Component: Circumvention/Snowflake Version:
Severity: Normal Keywords:
Cc: dcf, arlolra, cohosh Actual Points:
Parent ID: Points:
Reviewer: Sponsor: Sponsor19

Description

#21304 added a log sanitizer to the server (bridge) code that searches for IP addresses in logs and elides them. We noted in comment:17:ticket:21304 that the other components--client, broker, and proxy-go--can benefit from the same log sanitization.

comment:18:ticket:21304 suggests a way to do it: move the logScrubber code into a new top-level subdirectory and safelog package, and have the other programs import git.torproject.org/pluggable-transports/snowflake.git/safelog.

Child Tickets

Change History (12)

comment:1 Changed 2 months ago by cohosh

Owner: set to cohosh
Sponsor: Sponsor19
Status: newassigned

comment:2 Changed 2 months ago by cohosh

Status: assignedneeds_review

See https://github.com/cohosh/snowflake/compare/ticket30125

I put the safelog package inside the top-level directory common/ because I think we will have more factored out code later, specifically for the websocket pieces that the proxy-go instances and the clients will use to talk to the broker.

comment:3 Changed 2 months ago by dcf

Should it also be used in client?

comment:4 Changed 2 months ago by dcf

The refactoring looks good. I have a few ideas about deployment to save us some trouble later. My main goal is that there should be a clean break between the old unsanitized logs and the new sanitized logs, so that we don't later have to trawl through a log file and figure out where the change happened. This is because I'd like us to extract what we need from the old logs and then delete them.

For the bridge, those logs are being rotated and not saved long-term, so we don't need to do anything special.

For the broker, it will be something like this:

sv stop snowflake-broker
cd /var/log/snowflake-broker
tar cf unsanitized.tar *.s current.20190322.xz current
shred -n 1 -v -u *.s current.20190322.xz current
# install the new /usr/local/bin/broker
sv start snowflake-broker

For proxy-go, it will be similar, except that there are several /home/snowflake-proxy/*.log.d log directories. Also /home/snowflake-proxy/snowflake-proxy-*.log{,.xz} are unsanitized logs from before we started using runit log directories (happened in #28390).

For the client, we'll need a Tor Browser ticket to pick up the upgrade. A sample ticket and patch that can serve as a template is #26795. I know you are interested in the reproducible build and this would be a good introduction to rbm if you haven't used it yet. Basically, you just need to edit projects/snowflake/config and update git_hash, then run make testbuild to make sure it still builds, then open a ticket in the Applications/Tor Browser component.

comment:5 in reply to:  3 Changed 2 months ago by cohosh

Replying to dcf:

Should it also be used in client?

Probably a good idea, yeah. Here's that addition: https://github.com/cohosh/snowflake/compare/ticket30125

comment:6 in reply to:  4 ; Changed 2 months ago by cohosh

Replying to dcf:

The refactoring looks good. I have a few ideas about deployment to save us some trouble later. My main goal is that there should be a clean break between the old unsanitized logs and the new sanitized logs, so that we don't later have to trawl through a log file and figure out where the change happened. This is because I'd like us to extract what we need from the old logs and then delete them.

Thanks! This looks reasonable to me. Do you have something in mind for extracting useful data from the unsanitized logs? I suppose we could write a separate scrubber to sanitize them retroactively.

For the bridge, those logs are being rotated and not saved long-term, so we don't need to do anything special.

For the broker, it will be something like this:
[...]
For proxy-go, it will be similar, except that there are several /home/snowflake-proxy/*.log.d log directories. Also /home/snowflake-proxy/snowflake-proxy-*.log{,.xz} are unsanitized logs from before we started using runit log directories (happened in #28390).

I've noticed that there are a lot of old logs from different proxy-go instances. I'll set up the tarball to keep the directory structure, but I guess my question is the same as above about what we're planning on using these logs for.

For the client, we'll need a Tor Browser ticket to pick up the upgrade. A sample ticket and patch that can serve as a template is #26795. I know you are interested in the reproducible build and this would be a good introduction to rbm if you haven't used it yet. Basically, you just need to edit projects/snowflake/config and update git_hash, then run make testbuild to make sure it still builds, then open a ticket in the Applications/Tor Browser component.

Cool! I also wanted to ask you about thoughts you have about when to make snowflake client releases. I'm assuming it's just whenever there are changes we think are important to have people start using. But I also don't want to overwhelm the applications team.

comment:7 in reply to:  6 Changed 2 months ago by dcf

Status: needs_reviewmerge_ready

Replying to cohosh:

Replying to dcf:

The refactoring looks good. I have a few ideas about deployment to save us some trouble later. My main goal is that there should be a clean break between the old unsanitized logs and the new sanitized logs, so that we don't later have to trawl through a log file and figure out where the change happened. This is because I'd like us to extract what we need from the old logs and then delete them.

Thanks! This looks reasonable to me. Do you have something in mind for extracting useful data from the unsanitized logs? I suppose we could write a separate scrubber to sanitize them retroactively.

For myself, I just want to make graphs showing the number of client and proxy requests per second, like these from flash proxy:

I'm fine with keeping/publishing a scrubbed version of the old logs, or e.g. CSV files derived from them. I don't think we should keep the originals indefinitely. I'll add this topic to the agenda for the next check-in meeting.

For proxy-go, it will be similar, except that there are several /home/snowflake-proxy/*.log.d log directories. Also /home/snowflake-proxy/snowflake-proxy-*.log{,.xz} are unsanitized logs from before we started using runit log directories (happened in #28390).

I've noticed that there are a lot of old logs from different proxy-go instances. I'll set up the tarball to keep the directory structure, but I guess my question is the same as above about what we're planning on using these logs for.

Yeah, the log structure changed in the past in order to allow compression and rotation, because we ran out of disk space using single files :/ (That was #28390.) For me personally, I don't have any use in mind for the old proxy-go logs and would be fine with just deleting them.

For the client, we'll need a Tor Browser ticket to pick up the upgrade. A sample ticket and patch that can serve as a template is #26795. I know you are interested in the reproducible build and this would be a good introduction to rbm if you haven't used it yet. Basically, you just need to edit projects/snowflake/config and update git_hash, then run make testbuild to make sure it still builds, then open a ticket in the Applications/Tor Browser component.

Cool! I also wanted to ask you about thoughts you have about when to make snowflake client releases. I'm assuming it's just whenever there are changes we think are important to have people start using. But I also don't want to overwhelm the applications team.

Yes, so far it's whenever there's a change we want people to start using. Doing it once per alpha release is not too much. As long as you test that the build works across all platforms (that's what make testbuild does), it's not so much trouble for the Tor Browser devs--it's when something breaks the build and they have to start backing out changes that it's cumbersome. IMO it's justified in this case and also a good excuse to file a first Tor Browser ticket.

comment:8 Changed 2 months ago by cohosh

Merged to master and deployed.

I've done the log compression as recommended above.

  • For the broker, we now have unsanitized logs in /var/log/snowflake-broker/unsanitized.tar
  • For the snowflake server, unsanitized logs are in /var/log/tor/unsanitized.tar
  • For the proxy-go instances, we have the old unsanitized logs in /home/proxy-go/snowflake-proxy/unsanitized_old.tar and then in each of the my-instance.log.d subdirectories we have an unsanitized.tar. I'm going to move discussion about deleting these to another ticket, along with what to do now that we no longer need to restart these proxy-go instances.

comment:9 Changed 2 months ago by cohosh

Noting here that, as described in #30205, we have to restart the sv logs as well.

comment:10 Changed 2 months ago by cohosh

Just commenting that I rolled back the broker and proxy-go versions to deal with #30205 (this rollback was probably unnecessary). I will resume the update to the log-scrubbing versions tomorrow.

comment:11 Changed 2 months ago by cohosh

Okay undid the rollback and as of 10:50 on 17 April 2019 this code is now deployed for the server, proxy-go instances, and the broker.

comment:12 Changed 2 months ago by cohosh

Resolution: implemented
Status: merge_readyclosed

For the client, we'll need a Tor Browser ticket to pick up the upgrade. A sample ticket and patch that can serve as a template is #26795. I know you are interested in the reproducible build and this would be a good introduction to rbm if you haven't used it yet. Basically, you just need to edit projects/snowflake/config and update git_hash, then run make testbuild to make sure it still builds, then open a ticket in the Applications/Tor Browser component.

Opened #30241 and attached a patch. I had to add some additional lines to projects/snowflake/build because of some changes we made in factoring out libraries since the last update.

Note: See TracTickets for help on using tickets.