Opened 2 years ago

Closed 2 years ago

#22112 closed enhancement (implemented)

Replace torperf.csv with onionperf.csv

Reported by: karsten Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/Website Version:
Severity: Normal Keywords:
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

We recently switched from Torperf to OnionPerf for measuring Tor performance. But we're only using half of OnionPerf's measurements, namely those that exit the Tor network to the public internet and fetch files from ourselves. We should also include measurements made to the onion server that OnionPerf sets up and runs locally.

The torperf.csv data format does not support this second data set in a forward-compatible way, and it's about time to rewrite the code that is part of the legacy data-processing module that produces this file anyway.

I wrote an onionperf module that produces an onionperf.csv file quite similar to torperf.csv but with an additional column for public vs. onion server. I went ahead and pushed that code to master, but I marked the data format as beta, mentioning that it may change or disappear without prior notice.

Next steps are:

  • Get this code reviewed and possibly improved. Remove the beta warning, and add a deprecation notice to torperf.csv. Don't remove the code for producing torperf.csv just yet.
  • One month or so later, remove the code for updating torperf.csv and mention on stats.html that the file does not get updated anymore. Don't remove it just yet.
  • Another month or so later, remove the torperf.csv file and data format specification.

Child Tickets

Change History (6)

comment:1 Changed 2 years ago by karsten

Status: newneeds_review

Please review the patch linked from the description above.

comment:2 Changed 2 years ago by iwakeh

Status: needs_reviewneeds_revision

Committed to master means 'productive', but the current onionperf.csv links on metrics.tp.o lead to 404; is that intended?

init-onionperf.sql: The last "group by" on the second to the last line contains a 3, which seems to refer to source, but source in this select is set to ''. What is intended here?

Why are there columns in the measurements table that are not used later on (mostly the unrecognized key lines))? Shouldn't these be omitted?
Here it seems as if only 'endpointremote' is used to determine the content of column 'server'. If just that is needed it ought to be filled by the java code initially. But, I might have overlooked something?

In some other module we used constants for column-names in Java. As the db redesign will be future work, is this intentionally left to be improved then?

comment:3 in reply to:  2 ; Changed 2 years ago by karsten

Status: needs_revisionneeds_review

Replying to iwakeh:

Committed to master means 'productive', but the current onionperf.csv links on metrics.tp.o lead to 404; is that intended?

Ah, no, that's an oversight. Fixed! (Just not yet deployed.)

init-onionperf.sql: The last "group by" on the second to the last line contains a 3, which seems to refer to source, but source in this select is set to ''. What is intended here?

The intention was to aggregate over all sources and to include '' in every resulting row, so that there's the same number and type of columns as in the SELECT above. That third column, '' AS source, is not an aggregate function, so I included it in the GROUP BY. But I just tried out to simply drop that column from the GROUP BY, which worked just fine. Changed.

Why are there columns in the measurements table that are not used later on (mostly the unrecognized key lines))? Shouldn't these be omitted?
Here it seems as if only 'endpointremote' is used to determine the content of column 'server'. If just that is needed it ought to be filled by the java code initially. But, I might have overlooked something?

I tried to make the schema as complete as possible, so that we can write different views in the future more easily. In fact, I only left out path information and build times, because I wanted to avoid making the schema too complex while not using the data at all. But I could see how we're adding those parts, too. Regarding your question why we included unrecognized keys, they will all be recognized in the next metrics-lib version, in which case we'll update this code.

In some other module we used constants for column-names in Java. As the db redesign will be future work, is this intentionally left to be improved then?

Well, I'm not a big fan of how we used string constants there, because they give a false sense of code robustness by protecting against typos, but they cannot be used in prepared statements and also don't provide any type safety. And at least during development I ran much more often into the latter two issues than into typos. But maybe you're right and string constants are better than nothing. Changed.

Please find my task-22112 branch with the fixes. What do you think? Ready to remove the beta label and call this done? :)

comment:4 in reply to:  3 Changed 2 years ago by iwakeh

Status: needs_reviewmerge_ready

Replying to karsten:

[snip]

Well, I'm not a big fan of how we used string constants there, because they give a false sense of code robustness by protecting against typos, but they cannot be used in prepared statements and also don't provide any type safety. And at least during development I ran much more often into the latter two issues than into typos. But maybe you're right and string constants are better than nothing. Changed.

Thanks! It is mostly the 'better than nothing' reason and also more readability (using the same constant is clearly saying these strings mean the same). Of course all the improvements for db access will need their own little project.

Please find my task-22112 branch with the fixes. What do you think? Ready to remove the beta label and call this done? :)

Yep, ready for non-beta.

comment:5 Changed 2 years ago by karsten

Cool! Just one more thing: do you feel strongly about calling the new data file torperf2.csv rather than onionperf.csv? If not, I'd rename it. Sorry for mixing tickets here, but that question is what remains before we can call this stable. Thanks!

comment:6 Changed 2 years ago by karsten

Resolution: implemented
Status: merge_readyclosed

Renamed the data file to torperf-1.1.csv as suggested on this related ticket. Also removed the beta warnings. Pushed to master and deployed. Closing. Yay!

Note: See TracTickets for help on using tickets.