Opened 16 months ago

Last modified 9 months ago

#31071 reopened enhancement

Add a notice if we're missing data for a lookup

Reported by: karsten Owned by: metrics-team
Priority: Medium Milestone:
Component: Metrics/ExoneraTor Version:
Severity: Normal Keywords:
Cc: metrics-team Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Turns out the the exit scanner had an issue between April 25 and 29, 2019. If somebody looks up their exit IP address during that time, they won't be listed in the results. I know of one case where this is now potentially an issue.

Let's think about adding a notice if we're missing data for part of a lookup period, including exit lists and maybe also consensuses. This is different from having no data at all, it's about missing some data only.

First step will be to refine the (already quite complex) query to return whether we have sufficient or insufficient data, possibly but not necessarily with exact timestamps of available data.

Second step will be to include the notice in the website, first in English and then in translated languages.

Third step will be to release and deploy all this.

I'll work on this, but I'm putting it into needs_review to discuss the idea first.

Child Tickets

Attachments (1)

task-31071-note.png (213.8 KB) - added by karsten 13 months ago.

Download all attachments as: .zip

Change History (19)

comment:1 Changed 16 months ago by karsten

Cc: metrics-team added
Reviewer: irl
Status: assignedneeds_review

comment:2 Changed 16 months ago by karsten

And let's add a step zero: look through the archives when this situation has happened before and post all those time intervals to tor-relays@ with an explanation how this affects ExoneraTor results. I'll start with this now.

comment:3 Changed 16 months ago by irl

How do we plan to detect incomplete data? Perhaps the simplest option is to have a table that keeps track of events, and return those events along with the result instead of making the query more complex. i.e. do two SQL queries, one for the data and a second one for any events that might give context.

When the importer fails to find a new exit list, it can just add an entry to the events table. If there are not enough addresses in it, or no address was recently checked, it could do the same.

Understanding how often this has happened would be a good start, maybe we don't need to have the importer do this if it's not happening that often, we can just detect it and manually add rows to the table.

comment:4 in reply to:  3 Changed 16 months ago by karsten

Replying to irl:

Understanding how often this has happened would be a good start, maybe we don't need to have the importer do this if it's not happening that often, we can just detect it and manually add rows to the table.

Here are the gaps of 4 hours or more that I found in existing data:

Gap of  19 hours between 2011-09-10T00:05 and 2011-09-10T19:28:46.
Gap of  10 hours between 2011-09-10T22:30:22 and 2011-09-11T09:27:05.
Gap of  23 hours between 2011-12-21T01:21:19 and 2011-12-22T00:23:36.
Gap of   4 hours between 2012-01-10T03:16:24 and 2012-01-10T07:20:03.
Gap of 111 hours between 2012-02-07T02:33:16 and 2012-02-11T18:07:47.
Gap of  12 hours between 2012-11-09T05:06:32 and 2012-11-09T17:44:08.
Gap of   6 hours between 2013-03-07T06:14:23 and 2013-03-07T13:03:51.
Gap of  26 hours between 2013-03-07T15:16:50 and 2013-03-08T17:17:55.
Gap of 156 hours between 2013-03-14T09:29:25 and 2013-03-20T22:03:41.
Gap of   6 hours between 2013-08-08T20:07:23 and 2013-08-09T02:57:34.
Gap of   7 hours between 2013-09-29T01:01:11 and 2013-09-29T08:04:56.
Gap of  12 hours between 2013-10-05T14:13:06 and 2013-10-06T03:11:26.
Gap of  11 hours between 2013-11-03T15:06:38 and 2013-11-04T02:31:19.
Gap of   4 hours between 2013-12-24T08:33:57 and 2013-12-24T13:04:42.
Gap of   7 hours between 2014-01-21T10:38:42 and 2014-01-21T18:14:20.
Gap of  19 hours between 2015-10-09T14:13:37 and 2015-10-10T09:57:26.
Gap of   8 hours between 2016-09-18T03:16:31 and 2016-09-18T12:11:46.
Gap of  14 hours between 2017-11-19T17:12:04 and 2017-11-20T07:54:34.
Gap of   9 hours between 2018-01-21T22:12:34 and 2018-01-22T08:10:15.
Gap of   6 hours between 2018-01-26T16:11:36 and 2018-01-26T23:02:35.
Gap of   5 hours between 2018-01-27T04:21:42 and 2018-01-27T09:28:47.
Gap of   5 hours between 2018-01-27T14:53:18 and 2018-01-27T20:24:21.
Gap of  18 hours between 2018-02-02T18:27:54 and 2018-02-03T13:06:09.
Gap of   9 hours between 2018-02-25T00:16:21 and 2018-02-25T10:12:04.
Gap of   5 hours between 2018-03-03T15:54:17 and 2018-03-03T21:09:07.
Gap of   5 hours between 2018-09-24T15:11:07 and 2018-09-24T21:09:27.
Gap of   9 hours between 2018-12-30T23:22:06 and 2018-12-31T08:57:55.
Gap of  13 hours between 2019-01-12T18:30:24 and 2019-01-13T08:18:33.
Gap of 122 hours between 2019-04-25T13:13:19 and 2019-04-30T15:40:21.
Gap of  21 hours between 2019-05-25T19:04:43 and 2019-05-26T16:09:23.
Gap of   9 hours between 2019-06-21T21:14:21 and 2019-06-22T07:06:06.

Note that a 4 hour downtime wouldn't be an issue for ExoneraTor. It considers a previously scanned exit IP address valid for 24 hours. We would probably be looking for gaps 18 hours or longer.

comment:5 Changed 16 months ago by karsten

Owner: changed from karsten to metrics-team
Status: needs_reviewassigned

I'm not working on this ticket at the moment. Re-assigning to metrics-team.

comment:6 Changed 15 months ago by karsten

I started looking into implementing this enhancement.

First thing is that we'll have to define criteria for putting out a warning that we might be lacking exit scanner information. I think that experiencing an 18+ hour gap of exit scan results would be reason enough to emit a warning. This would apply to the requested day minus 2 days and plus 1 day.

What we could do is extend search_by_date_address24() by another UNION to include a query like this:

SELECT DATE(scanned) AS date, NULL AS fingerprint_base64,
      DATE_TRUNC('hour', scanned) AS scanned, NULL AS exitaddress,
      NULL AS validafter, NULL AS nickname, NULL AS exit, NULL AS oraddress
  FROM exitlistentry_exitaddress
  WHERE DATE(scanned)
  BETWEEN '2013-03-12' AND '2013-03-15' -- requested date in this example: '2013-03-14'
  GROUP BY 1, 3;

The response would contain up to 4 * 24 = 96 rows for each hour of exit scan results during the four days around the requested date. We could then check for gaps that are 18 hours or longer and print out a warning if we found something.

This query is really fast, because it uses the exitlistentry_exitaddress_date_scanned_fingerprint_id index. I don't expect major performance drawbacks from adding this query to search_by_date_address24().

Next steps:

  • Provide a new schema file src/main/sql/exonerator3.sql with the updated search_by_date_address24 function.
  • Update QueryServlet.java to parse the new query reponse by checking whether there were exit scanner results for all four days in question.
  • Extend QueryResponse.java to include this information.
  • Update ExoneraTorServlet.java to display this information.
  • Prepare the new warning message for translation.

Still leaving this assigned to metrics-team, as I'm not planning to implement this in the next couple days.

Feedback appreciated!

Changed 13 months ago by karsten

Attachment: task-31071-note.png added

comment:7 Changed 13 months ago by karsten

Status: assignedneeds_review

I finished a first implementation of this feature. It's pretty much as described in my previous comment, with the following exceptions:

  • I extended the view to missing statuses. That means if the directory authorities do not produce a consensus for an extended time, we'll put out a warning, too. This case is much less likely than a failing exit scanner, but it seemed related enough to include this case here.
  • I did not restrict the warning to cases when we're missing consecutive 18 hours of statuses or exit lists, but for cases when we're missing any 18 hours in the 3 or 4 days we're looking at. This was slightly easier to implement, but it's probably also slightly more robust, because it also addresses cases where the scanner dies, comes back shortly, and dies again.
  • I updated src/main/sql/exonerator2.sql rather than creating a new file src/main/sql/exonerator3.sql, because the changes did not affect any tables or views, just a single function. Let's be honest, the whole update is going to be a manual process anyway, and one needs to know exactly what one is doing. That's why I kept this simple.

Please take a look at the attached screenshot with examples how this new warning would look like. It's the "However, the database..." part. We could of course make this more visible and clearer. Maybe that's a UX team mission, though.


Regarding code, please take a look at commit b0165cb in my task-31071 branch.

comment:8 Changed 13 months ago by irl

Status: needs_reviewmerge_ready

Looks good to me.

comment:9 Changed 13 months ago by karsten

Cool! I asked a few folks to take a look at this ticket regarding UX and related aspects. Let's give it another few days or so before merging this ticket. And once it's merged, let's wait with the release until relevant translations exist.

comment:10 Changed 13 months ago by gk

Looks good from my PoV.

comment:11 Changed 13 months ago by emmapeel

Regarding localization, transifex is updating automatically when the file https://gitweb.torproject.org/exonerator.git/tree/src/main/resources/ExoneraTor.properties gets updated.

Are you planning to add a new localization file, or the strings are going to be there?

There are several languages to add also, maybe we can do a round as part of the upgrade?

comment:12 Changed 13 months ago by karsten

Thanks for looking, gk!

emmapeel, the plan is to add a new line to that existing file (or several lines if needed). In theory, this new line should be included in translations automatically. And yes, we can talk about adding more languages as part of this upgrade. That would happen after merging this change and waiting for a week or two for translations to come in.

comment:13 in reply to:  7 Changed 13 months ago by antonela

Replying to karsten:

This looks good for me.

comment:14 Changed 13 months ago by karsten

Great! Merged to master.

emmapeel, is the updated ExoneraTor.properties file automatically being updated in the translation portal, or is there a manual step involved? It would be great to have translations of the new line "summary.missingdata=However, the database is missing several hours of data for this specific request, so that this result must be interpreted carefully." before releasing and deploying this change.

comment:15 Changed 12 months ago by karsten

Resolution: fixed
Status: merge_readyclosed

I'm about to put out a new release with this change. Four weeks after merging this commit to master we only have 1 translation of that new line (French) out of 4 translated languages (German, French, Romanian, Swedish). However, the French translation has so many small changes like different apostrophes or whitespace characters that I'm having a hard time detecting actual textual changes. And apparently the German translation has been reset to the English original in a recent commit. This doesn't work so well. I decided to leave translations unchanged for this release which means that this new line will not be translated. We should discuss our translation policy at some point, but for the moment this shouldn't block the release. Closing, as this change is going to be released and deployed in a bit.

comment:16 Changed 12 months ago by karsten

Resolution: fixed
Status: closedreopened

More bad news: Turns out that worst case query time goes up from 3 to 58 seconds with the new query, which is inacceptable. We'll have to analyze and improve the query on the production database before merging this change. I just reverted this change and will put out a release without it now.

comment:17 Changed 10 months ago by karsten

We'll also have to make sure that this notice shows up when searching for IPv6 addresses. As of now, exit lists do not contain IPv6 addresses at all, so we're always missing exit list data when looking up an IPv6 address. But this will change at some point, and then we'll have to do the same checks as for the IPv4 case. Closed #26051 as near duplicate that covered the IPv6 case.

comment:18 Changed 9 months ago by irl

Reviewer: irl

This is not currently in need of review.

Note: See TracTickets for help on using tickets.