Opened 4 years ago

Closed 4 years ago

#17701 closed defect (not a bug)

ExitList descriptor chokes on getUnrecognizedLines()

Reported by: tomlurge Owned by: karsten
Priority: Medium Milestone:
Component: Metrics/Library Version:
Severity: Normal Keywords:
Cc: tomlurge Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

ConvertToJson [0] is a converter from raw CollecTor data to JSON, based on metrics-lib. Converting Exit-List descriptors (@type Tordnsel) works fine ([0], line 1221) until a check for unrecognized attributes ([0], line 109) is added. With the check enabled te converter outputs an error report like the following for each descriptor of type tordnsel:

Unrecognized lines in /data/in/tordnsel/2015-09-01-08-02-03:

[@type tordnsel 1.0]

It doesn't produce a JSON result though (which it should even if it encounters an unrecognized line).

[0] https://github.com/tomlurge/mteam/blob/master/src/mteam/ConvertToJson.java

Child Tickets

Attachments (3)

0004-tordnsel-patch-for-task-17701.patch (5.2 KB) - added by iwakeh 4 years ago.
patch to applied on top of the patches for the torperf ticket
0001-changes-for-17701.patch (2.4 KB) - added by iwakeh 4 years ago.
part one
0002-test-class-for-17701.patch (3.1 KB) - added by iwakeh 4 years ago.
part two

Download all attachments as: .zip

Change History (23)

comment:1 Changed 4 years ago by tomlurge

Cc: tomlurge added

comment:2 Changed 4 years ago by iwakeh

Component: Metricsmetrics-lib
Owner: set to karsten

Changed 4 years ago by iwakeh

patch to applied on top of the patches for the torperf ticket

comment:3 Changed 4 years ago by iwakeh

patch added.
please test and review.

comment:4 Changed 4 years ago by tomlurge

Dry run fails.

git -c diff.mnemonicprefix=false -c core.quotepath=false -c credential.helper=sourcetree apply -v --check --reject -p 1 /Users/tl/Downloads/0004-tordnsel-patch-for-task-17701.patch
fatal: new file test/org/torproject/descriptor/impl/ExitListImplTest.java depends on old contents
Completed with errors, see above

Tried to apply the patch anyway (since the error seems to only affect tests) but that failed too.

Last edited 4 years ago by tomlurge (previous) (diff)

Changed 4 years ago by iwakeh

part one

Changed 4 years ago by iwakeh

part two

comment:5 Changed 4 years ago by iwakeh

used the same sources and generated two new patch files.
one with the changes and one with the new test class.
(I don't know what caused the above problem.)

Last edited 4 years ago by iwakeh (previous) (diff)

comment:6 Changed 4 years ago by tomlurge

Applying the 2 patches seperately worked.
The converter doesn't throw errors anymore but doesn't output any descriptor data either (except the first line that's hardcoded at the top - verbose, compress and /data/in/path), no matter if the check for unrecognized lines is on or off.

Last edited 4 years ago by tomlurge (previous) (diff)

comment:7 Changed 4 years ago by iwakeh

Well, the test returns the five descriptors that it received.
I noticed that the your testdata file is not correctly named.
I should be named 2015-09-01-00-02-02 as those files on collector.torproject.org

comment:8 Changed 4 years ago by tomlurge

All tordnsel descriptors are named like that on my box.
I threw a tordnsel descriptor together with descriptors of all other types in one directory and converted them in one pass to rule out typos and the like but same result: they all pass, no errors or warnings, but also no sign of a tordnsel-descriptor in the otherwise complete JSON result set.

comment:9 Changed 4 years ago by iwakeh

there is a dublicate exitnode entry in your data (line 2790):

ExitAddress 89.248.172.45 2015-08-31 01:09:00
ExitAddress 89.248.169.36 2015-08-31 19:04:24

that causes the problem.

Maybe check the descriptorFile.getException() in the Converter?

Last edited 4 years ago by iwakeh (previous) (diff)

comment:10 Changed 4 years ago by tomlurge

Which data are you referring to? My sample data is not on Github and there's no file with 2790 lines that I'm aware of.
Can you tell me how to run the tests that your patch includes? Maybe that can give a hint on what's going wrong but I've never run test before.
Also I could use a hand in how to call the getException() method if you have a minute.

comment:11 Changed 4 years ago by tomlurge

I added the getException() method in ConvertToJson (line 103 ff) like this:

//  tordnsel
if (descriptor instanceof ExitList) {
  jsonDescriptor = JsonExitList.convert((ExitList) descriptor);
  System.out.print(descriptorFile.getException());
}

but got neither exceptions nor JSON.

comment:12 Changed 4 years ago by tomlurge

Changed error handler to descriptorFile.getException().printStackTrace(); but to no avail.
Just started my first foray into using a real debugger: what I can see and hopefully describe correctly is that for every other type of descriptors the code walks through every step of the if (descriptor instanceof <someDescriptorType>) switching logic _but_ for tordnsel descriptors it jumps from for (Descriptor descriptor : descriptorFile.getDescriptors()) (line 75) straight to the line line befor bw.close. Just like that... That's "strange".

comment:13 in reply to:  11 Changed 4 years ago by iwakeh

Replying to tomlurge:

I added the getException() method in ConvertToJson (line 103 ff) like this:

//  tordnsel
if (descriptor instanceof ExitList) {
  jsonDescriptor = JsonExitList.convert((ExitList) descriptor);
  System.out.print(descriptorFile.getException());
}

Rather to System.err if there is an exception:

 if (descriptor instanceof ExitList) {
   jsonDescriptor = JsonExitList.convert((ExitList) descriptor);
  if(null != descriptorFile.getException()){
    System.err.print(descriptorFile.getException());
  }
 }

You're above

descriptorFile.getException().printStackTrace();

will cause a NullPointerException when getException() returns null.

Last edited 4 years ago by iwakeh (previous) (diff)

comment:14 in reply to:  10 ; Changed 4 years ago by iwakeh

Replying to tomlurge:
Could you add your data to github?
After removing the double exitnode line the converter produced the wanted json
with your old data on github.

...
Can you tell me how to run the tests that your patch includes? Maybe that can give a hint on what's going wrong but I've never run test before.

You build metrics-lib by

ant jar

and the following runs the test suites

ant test

or everything combined

ant clean compile test jar

PS:
This is the data I was using:
https://github.com/tomlurge/mteam/blob/master/docs/rawDataExamples/tordnsel.txt

Last edited 4 years ago by iwakeh (previous) (diff)

comment:15 in reply to:  14 ; Changed 4 years ago by tomlurge

Replying to iwakeh:

Replying to tomlurge:
Could you add your data to github?

Here you go: https://github.com/tomlurge/mteam/tree/master/data/in

After removing the double exitnode line the converter produced the wanted json
with your old data on github.

Same here! See below...

PS:
This is the data I was using:
https://github.com/tomlurge/mteam/blob/master/docs/rawDataExamples/tordnsel.txt

Ah! I had forgotten that completely... As you can see in the /data/in directory, that is now in the repo, I have one directory with a few descriptors per each type and another directory, named 'singles', with one descriptot per type. The Tordnsel descriptor in the 'singles' directory is the same as in /docs/rawDataExamples.
Indeed every Tordnsel descriptot contained an entry "ExitAddress 89.248.169.36 2015-08-31 19:04:24". In all but one this entry was empty. In one descriptor - the one in the "singles" directory - the preceding entry was empty.

I have now - on my local copy - deleted these empty entries. Now the descriptor in "singles" passes as well as 6 of the 9 descriptors in directory "tordnsel". I assume the remaining 3 have more empty entries.

Thanks for spotting this!

But still this is a bug, right? The converter is not supposed to just give up silently on a whole descriptor file just because it encounters an empty entry. Btw: it's not a "double" entry as you say. It's an empty entry, followed by the next, non-empty, entry. So it's 2 ExitAddress lines following each other immediatly, but with different content.

Last edited 4 years ago by tomlurge (previous) (diff)

comment:16 in reply to:  15 Changed 4 years ago by iwakeh

Replying to tomlurge:

But still this is a bug, right? The converter is not supposed to just give up silently on a whole descriptor file just because it encounters an empty entry.

No, it seems to be a feature. The descriptor doesn't give up silently, because there is the
getException() method. The Converter ought to use it in order to find out about problems.
That's why it's there.
And, that processing ends here was a decision made by Karsten long ago, so he probably could
tell about the reasoning behind it.

Ceterum censeo ;-) these findings are even more reason to add tests for the Converter now
and define expected behavior.

Btw: it's not a "double" entry as you say. It's an empty entry, followed by the next, non-empty, entry. So it's 2 ExitAddress lines following each other immediatly, but with different content.

This is not my naming:
The metrics-lib code expects one ExitAddress and then the three other data lines,
but it encounters two ExitNode entries. Thus, the metrics-lib code 'thinks' this a duplicate ExitNode entry in one descriptor.

comment:17 Changed 4 years ago by iwakeh

The multiple ExitAddress lines will be dealt with in #17821

comment:18 Changed 4 years ago by karsten

Status: newneeds_information

Now that #17821 is resolved, are there any remaining issues in this ticket that we need to fix?

comment:19 Changed 4 years ago by tomlurge

ConvertToJson works fine with current metrics-lib. No remaining issues that I'm aware of. Thanks again to iwakeh for his help and patience with me!

Last edited 4 years ago by tomlurge (previous) (diff)

comment:20 Changed 4 years ago by karsten

Resolution: not a bug
Status: needs_informationclosed

Perfect! Resolving this ticket (as "not a bug", because the bug was not in metrics-lib, AFAIUI). Please re-open if issues in metrics-lib remain that were described above.

Note: See TracTickets for help on using tickets.