ExitList descriptor chokes on getUnrecognizedLines()

added component::metrics/library owner::karsten priority::medium reporter::tomlurge resolution::not a bug severity::normal status::closed type::defect labels

Trac:
Username: tomlurge
Cc: N/A to tomlurge

Trac:
Owner: N/A to karsten
Component: Metrics to metrics-lib

Trac:
0004-tordnsel-patch-for-task-17701.patch

patch to applied on top of the patches for the torperf ticket

patch added. please test and review.

Dry run fails.

git -c diff.mnemonicprefix=false -c core.quotepath=false -c credential.helper=sourcetree apply -v --check --reject -p 1 /Users/tl/Downloads/0004-tordnsel-patch-for-task-17701.patch fatal: new file test/org/torproject/descriptor/impl/ExitListImplTest.java depends on old contents Completed with errors, see above

Tried to apply the patch anyway (since the error seems to only affect tests) but that failed too.

Trac:
Username: tomlurge

Trac:
0001-changes-for-17701.patch

part one

Trac:
0002-test-class-for-17701.patch

part two

used the same sources and generated two new patch files. one with the changes and one with the new test class. (I don't know what caused the above problem.)

Applying the 2 patches seperately worked. The converter doesn't throw errors anymore but doesn't output any descriptor data either (except the first line that's hardcoded at the top - verbose, compress and /data/in/path), no matter if the check for unrecognized lines is on or off.

Trac:
Username: tomlurge

Well, the test returns the five descriptors that it received. I noticed that the your testdata file is not correctly named. I should be named 2015-09-01-00-02-02 as those files on collector.torproject.org

All tordnsel descriptors are named like that on my box. I threw a tordnsel descriptor together with descriptors of all other types in one directory and converted them in one pass to rule out typos and the like but same result: they all pass, no errors or warnings, but also no sign of a tordnsel-descriptor in the otherwise complete JSON result set.

Trac:
Username: tomlurge

there is a dublicate exitnode entry in your data (line 2790):

ExitAddress 89.248.172.45 2015-08-31 01:09:00
ExitAddress 89.248.169.36 2015-08-31 19:04:24

that causes the problem.

Maybe check the descriptorFile.getException() in the Converter?

Which data are you referring to? My sample data is not on Github and there's no file with 2790 lines that I'm aware of. Can you tell me how to run the tests that your patch includes? Maybe that can give a hint on what's going wrong but I've never run test before. Also I could use a hand in how to call the getException() method if you have a minute.

Trac:
Username: tomlurge

I added the getException() method in ConvertToJson (line 103 ff) like this:

//  tordnsel
if (descriptor instanceof ExitList) {
  jsonDescriptor = JsonExitList.convert((ExitList) descriptor);
  System.out.print(descriptorFile.getException());
}

but got neither exceptions nor JSON.

Trac:
Username: tomlurge

Changed error handler to descriptorFile.getException().printStackTrace(); but to no avail. Just started my first foray into using a real debugger: what I can see and hopefully describe correctly is that for every other type of descriptors the code walks through every step of the if (descriptor instanceof <someDescriptorType>) switching logic but for tordnsel descriptors it jumps from for (Descriptor descriptor : descriptorFile.getDescriptors()) (line 75) straight to the line line befor bw.close. Just like that... That's "strange".

Trac:
Username: tomlurge

Replying to tomlurge:

I added the getException() method in ConvertToJson (line 103 ff) like this:

//  tordnsel
if (descriptor instanceof ExitList) {
  jsonDescriptor = JsonExitList.convert((ExitList) descriptor);
  System.out.print(descriptorFile.getException());
}
}}}

Rather to System.err if there is an exception: {{{ if (descriptor instanceof ExitList) { jsonDescriptor = JsonExitList.convert((ExitList) descriptor); if(null != descriptorFile.getException()){ System.err.print(descriptorFile.getException()); } }


You're above

descriptorFile.getException().printStackTrace();

will cause a NullPointerException when getException() returns null.

Replying to tomlurge: Could you add your data to github? After removing the double exitnode line the converter produced the wanted json with your old data on github.

... Can you tell me how to run the tests that your patch includes? Maybe that can give a hint on what's going wrong but I've never run test before.

You build metrics-lib by

ant jar

and the following runs the test suites

ant test

or everything combined

ant clean compile test jar

PS: This is the data I was using: https://github.com/tomlurge/mteam/blob/master/docs/rawDataExamples/tordnsel.txt

Replying to iwakeh:

Replying to tomlurge: Could you add your data to github?

Here you go: https://github.com/tomlurge/mteam/tree/master/data/in

After removing the double exitnode line the converter produced the wanted json with your old data on github.

Same here! See below...

PS: This is the data I was using: https://github.com/tomlurge/mteam/blob/master/docs/rawDataExamples/tordnsel.txt

Ah! I had forgotten that completely... As you can see in the /data/in directory, that is now in the repo, I have one directory with a few descriptors per each type and another directory, named 'singles', with one descriptot per type. The Tordnsel descriptor in the 'singles' directory is the same as in /docs/rawDataExamples. Indeed every Tordnsel descriptot contained an entry "ExitAddress 89.248.169.36 2015-08-31 19:04:24". In all but one this entry was empty. In one descriptor - the one in the "singles" directory - the preceding entry was empty.

I have now - on my local copy - deleted these empty entries. Now the descriptor in "singles" passes as well as 6 of the 9 descriptors in directory "tordnsel". I assume the remaining 3 have more empty entries.

Thanks for spotting this!

But still this is a bug, right? The converter is not supposed to just give up silently on a whole descriptor file just because it encounters an empty entry. Btw: it's not a "double" entry as you say. It's an empty entry, followed by the next, non-empty, entry. So it's 2 ExitAddress lines following each other immediatly, but with different content.

Trac:
Username: tomlurge

Replying to tomlurge:

But still this is a bug, right? The converter is not supposed to just give up silently on a whole descriptor file just because it encounters an empty entry.

No, it seems to be a feature. The descriptor doesn't give up silently, because there is the getException() method. The Converter ought to use it in order to find out about problems. That's why it's there. And, that processing ends here was a decision made by Karsten long ago, so he probably could tell about the reasoning behind it.

Ceterum censeo ;-) these findings are even more reason to add tests for the Converter now and define expected behavior.

Btw: it's not a "double" entry as you say. It's an empty entry, followed by the next, non-empty, entry. So it's 2 ExitAddress lines following each other immediatly, but with different content. This is not my naming: The metrics-lib code expects one ExitAddress and then the three other data lines, but it encounters two ExitNode entries. Thus, the metrics-lib code 'thinks' this a duplicate ExitNode entry in one descriptor.

The multiple ExitAddress lines will be dealt with in #17821 (moved)

Now that #17821 (moved) is resolved, are there any remaining issues in this ticket that we need to fix?

Trac:
Status: new to needs_information

ConvertToJson works fine with current metrics-lib. No remaining issues that I'm aware of. Thanks again to iwakeh for his help and patience with me!

Trac:
Username: tomlurge

Perfect! Resolving this ticket (as "not a bug", because the bug was not in metrics-lib, AFAIUI). Please re-open if issues in metrics-lib remain that were described above.

Trac:
Resolution: N/A to not a bug
Status: needs_information to closed

closed

mentioned in issue #17821 (moved)

ExitList descriptor chokes on getUnrecognizedLines()

Child items 0

Activity