Opened 7 years ago

Closed 7 years ago

Last modified 7 years ago

#6445 closed enhancement (fixed)

Version spec revisions

Reported by: atagar Owned by:
Priority: Low Milestone:
Component: Core Tor/Tor Version:
Severity: Keywords: tor-relay
Cc: Actual Points:
Parent ID: Points:
Reviewer: Sponsor:

Description

Adding an EXTRA_INFO attribute to tor versions

Tor versions often contain information about the SCM commit they came from, for instance...
0.2.3.16-alpha-dev (git-8be6058d8f31e578)

This isn't part of the spec, which in turn choked stem when I tried to parse those versions. Adding this in and better defining a couple other points...

  • The STATUS_TAG should only contain non-whitespace. Otherwise... well, just about *anything* could be a 'valid' status tag.
  • The spec says that status tags should be "compared lexically". The ASCII value of 'Z' is greater than 'A' so I guess this means that they're sorted in a reverse alphabetical order. This seems weird, but clarifying it.

Child Tickets

Change History (11)

comment:1 Changed 7 years ago by atagar

Status: newneeds_review

comment:2 Changed 7 years ago by atagar

The spec says that status tags should be "compared lexically".

As Roger pointed out, I'm confused on this bit.

23:01 < armadev> atagar: since when is Z less than A in ascii?
11:31 < atagar> armadev: Sorry about that, I was thinking about sorting the numeric values.
11:34 < armadev> atagar: i think you have the definition of 'lexically' backwards.
11:38 < atagar> Yea, I was confusing myself. Looked it up, said 'oh, it mean alphabetically? that has ambiguity, what about capitals or numbers?', then reread the spec where it said to compare by the ASCII byte value and decided that it meant sorting by that numerically.
11:38 < atagar> Which is the right conclusion, but backwards.

Personally I still think "we compare them lexically as ASCII byte strings" is confusing. My preference would be for us to honor what we said earlier in the spec that "The STATUS_TAG is purely informational" and not have it impact the comparison, but happy to have it clarified whatever way works best.

comment:3 Changed 7 years ago by nickm

I'm not sure I get it. When I look at the code, it seems to do a proper (that is, forward) lexical comparison. Here's a snippet I added to test_dir_versions to verify this:

  test_eq(1, tor_version_as_new_as("Tor 0.2.1.1-rc",
                                   "Tor 0.2.1.1-alpha)"));
  test_eq(0, tor_version_as_new_as("Tor 0.2.1.1-rc",
                                   "Tor 0.2.1.1-zeta)"));

It seems to pass. What makes you think that the comparison is happening reverse-lexically?

(BTW, lexically DOES NOT mean alphabetically. Alphabetical comparison is approximately what strcasecmp does. Lexical comparison is what strcmp does.)

comment:4 Changed 7 years ago by atagar

I'm not sure I get it. When I look at the code, it seems to do a proper (that is, forward) lexical comparison.

Ack! Disregard the lexical part. I wasn't sure what "lexical comparison" meant (and I'm still not entirely sure after several attempts to look it up, the closest I've found so far is when the term was used in perl documentation). If it means inverse ordinal order then great. I'm far more interested in the other bits of this spec change.

comment:5 Changed 7 years ago by nickm

But it DOESN'T mean inverse ordinal order. What makes you think it does?

Let me try again: "strcmp(a,b) is < 0 if and only if a lexically precedes b." Does that explain what lexical order is?

comment:6 Changed 7 years ago by atagar

But it DOESN'T mean inverse ordinal order. What makes you think it does?

Because the best description I've found for lexical sorting seems to say so...
http://www.acrobatfaq.com/atbref5/index/ObjectsConcepts/Codingconventions/Sorting-lexicalandnumeri.html

"In fact such sorts look at the underlying ASCII/Unicode character number and sort from lowest to highest for each character, in turn of a word or string of characters."

Let me try again: "strcmp(a,b) is < 0 if and only if a lexically precedes b." Does that explain what lexical order is?

So lexical order is "whatever strcmp() does"? That's frustrating. Ok, I'll look around for documentation on strcmp() then.

comment:7 Changed 7 years ago by nickm

Argh. Sorry; it didn't occur to me you didn't know strcmp.

Let me try again.

Suppose you have sequences of elements. Suppose that those elements themselves have an ordering defined on them.

To lexically compare two sequences, consider their elements pairwise, starting with the first element of each, then the second, then the third, and so on.

If you find a pair of elements that are not equal, then the sequence containing the earlier element is lexically earlier. If you run out of elements in one sequence but not in the other, then the one in which you ran out of elements first is earlier. Otherwise, the sequences are identical.

This is what Python does with two byte sequences when you compare them with cmp or < or so on. Basically, it is the same as alphabetical order, except that properly speaking ascii isn't an alphabet.

comment:8 Changed 7 years ago by atagar

If you find a pair of elements that are not equal, then the sequence containing the earlier element is lexically earlier.

Ahh, gotcha. Maybe I'm misunderstanding what "inverse ordinal" order means since I thought that's what this is. Ie...

>>> def is_greater_than(a, b):
...   for i in xrange(len(a)):
...     if i > len(a):
...       return True
...     elif ord(a[i]) < ord(b[i]):
...       return False
...     elif ord(a[i]) > ord(b[i]):
...       return True
...   return False

>>> is_greater_than("rc", "alpha")
True

>>> is_greater_than("alpha", "beta")
False

>>> is_greater_than("ho hum", "Ho hum")
True

(code just scratched together, didn't give much thought to the empty comparisons)

Thanks for the clarification. :)

comment:9 Changed 7 years ago by nickm

Resolution: fixed
Status: needs_reviewclosed

Removed the incorrect definition; merged the rest.

comment:10 Changed 7 years ago by nickm

Keywords: tor-relay added

comment:11 Changed 7 years ago by nickm

Component: Tor RelayTor
Note: See TracTickets for help on using tickets.