Opened 4 months ago

Closed 12 days ago

#28940 closed enhancement (implemented)

Add support for LOG to goptlib

Reported by: dcf Owned by: dcf
Priority: Medium Milestone:
Component: Obfuscation/Pluggable transport Version:
Severity: Normal Keywords: goptlib
Cc: ahf Actual Points:
Parent ID: Points: 0.3
Reviewer: Sponsor: Sponsor19

Description

see:

  • #28179 (code changes)
  • #28181 (pt-spec changes) [doesn't seem to be committed yet?]

ahf made a branch here:
https://github.com/ahf/goptlib/commits/features/logging

Child Tickets

Attachments (2)

0001-Bug-28940-add-support-for-LOG.patch (6.3 KB) - added by dcf 2 months ago.
Bug-28940-add-support-for-LOG.2.patch (6.3 KB) - added by dcf 7 weeks ago.

Download all attachments as: .zip

Change History (15)

comment:1 Changed 4 months ago by dcf

Owner: changed from asn to dcf
Status: newassigned

comment:2 Changed 4 months ago by dcf

Status: assignedneeds_revision

Thanks for making this branch, ahf.

First off: I'm really confused about the format of the LOG message. The discussion at #21879 and #28181 doesn't seem to have been updated after decisions got made on IRC. In addition to LOG, there's also STATUS? It would be nice if the format were in a spec, or at least in textual form in a ticket somewhere. Especially with regard to what byte values are allowed and which must be escaped, and how.

Here's some quick review.

  • https://github.com/ahf/goptlib/commit/2d4cadf91f7da4fb43d8c1e85bb3c851fb660b0c (removing advice on how to spy on the stdout output stream) is inappropriate. The LOG feature does not replace this feature. While debugging, you might want to see for example CMETHOD line that won't just be logged verbatim by tor or whatever the controlling program is. And we won't be able to rely on the LOG feature for a long time yet.
  • I do not want goptlib to expose a log.Logger interface. That is a high-level application matter, not something for a low-level library like goptlib. If an application needs a log.Logger-compatible interface, it can write its own wrapper. I feel that there should be exactly one new exported function:
    func Log(severity int, message string)
    
    plus appropriate constants for severities. This is a tiny new feature; it doesn't need a new source file.
  • Don't use names like kvline_escape_value. Use camel-case names as https://golang.org/doc/effective_go.html#mixed-caps
  • I don't want a full-fledged key–value encoding library if we don't need it. I would rather have Log defined as something like
    line("LOG", "SEVERITY=" + encodeValue(severity), "MESSAGE=" + encodeValue(message))
    
    In the encodeValue helper function, don't do a check like kvline_value_needs_escape. Just escape everything always.
  • No need to call bytes.Buffer.Grow. It will grow automatically as you write to it.
  • In the test code, please also test:
    • Values containing '\0' (should panic?)
    • Values containing bytes >= '\x80' (e.g. UTF-8) (should panic?)
    • Values containing '='
    • Empty string

Unfortunately, per https://spec.torproject.org/pt-spec Section 3.3, we are limited to US-ASCII minus '\x00' and '\n' in all PT output lines, so we need to test what happens when that is violated, even if the KV quoted string output allows other values.

comment:3 Changed 4 months ago by dcf

Please also add examples of using the new feature to dummy-client and dummy-server, and to the long block comment at the top of pt.go. (Just one or two examples is fine, doesn't have to be comprehensive.)

comment:4 Changed 4 months ago by ahf

Sponsor: Sponsor19

Neat! Thanks for the early review. I'll try to revise the patches and link a branch here once I've fixed the things you have asked for.

It will most likely be in early January though!

comment:5 Changed 4 months ago by ahf

Just to address the spec concerns: Yes, they are not committed yet and I'm they are still open for discussion to some extend I think. The last week here we had some other bugs unrelated to all of this that had higher priority, but David and I will return to this in the new year.

comment:6 Changed 3 months ago by ahf

Points: 0.3

comment:7 Changed 3 months ago by ahf

Parent ID: #28180

Remove parent.

comment:8 Changed 3 months ago by ahf

Let's do this as an extension and keep goptlib for the low level details. goptlibext could be the name.

comment:9 in reply to:  8 Changed 2 months ago by dcf

Status: needs_revisionneeds_review

Replying to ahf:

Let's do this as an extension and keep goptlib for the low level details. goptlibext could be the name.

I definitely want a basic Log function in goptlib itself. A higher-level interface, using log.Logger for example, is a good candidate for a separate library that builds on goptlib, I agree.

Here's an implementation of Log along the lines of what I was thinking. The full patch includes more documentation and tests.

// Severity levels for the Log function.
const (
	LogLevelError   = "error"
	LogLevelWarning = "warning"
	LogLevelNotice  = "notice"
	LogLevelInfo    = "info"
	LogLevelDebug   = "debug"
)

// Encode a string according to the CString rules of section 2.1.1 in
// control-spec.txt.
//
// We additionally need to ensure that whatever we return passes argIsSafe,
// because strings encoded by this function are printed verbatim by Log.
func encodeCString(s string) string {
	result := bytes.NewBuffer([]byte{})
	result.WriteByte('"')
	for _, c := range []byte(s) {
		if c == 32 || c == 33 || (35 <= c && c <= 91) || (93 <= c && c <= 126) {
			result.WriteByte(c)
		} else {
			fmt.Fprintf(result, "\\%03o", c)
		}
	}
	result.WriteByte('"')
	return result.String()
}

// Emit a LOG message with the given severity (one of LogLevelError,
// LogLevelWarning, LogLevelNotice, LogLevelInfo, or LogLevelDebug).
func Log(severity, message string) {
	// "<Message> contains the log message which can be a String or CString..."
	// encodeCString always makes the string safe to emit; i.e., it
	// satisfies argIsSafe.
	line("LOG", "SEVERITY="+encodeCString(severity), "MESSAGE="+encodeCString(message))
}

comment:10 Changed 2 months ago by ahf

Status: needs_reviewmerge_ready

Let's get your LOG support into goptlib. It looks good.

I lifted out the logger and the STATUS API into a goptlibext package at https://github.com/ahf/goptlibext

Let's move it over to Gitlab once the instance is up and running this or next week.

comment:11 in reply to:  10 Changed 2 months ago by dcf

Replying to ahf:

I lifted out the logger and the STATUS API into a goptlibext package at https://github.com/ahf/goptlibext

Would it help if I exposed the pt.line function? It's what's used internally to provide all the line-oriented output for other functions like pt.Cmethod and pt.ProxyDone. Or is writing to pt.Stdout good enough for your purposes?

Changed 7 weeks ago by dcf

comment:12 Changed 7 weeks ago by dcf

Status: merge_readyneeds_review

attachment:Bug-28940-add-support-for-LOG.2.patch has some minor changes. The first is that I changed "log level" to "log severity" to match the terminology of the spec.

The second change is to better conform with the letter of the spec regarding the value of SEVERITY=. But the way the spec is worded may be an oversight, so I want to check. Formerly I defined the log severity constants as plain strings. This was safe because the function escaped the severity string, so no matter what garbage the caller provided, it would not violate the global "any US-ASCII character but NUL or NL".

But while the spec specifies quoting for MESSAGE=, it does not actually say that SEVERITY= may be quoted. That is, while this certainly conforms:

LOG SEVERITY=debug MESSAGE="hello world"

this may not conform:

LOG SEVERITY="debug" MESSAGE="hello world"

But if we don't quote the severity string, then we need to prevent callers from providing garbage values for it. I took some inspiration from ahf's patch. I made the log severity constants be instances of an unexported struct type; i.e., it's not possible to create additional instances of the style from outside the package. In short it looks like this:

// Unexported type to prevent external callers from inventing new severities.
type logSeverity struct {
	string
}

// Severity levels for the Log function.
var (
	LogSeverityError   = logSeverity{"error"}
	LogSeverityWarning = logSeverity{"warning"}
	LogSeverityNotice  = logSeverity{"notice"}
	LogSeverityInfo    = logSeverity{"info"}
	LogSeverityDebug   = logSeverity{"debug"}
)

And the body of Log changed from

line("LOG", "SEVERITY="+encodeCString(severity), "MESSAGE="+encodeCString(message))

to

line("LOG", "SEVERITY="+severity.string, "MESSAGE="+encodeCString(message))

So the question is, is this strict enumeration necessary, or is the spec meant to allow quoting of SEVERITY=? It's allowed by tor; but should we specify it? As it stands, an implementation would be totally justified in parsing a LOG line with a regex, something like: ^LOG SEVERITY=(error|warning|notice|info|debug) MESSAGE=(\w+|"(\\[0-9]{1,3}|\\n|\\t|\\r|\\"|\\\\|[^\\])*")$.

comment:13 Changed 12 days ago by dcf

Resolution: implemented
Status: needs_reviewclosed
Note: See TracTickets for help on using tickets.