Issue C. User Input Written to Logs
At 2014-04-23 11:52:12 Arturo Filastò wrote: User input is written to the log files without escaping/removing characters with special meaning. An attacker can insert fake log entries or create log entries that contain terminal escape codes, or other injection attacks.
Remediation:
Rather than escape or sanitize user input at each call to log(), we recommend fixing the log function itself to do this. All logging paths should go through the same, safe, sanitizer. Here are two examples of the sort of encoding we mean: one in use in Tahoe-LAFS, and a self-contained function we have not used, and only cursorily tested:
import codecs
def debug(logmsg):
"""
I'll encode logmsg into a safe representation (containing only
printable ASCII characters) and pass it to log.debug() (which in
this example stands in for some underlying logging module that
doesn't further process the string).
As an aside, it can be helpful to hold all strings of human-language
characters in Python unicode objects, never in Python (Python v2) string
objects (which are renamed to "bytes" objects in Python v3). However,
that is not necessary to use this.
"""
return log.debug(log_encode(logmsg))
def log_encode(logmsg):
"""
I encode logmsg (a str or unicode) as printable ASCII. Each case
gets a distinct prefix, so that people differentiate a unicode
from a utf-8-encoded-byte-string or binary gunk that would
otherwise result in the same final output.
"""
if isinstance(logmsg, unicode):
return ': ' + codecs.encode(logmsg, 'unicode_escape')
elif isinstance(logmsg, str):
try:
unicodelogmsg = logmsg.decode('utf-8')
except UnicodeDecodeError:
return 'binary: ' + codecs.encode(logmsg, 'string_escape')
else:
return 'utf-8: ' + codecs.encode(unicodelogmsg, 'unicode_escape')
else:
raise Exception("I accept only a unicode object or a string, not a %s object like %r"
% (type(logmsg), repr(logmsg),))
This issue was automatically migrated from github issue https://github.com/TheTorProject/ooni-probe/issues/302