We did a preliminary code review with Nick and Yawning. Here are some comments:
The max_len check in encoder.py:decode() is a bit weird. IIUC it's the
only check that controls the size of the input string, and the
max_len check does:
{{{
insufficient = (len(covertext) < self._max_len)
if insufficient:
raise DecodeFailureError(
"Covertext is shorter than self._max_len, can't decode.")
}}}
This confuses me, because I'm used to checks that abort if
the input string is larger than the maximum length, whereas your
check aborts if the input string is smaller than the maximum
length. Why is that?
rank_unrank.cc:DFA::rank() is a bit sketchy. It's not entirely
obvious to me why the _T array has enough space to accomodate any
value of 'n' (which is the size of the input string). I don't really
understand _buildTable() which builds the _T array, and it's also
undocumented (what is _T even?).
This, combined with the previous comment might get ugly. But I might be misreading the code.
Yawning notes that 'std::string attFstFromRegex' is not exception safe.
Nick notes that it might be better if we blindly replaced all operator[] uses
with .at()s, which do bound checking. If at() is too slow for us,
parts of the code that are on the critical path should get analyzed
and optimized accordingly.
Parts of the code that can't use at() (if any) should use asserts to make
sure that no out-of-bound read/write ever happens.
It seems to me that some of the TODOs left on the rank_unrank.cc
file might be security related.
BTW, I mainly looked at the rank() function since IIUC it's the one that
receives network data. What other functions should we pay attention
to?
Also note that our review was mainly looking for security issues. We
didn't look for correctness.
We did a preliminary code review with Nick and Yawning. Here are some comments:
The max_len check in encoder.py:decode() is a bit weird. IIUC it's the
only check that controls the size of the input string, and the
max_len check does:
{{{
insufficient = (len(covertext) < self._max_len)
if insufficient:
raise DecodeFailureError(
"Covertext is shorter than self._max_len, can't decode.")
}}}
This confuses me, because I'm used to checks that abort if
the input string is larger than the maximum length, whereas your
check aborts if the input string is smaller than the maximum
length. Why is that?
The name self._max_len is an unfortunate misnomer. It refers to the maximum length strings that we precompute for being able to (un)rank strings in our language.
In fact, for efficiency purposes we (un)rank only strings that exactly self._max_len. That is, for a regular expression R and k = self._max_len, all strings we put on the wire are of the format L(R)_k [concat] .*, where L(R)_k is the set of all strings in L(R) that are exactly length k. So, insufficient, when false, means that we know that our ciphertext is not long enough to unrank, so we should bail.
rank_unrank.cc:DFA::rank() is a bit sketchy. It's not entirely
obvious to me why the _T array has enough space to accomodate any
value of 'n' (which is the size of the input string). I don't really
understand _buildTable() which builds the _T array, and it's also
undocumented (what is _T even?).
This, combined with the previous comment might get ugly. But I might be misreading the code.
Our precomputation table _T for value _T[q][i], where q is a state in our DFA and i is a non-negative integer, is the number of unique accepting paths in our minimized DFA of length i from state q.
If it helps, I can put an explicit paragraph that explains this in the comments of the header file.
Also, worth noting that fte/encoder.py:133 we ensure that we never input more than self._max_len bytes into rank(). However, it is certainly a good idea for us to exercise defense in depth and check at all levels.
Yawning notes that 'std::string attFstFromRegex' is not exception safe.
Well spotted, I will fix this.
Nick notes that it might be better if we blindly replaced all operator[] uses
with .at()s, which do bound checking. If at() is too slow for us,
parts of the code that are on the critical path should get analyzed
and optimized accordingly.
Parts of the code that can't use at() (if any) should use asserts to make
sure that no out-of-bound read/write ever happens.
I will try replacing all [] operators with .at(), and will report on the performance impact.
It seems to me that some of the TODOs left on the rank_unrank.cc
file might be security related.
I agree.
BTW, I mainly looked at the rank() function since IIUC it's the one that
receives network data. What other functions should we pay attention
to?
DFA::rank() in rank_unarnk.cc and DFA_rank() in cDFA.cc are the two to look at.
Also note that our review was mainly looking for security issues. We
didn't look for correctness.
Changed the name of max_len to fixed_slice to better represent the actual meaning of the variable.
Added documentation to fte/encoder.py:RegexEncoderObject to explain the significance of fixed_slice.
Added documentation to fte/rank_unrank.h to explain the significance of _T and buildTable.
Changed all uses of [] to .at().
Resolved all TODOs.
Added verification on input to unrank to ensure the integer is within the expected range.
Added verification on input to rank to ensure the string is the exact size we expect.
Added verification on output of unrank/rank to ensure we properly traversed the DFA and are in a final state.
Added a try/catch around the call to re2 in attFstFromRegex.
The changes above introduce the following question: If the inputs to (un)rank are out of range, then that indicates something has gone seriously wrong in the system. That is, we shouldn't ever invoke unrank on an integer that we don't know how to unrank. What's the "proper" PT way to handle this fatal error condition?
The changes above introduce the following question: If the inputs to (un)rank are out of range, then that indicates something has gone seriously wrong in the system. That is, we shouldn't ever invoke unrank on an integer that we don't know how to unrank. What's the "proper" PT way to handle this fatal error condition?
I don't think you can do anything except close the connection and write to a log file somewhere. It's the same as if a MAC failed to verify, or a WebSocket message were too long.
Nice. Code looks more defensive now. I'll try to find some time to re-review the code.
BTW, just to make sure I got this right, rank() is the only function that accepts attacker-controlled network data. unkrank() only handles "trusted" input, right?
The changes above introduce the following question: If the inputs to (un)rank are out of range, then that indicates something has gone seriously wrong in the system. That is, we shouldn't ever invoke unrank on an integer that we don't know how to unrank. What's the "proper" PT way to handle this fatal error condition?
I don't think you can do anything except close the connection and write to a log file somewhere. It's the same as if a MAC failed to verify, or a WebSocket message were too long.
Is there a standard paradigm for logging from PTs?
Nice. Code looks more defensive now. I'll try to find some time to re-review the code.
BTW, just to make sure I got this right, rank() is the only function that accepts attacker-controlled network data. unkrank() only handles "trusted" input, right?
That is roughly correct from the C++ side. However, it is worth pointing out that rank has two levels: the Python wrapper in fte/cDFA.cc and the actual rank algorithm in fte/rank_unrank.cc.
In addition, in Python, the entry point for attacker-controlled code is decode in fte/encoder.py. This calls fte.dfa.rank as well as fte.encrypter.decrypt.
The changes above introduce the following question: If the inputs to (un)rank are out of range, then that indicates something has gone seriously wrong in the system. That is, we shouldn't ever invoke unrank on an integer that we don't know how to unrank. What's the "proper" PT way to handle this fatal error condition?
I don't think you can do anything except close the connection and write to a log file somewhere. It's the same as if a MAC failed to verify, or a WebSocket message were too long.
Is there a standard paradigm for logging from PTs?
Not really. But there should be one.
Obfsproxy, when used in managed mode, only allows logging to a file. I think this is the easiest and "cleanest" solution for now. (This happens because its stdout channel is used for the managed-proxy configuration protocol, and its stderr is unused (#9957 (moved)).). For example, you do:
ClientTransportPlugin obfs2 exec /usr/local/bin/obfsproxy --managed --log-file=/home/user/logq
Apart from #9957 (moved) (which will certainly be helpful), we might be able to design a better logging solution for PTs by using the Extended ORPort or the TransportControlPort (unimplemented; see 196). This has not been particularly needed so far.
Could you rebase your branch on top of, say, tag tbb-3.5.2-build5? At the moment your branch's history is totally disjoint from both the 3.6-beta branch and the main tor-browser-bundle branch. b0571a5f looks like you made an unversioned copy of the 3.6-beta branch and added all the files, which loses all the common history.
What you should do instead, is check out a copy of the mainline tor-browser-bundle,
Make your changes in the fte branch, and make that the branch that you push to github. If things work out right, git log | tail in your branch should go all the way back to the from __aliens__ import Bitcoincommit. gitk fte master should show the fte branch splitting off the master branch.
bd789345 looks like a mistaken merge with fteproxy. Even though you later undid the merge, it will be a lot easier to review the branch if it doesn't have all the fteproxy commits in it.
Repairing the history might be a bit tricky. Since there are only about a dozen commits that aren't due to the fteproxy merge, I think I'd handle it with a git format-patch on each of the changes in the old repository, then git am in a new repository. To be specific:
Repeat format-patch and am for all the commits you want to keep. It looks like this command in tor-browser-bundle-old will give you just the non-fteproxy commits you want (e.g., ec01e4d85a0b75dd3abbf38bf5ad79ba59479b72 is in there):
git log --first-parent --reverse
Trac: Resolution: implemented toN/A Status: closed to reopened
Beware that in going to 3.5.2, you have to delete (or rename) your gitian-builder/inputs/tor-browser directory and let Gitian re-fetch it, because of an upstream Mozilla repository change.
I get matches on the Windows packages, and mismatches on the others. I'm guessing it's because you didn't rebuild the others, which is fine.
Let's talk tomorrow about what to do to get the patch merged. I haven't looked very hard at the changes yet. We should make sure the Gitian bunch know that this is happening.
What feedback have you gotten on the bundles at fteproxy.org?
Yeah, mismatches are because I only rebuilt the Windows binaries.
I have had feedback on the bundles on fteproxy.org, as well as the ftreproxy software. That's what has resulted in issues raised on github, and subsequent releases:
I froze functionality at 0.2.0 (3 months ago) and all releases since have focused on improving security, stability, performance, and cross-platform compatibility.
I have not yet received feedback on the TBB 3.5.x + fteproxy bundles.
Ok, I went ahead and merged this. Note that we will need to add default bridges and some prefs to Tor Launcher (for #10418 (moved)) to make these available as a transport during initial PT config.
Trac: Resolution: N/Ato fixed Status: reopened to closed