Change History (19)

Searching by and listing base64-encoded relay fingerprints is doable. I'm not entirely sure how + and / should be escaped in the search parameter, but I'll ask Onionoo client developers to comment on this. The listing part is simply a new field.

What do you mean by "extrainfo-hash"? Would you really want to learn about the extra-info descriptor digest here? We don't even include the server descriptor digest. What would you do with the extra-info descriptor digest?

The search parameter supports searching by (beginning of) base64-encoded relay fingerprint. This needs input from Onionoo client developers for escaping + and /. Technically, this constitutes a major protocol change that requires at least one month notice before deployment to give client developers the chance to implement filters if their searches are not supposed to return matching base64-encoded fingerprints.

(This is actually a suggestion to not change anything in Onionoo:) Regarding listing base64-encoded fingerprint, I'm considering not to add a field for that but asking client developers to re-encode the hex fingerprint if they want to display the base64-encoded one. Let's save bandwidth by avoiding to transmit redundant data. Unless binary encoding is something that clients cannot or shouldn't be required to do?

details documents contain new fields server_digest and extra_digest with hex representations of server descriptor and extra-info descriptor digests. This is a minor protocol change that I can implement in the next day or two and deploy by the end of the week if we agree it's a good idea. Atlas and friends can then display the new fields if they wish to. They could even include links to the descriptors as served by gabelmoo, if you want to.

details documents contain a new field micro_digest with the microdescriptor digest encoded in base64 with trailing =s omitted. This requires parsing microdesc consensuses which we're currently not doing, so we should run some performance tests before deploying this on the server. Don't expect this feature this month, I'd say.

The search parameter supports searching by (beginning of) base64-encoded relay fingerprint. This needs input from Onionoo client developers for escaping + and /. Technically, this constitutes a major protocol change that requires at least one month notice before deployment to give client developers the chance to implement filters if their searches are not supposed to return matching base64-encoded fingerprints.

If a relay would have a nickname identical to another relay's base64-encoded fingerprint, Onionoo would return both relays, right?

(This is actually a suggestion to not change anything in Onionoo:) Regarding listing base64-encoded fingerprint, I'm considering not to add a field for that but asking client developers to re-encode the hex fingerprint if they want to display the base64-encoded one. Let's save bandwidth by avoiding to transmit redundant data. Unless binary encoding is something that clients cannot or shouldn't be required to do?

That sounds like a good option to me but would it work with the upcoming JavaScript-free version of Globe?

The search parameter supports searching by (beginning of) base64-encoded relay fingerprint. This needs input from Onionoo client developers for escaping + and /. Technically, this constitutes a major protocol change that requires at least one month notice before deployment to give client developers the chance to implement filters if their searches are not supposed to return matching base64-encoded fingerprints.

If a relay would have a nickname identical to another relay's base64-encoded fingerprint, Onionoo would return both relays, right?

It would, yes.

(This is actually a suggestion to not change anything in Onionoo:) Regarding listing base64-encoded fingerprint, I'm considering not to add a field for that but asking client developers to re-encode the hex fingerprint if they want to display the base64-encoded one. Let's save bandwidth by avoiding to transmit redundant data. Unless binary encoding is something that clients cannot or shouldn't be required to do?

That sounds like a good option to me but would it work with the upcoming JavaScript-free version of Globe?

Why wouldn't it work? The JavaScript-free version of Globe would simply put together details pages on the Globe server, rather than letting the user's browser do that.

I implemented searching for base64-encoded fingerprints in ​my task-13135 branch. That branch passes unit tests, but I didn't test it in the Vagrant testing instance yet. We'll also have to announce this protocol change one month prior to deploying it. Please review.

The branch looks fine. (No problems when running it locally, but I had to start with fresh data.)

In SummaryDocument (catch-statements around lines 52 and 65)
I'd rather log the DecoderException instead of totally ignoring it.
The DigestUtils or Base64 class could malfunction, or some other problem could happen. In such a case, there would be a hint in the log.

And the version in ResponseBuilder needs to be increased before the release.

The branch looks fine. (No problems when running it locally, but I had to start with fresh data.)

In theory, starting with fresh data shouldn't have been necessary. Redeploying the .war or maybe even restarting Tomcat should be sufficient. Hmm.

In SummaryDocument (catch-statements around lines 52 and 65)
I'd rather log the DecoderException instead of totally ignoring it.
The DigestUtils or Base64 class could malfunction, or some other problem could happen. In such a case, there would be a hint in the log.

True, we should log those things.

And the version in ResponseBuilder needs to be increased before the release.

True.

The first time for using 'next_major_version_scheduled' :-)

Indeed.

However, there's still a problem with my branch: Tomcat interprets + in parameter values as space. When I fetch the following URL either with curl or Firefox, the servlet sees the following search terms.

Maybe embedded Jetty doesn't do that? Or maybe there's a fix to tell Tomcat not to interpret + as space? Or maybe we should tell Onionoo clients to escape/substitute the + character in base64 searches (ugh)?

The branch looks fine. (No problems when running it locally, but I had to start with fresh data.)

In theory, starting with fresh data shouldn't have been necessary. Redeploying the .war or maybe even > restarting Tomcat should be sufficient. Hmm.

I had to start with fresh data, b/c it was erased during an extended 'clean-up' ;-)

However, there's still a problem with my branch: Tomcat interprets + in parameter values as space. When I fetch the following URL either with curl or Firefox, the servlet sees the following search terms.

Maybe embedded Jetty doesn't do that? Or maybe there's a fix to tell Tomcat not to interpret + as
space? Or maybe we should tell Onionoo clients to escape/substitute the + character in base64
searches (ugh)?

Oh, I missed that.
Well, getQueryString() from HttpServlet will return search=S+1drLUSi/Sq+vGxcQlIjauTLDg
So, in case of a search rather parse the search parameters manually?
This would then also work for all other containers, as HttpServlet is part of the api.

I updated the behavior of getQueryString() in the unit tests and added tests based on yours. (I'm trying to limit unit tests to a single test case only, which is why I rewrote your test methods.) Feel free to suggest more tweaks.

I'm also going to set next_major_version_scheduled to November 15, 2014 and announce the next protocol version on onionoo-announce@ today.

Sebastian, while re-reading the ticket to see what's left to do, I figured that you didn't only ask for the various descriptor digests to be listed, but you also want to search for them, right? That's also doable, but slightly harder to implement.

I also realized that you don't just want the digests of the most recent descriptors that are known to Onionoo (which might be whatever metrics-db fetches from gabelmoo and moria1 at :05 of the hour, not what's contained in the latest consensus or microdesc consensus), but a set of digests of descriptors published in the last X (6? 12? 24? 48?) hours.

So, before I go build something, I'd like to hear what exact use cases you have in mind. Can you say which of the following use cases would be useful to have, and for whom and in which situation? (I guess I'm asking for user stories here...)

Search for (partial) base64-encoded fingerprint (already implemented, will be deployed on November 15)

Display base64-encoded fingerprint (should be implemented in Atlas/Globe by converting the hex fingerprint to base64)

Search for hex-encoded server descriptor digest

Search for base64-encoded server descriptor digest

Display hex-encoded or base64-encoded (whichever is more useful) digests of all server descriptors published in last X hours

Search for hex-encoded extra-info descriptor digest

Search for base64-encoded extra-info descriptor digest

Display hex-encoded or base64-encoded (whichever is more useful) digests of all extra-info descriptors published in last X hours

Search for hex-encoded microdescriptor digest

Search for base64-encoded microdescriptor digest

Display hex-encoded or base64-encoded (whichever is more useful) digests of all microdescriptors generated from descriptors published in last X hours

One big usecase for the partial searches is spotting some hidden service ring cheating sybils.

Why I requested the base64 format is because I want to plug whatever I see in a consensus in Tor into the tool to learn more about it. The same goes for the reverse direction - by having them displayed on the website, I can easily search for them on disk. I'm not sure if I ever asked for hex-encoded digests, only fingerprints I believe.

One big usecase for the partial searches is spotting some hidden service ring cheating sybils.

Partial fingerprint searches are already supported for hex-encoded fingerprints and will be supported for base64-encoded fingerprints, too. Sounds like we're doing good here.

Why I requested the base64 format is because I want to plug whatever I see in a consensus in Tor into the tool to learn more about it. The same goes for the reverse direction - by having them displayed on the website, I can easily search for them on disk. I'm not sure if I ever asked for hex-encoded digests, only fingerprints I believe.

The search part will be supported by Onionoo soon, and the display part is something you'll have to request from Atlas/Globe people in separate tickets. That's for relay fingerprints.

Does that mean descriptor digests are not relevant for your use cases? That includes server descriptors, extra-info descriptors, and microdescriptors. There was some discussion of these above, but if fingerprints are all you care about, I'm happy not to include a feature that wouldn't be used anyway.

Support various fingerprint formats →
Support base64-encoded fingerprints in `search` parameter

Sebastian and I discussed this more on IRC. The conclusion is in new tickets #13422, #13424, and #13425. I'd say let's turn this ticket into the ticket that adds base64-encoded fingerprints to the search parameter and discuss the remaining ideas in the other tickets. Leaving this ticket open until the branch is merged and deployed, which will be in a month from now.