Posted
by
michael
on Tuesday March 20, 2001 @09:21AM
from the snooping-for-fun-and-profit dept.

HerrHair had the first reader submission of this, but it took a few days to look into it. If you use Earthlink's customized browser/email/chat/kitchen sink application, which Earthlink recommends for all of its new customers, you are sending an extra HTTP header called HTTP_ELNSB50 with every HTTP request (every download of a file or image), and the data for this header is a lengthy alphanumeric string, which readers took to be a unique ID of some sort. This does not appear to be the case.

Steve Gibson was apparently the first one to look into this browser serial number. I'm a little hesitant to link to that page, since its contents have changed dramatically twice in the last 24 hours. Gibson initially had a page claiming it was privacy-invading unique ID. He changed it to include a disclaimer in a large red box, and has now changed it again to display the information Earthlink provided about the serial number. Earthlink provided much the same information to slashdot after our query.

The header information sent is similar to the codes below. Depending on how logging is set up on a given webserver, they may or may not be logged, but enough server logs are accessible across the net that typing ELNSB50 into any search engine will find examples. (ELNSB50, by the way, apparently stands for "Earthlink Sandbox 5.0".)

Even a cursory examination should show that these numbers don't have enough uniqueness to be globally unique IDs. Microsoft's GUID had 128 bits; a good hash function might have 160 bits; those serial numbers, culled from widely scattered machines, aren't unique enough.

This is what Earthlink sent us about the codes:

reserved:

14

future growth

monitorDepth:

8

monitor bit depth

browserFontSize:

3

browser font -- small to large

connectionSpeed:

3

One of 4 categories

connectionType:

4

Modem, high speed, etc.

monitorHorz:

16

horizontal area

monitorVert:

16

max vertical area

browserViewHorz:

16

views horizontal area

browserViewVert:

16

views vertical area

popID:

32

numerical POP ID

sandboxVersion:

32

what version of the sandbox sent this?

Most items should be self-explanatory. ConnectionSpeed has four possible values: slow dialup (<56K), fast dialup (56K), slow broadband, and fast broadband. The POP ID refers to which of Earthlink's Point-of-Presences you are dialed up to - which bank of modems you called. The rest should be clear. If you assume the codes are a number in hexidecimal, and the above are the number of bits dedicated to each bit of information, they appear to agree well. This table differs slightly from Steve Gibson's version. The differences appear to be minor and reconcilable - Earthlink doesn't seem to like the use of the word "Sandbox" in external publications, but it's their own term for their software and it seems quite appropriate: a closed environment which has all the toys you need and which you don't want to/are not able to escape from. (A screenshot of Earthlink's Sandbox is available.)

While I was looking into this, I also noted (Ethereal strikes again) that Earthlink's Sandbox sends a good chunk of data back to Earthlink's servers upon initial installation - this data is PGP-encrypted, or at least it is preceded by a header indicating that it is. This data is sent whether or not the user is signing up for a new account or just re-installing the software on an old machine. There is no easy way to determine what information is being sent back without performing a comprehensive disassembly of the software. As of press time, Earthlink has not provided any information about what is being sent to Earthlink's servers when their software is installed.

So, there you have it. Is Earthlink's code a unique ID? Apparently not. Does it reveal more information about you when you are browsing the web than is revealed by any other web browser? Yes. Can you turn it off? No, but you could use another browser. Will 99% of Earthlink's users ever know about it? No.

Earthlink's Sandbox sends a good chunk of data back to Earthlink's servers upon initial installation... There is no easy way to determine what information is being sent back without performing a comprehensive disassembly of the software.

Let alone the HTTP header, the installation transmission seems to be an issue. It's not the first time I see software doing it, and I'm getting sick of it. I don't want my software to "phone home" every time it's installed or run. That's when I jumped in the open source/free software bandwagon. I won't run ANYTHING without the source code available. Granted, I will not always CHECK the source code, but at least I can.

There is a W3C standard called CC/PP for telling web servers all about your browser (so that it can send you some useful content rather than just telling you 'this page is best viewed in 800x600 in lots of colours). This seem to be doing much the same, abeit in a non-standard way.
Then again, everyone is ignoring CC/PP.

Timely article. W3C just advanced a
working
draft [w3.org] of CC/PP to Last Call.

It stands for Composite
Capabilities/Preferences Profiles. It's a
language that your browser
could use to describe its capabilities and your
preferences, e.g. "32-bit display, 800x600
browser window, PPC hardware, no applets."

The idea is, of course you want the
server to know what you've got, so it doesn't
send you useless content. Like it or not, your
browser will be having deeper
conversations with servers, pretty soon.

...On the other hand, this language (CC/PP)
looks too complicated to use.

I'm a web developer. If I'm on the server,
I want to deliver content to the browser
and let the browser format it appropriately,
taking into account resolution, window size, color depth, user colorblindness, and so on.
Heaven knows I don't want to write an IF statement for every possible pipe size.

There just needs to be a way for me to write
"you've got a choice-- low-bandwidth or
high-bandwidth media; 8-bit or 32-bit images"
using tags in my HTML, and the user's browser
should decide what to do with that information.
Often it can just pick the best alternative
for that client. If not, it can always just
render two links and let the user choose.

That could have been 4 different people visiting 4 different sites. The IDs are not unique to a person. Have you read a single word of the article or a single comment in the thread or are you just karma whoring with a perceived security problem?

OK, I work for Earthlink - tech support specifically (actually in one of the legacy Mindspring centers, so I may have talked to at least one of you at some point in time:), so, here's my whole take on this stuff:

The headers: Well honestly, I had heard about them maybe 48 hours before the story hit/. (figured it would have been sooner, honestly). Looking at what it info it provides, yeah they're pretty harmless. Of course, I know many of you wouldn't want this info sent out, so send emails to feedback@earthlink.net for your concerns. YES, they get read (eventually!). Hell I wouldn't be suprised if they add a switch in the next software release now. And yes, we do care about our users and their privacy. And we really hate spam too:)

Sky Dayton and Scientology: OK, Sky Dayton is a member of Scientology, and he's our chairman. Now, does that mean the whole bloody company is commited to said belief system? Of course not. From what I know of Scientology, we sure don't run our company based on their ideals. Check:
http://www.earthlink.net/about/mission.html [earthlink.net] if you wanna know what kind of beliefs we use in day-to-day business (and we may not be doing our best in all those categories, but damned if we're not gonna fix that). And our people are really cool - Carter Calle, Mike McQuary, our TS managers - they rule.

As others have stated, it's not really a uniqueid, your connecting IP is giving away more information that this. But why do they need all this data?

The only thing that it would seem to me is that it is because Earthlink has poor web page design (not browser, their internal web pages!) that they require to know 1) what speed you can handle , as to adjust A/V content as to suit your connection speed, 2) what your screen layout is as to probably used fixed width tables effectively in the HTML layout, and 3) where you are located in the country (via the POP bank info). Neither of which is even necessary if you follow HTML 4 specs, with effective use of the OBJECT tag, relative table sizes, and use of the standard HTTP header and/or cookies, respectively.

In other words, their customized browser appears to be covering up for lame web page designers.

The sad thing is, the law actually goes the other way and protects THEM from YOUR possible DECRYPTING of the information.

They invade your computer, grab some personal information and encrypt it, then send it back to their servers (without your knowledge). You find out about this, and find a way to decrypt it. You find out they've taken a LOT more than anyone would want them to, so you publish your findings. They don't like this (it's bad press) so they sue you under the terms of the DMCA (the material was "protected" by encryption, and decrypting it for any reason is illegal...)

It's bad enough you can do it with JavaScript. What they should really do is not design resolution/color depth/etc specific HTML pages. The W3C should have opposed those additions to Javascript (and Javascript in its current form in general) as well as these stupid HTTP headers.

Sigh. I'm a geezer at 20.

This, BTW, is why "the browser is the platform" is perhaps not the best idea. CSS2 widgets encourage browsers to be uniform, but it's browser diversity that encourages people to not code for a specific platform. HTML just wasn't designed to do this faux bitmap garbage that passes for a website nowadays, nor should it have.

Take a look at www.microchip.com [microchip.com]. On every page they serve, they have a unobtrusive link called "Page Options" at the top where you can choose what page you want to get: text only, graphics or Java frame. As it turns out, I use all three versions from my university ethernet connection, depending on if I want the heavy-duty search in Java (like a MSFT help search, index, etc box), I just want to browse (I'll use graphics) or I really need something fast (text-only). It's not polite to NOT give these choices to the user!

It works great! I don't know how much more it costs them to do this, but it definitely makes for happy customers. Each version is based off a different root directory on the server and all three are probably generated automatically without the web designer having to think twice.

As far as having something else to do, generally it's looking at one or two other active Netscape windows.

Of course, the odds of such a law happening are slim; the odds of a well-crafted law passing are about zero.[emphasis added]

This, unfortunately, is EXACTLY why many of us disagree with your subject line...though your suggestion of a good privacy law is not unreasonable, the fact is, US Govt. Inc. has been showing itself more likely to do harm rather than good when it interferes. After all, the DMCA (for example) and the Indecent Communications Act [Yes, I called it that on purpose:-) ] were both, arguably, intended to protect artists and/or children...but only served to attempt serious harm to the rights of US internet users (and, indirectly, internet users elsewhere in the world). The CDA was fortunately swiftly slammed. I can only hope the DMCA is next, but fear there's too much money pouring in through lobbyists to fix it completely.

Require ISP's to send copies of personal data to a new Federal Office of Internet Privacy Protection to be checked for violations [and, "of course", to help track down child pornographers and/or tax evaders. Hey, this IS congress we're talking about...]

I disagree about as strongly as it's possible to disagree. Content negotiation is a Good Thing(tm).

Here's an example: when I go to a web site, I expect (hope?) that the content of the site will be rendered in English. For large web sites with a multi-lingual user base, that's not always a safe assumption. Fortunately, content negotiation makes that possible.

Apache makes on-the-fly decisions about what content to send based on this [apache.org].

Does that mean that webmasters need to be careful about how they set up their sites if they're using this technique? Sure. But it also opens up a wide range of options.

1. It's more expensive to design 2 sets of pages. That money should be spent on more content.

Speaking on behalf of webmasters everywhere: thanks for telling me how to spend my money. Allow me to suggest that doing two versions of the same image - one at a high bit-depth, and another at a lower quality - isn't too much of a strain on my budget.

2. Sometimes people with slow modems don't mind waiting - maybe they let your site load in the background while they do something else. It's not polite to make these choices for your users.

Content negotiation doesn't have to be like making the choice for the user. Instead, it can work as a reasonable best-guess. Besides which, I've seen plenty of sites which simply assume high bandwidth (or pathetic bandwidth) and make all the design decisions based on that information. In what way is that giving the user a choice, other than to vote with his feet?

Yeah, exactly. Content negotiation is a good thing - when you do it the right way. I couldn't tell you why Google has made the decision to use IP address rather than the Accept-Language header to determine what language to serve up files, but obviously, it has a pretty stupid result.

I'll tell you what I'd really like to see (now that it's just occurred to me) - a "Reject-MIME" heading. That way, if I get sick and tired of watching some hack's Flash movies, I could tell the server not to send 'em to me. Or a "Max-Content-Length" heading, so sites wouldn't shoot 5 meg files at me without asking.

I haven't tried them recently, but when I tried last year the service was quite good. And two years ago they were willing to research what I would need to do to set up a Linux connection (basically add some DSN lookup #'s).

Stupid cost cutting can happen anywhere, but perhaps you just hit a peak time.

You seem to be assuming that the user will be browsing with the screen maximized, which I never do. When I hit a site that insists upon that, I immediatly leave it, even when, as occasionally happens, it means killing the browser.

Don't design for the maximum possible screen size. That is very bad manners.

I think they're referring not to just the number of bits but to the amount of variation ( or lack thereof ) between different headers for that number of bits. Sure you've got 192 bits, but they don't change enough between different user's browsers to be usably unique. Compart that to MS GUIDs, that vary drastically from one system to another.

Actually I don't think the Earthlink header reveals too much unpleasant. In any browser that has Javascript active, any Web page out there can pull out the same information. The only thing they can't is the POP ID, and they can infer that from the IP address you're using if they want to. I don't like that they're sending the info without saying they are, but the info itself isn't particularly distressing. Maybe we need something like P3P but working the other way, telling you what information your browser is going to send and making sure that matches your preferences before sending it?

The sad thing is, the law actually goes the other way and protects THEM from YOUR possible DECRYPTING of the information.

Don't misstate the DCMA which is bad enough as it is. If a Technical Protection Measure is an effective access control (where "effective" dosn't mean that it works well or is hard to crack) protects a copyrighted work, then you may not circumvent it without authorization.

Earthlink would have a very hard time demonstrating that the information they send is copyrightable because it is just a set of facts about your machine. Therefore, the encryption is not a section 1201 TPM. Furthermore, Fair Use is an affirmative defence.

I mean fine, I'm willing to believe earthlink here, but your suggestion that it's not long enough to be a GUID seems specious. If you look at the numbers we can clearly see that each number can be at least 0-d which implies that it is probably either an 8 bit character or a 4 bit character (i.e. hexidecimal). So, you say:

Microsoft's GUID had 128 bits; a good hash function might have 160 bits;

Well, if each character in that string was a 4 bit number, then you are talking 4 bits in 48 places which means it is at least a 192 bit number. So, your logic seems somewhat faulty.

...with targeted ads. One of the most desired features from current advertisers is the ability to target ads based on the users location. Doing this by IP is very spotty, the POPID would solve that problem fairly safely.

I understand your position. However, as another web designer, I would love to at least have easy access to your preferences. Typically the browser settings would be a good indication of the user preferences. Possibly a better solution would be a "preferences" header. This way each user could set up things like "prefered size", "prefered resolution", "prefered font and size". These could be transmitted to the server and utilized appropriately.

And this is where I always wonder about web designers, including Earthlink. On the one hand,
I could understand how some of this could be important if we were talking about sending full-fledged web apps to the user. On the other
hand, it appears that what most web designers really want is the ability to send me content that would be far better off rendered as
a pdf file. There are exceptions, but most of those are better handled using CSS (and we know how popular *that* sensible solution is). I
mean, I know what my preferred fonts and sizes are. I set them up in my browser, and 98% of everybody who *doesn't* try to give me some kind of special web experience and just sends me html
ends up giving me something I'm happy to look at.
Again, I really wouldn't mind too much if designers at least used CSS consistently, since I can arrange things there so that nothing too horrible happens.

But that leaves all the rest of you, and I'll guess we'll just have to wait until you either learn or lose your jobs.

Google Europe went online a couple of weeks ago, and as far as anyone can work out, they're using IP address based redirects to send users to their country local site. The upshot has been that there were lots of problems in the first few days as the database was being sorted out. I wasn't getting redirected to Google.FR or anything but for a while my searches were getting results in Swedish first.

Oh yeah? Check out HKCU\Software\Human\BodyParts\Boobs\Parameters, and you'll see a DWORD value for it. If you don't, you probably need to fix your registry, because several MS applications will crash if they can't find it.

It's probably rightfully considered an HTTP header indicating that what follows is HTML. HTML is only considered in the payload of the transmission, and that occurs in the HTTP header before you get to the payload. Otherwise, it would make little sense to have text/plain as a Content-type, since you can transmit that over HTTP with no HTML coming in at all. Content-type: text/html just indicates that what's about to come over the wire is in HTML form.

They just don't want to get sued by France (as yahoo did) if you, or other users, look up sites containing Certain Illegal(in France) Information. Try doing a google search (from the redirect) on that info. Bet it won't allow it.

"Yes, imagine. Imagine if web designers weren't obsessed with style over content, with special effects over usability, with animated intros over usefulness, with exactly
positioned layout over standards that are easily accesible by the visually impaired or degrade well for old browsers."

I think you will find most good web designers do care about these things...It's the marketing droids that want the shiny spinning stuff and the locked layouts

Seems like the quota-hunters moderate everything down which is remotely critical of their cult... Parent is on-topic: this is the very ISP the article talks about, and was a direct response to the question asked in parent! The article was about a header which could have been used for snooping, and the "cult" would gladly engage in these kinds of activities.

Sites can silently collect information sent in HTTP headers (or use a script to look through logs afterward). With JavaScript, you can usually see what information the site looks up by looking at the page source. Thus there is some loss of privacy in including this information in the headers.

Macintouch [macintouch.com] shows that doing a web search on 'ELNSB50' provides more info than simply codified attributes of your client connection. Clicking on results from Google display "Web Browser Agent/Platform Statistics" which can be used to determine which websites a person visits.

At random, I chose the browser ID of "000041100320025802940113000000000502000800000000" and searched on that. I found that browser had visited four specific sites.

I don't want my tracks to be available to everyone. I understand that my perusals are logged in my company's system since that's my net connection, but these aforementioned actions are available publicly. That's not a good thing.

I saw an Earthlink commercial on TV the other night. It went on and on about all of the shady things people do to strip away privacy on the internet. Then it stated that Earthlink would never do those things.

Given this stuff is not actually tracking anyone, but it does carry more information than is at all necessary (Not than any is really necessary.).

Of course, given the history net companies have with privacy, it really is not surprising.

I don't know if the EarthLink browser can be set to run through a local proxy, but if it can, then Proximitron [spywaresucks.org] can prevent the extra HTTP header from being sent at all. I just started using it, and it works wonderfully. Plus the paranoid among us can open the HTTP log window and watch what's being sent out and received, for that warm-and-fuzzy reassuring feeling.

I think that screen size falls under "function" and not "form". People with small screens need information (regardless of what it is) presented in a long tall format so they only have to scroll down, not side to side. People with huge screens need information presented in a short wide format so they don't have to scroll at all.

Remember when most sites had a "text only" link? Maybe if the browsers make it easy to identify text-only users then that kind of duality can come back. Right now I think web designers don't want to have to present the text-only question before jumping to the content. But that's laziness more than anything.

Imagine never having to answer stupid questions like "flash or html?" "800x600 or 1024x768?"

Its possible that based on the connection speed, you could default modem users to the HTML site and broadband customers to the flash site (of course, with links to the opposite choice). You could also arrange the tables so people with smaller screen sizes are scrolling left to right and people with large screen sizes aren't forced to scroll down a website that fits into the first three inches of their screen.

I do think there is something else they should flag...system color scheme. I use a darker scheme where my text is white and my workspace is black. On many websites with hardcoded white background I can't read a thing. I usually end up having to disable them. It would be nice if a website could ask my browser what my default text color is and send out the appropriate background.

I understand your position. However, as another web designer, I would love to at least have easy access to your preferences. Typically the browser settings would be a good indication of the user preferences. Possibly a better solution would be a "preferences" header. This way each user could set up things like "prefered size", "prefered resolution", "prefered font and size". These could be transmitted to the server and utilized appropriately.

And frankly, as someone who has done tech support, I KNOW that sometimes the experts do have to do the thinking for the end user...

You can get most of the info through javascript then return it to a cgi for logging purposes.

In my experience, Javascript is usually a bad solution to whatever the problem is (others would disagree stongly). With this particular issue you have the problem of passing the information to the server (a page load), keeping up with the information while the user navigates (session management/cookies/user tracking of some sort), and the generally not-quite-completly-compatible nature of Javascript (you have to write scripts to check for and behave differently for just about every browser and browser version).

Sure, there are plenty of prewritten scripts to do just that. But you still have to worry about the possibility that the user's browser does not support Javascript or that it is disabled. You therefore STILL have to have a default "blind guess" (as opposed to a "Javascript guess") version.

The HTML headers would not remove the need for a "blind guess" version, but it would solve all the other problems. If it existed, the web designer could count on it and utilize it easily.

On the one hand, I could understand how some of this could be important if we were talking about sending full-fledged web apps to the user. On the other hand, it appears that what most web designers really want is the ability to send me content that would be far better off rendered as a pdf file.

That's right. It would be better visually many times as a PDF... Or an easily resizable Flash... or, or, or. Right now though, the best thing we have to work with for display is HTML.

CSS is still in the "maybe one day it could be really useful" stage, but it is mostly broken in different ways on different browsers. PDF isn't interactive. Flash is about 90% supported display-wise, but tools for interactive use (such as the PHP-Ming combo) are still maturing and you still have to be concerned with the other 10% of users.

Don't get me wrong. I think pixel perfect HTML is much more trouble than it's worth. However, it's generally marketing that makes the look-and-feel decisions, not us measly nerdy web masters. A HTML header that would give me as a designer a couple more tools to work with would be extremely welcome.

Personally, I'd like to see HTML completly scrapped in favor of something that works well. It's being used for things it was never intended. I picture it as this huge pile of scrap stuck together with bubble gum and kite string. Like Microsoft, massive chunks of gizmos tacked on from every direction that somehow still manages to (mostly) work. HTML is not the best tool for most jobs, but it's the most common and compatible.

Speaking as "another web designer" myself that's also a tech, you're the kind that have given us all a bad name and screwed up the web.

That's going to far.

What you're looking for is more ways to push style over substance, and I'm asking you to reconsider that position.

What I'm looking for is a better way to manage display. Let's face it, most websites are little more than interactive ads. Sites like this are the exception, not the rule. What you suggest works fine for a content driven site, not for a corporate site where the marketing department is extremly concerned about presentation.

Yes, everyone has different preferences, so how about giving them content they can use regardless of those choices instead of trying to manage the myriad of different user preference combinations that might want to see your pages?

That's precisely what I would like to do. With a little more information on what their preferences are, I can easily generate pages that give them what they want in a way they perfer. Want just the basics? No problem. Want fancy animated graphics? No problem. Want it converted to PDF and emailed to you? I can even do that. But if I don't know what you want, I have to make trade-offs to serve the lowest common denominator.

You don't have to do jack to give the user what they want given the preferences they have chosen.

Screen size is a matter of "form". A "short fat screen" has a different form factor than a "tall skinny screen", right? A properly designed web page is not constrained to any one resolution or window size. CSS has provisions for layout boxes defined as a %-age of the parent element and for floating elements. If I resize my browser window, the web page should reflow into the available content area, not be locked to a particular presentation.

Do you really want to build a site 4 times to accommodate 4 different ways a user might access it? What happens if a 5th method is developed — do you retrofit all your existing sites? No! Build the site correctly and you only have to do it once!

Remember when most sites had a "text only" link? Maybe if the browsers make it easy to identify text-only users then that kind of duality can come back.

There never was a duality, except when lazy web designers were involved. Web content is primarily textual. If you have inline images or other media, you're expected to provide ALT text and similar fallback mechanisms. Graceful degradation [anybrowser.org] and device independence [w3.org] are the key, but the concept seems to have flown right over the heads of an entire generation of dee-zyne-ers.

Imagine never having to answer stupid questions like "flash or html?" "800x600 or 1024x768?"

Imagine sending your content in a universally accessible fashion, rather than a proprietary format that requires a plugin. Imagine designing a site correctly so that it automatically fits any size browser with no extra work or finagling on your part.

Its possible that based on the connection speed, you could default modem users to the HTML site and broadband customers to the flash site (of course, with links to the opposite choice).

If you recognize here that people want a choice, why don't you recognize their choices (system preferences) in other areas as well?

You could also arrange the tables so people with smaller screen sizes are scrolling left to right and people with large screen sizes aren't forced to scroll down a website that fits into the first three inches of their screen.

See above. A good design accommodates variable screen sizes without the need for "detection scripts" and such. You don't need to know the user's screen size.

I do think there is something else they should flag...system color scheme. I use a darker scheme where my text is white and my workspace is black. On many websites with hardcoded white background I can't read a thing. I usually end up having to disable them. It would be nice if a website could ask my browser what my default text color is and send out the appropriate background.

Similar functionality exists in CSS. If the site uses your system colors [w3.org] it will behave as you describe.

The big companies will always be ahead of crusade sites like Slashdot. Even though we will eventually find out what is going on, it is always after some form of privacy trampling has taken place.

There needs to be a law on the books that prevents the transmission of any information without the user's express consent. I'm not talking about the "If you install this software, you agree to these terms" type of consent, but the "we are sending the following information to our central database: connection speed, monitor type,..." with a OK/Cancel popup. This becomes important when you start sending things like "We are sending the following to the Microsoft database: Your hard drive's serial number, your mother board's serial number, your up-to-date billing statement ensuring you have paid for this week's use of Windows XP,..."

Of course, the odds of such a law happening are slim; the odds of a well-crafted law passing are about zero. We need some Slashdotters in Congress, I guess...

Even a cursory examination should show that these numbers don't have enough uniqueness to be globally unique IDs. Microsoft's GUID had 128 bits; a good hash function might have 160 bits; those serial numbers, culled from widely scattered machines, aren't unique enough.

It's beside the point, but exactly how many bits do you think are in there?

It looks like you have 48 characters after the colons. That's more than enough bytes to encode the bits you say you need to be a unique ID. If each pair of characters is a hex representation of an 8-bit number, then you have a 192-bit space.

It is interesting to note that Earthlink is owned by Scientology. This story is especially timely. There is an unofficial investigation underway into the rumour that Scientology maintains a separate set of email servers on Earthlink's network that mirror all email traffic sent to or from it's subscribers. The purpose of which is to scan for Scientology related (read: disparaging) messages.

The rumour is that the server farm is at an offsite location, which only Scientology has access to. The explanation given to the employees of Earthlink about this offsite facility is that it is an "offsite backup" location.

Imagine never having to answer stupid questions like "flash or html?" "800x600 or 1024x768?"

Yes, imagine. Imagine if web designers weren't obsessed with style over content, with special effects over usability, with animated intros over usefulness, with exactly positioned layout over standards that are easily accesible by the visually impaired or degrade well for old browsers.

There is a DMCA exception for protecting privacy. You can go around it to find out if anything private is being sent and disable it. In addition, the DMCA only protects restrictions placed by the copyright owner, and allows a copyright owner to allow circumvention. You are the copyright owner of your own personal info. Third, the DMCA is unconstitutional.

I am not a lawyer, get real legal advice if you need it. Or just hide your tracks real well.;)

> Like opera, where you can select which one, "Identify as MSIE 5":) Cool

I used to play that kind of tricks (mostly by using junkbuster), than realised that I was making myself a disservice by pushing IE stats. If everybody masquerade as IE, then webmasters will be right to do IE-only pages, as this is the only thing they will see in their logs.

At this point, the User-Agent: rewrite will stop working, because the sites will really be using proprietary IE functionality that will not even exist in Opera. And you will be forced to use IE.

> That really sux. Please do us both a favour - mail them and tell them to at least provide a 'in english, please'-link that can be bookmarked.

They already provide a bookmarkable link that works for me. It is <http://www.google.com/intl/en>. This one is not redirected, and is in english.

But the big issue, for me, is that they may start to use different databases for various audience, in which case I may not be able to access google.com content from france, only a english version of google.fr.

Since a couple of weeks ago, my home page, which is www.google.com is displayed in french. More precisely, www.google.com send me a redirect to www.google.fr. My browser is set to request only english documents, so I suspected they base the redirect on thIP address.
A quick direct connection show it:

Please don't. If I want to download a high quality link on a 56k modem, it is my business. If I want only the lowres from my DSL line, it is my business too.

Web designer should stop trying to think for the users, like google that insist that I have the french version of the page.

Of course, you're going to tell me that you would provide a link to the other version of the site, but the truth is that you wouldn't.

Try broswing ati.com with mozilla. Isn't that nice, a 'Web Designer' that make decisions for its users ? (The site sort-of works with Mac OS X Server Omniweb, or lynx, so it is just becasue they are lazy assholes)

If such headers were commons, it'll take a couple of year until:

1/ Users will have only one link and the server will choose what content is best for him
2/ Users with browsers that don't give the info will be redirected to a please-use latest IE page.

Hmm, he actually said, "even a cursory examination should show that these numbers don't have enough uniqueness to be globally unique IDs.". While he did go on to count bits immediately after that, he mentioned that since these numbers were taken from widely scattered machines, they should be a lot more different. The issue wasn't really the length-- it was that the numbers were almost identical for a bunch of different machines. They aren't too short-- they're too similar.

Yes, its an invasion of privacy. Is it malicious? Probably not. Will it help Earthlink monitor their service, make it more efficient, and potentially more usable (display depth, etc.)? Yes. While I think it's crummy of Earthlink to keep quiet about this, it's no big deal. The average user is going to end up with better service or potentially lower prices because of more efficient use of Earthlink's resources. The average AOLer doesn't think about privacy the way Slashdotters do, witness Smartmouth [smartmouth.com], an online service which references the database of Stop&Shop, a grocery store, to provide calorie and fat content info on all your groceries. 99 44/100 % of users will think this is a Good Thing.

It would be cool because the designer could make a more intelligent default choice for the user... lots of artery-clogging graphics, or few artery-clogging graphics?

Then again, considering how shitty 99% of web design is, maybe it's better than designers code their pages in assumption that users have 28.8 modems. I'm freaking tired of graphic design overload and NO content.

Putting your bandwidth in the HTTP request would only be good if...
1. Users could override what goes in the header... for example I have DSL but I hate graphic overload so I'd probably self-identify as a 14.4 modem user:)
2. Users had the power to switch to the low- or high-bandwidth site.

Well, I'm glad you're going to sit down and reverse-engineer every piece of software you buy or download, before installing. I suppose you're also going to inspect all the farms that grow your food, the plant that makes your car, etc.

I think the original poster came pretty close to hitting it squarely on the head. Note that the suggestion was not that government ban the sending of information. It was proposed that government mandate the revelation of information-mining, so that you can vote with your wallet... intelligently.
It's as easy to say "The fewer laws the better" as it is to call for a cradle-to-grave state. Both are failures and abdication of voter responsibility. What we need is the right laws, and their number will fall somewhere between zero and infinity. Though it was said 150 years ago, it's still true:

The legitimate object of government is to do for a community of people whatever they need to have done, but cannot do at all, or cannot so well do, for themselves in their separate and individual capacities.

I'm an Earthlink user, and it isn't required that you install the Sandbox software. You just have to be able to set-up a Dial-up networking connection in Windows. Which, even for slightly novice users, isn't particuraly difficult between the Dial-up networking wizard and Earthlink's instructions. My fiance uses the Sandbox stuff. The only thing I see that she gets from using it is a prettier display while the modem is dialing up.

As far as the potential unique serial number not being true, I'm not surprised. Earthlink did stand up against the FBI when it came to installing Carnivore.BigCat79

First, about the popID in the HTTP header, I hate to tell you this, but I happen to know that my Earthlink IP address is "nicely" masked via my geographic POP location. Ex. cust1.citystate.etc.etc So, Earthlink in masking my IP numerics uses the city where I dialup from.

Secondly, as long as they don't make me use their in house software as a condition of using their service, I don't care what they develop. I like Earthlink because they do actively support LINUX/PPP connections with very little hassle. I understand that these folks are having support issues, especially that they just ate a number of the remaining clueless lusers from mindspring and onemain.com. Oh, and another thing, that Sandbox screenshot is old. Member start pages (that blue page) were changed in Jan/Feb.

Third, has anyone stopped to think that perhaps the PGP encryption during install might be a new subscribers CC number and other personally identifiying information? Wouldn't that make sense?

You just need to know where to draw the line. This is well short of the line, folks

I agree, but where exactly is that line? And more importantly, is a company going to tell me when they have an itch to cross it? Almost certainly not, which is why we need to nip this kind of behavior in the bud.

I had this same problem when dealing with an "application" that insisted on sending information about my computer out.

What I end up doing was having a registry monitoring program called regmon to to monitor all registry access, then I loaded up the program and then stopped monitoring registry... I found that they wanted to send a LOT of VERY personal info out.

No real disassembly is needed... load up regmon or filemon (file access monitoring program) and note what it looks at... betcha you would be surprised...

Well I have a solution, put the light content as the default and if they want more give them a link. The light can always be the default, with links to "Click here, let us set a cookie that says you like big bandwidth hogging content from our site forever". Then on the default light content site, check the cookie and redirect to heavy content. Am I missing something?

Could this be a reaction to that brain dead French court ruling on Yahoo? Maybe Google is filtering out Nazi related stuff if the reverse DNS indicates you're in France? Of course that wouldn't excuse them from only providing a French language page.

actually what/.ers won't believe is that it is irrelevant what the data says. if you are an earthlink member, then they already have all you personal info, you gave it to them when you signed up. people are getting way too paranoid for their own good. since it only occurs at installation, the only possible data it could send would be necessary hardware and log in data.

Even a cursory examination should show that these numbers don't have enough uniqueness to be globally unique IDs. Microsoft's GUID had 128 bits; a good hash function might have 160 bits; those serial numbers, culled from widely scattered machines, aren't unique enough.

There are 48 (presumably) hex digits there. Each hex digit represents 4 bits. So the number is a 192 bit value.

What you can find out whith a little use of brain:
- connection speed (not hardware.. but true speed)
- font size (not sure about this.. signed scripts should make it possible)
- POP ID - well, they provide your service, so they surely know about it
- sandbox version - if you don't use it, they can't find it out.

What they in fact do, is to pool their incoming information into one channel. That's much easier then to collect, analyze and join all the logs from their different dialups and proxies.

So it's not really a bad thing they do. Just a little bit naughty. Not more evil then banner- and counterhosts detecting your resolution and stuff..

I think Earthlink should have published the spec in advance, if for no other reason than to protect their shareholders from privacy scares. Earthlink has invested millions in its 'serf at AOL' campaign. They need to protect their pro-geek branding.

Another reason for publishing is so people can make use of the tag.

2. Standards Approach

As one of the original designers of HTTP the tag as specified sucks. It is fixed field after fixed field, no extensibility. I think that the idea is fine, but the syntax choosen is not.

First off a non-standard header should have an X- prefix.

Secondly, the scheme does not work for text to voice displays, or for that matter very high definition displays (>100dpi) that are on the horizon. It would be handy to be able to give the monitor size and also the gamma. These are all real needs for real people today, and will be mainstream in a couple of years.

Now there have been folk who have created similar schemes from time to time, none has taken off due to apathy at Netscape and Mr Softy. But that is no real excuse for earthlink. If they don't like the schemes on offer they might at least state why.

Very interesting point, and a good reason for this type of technology to be used. But I, like many/.'ers, really dislike/distrust the idea that you never find out about what Earthlink or other companies are sending back to their servers until after someone digs through their code. I've only ever seen a few programs that actually explain up front what security issues are involved in using their software, AND how to protect yourself.

I wanted to try the free Earthlink service about a year ago, and when I installed it, it automatically installed their IE5.0 browser over top of what I already had. I was pissed! Their install program never asked me if I wanted to do that. To this day, that old computer of mine has the crappy Earthlink browser installed. I never use it, but I also haven't figured out a way to get rid of it other than a complete reworking of the registry (not a good idea!) to make sure I've irradicated Earthlink crap.

Yes, imagine. Imagine if web designers weren't obsessed with style over content, with special effects over usability, with animated intros over usefulness, with exactly positioned layout over standards that are easily accesible by the visually impaired or degrade well for old browsers."

I think you will find most good web designers do care about these things...It's the marketing droids that want the shiny spinning stuff and the locked layouts

As one of the lead web developers for a large and successful e-commerce site (who will remain unnamed because I'd like to keep my job) I can attest to this fact. The typical concept-to-implementation for a project starts out with our designers having created low-bandwidth, user-friendly, but still good looking designs, our developers having coded it browser-inspecific, and the database people having given us good structure on the back-end.

Then the marketers and upper-management get their hands on it. "Can you change this feature?" "Can we add flash to that page?" "Can we get that in cornflower blue?"

Not to mention if we ever present multiple design concepts, they never want all of one, they want bits and pieces from all of them, resulting in a frankenstein monster that is not only hell to write, but hell to maintain and hell to use.

It doesn't help that our designers are constantly looking for ways to stretch themselves (you'd get bored doing GIF and JPG banners all day, too) and jump at any opportunity for huge flash projects.

The end result is a meeting room with three or four developers voices about feasability, usability, and scalability lost amid a sea of excited voices about what a fancy, exciting site it's going to be once we implement all the new features.

Most developers really do have good intentions, but we're not given the freedom to implement any of them.

And, because we have a team of top-notch developers, we actually are capable of building the frankenstein monsters they want, and when we succeed in building it, they only want more just like it. Our earlier protests are forgotten, many marketers grumbling quietly that we must just be lazy and not want to do more complicated web design. In the end, to the their minds, we're just their trained monkeys.

I'd hate to play devil's advocate here, but to be honest I rather like this idea. The information isn't any more identifiable than, say, an IP address. One big benefit is if other browsers begin to include this type of information: PHP could use this information to choose the "best" version of a webpage, video stream, etc to send you. I know I personally get annoyed when a webpage is designed for a much higher resolution than I have set. Similarly, inexperienced internet users shouldn't be allowed to attempt to stream 1Mb/Sec of video through a 56K modem. Sure, it'll look like crap and it's all the end-user's fault but marketing people will tell you that if the end-user screws up you can lose customers because of it (they can go elsewhere, you can't).

I use Earthlink and had been aware of this for a while, but had been unable to find any solid information regarding the extra header.

I have an Earthlink connection; it's the best I can do because of my location. Anyway, I had written an HTTP proxy Perl script, simply for my own educational purposes. You can imagine my surprise when I noticed this extra header! I could not find a reference to HTTP_ELNSB50 in any of the rfc's or manuals I consulted and I noticed that it never changed.

I did in fact email Earthlink about this, because I feared it might be an invasive identifier. I am disappointed, though, to report that even after repeated emails, I received no answer regarding my queries. I do not grudge Earthlink for this, but I do not think it is the best customer service. I nearly cancled my account when I could not discover what this mysterious header was.

Suffice to say, though, I am very grateful to Slashdot for answering my questions!