Apparently, HealthCare.gov isn't just having a few backend problems. A software quality researcher studying the besieged online health insurance exchange has discovered a number of issues that could expose the personally identifiable information of applicants to third parties and leave that information vulnerable to attacks by hackers.

Those problems may be in part due to the long-delayed security testing of the entire integrated exchange system, which was put off as last-minute development work that was done to ready the site for launch. Recently published internal government documents indicate that the site was only given provisional security approval before launch because a substantial amount of testing had not been completed just days before the site's October 1 launch date.

The problems uncovered by researcher Ben Simo hint at how slapdash some of the coding done to integrate the site was. He found personally identifiable information embedded both in Web addresses sent to reset user passwords and in data being sent to third-party sites not directly involved in the health insurance certification process. HealthCare.gov's website also pushes personal data having nothing to do with site functionality back to browsers. While that data is sent over an encrypted connection, it could be vulnerable to exploits targeting HealthCare.gov users.

Security alert

Enlarge/ A government memorandum signed off on by CMS Administrator Marilyn Tavenner allowed HealthCare.gov to launch without final security checks.

On September 27—just three days before HealthCare.gov's legislatively mandated go-live date—the Center for Medicare and Medicaid Services (CMS) Administrator Marilyn Tavenner signed off on a request to buy more time for the Security Control Assessment (SCA) of the site. CMS Deputy CIO Henry Chao and Consortium Administrator for Medicare Health Plan Operations James Kerr submitted the request, noting:

From a security perspective, the aspects of the system that were not tested due to the ongoing development exposed a level of uncertainty that can be deemed as a high risk… Although throughout the three rounds of SCA testing all of the security controls have been tested on different versions of the system, the security contractor has not been able to test all of the security controls in one complete version of the system.

The proposal approved by Tavenner outlined a set of steps CMS would take to cover its assets during the first year of HealthCare.gov's operation, including the use of continuous monitoring tools to perform daily and weekly assessments of activity and weekly testing of all the "border devices" between the Internet and the overall system, "including Internet facing Web servers." The plan also included a full SCA evaluation on the HealthCare.gov "stable environment" within 90 days of the site's launch. CMS' CIO Tony Trenkle, Enterprise Information Security Group Director Teresa Fryer, and Chief Operating Officer Michelle Snyder all signed off on the recommendations, acknowledging the elevated security risk.

One of the mitigating factors cited in the plan was that the whole HealthCare.gov infrastructure would be migrated once the site went live into CMS' Virtual Data Center, a set of geographically dispersed data centers operated by a collection of eight contractors. The Virtual Data Center facilities have all already passed security certification, and migration of the site from the temporary capacity provided by Verizon Terremark during development was supposed to begin this month.

In retrospect, the decision to move HealthCare.gov into the distributed data center might have come a little bit too late—Terremark's single data center hosting the HealthCare.gov site has suffered two outages since the launch.

Privacy problems

Whatever security assessments were done on the code that shipped October 1, analysis by Ben Simo shows it didn't include checks against the site's own privacy policy.

First, Simo found holes in the site's password reset function. While a bug that revealed the e-mail address and password reset code for a user through a Web debug tool was repaired, another flaw in the reset process remains: the username and reset code are sent in clear text in e-mail as part of the link users are asked to click on to perform the reset. Also, the password-reset code is apparently permanent—meaning that if it is compromised, someone could use the code and username over and over to attempt to hijack a user's account.

The latest vulnerability unearthed by Simo is in a pile of data passed by the site to the user's browser and in the information sent to analytics sites used to track the site's performance. That data, Simo discovered, includes the user's username and password reset code.

HealthCare.gov sends data to analytics providers such as Google's DoubleClick and Pingdom. As Simo reviewed the Web requests being made as part of his movement through the HealthCare.gov site, he found requests sent to these two providers that included his visit to the password reset page—and all of the user data that was generated by the page. That runs counter to the privacy policy on HealthCare.gov, which states that no personally identifiable information will be collected by site analytics tools. This is the same sort of behavior that the Federal Trade Commission has fined social networks such as Facebook and MySpace for in the past.

Even more information gets pushed back to the browser as a user moves through the site. Simo found a JSON data structure being pushed back to his browser that included most of the personal information for his account, including various unique user IDs and his name, address, date of birth, phone number, and e-mail address, plus a field for his social security number if he had provided it—along with the password reset code.

While all this data is transmitted over a secure connection and isn't stored as a cookie or in some other relatively permanent form, Simo told Ars, "it's increasing the harm if an account can be compromised." The information isn't used by the site once an account is created, so it shouldn't be sent back to the users each time they log in, he said. "I get the impression they weren't thinking about security as they designed these pieces of the site."

Depending on the analytics vendor, that's actually against TOC. For Google Analytics and Omniture, including a username/email/other PII (Personally Identifiable Identification) would be grounds for discontinuation of service.

By happenstance, I just filed a FOIA request asking for any documentation on web analytics on healthcare.gov today, so in a few weeks we'll find out if anyone knew of this exact bug prior to launch.I filed the FOIA because I wanted to know how accurate statements regarding registration numbers being unavailable were; PPTs/other reports based on analytics data can settle that argument rather permanently.

There are many federal sites that use Google Analytics or similar, seemingly unaware that it's all available under FOIA. Interestingly, NSA.gov runs Google Analytics. I think there should be some sort of federal framework around analytics that mandates public quarterly reports on website utilization.

Well, as mentioned in an earlier article, "...the whole thing was dependent on data provided by Experian", so right there you know there are going to be data breaches and leaks of personal information.

Can't wait until the Russian Cyber-Mafia gets hold of our credit and medical info... or maybe it will be the Syrian Electronic Army...

Not to defend this catastrophe too much, but in the interest of fairness, I wonder how many of these incidents are more due to the intense scrutiny that everybody and their dog are subjecting this system to. How many large-scale systems projects in the private sector suffer from massive security holes that just don't get noticed until something happens? You wouldn't believe how many companies still store passwords in plain text...

I welcome the scrutiny, as the site asks for heaps of personal data. It should be one of the most secure websites in the history of websites.

Not to defend this catastrophe too much, but in the interest of fairness, I wonder how many of these incidents are more due to the intense scrutiny that everybody and their dog are subjecting this system to. How many large-scale systems projects in the private sector suffer from massive security holes that just don't get noticed until something happens? You wouldn't believe how many companies still store passwords in plain text...

How many private sector projects cost hundreds of millions of dollars just to start?How many private sector projects involve as much personal information as this site does?How many private sector projects have backdoors right into the IRS and your credit history?

Not to defend this catastrophe too much, but in the interest of fairness, I wonder how many of these incidents are more due to the intense scrutiny that everybody and their dog are subjecting this system to. How many large-scale systems projects in the private sector suffer from massive security holes that just don't get noticed until something happens? You wouldn't believe how many companies still store passwords in plain text...

I welcome the scrutiny, as the site asks for heaps of personal data. It should be one of the most secure websites in the history of websites.

Agreed. Admittedly, I am a bit of a tech 'lay person' but I cannot think of any other data storage location-besides the NSA-that conglomerates so much personal information. Ease and convenience of use are way down the list of priorities, security should be priorities 1, 2 and even 3.

Not to defend this catastrophe too much, but in the interest of fairness, I wonder how many of these incidents are more due to the intense scrutiny that everybody and their dog are subjecting this system to. How many large-scale systems projects in the private sector suffer from massive security holes that just don't get noticed until something happens? You wouldn't believe how many companies still store passwords in plain text...

Most large public sites don't pass along plain-text user data in analytics requests. Period.

Every issue/problem that I've seen reported about HealthCare.gov are mitigatable and/or solved issues. I really don't comprehend how in 2013, and with good funding, our government is incapable of deploying what should be a pretty straight forward web application - meaning they're not inventing anything new untested technology.

I feel like they just plucked some random firm off the internet who told them they would build the site to maximize SEO.

Who gets sued and imprisoned when the inevitable security breach happens? This is just too tempting a target for hackers; massive amounts of personal information about millions of people. I have not heard of any breaches yet. Ironically, the massive problems the site has been having may have made it effectively secure. The data may be garbled as to be unusable by the hackers.

For what it's worth, I've been through this process with a large federal financial system. The whole thing is a joke. It's basically: "list every piece of 'software' you're using for your system". Oh, you're using Java version X? Great, that is on our approved software list. You say you're using... "Javascript"? Is that "remotely deployable code"? Okay, now setup your servers. We're going to run a few scripts that check for known vulnerabilities on your open ports. Okay! Great, you're accredited!

They basically didn't care about anything that wasn't pre-packaged. There was no code audit, there was no logical evaluation of attack vectors, it was just a checklist and automated scans.

Why the fuck does Healthcare.gov need to interact with DoubleClick. Its not like the government can legally make a profit from the ads they display.

Its time to open both the frontend and backend to the open source community. I sure hope the government owns the right to all the code otherwise we are in trouble if whom ever does own it decides they want more money to maintain the code.

I can't quite imagine how we would have any of these problems with a single payer system.

We seem to be going through a lot of pain and bullshit just to prop up the for-profit health insurance industry.

A single payer system still requires the government to run a countrywide system. If they are mismanaging the security now I would be very worried about the ability to handle security in any large scale system.

I can't quite imagine how we would have any of these problems with a single payer system.

We seem to be going through a lot of pain and bullshit just to prop up the for-profit health insurance industry.

A single payer system still requires the government to run a countrywide system. If they are mismanaging the security now I would be very worried about the ability to handle security in any large scale system.

We already have at least 2 of those specifically related to health care (Medicare and the VA) and while they have their flaws, there seems to be little reason to keep reinventing this particular wheel. This one's even more complex because it's entangled with a bunch of extra data about incomes, tax deductions, state monopolies, and a variety of private insurance companies, none of which would be needed on this end of it if everyone was just automatically enrolled in the same "plan" nationally. All the financial/deduction details could be handled in tax returns, like they are now.

What I don't understand is that why the administration didn't accept MS and Amazon offers when they decide they would provide technical assistance? I say the contractor working on the service isn't good enough for the job. They got the contract because of insider connections. Here is the political price the Obama administration is paying...after a complete Tea Party shutdown debacle. And you begin to see the Democrats employing the same Ted Cruz self serving rhetoric about how well the ACA is working because too many people are signing up and causing the system to crash. If they knew the program would be this popular (why won't it since you are required BY LAW to sign up or else being penalized in your tax refunds), they would have hired a more qualified company to work on the service.

Why the fuck does Healthcare.gov need to interact with DoubleClick. Its not like the government can legally make a profit from the ads they display.

Its time to open both the frontend and backend to the open source community. I sure hope the government owns the right to all the code otherwise we are in trouble if whom ever does own it decides they want more money to maintain the code.

Why the fuck does Healthcare.gov need to interact with DoubleClick. Its not like the government can legally make a profit from the ads they display.

Its time to open both the frontend and backend to the open source community. I sure hope the government owns the right to all the code otherwise we are in trouble if whom ever does own it decides they want more money to maintain the code.

Why the fuck does Healthcare.gov need to interact with DoubleClick. Its not like the government can legally make a profit from the ads they display.

Its time to open both the frontend and backend to the open source community. I sure hope the government owns the right to all the code otherwise we are in trouble if whom ever does own it decides they want more money to maintain the code.

Why the fuck does Healthcare.gov need to interact with DoubleClick. Its not like the government can legally make a profit from the ads they display.

Its time to open both the frontend and backend to the open source community. I sure hope the government owns the right to all the code otherwise we are in trouble if whom ever does own it decides they want more money to maintain the code.

If the factoid about 500 million lines is true, it represents more code than all Windows Operating Systems, Linux, Debian, and Facebook combined.

Considering that the Zumwalt has only 50 million lines of code, I suspect there are not 500 million lines of code in this project. Unless you count all the code in Javaenterprise and all the mainframe code this thing touches.

Why the fuck does Healthcare.gov need to interact with DoubleClick. Its not like the government can legally make a profit from the ads they display.

Its time to open both the frontend and backend to the open source community. I sure hope the government owns the right to all the code otherwise we are in trouble if whom ever does own it decides they want more money to maintain the code.

If the factoid about 500 million lines is true, it represents more code than all Windows Operating Systems, Linux, Debian, and Facebook combined.

Considering that the Zumwalt has only 50 million lines of code, I suspect there are not 500 million lines of code in this project. Unless you count all the code in Javaenterprise and all the mainframe code this thing touches.

Sean, do you know who originally reported the 500 million lines number?

Why the fuck does Healthcare.gov need to interact with DoubleClick. Its not like the government can legally make a profit from the ads they display.

Its time to open both the frontend and backend to the open source community. I sure hope the government owns the right to all the code otherwise we are in trouble if whom ever does own it decides they want more money to maintain the code.

If the factoid about 500 million lines is true, it represents more code than all Windows Operating Systems, Linux, Debian, and Facebook combined.

Considering that the Zumwalt has only 50 million lines of code, I suspect there are not 500 million lines of code in this project. Unless you count all the code in Javaenterprise and all the mainframe code this thing touches.

Sean, do you know who originally reported the 500 million lines number?

Why the fuck does Healthcare.gov need to interact with DoubleClick. Its not like the government can legally make a profit from the ads they display.

Its time to open both the frontend and backend to the open source community. I sure hope the government owns the right to all the code otherwise we are in trouble if whom ever does own it decides they want more money to maintain the code.

If the factoid about 500 million lines is true, it represents more code than all Windows Operating Systems, Linux, Debian, and Facebook combined.

Considering that the Zumwalt has only 50 million lines of code, I suspect there are not 500 million lines of code in this project. Unless you count all the code in Javaenterprise and all the mainframe code this thing touches.

I share your skepticism on the 500 MLOC number.

But it's still telling anyone would make an estimate reaching into the 10's of millions.

"I get the impression they weren't thinking about security as they designed these pieces of the site."

The PC way of saying "pieces of shit"

I can't wait for more crashes, meltdowns, exposure of private information, etc to hit the news. This whole health care thing is one big long running joke, its like watching the gov put band-aids on cracks appearing in the Hoover Dam, it's only a matter of time before it breaks completely.

Why the fuck does Healthcare.gov need to interact with DoubleClick. Its not like the government can legally make a profit from the ads they display.

Its time to open both the frontend and backend to the open source community. I sure hope the government owns the right to all the code otherwise we are in trouble if whom ever does own it decides they want more money to maintain the code.

If the factoid about 500 million lines is true, it represents more code than all Windows Operating Systems, Linux, Debian, and Facebook combined.

Considering that the Zumwalt has only 50 million lines of code, I suspect there are not 500 million lines of code in this project. Unless you count all the code in Javaenterprise and all the mainframe code this thing touches.

Sean, do you know who originally reported the 500 million lines number?

I can't quite imagine how we would have any of these problems with a single payer system.

We seem to be going through a lot of pain and bullshit just to prop up the for-profit health insurance industry.

Yeah..because the same guys who can't launch a website could then manage your healthcare...

What could go wrong?

Not a lot, really. The beauty of universal health care is that there's not really much for the government to do, unless they're also operating the hospitals (which isn't even the case in most systems, except the UK). Its as simple as "You have a valid health card? Okay, you get whatever the doctor orders." There's no decision to be made at all. That's why Canada spends about 3% of every health care dollar on administative overhead, while the US spends about 33%.

"HealthCare.gov sends data to analytics providers such as Google's DoubleClick and Pingdom." - nothing prepared me for this! How can the government possibly justify tying their web site to third-party for-profit tracking platforms? Citizens are required by law to do this health care thing. This is as bad as Medicare forcing elderly people to use IE. (Which was true a few years back when I signed my mother up for Medicare. I don't know if they ever fixed it. I was baffled by why the site wasn't working, then I started debugging with FireFox and found the site used IE-only extensions and wouldn't run with any other browser.) I think that, by law, a public service like HealthCare.gov should not be using third-party tracking services. This is a worst-case scenario.

I can't quite imagine how we would have any of these problems with a single payer system.

We seem to be going through a lot of pain and bullshit just to prop up the for-profit health insurance industry.

Yeah..because the same guys who can't launch a website could then manage your healthcare...

What could go wrong?

Not a lot, really. The beauty of universal health care is that there's not really much for the government to do, unless they're also operating the hospitals (which isn't even the case in most systems, except the UK). Its as simple as "You have a valid health card? Okay, you get whatever the doctor orders." There's no decision to be made at all. That's why Canada spends about 3% of every health care dollar on administative overhead, while the US spends about 33%.

That's a very myopic viewpoint that focuses on just the dollars you see while ignoring hidden costs like lost productivity while waiting for treatment.Also, overall administrative cost percentage is hardly a metric of success. When, in some cases, 40 cents on every dollar of taxes paid goes to health care and the program is still going broke, maybe it's time to rethink the program.Additionally, Canada may be graduating a record number of doctors but it's still not enough to meet the shortage. That's little consolation if your hospital is closed. Going from 2 doctors to 6 doctors may be a 200% increase and sounds real good (which is the general argument we see when defenders want to point at overall doctor graduations in Canada)...but if you really need 15 doctors, suddenly that 200% increase doesn't look as attractive. Damn statistics!

Quote:

The beauty of universal health care is that there's not really much for the government to do...

Sorry, when the government thinks it can spend your money better than you can, there's always lot's for them to do.

Not to defend this catastrophe too much, but in the interest of fairness, I wonder how many of these incidents are more due to the intense scrutiny that everybody and their dog are subjecting this system to. How many large-scale systems projects in the private sector suffer from massive security holes that just don't get noticed until something happens? You wouldn't believe how many companies still store passwords in plain text...

In the case of IT Security, more specifically cryptography, public scrutiny is part of the strength and validation process of encryption. Algorithms are released to the public and experts and amateurs alike try to find holes, if it holds up to the scrutiny then we can bet is is pretty safe. Any software and website is the same. The more eyes on it the better.

"Well that one other country had this specific (and sometimes not even directly related) problem with their national health care program, therefore it's impossible."

But if you ask around in those countries to find out whether they'd like to switch to the privatized, fractured insurance model the US is operating under, you're likely to get kicked in the crotch. So while none of them are perfect, the people living with them are quite sure they'd like to keep them.

I can't quite imagine how we would have any of these problems with a single payer system.

We seem to be going through a lot of pain and bullshit just to prop up the for-profit health insurance industry.

A single payer system still requires the government to run a countrywide system. If they are mismanaging the security now I would be very worried about the ability to handle security in any large scale system.

We already have at least 2 of those specifically related to health care (Medicare and the VA) and while they have their flaws, there seems to be little reason to keep reinventing this particular wheel. This one's even more complex because it's entangled with a bunch of extra data about incomes, tax deductions, state monopolies, and a variety of private insurance companies, none of which would be needed on this end of it if everyone was just automatically enrolled in the same "plan" nationally. All the financial/deduction details could be handled in tax returns, like they are now.

A single-payer system is simpler. The reports of serious security issues with healthcare.gov have are concerning because apparently they basically ignored security in design and testing.

The South Carolina Revenue was recently hacked and personal data was stored in plain text files per IRS data security requirements.

based purely on that screenshots I don't see the problem. Query strings are encrypted over SSL and as you can see that's an HTTPS request. UPDATE: never mind I didn't read the "3rd party" part yikes! Down vote me into oblivion!

But it's still telling anyone would make an estimate reaching into the 10's of millions.

Yes, but *what* does it tell you? It could be that someone with contact with the code knows its that large--or it could just be that they believe it. It could also be telling us that someone has no sense of scale, so when they estimated they were inadvertently off by several orders of magnitude. Or it could be telling us that whoever leaked that figure had a political ax to grind and just made it up. Who knows?

Sean Gallagher / Sean is Ars Technica's IT Editor. A former Navy officer, systems administrator, and network systems integrator with 20 years of IT journalism experience, he lives and works in Baltimore, Maryland.