Politicians Start To Push For Autonomous Vehicle Data To Be Protected By Copyright Or Database Rights

from the battle-for-the-internet-of-things dept

Autonomous vehicles are much in the news these days, and seem poised to enter the mainstream soon. One of their key aspects is that they are digital systems -- essentially, computers with wheels. As such they gather and generate huge amounts of data as they move around and interact with their surroundings. This kind of data is increasingly valuable, so an important question poses itself: what should happen to all that information from autonomous vehicles?

The issue came up recently in a meeting of the European Parliament's legal affairs committee, which was drawing up a document to summarize its views on autonomous driving in the EU (pdf). It's an area now being explored by the EU with a view to bringing in relevant regulations where they are needed. Topics under consideration include civil liability, data protection, and who gets access to the data produced by autonomous vehicles. On that topic, the Swedish Greens MEP Max Andersson suggested the following amendment (pdf) to the committee's proposed text:

Notes that data generated during autonomous transport are automatically generated and are by nature not creative, thus making copyright protection or the right on databases inapplicable.

Pretty inoffensive stuff, you might think. But not for the center-right EPP politicians present. They demanded a vote on Andersson's amendment, and then proceeded to block its inclusion in the committee's final report.

This is a classic example of the copyright ratchet in action: copyright only ever gets longer, stronger and broader. Here a signal is being sent that copyright or a database right should be extended to apply not just to works created by people, but also to the data streams generated by autonomous vehicles. Given their political leanings, it is highly unlikely that the EPP politicians believe that data belongs to the owner of the vehicle. They presumably think that the manufacturer retains rights to it, even after the vehicle has left the factory and been sold.

That's bad enough, but there's a bigger threat here. Autonomous vehicles are just part of a much larger wave of connected digital devices that generate huge quantities of data, what is generally called the Internet of Things. The next major front in the copyright wars -- the next upward move of the copyright ratchet -- will be over what happens to all that data, and who, if anyone, owns it.

Re:

It's not too simple for them to understand. It's just that it would deny them what they want most, all your data.

Another possibility would be mandating that the data belongs to the owner of the system that collected it, but that would simply result in never being able to buy an autonomous vehicle, only lease one. A possible fix to that would be mandating that data collected during the period of any lease belongs to the lessee, not the owner, and that no contract may require a transfer of that data.It would also be nice if the law mandated that the owner of the data shall be able to access said data without needing outside help, and that the owner of said data may erase any or all data except for possibly a log entry indicating the date and time of the data erasure.

While we're dreaming, why don't we ask for a justice system instead of a legal system, A government that's fair to everyone, not just those in power, and world peace. They're all about equally likely.

Re: Re:

If you mandate that the lawyers who will be defense and prosecution of a case are up in the air right up until a coin toss at the start of a trial, you would find the legal system balancing out much more evenly. Right now the prosecutor has giant stacks of laws to wield to force defendants to plead out and never go to court, but that isn't bound by law, it is just how it has been done. If everyone demanded their day in court, the entire system would collapse within a year since no one could actually have a speedy trial without expanding the court systems a thousand fold.

Re:

How about deleting the data after a few minutes unless the sensors detect an accident? Would that be too simple a solution

Yes: the data may be important to the functioning of the system. It's uploaded to the manufacturers' databases and sent down to other cars, to improve their driving. In particular, this is how they'll be collecting/updating detailed LIDAR for even the smallest roads. When a pothole opens up, the other cars (from the same manufacturer) will know about it.

Re: Re:

Also, the data would doubtless be used both to investigate crashes (what went wrong may not be a couple of minutes beforehand) - and thus fix bugs in their software - and to defend themselves in court (since, inevitably, they will sued for every crash whether their software is at fault or not). I don't know about anyone else, but I'd personally prefer the developers creating these things have access to everything they need to know what causes crashes.

People are vastly oversimplifying things is they think that the only reason people need the data is to spy on you. Sensible retention limits, etc., are important but it goes both ways - just as it's good that corporations and government not have unlimited data on you, them having zero data might be just as dangerous.

Re: Re: Re:

People are vastly oversimplifying things is they think that the only reason people need the data is to spy on you.

Yes, there are reasons for collecting data that do benefit the consumer. The problem is that so many companies have abused data collection so badly that they have completely poisoned the well. The default assumption of most people is that data collection is done for the benefit of the company, regardless of the cost to the consumer. It's going to be a very, very long time before the corporate culture is trusted enough for this assumption to change.

Re: Re: Re: Re:

"The problem is that so many companies have abused data collection so badly that they have completely poisoned the well'

True, but just because bank have abused personal financial information, that doesn't mean you refuse what they require to do a good job under normal business conditions.

"The default assumption of most people is that data collection is done for the benefit of the company, regardless of the cost to the consumer."

About which they're correct, about every industry. But, the reaction to that is not to cripple the ability to do business at all, or to ensure that the customer is placed at even graver risk than they would be with the data collection.

personal data

Re: personal data

GDPR may be a good point, but the issue here is how much you allow. It's good to be suspicious of the surveillance uses, but you won't magically make everything better by preventing other uses of the data.

Re: Re: Re:

I don't know about anyone else, but I'd personally prefer the developers creating these things have access to everything they need to know what causes crashes.

The "preserve only after a crash" method described above handles that case reasonably well. It's largely how aircraft black-boxes work. (Some people are pushing for more retention there, e.g. cockpit voice recordings for a whole flight not just the last hours. But it's relatively rare for the current limits to stand in the way of investigations.)

Re: Re: Re: Re:

"The "preserve only after a crash" method described above handles that case reasonably well"

Does it? What if the lawsuit is about an incident that the plaintiff claims was caused by your car's bad driving, but the vehicle was not invalid in the a collision itself? I can think a lot of incidents that would not be triggered by that criteria, and thus necessary data is lost.

Again, it's not a bad idea to put sensible limits on data, but setting an arbitrary limit that you assume will allow companies to fix problems is asking for trouble. I know that I have come across numerous problems that have been made far more difficult to fix because logs from before a previous date were unavailable, and I'm sure the servers I run are less complex and are less physically dangerous than any automated car.

"It's largely how aircraft black-boxes work"

There's a massive difference in the data required and the situations where they would be investigated, though.

Re: Re: Re: Re: Re:

If a software issue caused an accident, it should be visible in the seconds before the accident. Also, if a problem is identified, it can be looked for in data arriving from other vehicles. If you can build the log search you can filter live data and only keep what is relevant for as long as it is relevant. Keeping the data for long periods, especially of peoples travels, is just building a database for law enforcement to trawl and use for such purposes as identifying club members, church members etc.

Also, I assume there will at least be an emergency stop button, which passengers can use if they notice the vehicle behaving in a dangerous fashion, and obviously trip a preservation of recent data.

Re: Re: Re: Re: Re: Re:

If a software issue caused an accident, it should be visible in the seconds before the accident.

The interesting part of Paul's message was that the autonomous car could cause an accident without being a victim of it. In that case it might be completely unaware that anything happened or anything should be preserved. Still, I'm reluctant to suggest the vehicles store all kinds of private data just in case it could be useful.

The e-stop could be a good compromise. One might want a "send bug report but keep driving" option too.

Re: Re: Re: Re: Re: Re: Re:

The interesting part of Paul's message was that the autonomous car could cause an accident without being a victim of it.

Assuming that there is evidence of that from the vehicle, vehicles involved in the accident, the vehicle data is not required for assigning blame, and the software developers know what to look for in current data streams.

Further, once given a hint as to a cause of an accident, it is time to take to the test track to recreate the problem, where the is little risk to human life.

Re: Re: Re: Re: Re: Re:

"Also, if a problem is identified, it can be looked for in data arriving from other vehicles"

So, if you have a crash, you'd better hope it's on a crowded freeway and not a quiet back street?

"Also, I assume there will at least be an emergency stop button, which passengers can use if they notice the vehicle behaving in a dangerous fashion"

We live in the real world, sadly. If we can't get some people to be that alert, no text, etc., which solely in control of a vehicle, do you honestly think you can depend on them in a vehicle where they will not be required to do anything for the majority of their time?

Re: Re: Re: Re: Re: Re: Re:

Assume any vehicle involved in an accident keeps enough data to analyze the cause of a crash, and that should be present in the last several minutes leading up to the accident. Included in that data will be what it sensed about other vehicles including any video footage. (Also, it would be reasonable to trigger the preservation of data from nearby vehicles.) If that data shows another vehicle to be at at fault, like cutting in too soon etc. then the software developers know what to look for in the data streams of the type of vehicle that caused the accident.

If companies keep vehicle data for long periods, the police will either use the third part doctrine, or overbroad warrants to go on fishing expeditions. Do you really want claims that a robbery or assault to become an excuse for the police to get infomation on who was near a political activists house on a given afternoon or evening? That car data will not only show how you traveled, by how long your vehicle was parked in a given area.

Which is worse for society, a few unsolved accidents, or a total surveillance state?

Re: Re: Re: Re: Re: Re: Re: Re:

OK, I think I see the disconnect here. You're talking about what happened physically to cause the crash. I'm talking about what happened to cause the software to misbehave in a way that allowed the crash to happen - which can be caused by things that occurred long before the visible symptoms became apparent.

Getting data from immediately before the crash might be useful to determine that X was what happened, but it's useless in determining WHY X happened, if the reason is that a faulty sensor or specific unusual set of inputs triggered a memory leak that meant that the software responded much more slowly than normal to a certain warning trigger. You often need access to longer log records to make a valid examination of some issues. Note, I'm not talking months or years here, I'm only pointing out that a few moments may not be enough to fix what actually caused the crash before the next one is triggered by the same bug.

"Do you really want claims that a robbery or assault to become an excuse for the police to get infomation on who was near a political activists house on a given afternoon or evening"

No, but the problem there is the system that allows them to go on fishing expeditions, not the fact that this data means there might be more fish to catch.

"That car data will not only show how you traveled, by how long your vehicle was parked in a given area."

So, the same as your mobile phone's data shows right now? Yes, you can turn it off, but most people don't.

"Which is worse for society, a few unsolved accidents, or a total surveillance state?"

If you believe that false dichotomy, you may wish to read up on the actual situation.

Re: Re: Re: Re: Re: Re: Re: Re: Re:

Hmm, safety critical real time system, if it does not detect failures in critical processes and resources in seconds it has been so badly implemented that it is not fit for use. When a system is running with millisecond deadlines on tasks, it should take less than a second for it to detect a software or sensor issue.

Similarly, any software design flaws in dealing with driving a vehicle will be visible in the data leading up to an accident, as the data the software should have reacted to will be in the sensor data, as will failure to sense an obstruction that caused an accident.

Re: Re: Re: Re: Re: Re: Re: Re: Re: Re:

You seem to be operating under the delusion that these vehicles will be operating under perfect conditions, especially relating to the condition of the vehicle over time. What they *should* be doing and what they will actually be doing are two different things. It needs to be known that certain things are happening in the wild, and like it or not there's no such thing as a perfect piece of software - there are always risks, but these can be mitigated by developers who have knowledge that such edge cases are possible.

That's not to say that a device that's not fit for purpose will be on the roads the majority of time. It's just that real world application will always introduce issues not discovered during testing, and it's better for everybody that a few minutes of additional real data is available to those tasked with fixing them, than them having their job being made difficult or impossible because someone's paranoid that they will be compromised in some way if minutes rather than seconds are retained.

As I've been saying - it's very much worth having limits to try and restrict what bad actors in industry and government have access to - but, this should not be at the expense of the work that actually needs to be done to ensure that the software does not cause more issues. than it should.

Re: Re: Re: Re: Re: Re:

"Any limit is arbitrary past what is needed for the immediate operation for the vehicle"

Rubbish. Plus, you're saying that any limit other than instant deletion is arbitrary?

This is why this will be a long conversation. Some people just won't accept a real conversation, they have to go for wild hyperbole.

"A hit and run case could be years old before a connection to a vehicle is made"

Well, there's certainly an argument that it won't be any longer, once every car is connected. Plus, the point here is not to keep data for crime investigation, it's to use it for troubleshooting technical issues. Stop confusing the two arguments, my point is purely that by limiting access too severely because you're paranoid of what *might* happen, you're guaranteeing that something bad *will* happen because the problems causing the issues are not allowed to be fixed.

Nobody's saying that people need years' worth of logs to investigate bugs. I'm just saying that going "3 seconds before impact is all that's needed" is equally dumb.

Re: Re:

You do not need the data from individual cars to be kept once you have located the pothole. Similarly Lidar based images of the road and its surrounding can be built and updated from vehicles data and then the vehicle data discarded. That is, other than accident data, there is no need to keep data identified to a vehicle for more than a few minutes.

Keep everything indefinitely, or for prolonged periods of time is a means of giving corporations and governments power over people.

Re: Re: Re:

That is, other than accident data, there is no need to keep data identified to a vehicle for more than a few minutes.

It might take more than a few minutes to get confirmation on low-traffic roads. Companies may want to maintain the source record until then, so they can block sources feeding false data (intentionally or not). Hypothetically, they could use existing pseudonymity protocols to do that without knowing who is driving where, even temporarily, but I see zero chance of them doing so.

Re: Lots of social value in the data

It is impossible to anonymize data like that. Numerous studies have shown that with enough information, individuals can easily be identified. This is especially true with cars where you will know the exact location of their home and work addresses.

ALL car data should be opt-in only for transmission outside of the vehicle AND even after transmission should be completely owned by the car owner. Their data can be can be revoked at a moments notice with requirements that the data holder erases the data within XX days.

This data should also be illegal to be used by insurance or liability companies to modify or deny claims.

Re: Re: Lots of social value in the data

That ship has sailed. Here in the UK some insurers offer plans based on trackers in your car. The minute you deviate from the plan, they know and jack up your insurance costs. One family I know was effectively curfewed; if they drove the car after 10pm too often they'd get a call or email advising that continuing to do so would raise their premium.

Re: Re: Re: Lots of social value in the data

Those "good driver" sensor deals are a scam. They KNOW there's no way to avoid "bad" driving 100% of the time. You may need to swerve to avoid debris or faults in the surface. You may need to speed to keep at the flow of traffic (which is safer for everybody than doing the speed limit in traffic going faster). Etc, etc. Getting one of those sensors means you WILL end up paying more for insurance.

Re: Re: Re: Re: Lots of social value in the data

Getting one of those sensors means you WILL end up paying more for insurance.

No, you're missing the other half of the scam: they raise everyone's rates, then offer a "discount" (i.e., what you were paying before) if you agree to tracking. As we've seen with mobile phones, most people will accept tracking.

Re: Lots of social value in the data

If I or my machine which I purchased collects data, the data should be under my control. Disregarding automated vehicles, what data is a car manufacturer entitled to from my vehicle? I would like control of that data, including what data is being collected and sold from MY vehicle when I bring it in for repair or maintenance. I don't want to read through 50 pages of legalize to try to figure it out either.

As to autonomous vehicles, if the car manufacturer has to collect data from an autonomous car it is no longer an autonomous vehicle.

Data should not be copyrighted it should only be protected as a trade secret.

So let's see here, SomeEvilCorp owns the copyright on data gathered by spying upon Joe Blow ... now Joe tells his friends about how totally awesome his last trip was, detailing the things he did and saw ... only to be sued for copyright infringement of his own life.

How about if I tell my car to drive on roads in the shape of a picture? is that not creative? automatic output by a machine is still controlled by the user. much like pictures created in MSPaint are still created by the user.

Re:

Interesting question. Looking solely at the data, it is just factual spatial information that is collected at certain intervals. Taken as a whole, or as a collection, is the data alone recognizable as creative output or does the data have to be transformed in order to identify the creation? Unless the data was presented in a novel way, e.g., as ASCII art, it would not appear to be very pretty or creative. I think it is unlikely that the data alone, i.e. a series of numbers would be recognized as being creative until it is transformation into a map/picture or even sound. That transformation would include additional creative choices, such as the colors used, size, etc. The data is part of the creation, but the data alone it is not a creation it is a small piece of the creative process.

Who exactly changes a flat tire on an autonomous vehicle? After it causes a wreck, how do you exchange information with one of these fucking nightmare vehicles? Who do you sue? The owner, the manufacturer, or the code writer?