from the beep-boop-beep dept

I don't want to waste any space with a long introduction, other than to say it's always incredibly frustrating when artists come up with inventive new ways to produce artwork, only to have those efforts met with stupid intellectual property issues. Experimentation is key to the artistic world and we've begun to see how artists are incorporating technology into what they produce. This should be exciting, but all too often that excitement is plagued by legal issues.

A case in point of this would be Canadian artist Adam Basanta, who has come up with a bonkers and very cool method for both producing machine-generated art and then validating that art for human consumption by comparing it to real-world artwork made by us lowly apes. Let's start with his setup.

Broadly, Basanta’s machine has two stages: creation and validation.

Creation happens with a hardware setup that Basanta likens to a Rube Goldberg machine: two computer scanners tipped on their sides and pointed face to face, endlessly scanning each other, and the results – influenced by shifts in the room’s lighting, randomized settings and an automatically moving mouse – are interpreted by a computer and turned into colourful abstract pictures.

The second stage is validation. Another computer running a custom-built program automatically checks each image against an online database of real art made by human hands. If the machine-made image is similar to one that has been human-made, the computer dubs it a success and keeps it; if there is no match, the image is deleted forever.

If that doesn't get your heart beating a little faster, you simply don't care about art. This setup is, at the very least, incredibly interesting, and Basanta's method for validating whether the art produced by the machines is good enough for human consumption or not kicks the interest level into overdrive. His setup generates something like a thousand images a day, with a tiny fraction of that being deemed worthy of retention. The whole thing was good enough to warrant an art exhibit in Canada and Basanta has featured many of the images on his website as well.

And that's where the trouble started. Artist Amel Chamandy has alleged that Basanta violated her copyright on a piece she created called "A World Without Trees", as well as the trademark rights she has on her own name. Both claims stem from one of the pieces Basanta's machine setup used to validate its own artwork against and the naming convention it used to denote the new pieces it created.

In June, someone – it’s not clear if it was Chamandy herself or someone who works with her – did a Google search for her name and the name of a 2009 wall installation she made called Your World Without Paper.

The first result in the Google search, according to documents filed in court, was Chamandy’s website. But the second and third results pointed to Basanta’s website, because his machine had named one of its own pictures after one of hers. The offending image, some magenta lines on a field of indigo, is called: 85.81%_match: Amel Chamandy “Your World Without Paper”, 2009.

The trademark claim rests solely on the name of the file including Chamandy's full name. It's a silly argument for trademark infringement as the whole point of including the name is to weigh the new art piece against her specific work, which necessarily involves anyone viewing these pieces being informed that they are not the work of the original author. The whole purpose of the validation process is to show what differentiation remains between the new piece and the human-made example. That's not trademark infringement. It's not really even close.

As for the copyright portion of this, it's important that you not be fooled by the percentage the machine setup notes in the validation process. You might think that an 85% match would mean the two images are very similar and would share a ton of features that would link the two in the viewer's mind. That's not even close to being the case, as you can see just how different the two images are below.

If that looks like copyright infringement to you, you need your head examined. Indeed, the entire setup here is defined by the fact that this is a totally independent creation -- and the "validation" process only serves to highlight that there is no copying. Indeed, the idea that independent creation is a defense against copyright goes back ages, and this is quite obviously an independent creation. The only reason the other artwork is mentioned at all, because it's the literal coincidence that the computer judged these images similar that leads to the name being mentioned. Judge Learned Hand famously wrote:

... if by some magic a man who had never known it were to compose anew Keats's Ode on a Grecian Urn, he would be an "author," and, if he copyrighted it, others might not copy that poem, though they might of course copy Keats's...

This is a case where "some magic" took place, and one artist "composed anew" something that a computer (but no human eye) judged to have a decent level of similarity to another's work.

Were her name and the name of her work never mentioned on Basanta's site, she simply never would have noticed. Nor would anyone else. Ever. And, yet, because Basanta's entire project centers around pointing out the kind of quality his machine setup can produce in artwork by comparing it to real-world creations made by humans, suddenly Basanta is mired in intellectual property claims.

And that's what sucks more than anything. One artist suing another, on incredibly specious grounds, is a betrayal of how art is created in the first place. If anything, Basanta was crediting Chamandy and pointing people toward her wider works by doing things the way he did. And this is the thanks he gets, because copyright.

from the lessons-learned dept

One of the reasons why I'm so adamant about the negative impacts on free speech from making internet platforms liable for the speech of their users, or even just by pushing for greater and greater content moderation by those platforms, is that this is not a theoretical situation where we have no idea how things will play out. We have reams and reams of evidence, just in the US alone (and plenty outside of the US) by looking at the copyright world. Remember, while CDA 230 makes platforms immune from a civil lawsuit regarding content posted by users or for any moderation choices regarding that content, it exempts intellectual property law (and criminal law). On the copyright side, we have a different regime: DMCA 512. Whereas CDA 230 creates a broad immunity, DMCA 512 creates a narrower "safe harbor" where sites need to meet a list of criteria in order to be eligible for the safe harbor, and those criteria keep getting litigated over and over again. Thus, we have quite a clear parallel universe to look at concerning what happens when you make a platform liable for speech -- even if you include conditions and safe harbors.

And it's not good.

Corynne McSherry, EFF's legal director has put together an excellent list of five lessons on content moderation from the copyright wars that should be required reading for anyone looking to upset the apple cart of CDA 230. Incredibly, as someone who lives and breathes this stuff every day, it's quite incredible how frequently we hear from people who haven't looked at what happened with copyright, who seem to think that the DMCA's regime is a perfect example of what we should replace CDA 230 with. But... that's only because they have no idea what a complete and total clusterfuck the copyright/DMCA world has been and remains. Let's dig into the lessons:

1. Mistakes will be made—lots of them

The DMCA’s takedown system offers huge incentives to service providers that take down content when they get a notice of infringement. Given the incentives of the DMCA safe harbors, service providers will usually respond to a DMCA takedown notice by quickly removing the challenged content. Thus, by simply sending an email or filling out a web form, a copyright owner (or, for that matter, anyone who wishes to remove speech for whatever reason) can take content offline.

Many takedowns target clearly infringing content. But there is ample evidence that rightsholders and others abuse this power on a regular basis—either deliberately or because they have not bothered to learn enough about copyright law to determine whether the content they object to is unlawful. At EFF, we’ve been documenting improper takedowns for many years, and highlight particularly egregious ones in our Takedown Hall of Shame.

There are all different kinds of "mistakes" embedded here as well. One is just the truly accidental mistakes, but it happens quite frequently when you're dealing with billions of pieces of content. Even if those mistakes only happen 0.01% of the time, you still end up with a ton of content taken down that should remain up. That's a problem.

But the larger issue, as Corynne points out, is abuse. If you give someone a tool to take down content, guess what they do with it? They take down content with it. They often don't care -- even a little bit -- that they're using the DMCA for non-copyright purposes. They just want the content to come down for whatever reasons, and the DMCA provides the tool. Imagine that, but on steroids, in other contexts.

2. Robots aren’t the answer

Rightsholders and platforms looking to police infringement at scale often place their hopes in automated processes. Unfortunately, such processes regularly backfire.

For example, YouTube’s Content ID system works by having people upload their content into a database maintained by YouTube. New uploads are compared to what’s in the database and when the algorithm detects a match, copyright holders are informed. They can then make a claim, forcing it to be taken down, or they can simply opt to make money from ads put on the video.

But the system fails regularly. In 2015, for example, Sebastien Tomczak uploaded a ten-hour video of white noise. A few years later, as a result of YouTube’s Content ID system, a series of copyright claims were made against Tomczak’s video. Five different claims were filed on sound that Tomczak created himself. Although the claimants didn’t force Tomczak’s video to be taken down they all opted to monetize it instead. In other words, ads on the ten-hour video could generate revenue for those claiming copyright on the static.

Again, this has been an incredibly popular claim -- most frequently from people who are totally ignorant both of how technology works (the "well, you nerds can nerd harder" kind of approach) to those who are totally ignorant of what a complete clusterfuck existing solutions have been. ContentID is absolutely terrible and has a terrible error rate and it is, by far, the best solution out there.

And that's just in the narrow category of copyright filtering. If you expect algorithms to be able to detect much more amorphous concepts like bullying, terrorist content, hate speech and more... you're asking for much higher error rates... and a lot of unnecessary censorship.

With the above in mind, every proposal and process for takedown should include a corollary plan for restoration. Here, too, copyright law and practice can be instructive. The DMCA has a counternotice provision, which allows a user who has been improperly accused of infringement to challenge the takedown and, if the sender doesn’t go to court, the platform can restore to content without fear of liability. But the counternotice process is pretty flawed: it can be intimidating and confusing, it does little good where the content in question will be stale in two weeks, and platforms are often even slower to restore challenge material. One additional problem with counter-notices, particularly in the early days of the DMCA, was that users struggled to discover who was complaining, and the precise nature of the complaint.

The number of requests, who is making them, and how absurd they can get has been highlighted in company transparency reports. Transparency reports can both highlight extreme instances of abuse—such as in Automattic’s Hall of Shame—or share aggregate numbers. The former is a reminder that there is no ceiling to how rightsholders can abuse the DMCA. The latter shows trends useful for policymaking. For example, Twitter’s latest report shows a 38 percent uptick in takedowns since the last report and that 154,106 accounts have been affected by takedown notices. It’s valuable data to have to evaluate the effect of the DMCA, data we also need to see what effects “community standards” would have.

Equally important is transparency about specific takedown demands, so users who are hit with those takedowns can understand who is complaining, about what. For example, a remix artist might include multiple clips in a single video, believing they are protected fair uses. Knowing the nature of the complaint can help her revisit her fair analysis, and decide whether to fight back.

Another point on this is that, incredibly, those who support greater content policing frequently are very much against great transparency. In the copyright space, the legacy companies have often hit back against transparency and complained about things like the Lumen Database and the fact that Google publishes details of takedown demands in its transparency reports -- with bogus claims about how such transparency only helps infringers. I fully expect that if we gut CDA 230, we'll see similarly silly arguments against transparency.

4. Abuse should lead to real consequences

Congress knew that Section 512’s powerful incentives could result in lawful material being censored from the Internet without prior judicial scrutiny. To inhibit abuse, Congress made sure that the DMCA included a series of checks and balances, including Section 512(f), which gives users the ability to hold rightsholders accountable if they send a DMCA notice in bad faith.

In practice, however, Section 512(f) has not done nearly enough to curb abuse. Part of the problem is that the Ninth Circuit Court of Appeals has suggested that the person whose speech was taken down must prove to a jury the subjective belief of the censor—a standard that will be all but impossible for most to meet, particularly if they lack the deep pockets necessary to litigate the question. As one federal judge noted, the Ninth Circuit’s “construction eviscerates § 512(f) and leaves it toothless against frivolous takedown notices.” For example, some rightsholders unreasonably believe that virtually all uses of copyrighted works must be licensed. If they are going to wield copyright law like a sword, they should at least be required to understand the weapon.

This gets back to the very first point. One of the reasons why the DMCA gets so widely abused any time anyone wants to take down anything is because there's basically zero penalty for fling a bogus report. There is merely the theoretical penalty found in 512(f) which the courts have effectively killed off in most (hopefully not all) cases. But, again, it's even worse than it seems. Even if there were a better version of 512(f) in place, how many people really want to go through the process of suing someone just because they took down their cat videos or whatever? In many cases, the risk is exceptionally low because even if there were a possible punishment, no one would even pursue it. So if we're going to set up a system that can be abused -- as any of these systems can be -- there should be at least some weight and substance behind dealing with abusive attempts to censor content.

5. Speech regulators will never be satisfied with voluntary efforts

Platforms may think that if they “voluntarily” embrace the role of speech police, governments and private groups will back off and they can escape regulation. As Professor Margot Kaminski observed in connection with the last major effort push through new copyright enforcement mechanisms, voluntary efforts never satisfy people speech censors:

Over the past two decades, the United States has established one of the harshest systems of copyright enforcement in the world. Our domestic copyright law has become broader (it covers more topics), deeper (it lasts for a longer time), and more severe (the punishments for infringement have been getting worse).

… We guarantee large monetary awards against infringers, with no showing of actual harm. We effectively require websites to cooperate with rights-holders to take down material, without requiring proof that it's infringing in court. And our criminal copyright law has such a low threshold that it criminalizes the behavior of most people online, instead of targeting infringement on a true commercial scale.

In addition, as noted, the large platforms adopted a number of mechanisms to make it easier for rightsholders to go after allegedly infringing activities. But none of these legal policing mechanisms have stopped major content holders from complaining, vociferously, that they need new ways to force Silicon Valley to be copyright police. Instead, so-called "voluntary efforts" end up serving as a basis for regulation. Witness, for example, the battle to require companies to adopt filtering technologies across the board in the EU, free speech concerns be damned.

Over and over we've tried to point this out to the platforms that have bent over backwards to please Hollywood. It's never, ever enough. Even when the platforms basically do exactly what was asked of them, those who wish to censor the networks always, always, always ask for more. That's exactly what will happen if we gut CDA 230.

Again, it's worrisome that the debates over changing intermediary liability rules don't seem to even acknowledge the vast amount of empirical evidence we have in the copyright context. And that's maddening, given that the copyright world shows just how broken the system can become.

from the what's-in-a-name dept

Maybe someday AI will be sophisticated, nuanced, and accurate enough to help us with platform content moderation, but that day isn't today.

Today it prevents an awful lot of perfectly normal and presumably TOS-abiding people from even signing up for platforms. A recent tweet from someone unable to sign up to use an app because it didn't like her name, as well as many, many, MANY replies from people who've had similar experiences, drove this point home:

You're right in there with Alan Cumming, the actor, whose name was autocensored by the late City of Heroes MMO's official forums. (The COH forums also auto-nixed Dick Grayson, which was... amusing... on a forum where superheroes got discussed a lot.)

This dynamic is what's known as the Scunthorpe Problem. Scunthorpe is a town in the UK whose residents have had an appallingly difficult time using the Internet due to a naughty word being contained within the town name.

The Scunthorpe problem is the blocking of e-mails, forum posts or search results by a spam filter or search engine because their text contains a string of letters that are shared with another (usually obscene) word. While computers can easily identify strings of text within a document, broad blocking rules may result in false positives, causing innocent phrases to be blocked.

The problem was named after an incident in 1996 in which AOL's profanity filter prevented residents of the town of Scunthorpe, North Lincolnshire, England from creating accounts with AOL, because the town's name contains the substring cunt. Years later, Google's opt-in SafeSearch filters apparently made the same mistake, preventing residents from searching for local businesses that included Scunthorpe in their names.

(A related dynamic, the Clbuttic Problem, creates issues of its own when, instead of outright blocking, software automatically replaces the allegedly naughty words with ostensibly less-naughty words instead. People attempting to discuss such non-purient topics as Buttbuttin's Creed and the Lincoln Buttbuttination find this sort of officious editing particularly unhelpful…)

While examples of these dynamics can be amusing, each is also quite chilling to speech, and to speakers wishing to speak.

With the last name ‘Dicks’, I have to remind people to check their spam folder more often than a Nigerian prince.

It's not something we should be demanding more of, but every time people call for "AI" as a solution to online content challenges these are the censoring problems the call invites.

A big part of the problem is that calls for "AI" tend to treat it like some magical incantation, as if just adding it will solve all our problems. But in the end, AI is just software. Software can be very good at doing certain things, like finding patterns, including patterns in words (and people's names…). But it's not good at necessarily knowing what to make of those patterns.

More sophisticated software may be better at understanding context, or even sometimes learning context, but there are still limits to what we can expect from these tools. They are at best imperfect reflections of the imperfect humans who created them, and it's a mistake to forget that they have not yet replicated, or replaced, human judgment, which itself is often imperfect.

Which is not to say that there is no role for software to help in content moderation. The things that software is good at can make it an important tool to help support human decision-making about online content, especially at scale. But it is a mistake to expect software to supplant human decision-making. Because, as we see from these accruing examples, when we over-rely on them, it ends up being real humans that we hurt.

Had this on a website for the kids, the kids demanded to know why, our last name is ‘Clithero’ Interesting conversation. 😳

from the locking-up-life dept

The world is slowly but surely marching towards newer and better forms of artificial intelligence, with some of the world's most prominent technology companies and governments heavily investing in it. While limited or specialist AI is the current focus of many of these companies, building what is essentially single-trick intelligent systems to address limited problems and tasks, the real prize at the end of this rainbow is an artificial general intelligence. When an AGI could be achieved is still squarely up in the air, but many believe this to be a question of when, not if, such an intelligence is created. Surrounding that are questions of ethics that largely center on whether an AGI would be truly sentient and conscious, and what that would imply about our obligations to such a mechanical being.

Andrei Iancu, director of the U.S. Patent and Trademark Office (USPTO), says that the courts have strayed on the issue of patent eligibility, including signaling he thought algorithms using artificial intelligence were patentable as a general proposition.

That came in a USPTO oversight hearing Wednesday (April 18) before a generally supportive Senate Judiciary Committee panel.

Both Iancu and the legislators were in agreement that more clarity was needed in the area of computer-related patents, and that PTO needed to provide more precedential opinions when issuing patents so it was not trying to reinvent the wheel each time and to better guide courts.

On some level, even without considering the kind of AI and AGI once thought the stuff of science fiction, the general question of patenting algorithms is absurd. Algorithms, after all, are essentially a manipulated form of math, far different from true technological expression or physical invention. They are a way to make equations for various functions, including, potentially, equations that would both govern AI and allow AI to learn and evolve in a way not so governed. However ingenious they might be, they are most certainly no more invention than would be the process human cells use to pass along DNA yet discovered by human beings. It's far more discovery than invention, if it's invention at all. Man is now trying to organize mathematics in such a way so as to create intelligence, but it is not inventing that math.

Yet both the USPTO and some in government seem to discard this question for arguments based on mere economic practicality.

Sen. Kamala Harris drilled down on those Supreme Court patent eligibility decisions -- Aliceand Mayo, among them -- in which the court suggested algorithms used in artificial intelligence (AI) might be patentable. She suggested that such a finding would provide incentive for inventors to pursue the kind of AI applications being used in important medical research.

Iancu said that generally speaking, algorithms were human made and the result of human ingenuity rather than the mathematical representations of the discoveries of laws of nature -- E=MC2 for example -- which were not patentable. Algorithms are not set from time immemorial or "absolutes," he said. "They depend on human choices, which he said differs from E=MC2 or the Pythagorean theorem, or from a "pattern" being discovered in nature.

Again, this seems to be a misunderstanding of what an algorithm is. The organization and ordering of a series of math equations is not human invention. It is most certainly human ingenuity, but so was the understanding of the Bernouli Principle, which didn't likewise result in a patent on the math that makes airplanes fly. Allowing companies and researchers to lock up the mathematical concepts for artificial intelligence, whatever the expected incentivizing benefits, is pretty clearly beyond the original purpose and scope of patent law.

But let's say the USPTO and other governments ignore that argument. Keep in mind that algorithms that govern the behavior of AI are mirrors of the intelligent processes occurring in human brains. They are that which will make up the "I" for an AI, essentially making it what it is. Once we reach the level of AGI, its reasonable to consider those algorithms to be the equivalent of the brain function and, by some arguments, consciousness of a mechanical or digital being. Were the USPTO to have its way, that consciousness would be patentable. For those that believe we might one day be the creators of some form of digital life or consciousness, that entire concept is absurd, or at least terribly unethical.

Such cavalier conversations about patenting the math behind potentially true AGI probably require far more thought than asserting they are generally patentable.

from the calm-down,-guys dept

You may recall the years we've spent over the ridiculous monkey selfie story, concerning whether or not there was a copyright in a selfie taken by a monkey (there is not) and if there is (again, there is not) whether it's owned by the monkey (absolutely not) or the camera owner (still no). But one of the points that we raised was to remind people that not every bit of culture needs to be locked up under copyright. It's perfectly fine to have new works enter the public domain. So much of the confusion over the whole monkey selfie thing is that so many people have this weird belief that every new piece of content simply must have a copyright. Indeed, during the PETA legal arguments in trying to claim the copyright on behalf of the monkey, they basically took it as given that a copyright existed, and felt the only fight was over who got to hold it: the camera owner or the monkey.

As we mentioned a few times throughout that ordeal, it really appeared that PETA's lawyers at the hotshot (and formerly respectable) law firm of Irell & Manella had taken on the case to establish some credibility on the issue of non-human-generated works and copyright. There isn't likely to be a rush of animal selfies (though there just was a pretty damn awesome penguin selfie -- no one tell PETA), but there are going to be a whole bunch of questions in the very, very near future concerning copyright and works generated by artificial intelligence. If you look, there are already many, many law review articles, papers, think pieces and such on whether or not AI-generated works deserve copyright, and some of these go back decades (shout out to Pam Samuelson's prescient 1985 paper: Allocating Ownership Rights in Computer-Generated Works).

But now many of these questions are becoming reality, and some lawyers are freaking out. Case in point: an article in Lexology recently by two Australian lawyers, John Hannebery and Lachlan Sadler, in which they seem quite disturbed about the copyright questions related to the new Clips camera from Google. In case you haven't heard about it (and I'll confess this article was the first I'd found out about it), Clips is a tiny camera that you "clip" somewhere while action is happening and it uses AI to try to take a bunch of good pictures. Sounds interesting enough, if it actually works.

But, as these lawyers note, it's not clear there's any copyright for users of the device, and there almost certainly isn't in Australia where they practice:

Under the Australian Copyright Act, subject to certain exceptions, copyright in an artistic work is owned by the author, which, in relation to a photograph, is "the person who took the photograph". Therefore, as simple as that, the owner of a Clip (or similar product) which takes photos by AI will not own copyright under Australian law, as they are not the person who "took" the photos.

Unfortunately for robots everywhere however, neither will the AI. As you might have noticed in the above quote, it is the person who took the photo who owns the copyright. While "person" is not defined in the Copyright Act, it is defined in the Acts Interpretation Act (which governs the interpretation of legislation), which provides that it includes an individual, body politic, or body corporate but not, by implication, a machine.

Therefore, the answer is that, under Australia law, no-one will own copyright in photos taken by AI. The photos simply will not be protected by copyright in Australia, as they do not have an "author" within the meaning of the Copyright Act. The Australian Federal Court reached a similar conclusion when it ruled that information sheets arranged by a computer program did not attract copyright protection.

A similar analysis almost certainly applies to the US and a bunch of other countries (including Spain and Germany) where the law is pretty clear that non-humans don't get copyright. As that and other articles note, there are some countries (including New Zealand, India, Hong Kong and the UK) which have specifically updated their copyright laws to include a new form of copyright for computer generated works (it varies, but basically giving the copyright to whichever person was most involved in the process -- which opens up a whole different can of worms).

But what struck me about the article by Hannebery and Sadler, is they don't even stop to consider why we might not want every new work to be covered by copyright. It's not even up for discussion in their piece. They just insist that the lack of copyright must be a problem and demand that Australia amend its copyright laws to fix it without ever bothering to explain why it's a problem:

Refusing to afford computer-generated works copyright protection is likely to become more and more problematic, as artificial intelligence develops at a mind-boggling rate and we start seeing artistic works (like paintings, music, and even novels) created by machines.

Eventually, Australian lawmakers will have to address this issue. This may mean adopting an approach similar to that of the UK and New Zealand, whereby copyright ownership is granted to (most likely) the creator/owner of the computer program which authored the work. The alternate approach of granting copyright ownership to computer programs would of course be radical, but is certainly not outside the realm of possibility as technology continues to develop.

Notice how the lack of copyright is declared to be "problematic," and the only debate, it appears, is between whether the owner of the system should get the copyright, or the programmer of the AI.

But that's silly. As we wrote all those years ago, not everything needs copyright. Indeed, even for most of the modern world, we didn't automatically copyright all works of creation until relatively recently. In my case, here in the US, it was still in my lifetime that we assumed most works were in the public domain and only granted copyright to the small percentage that decided to register.

It's just in the past couple of decades -- often driven by special interests who have built entire industries on sucking up copyrights and restricting competition with them -- that we've reached a world where the idea of content without copyright is somehow "problematic." But it's not problematic and it shouldn't be, and we should get past the brainwashing of the legacy copyright players, and recognize that not everything needs copyright, and AI-generated works most certainly do not.

In that article we wrote years back, there's a quote from Sherwin Siy explaining why it's unfortunate that the meaning of the public domain has changed so drastically in just the past few decades:

This is the definition of the public domain—things that are not protected by copyright. We’re used to thinking of the public domain as consisting of things that were in copyright and then aged out of it after a length of time, but that’s just a part of it. There’s also works created by the federal government, and things that simply can’t be protected—like ideas, methods of operation, or discoveries.

But, because legacy copyright interests have been so driven into so many people's heads that everything must be covered by copyright, and everything must be owned, and everything must be locked down, some people seem unwilling to even consider that the world might not fall apart if some content is never under copyright. As we've seen in lots of areas where that's the case, those industries often thrive and grow more rapidly than those encumbered with legacy protections in the form of copyright.

Hopefully, as more and more AI-generated content exists, we resist the urge to lump it all under an outdated 18th century concept that simply isn't needed to create "incentives" for a computer to generate new works.

from the everyone-error dept

In the wake of a Tempe, Arizona woman being struck and killed by an Uber autonomous vehicle, there has been a flurry of information coming out about the incident. Despite that death being one of eleven in the Phoenix area alone, and the only one involving an AV, the headlines were far closer to the "Killer Car Kills Woman" sort than they should have been. Shortly after the crash, the Tempe Police Chief went on the record suggesting that the victim had at least some culpability in the incident, having walked outside of the designated crosswalk and that the entire thing would have been difficult for either human or AI to avoid.

Strangely, now that the video from Uber's onboard cameras have been released, the Tempe police are trying to walk that back and suggest that reports of the Police Chief's comments were taken out of context. That likely is the result of the video footage showing that claims that the victim "darted out" in front of the car are completely incorrect.

Contrary to earlier reports from Tempe’s police chief that Herzberg “abruptly” darted out in front of the car, the video shows her positioned in the middle of the road lane before the crash.

Based on the exterior video clip, Herzberg comes into view—walking a bicycle across the two-lane road—at least two seconds before the collision.

Analysis from Bryan Walker Smith, a professor at the University of South Carolina that has studied autonomous vehicle technology indicates that this likely represents a failure of the AVs detection systems and that there may indeed have been enough time for the collision to be avoided, if everything had worked properly.

Walker Smith pointed out that Uber’s LIDAR and radar equipment “absolutely” should’ve detected Herzberg on the road “and classified her as something other than a stationary object.”

“If I pay close attention, I notice the victim about 2 seconds before the video stops,” he said. “This is similar to the average reaction time for a driver. That means an alert driver may have at least attempted to swerve or brake.”

The problem, of course, is that AVs are in part attractive because drivers far too often are not alert. They are texting, playing with their phones, fiddling with the radio, or looking around absently. We are human, after all, and we fail to remain attentive with stunning regularaty.

So predictable is this failure, in fact, that it shouldn't surprise you all that much that the safety operator behind the wheel of this particular Uber vehicle apparently is shown in the video to have been distracted by any number of things.

A safety operator was behind the wheel, something customary in most self-driving car tests conducted on public roads, in the event the autonomous tech fails. Prior to the crash, footage shows the driver—identified as 44-year-old Rafaela Vasquez—repeatedly glancing downward, and is seen looking away from the road right before the car strikes Herzberg.

So the machine might have failed. The human behind the wheel might have failed. The pedestrian may have been outside the crosswalk. These situations are as messy and complicated as we should all expect them to be. Even if the LIDAR system did not operate as expected, the human driver that critics of AVs want behind the wheel instead was there, and that didn't prevent the unfortunate death of this woman.

So, do we have our first pedestrian death by AV? Kinda? Maybe?

Should this one incident turn us completely off to AVs in general? Hell no.

from the don't-be-evil,-but-AI-first? dept

The present global verve about artificial intelligence (AI) and machine learning technologies has resonated in China as much as anywhere on earth. With the State Council’s issuance of the "New Generation Artificial Intelligence Development Plan" on July 20 [2017], China's government set out an ambitious roadmap including targets through 2030. Meanwhile, in China's leading cities, flashy conferences on AI have become commonplace. It seems every mid-sized tech company wants to show off its self-driving car efforts, while numerous financial tech start-ups tout an AI-driven approach. Chatbot startups clog investors' date books, and Shanghai metro ads pitch AI-taught English language learning.

That's from a detailed analysis of China's new AI strategy document, produced by New America, which includes a full translation of the development plan. Part of AI's hotness is driven by all the usual Internet giants piling in with lots of money to attract the best researchers from around the world. One of the companies that is betting on AI in a big way is Google. Here's what Sundar Pichai wrote in his 2016 Founders' Letter:

Looking to the future, the next big step will be for the very concept of the "device" to fade away. Over time, the computer itself -- whatever its form factor -- will be an intelligent assistant helping you through your day. We will move from mobile first to an AI first world.

This Center joins other AI research groups we have all over the world, including in New York, Toronto, London and Zurich, all contributing towards the same goal of finding ways to make AI work better for everyone.

Focused on basic AI research, the Center will consist of a team of AI researchers in Beijing, supported by Google China's strong engineering teams.

So far, so obvious. But an interesting article on the Macro Polo site points out that there's a problem with AI research in China. It flows from the continuing roll-out of intrusive surveillance technologies there, as Techdirt has discussed in numerous posts. The issue is this:

Many, though not all, of these new surveillance technologies are powered by AI. Recent advances in AI have given computers superhuman pattern-recognition skills: the ability to spot correlations within oceans of digital data, and make predictions based on those correlations. It's a highly versatile skill that can be put to use diagnosing diseases, driving cars, predicting consumer behavior, or recognizing the face of a dissident captured by a city's omnipresent surveillance cameras. The Chinese government is going for all of the above, making AI core to its mission of upgrading the economy, broadening access to public goods, and maintaining political control.

As the Macro Polo article notes, Google is unlikely to allow any of its AI products or technologies to be sold directly to the authorities for surveillance purposes. But there are plenty of other ways in which advances in AI produced at Google's new lab could end up making life for Chinese dissidents, and for ordinary citizens in Xinjiang and Tibet, much, much worse. For example, the fierce competition for AI experts is likely to see Google's Beijing engineers headhunted by local Chinese companies, where knowledge can and will flow unimpeded to government departments. Although arguably Chinese researchers elsewhere -- in the US or Europe, for example -- might also return home, taking their expertise with them, there's no doubt that the barriers to doing so are higher in that case.

So does that mean that Google is wrong to open up a lab in Beijing, when it could simply have expanded its existing AI teams elsewhere? Is this another step toward re-entering China after it shut down operations there in 2010 over the authorities' insistence that it should censor its search results -- which, to its credit, Google refused to do? "AI first" is all very well, but where does "Don't be evil" fit into that?

from the perhaps-AI-can-help-us-deal-with-AI dept

Most people don't understand the nuances of
artificial intelligence (AI), but at some level they comprehend that it'll be
big, transformative and cause disruptions across multiple sectors. And even if AI
proliferation won't lead to a robot uprising, Americans are worried about how AI and automation will
affect their livelihoods.

Recognizing this anxiety, our policymakers
have increasingly turned their attention to the subject. In the 115th Congress,
there have already been more mentions of “artificial intelligence” in proposed legislation and in the Congressional Record than ever before.

While not everyone agrees on how we should
approach AI regulation, one approach that has gained considerable interest is
augmenting the federal government's expertise and capacity to tackle the issue.
In particular, Sen. Brian Schatz has called for a new commission on AI; and Sen.
Maria Cantwellhas
introduced legislation setting up a new committee within the Department of
Commerce to study and report on the policy implications of
AI.

This latter bill, the “FUTURE of Artificial Intelligence Act” (S.2217/H.4625), sets forth a bipartisan proposal that
seems to be gainingsometraction. While
the bill's sponsors should be commended for taking a moderate approach in the
face of growing populist anxiety, it's not clear that the proposed advisory
committee would be particularly effective at all it sets out to do.

One problem with the bill is how it sets the
definition of AI as a regulatory subject. For most of us, it's hard to
articulate precisely what we mean when we talk about AI. The term “AI” can describe
a sophisticated program like Apple's Siri, but it can also refer to Microsoft's
Clippy, or pretty much any kind of computer software.

It turns out that AI is a difficult thing to define, even for experts.
Some even argue that it's a meaningless
buzzword. While this is a fine debate to have in the academy, prematurely
enshrining a definition in a statute – as this bill does – is likely to be the
basis for future policy (indeed, another recent bill offers a totally different definition). Down
the road, this could lead to confusion and misapplication of AI regulations. This
provision also seems unnecessary, since the committee is empowered to change
the definition for its own use.

The committee's stated goals are also overly-ambitious.
In the course of a year and a half, it would set out to “study and assess” over
a dozen different technical issues, from economic investment, to worker
displacement, to privacy, to government use and adoption of AI (although,
notably, not defense or cyber issues). These are all important issues. However,
the expertise required to adequately deal with these subjects is likely beyond
the capabilities of 19 voting members of the committee, which includes only
five academics. While the committee could theoretically choose to focus on a
narrower set of topics in its final report, this structure is fundamentally not
geared towards producing the sort of deep analysis that would advance the
debate.

Instead of trying to address every AI-related
policy issue with one entity, a better approach might be to build separate, specialized
advisory committees based in different agencies. For instance, the Department
of Justice might have a committee on using AI for risk assessment, the General
Services Administration might have a committee on using AI to streamline
government services and IT
infrastructure, and the Department of Labor might have a committee on worker displacement
caused by AI and automation or on using AI in employment decisions. While this
approach risks some duplicative work, it would also be much more likely to
produce deep, focused analysis relevant to specific areas of oversight.

Of course, even the best public advisory
committees have limitations, including politicization, resource constraints and
compliance with the Federal Advisory Committee Act. However, not
all advisory bodies have to be within (or funded by) government. Outside
research groups, policy forums and advisory committees exist within the private
sector and can operate beyond the limitations of government bureaucracy while
still effectively informing policymakers. Particularly for those issues not
directly tied to government use of AI, academiccenters, philanthropies and other groups
could step in to fill this gap without any need for new public expenditures or
enabling legislation.

If Sen. Cantwell's advisory committee-focused
proposal lacks robustness, Sen. Schatz's call for creating a new “independent federal
commission” with a mission to “ensure that AI is adopted in the best interests
of the public” could go beyond the bounds of political possibility. To his
credit, Sen. Schatz identifies real challenges with government use of AI, suchas those posed by criminal justice applications,
and in coordinating between different agencies. These are real issues that
warrant thoughtful solutions. Nonetheless, the creation of a new agency for AI
is likely to run into a great deal of pushback from industry groups and the
political right (like similar proposals in the past), making it a difficult
proposal to move forward.

Beyond creating a new commission or advisory
committees, the challenge of federal expertise in AI could also be
substantially addressed by reviving Congress' Office of Technology Assessment
(which I discuss in a recent paper with
Kevin Kosar). Reviving OTA has a number of advantages: OTA ran
effectively for years and still exists in statute, it isn't a regulatory body,
it is structurally bipartisan and it would have the capacity to produce deep-dive
analysis in a technology-neutral manner. Indeed, there's good reason to
strengthen the First Branch first, since Congress is ultimately responsible for
making the legal frameworks governing AI as well as overseeing government
usage.

Lawmakers are right to characterize AI as a big deal. Indeed, there are trillions of
dollars in potential economic benefits at stake. While
the instincts to build expertise and understanding first make for a commendable
approach, policymakers will need to do it the right way – across multiple
facets of government – to successfully shape the future of AI without hindering
its transformative potential.

from the no-social-graph-required dept

Techdirt has been exploring the important questions raised by so-called "fake news" for some time. A new player in the field of news aggregation brings with it some novel issues. It's called TopBuzz, and it comes from the Chinese company Toutiao, whose rapid rise is placing it alongside the country's more familiar "BAT" Internet giants -- Baidu, Alibaba and Tencent. It's currently expanding its portfolio in the West: recently it bought the popular social video app Musical.ly for about $800 million:

Toutiao aggregates news and videos from hundreds of media outlets and has become one of the world's largest news services in the span of five years. Its parent company [Bytedance] was valued at more than $20 billion, according to a person familiar with the matter, on par with Elon Musk's SpaceX. Started by Zhang Yiming, it's on track to pull in about $2.5 billion in revenue this year, largely from advertising.

Toutiao, one of the flagship products of Bytedance, may be the largest app you’ve never heard of -- it's like every news feed you read, YouTube, and TechMeme in one. Over 120M people in China use it each day. Yet what's most interesting about Toutiao isn't that people consume such varied content all in one place... it's how Toutiao serves it up. Without any explicit user inputs, social graph, or product purchase history to rely on, Toutiao offers a personalized, high quality-content feed for each user that is powered by machine and deep learning algorithms.

However, as people are coming to appreciate, over-dependence on algorithmic personalization can lead to a rapid proliferation of "fake news" stories. A post about TopBuzz on the Technode site suggests this could be a problem for the Chinese service:

What's been my experience? Well, simply put, it's been a consistent and reliable multi-course meal of just about every variety of fake news.

The post goes on to list some of the choice stories that TopBuzz's AI thought were worth serving up:

Roy Moore Sweeps Alabama Election to Win Senate Seat

Yoko Ono: "I Had An Affair With Hillary Clinton in the '70s"

John McCain's Legacy is DEMOLISHED Overnight As Alarming Scandals Leak

The post notes that Bytedance is aware of the problem of blatantly false stories in its feeds, and the company claims to be using both its artificial intelligence tools as well as user reports to weed them out. It says that "when the system identifies any fake content that has been posted on its platform, it will notify all who have read it that they had read something fake." But:

this is far from my experience with TopBuzz. Although I receive news that is verifiably fake on a near-daily basis, often in the form of push notifications, I have never once received a notification from the app informing me that Roy Moore is in fact not the new junior senator from Alabama, or that Hillary Clinton was actually not Yoko Ono's sidepiece when she was married to John Lennon.

The use of highly-automated systems, running on server farms in China, represents new challenges beyond those encountered so far with Facebook and similar social media, where context and curation are being used to an increasing degree to mitigate the potential harm of algorithmic newsfeeds. The fact that a service like TopBuzz is provided by systems outside the control of the US or other Western jurisdictions poses additional problems. As deep-pocketed Chinese Internet companies seek to expand outside their home markets, bringing with them their own approaches and legal frameworks, we can expect these kind of issues to become increasingly thorny. We are also likely to see those same services begin to wrestle with some of the same problems currently being tackled in the West.

from the beware-the-innovations-you-kill dept

We've written a few times about the GDPR -- the EU's General Data Protection Regulation -- which was approved two years ago and is set to go into force on May 25th of this year. There are many things in there that are good to see -- in large part improving transparency around what some companies do with all your data, and giving end users some more control over that data. Indeed, we're curious to see how the inevitable lawsuits play out and if it will lead companies to be more considerate in how they handle data.

However, we've also noted, repeatedly, our concerns about the wider impact of the GDPR, which appears to go way too far in some areas, in which decisions were made that may have made sense in a vacuum, but where they could have massive unintended consequences. We've already discussed how the GDPR's codification of the "Right to be Forgotten" is likely to lead to mass censorship in the EU (and possibly around the globe). That fear remains.

But, it's also becoming clear that some potentially useful innovation may not be able to work under the GDPR. A recent NY Times article that details how various big tech companies are preparing for the GDPR has a throwaway paragraph in the middle that highlights an example of this potential overreach. Specifically, Facebook is using AI to try to catch on if someone is planning to harm themselves... but it won't launch that feature in the EU out of a fear that it would breach the GDPR as it pertains to "medical" information. Really.

Last November, for instance, the company unveiled a program that uses artificial intelligence to monitor Facebook users for signs of self-harm. But it did not open the program to users in Europe, where the company would have had to ask people for permission to access sensitive health data, including about their mental state.

Now... you can argue that this is actually a good thing. Maybe we don't want a company like Facebook delving into our mental states. You can probably make a strong case for that. But... there's also something to the idea of preventing someone who may harm or kill themselves from doing so. And that's something that feels like it was not considered much by the drafters of the GDPR. How do you balance these kinds of questions, where there are certain innovations that most people probably want, and which could be incredibly helpful (indeed, potentially saving lives), but don't fit with how the GDPR is designed to "protect" data privacy. Is data protection in this context more important than the life of someone who is suicidal? These are not easy calls, but it's not clear at all that the drafters of the GDPR even took these tradeoff questions into consideration -- and that should worry those of us who are excited about potential innovations to improve our lives, and who worry about what may never see the light of day because of these rules.

That's not to say that companies should be free to do whatever they want. There are, obviously LOTS of reasons to be concerned and worried about just how much data some large companies are collecting on everyone. But it frequently feels like people are acting as if any data collection is bad, and thus needs to be blocked or stopped, without taking the time to recognize just what kind of innovations we may lose.