s3e29: Tiny Brains Mewling In All The Things

by danhon

0.0 Station Ident

3:48pm on Thursday October 20, 2016 at the XOXO Outpost, listening to The Promise, by When in Rome. Thanks to everyone who wrote back after yesterday’s episode – all the replies, even the tiny short ones, mean a lot. One of the big things I’ve been…. struggling with? Thinking about? …is the way in which this newsletter really started out as a way for me to get stream-of-consciousness thoughts out and to record, well, the things that had caught my attention. I continually protested right from the start that I wasn’t writing it for *you*, I was writing it for *me*, and I’d do things like turn off subscriber/unsubscribe notifications to make sure that the little (ok, large) external-validation part of me didn’t get stimulated into a frenzy.

At some point, it became harder to deal with the whole it’s-not-you-it’s-me aspect of this whole writing exercise. It’s not like I’m *not* aware that this newsletter has 2,102 subscribers (which is… a lot, I think? But not *that* many? Like, it’s a solid B, maybe B- in the Leagues Of Newsletters, even if I did get a nice mug from Tinyletter once. I mean, I’m no Caitlin Dewey, Laura Olin or Deb Chachra (you should subscribe to their newsletters[0, 1, 2])). It’s not like my worth as a person is directly correlated with the number of newsletter subscribers I have. No, that’s just a bad Black Mirror spec script. No, this is the age-old problem of having developed an audience (for certain sizes of audience) and then trying to figure out: am I doing this for them? Do I have to make them happy? What is it they want from me? Or staying true to the original organizing principle of: fuck you guys, I’m just writing whatever’s in my head.

If you’ll allow me this indulgence, part of the issue with that, in the unreasonable part of my brain, is that there’s a (I suspect) atypical distribution of “influential people” and, from my point of view, People Who Should Know Better who subscribe to and (thanks to the modern Internet-surveillance-tracking-complex) read these newsletter episodes. Why?! What could possibly be interesting enough for them to want to read this unorganized stream of consciousness? (I can tell, already that I’ve lost at least three readers now who thought they were going to get invaluable insight on What Brands Should Do Next About Convolutional Neural Networks)? No, my therapist (this is America, after all) said to me. If you’re really just writing about whatever’s in your head, Dan, maybe your head is inherently interesting enough for people to want to pay attention to you, *whatever* you’re interested in.

Let’s pause there at time index 1, and enhance the joke and commentary so we kill it dead. I said “AIs don’t kill people. People kill people.” because I was making a comparison between what (some) people say about guns. The argument of course is that guns, as inanimate objects without agency do not possess the capability to kill *on their own* and thus require the operation – the intent and operation – of a human being, a person, a people, so that they can fulfill their purpose. People (using guns) kill people. Guns are mere tools.

This is 2016, so any hack can make themselves look smart by taking something someone has said about anything in history and substituting artificial intelligence. Look: AIs don’t kill people. People kill people. See! Someone should give me a column in Fast Company.

What does “AIs don’t kill people. People kill people” even *mean*?! I mean, I guess it means that some people might think that AIs are just tools and that any intent comes from the humans, persons, people designing and training the AI and that well, if you’ve got an autonomous drone that goes and decides to fire a hellfire missile at something and someone dies well, it’s not the autonomous drone with AI that killed someone, it’s the people who designed the damn thing and decided give it permission to fly in the sky and to autonomously make fire-control decisions.

But! Martin actually takes my quip seriously! Now I’m stuck in a hole where someone’s properly thinking about what I said and what *he* says *does* actually illuminate something about our present situation in 2016 and is far more useful than someone just taking a well-known phrase about one kind of technology and replacing that one kind of technology with another kind of technology.

No, Martin says: “AIs don’t kill people. Training data kills people.”

A-ha.

Right now, our successful applications of deep-learning and artificial intelligence mainly use supervised learning. We put together a training set for them, annotate it so that we know what the data is supposed to *mean* (look, how are you going to train an AI to recognize dogs if you haven’t labelled all the dog pictures?) The smarter amongst you will have figured out then that your bound and set of training data is necessarily, well, bounded and setted. There’s stuff *outside* of the training data. How do AIs react with situations that aren’t covered inside the training set? Part of the way that humans deal with this (in a *really* rough way, and COUGH CITATION NEEDED) is that we’re pretty (CITATION NEEDED) good at Bayesian probabilities and developing priors, so that when something unpredicted happens we can make up a new rule based on the little prior information we have and quickly tweak that rule (YES THIS MEANS YOU GREG BORENSTEIN, PLEASE SEND A REPLY THAT ACTUALLY EXPLAINS THIS PROPERLY).

Anyway. This is, in a way, part of the explanation for yesterday’s twitter joke about a future car not being able to see black people because the training set accidentally didn’t include people of colour[2].

Surprise: this thread isn’t actually about AI, the ethics of training set data or anything like that. This thread is *actually* about what happened next, when I asked Martin for pre-emptive permission to use his reply somewhere. That somewhere turns out to be here.

First, some stuff out of the way.

Yes, Ted Nelson. Yes, Tim Berners-Lee. Yes, Xanadu[3].

I said: “A button on social media screens that says “ask permission to use this media elsewhere” and sends a DM request”[4]

To which the expected replies are, of course, “interns” (thank you, Martin), and so-on. But I want to describe in more than 140 characters what I was thinking of *in terms of a thought experiment*.

Scenario 1: every single photo on Flickr, alongside its license, and depending upon the license offered with the media, has a link that allows you to request permission to use the image from its creator. Little Bobby Tables is putting together his presentation on Doors In Rome for his product marketing manager and finds a whole bunch of nice photos of Doors in Rome and one in particular really strikes his fancy. It’s not CC-licensed, and he wants to use it in his presentation. In regular world, he’d just copy and paste that shit, because copyright is, shall we say, a bit laissez-faire in certain respects in 2016. In my I’m the Product Owner Of Everything World, Bobby can:

– click to request usage of the image and optionally say what he wants to use it for
– Harold, the creator of the image can approve the usage and set a license price (Flickr also has some suggestions based on the kind of image)
– Bobby gets a legal copy of the image to use, with a license and proof that he’s paid for it
– Harold gets like… a hundredth of a bitcoin/imaginary currency/20 cents that accrue in an account until Flickr/Yahoo/Verizon feel like paying out, a bit like Amazon Associates fees

Okay, fine. But what if that was *everywhere*? Or in the minimally-viable-number-of-places? What infrastructure would you need to support a mechanism that:

– on display of media, provides a way to request a license from the copyright owner/licensor
– allows the copyright owner to respond and set a fee
– collects that fee
– delivers a licensed copy of the original media to the licensee

(you can tell I used to be an IP lawyer, a long long time ago)

OK, common objections: this breaks copy-and-paste and with it one of the fundamental freedoms of the Internet to which I say: look, this doesn’t *replace* copy-and-paste, it’s a thought experiment as to what would need to be done (and I didn’t say it was achievable or practical!) to implement an iTunes-like easier-to-pay-than-search-Napster-Google-Images for *legally* purchasing licensed media to use in whatever bullshit PowerPoint knowledge workers need to put together to impress their boss this week.

OK, so some annoying things. You want to make this as fast as possible so Bobby doesn’t *actually* want to wait for Harold to receive the request and for Harold to accede to the request. You want to make it so that it’s worth Harold’s while to just say, you know what, automatically say yes up to a certain bitcoin/imaginary currency/anything-but-pounds-sterling. I mean, it’d be like if you tried to buy a piece of music you heard on Soundcloud and you’d have to WAIT for someone to say OKAY FINE GIVE ME MONEY FOR THAT WEIRD TRACK I MADE.

Anyway, this doesn’t really go anywhere because if I think about this more then I’m doing amateur product management and design as a hobby and what the fuck is that anyway. I’m supposed to be getting ready to go out for a nice family dinner.

Anyway *anyway*, a few final associations generated by my association generator:

– have any of the photo licensing houses hired a bunch of computer vision/deep learning engineers and what products are they developing what do you mean they’re not developing any hey here’s a pitch you should take a look at…
– “I’m a social justice deep learning designer.” “Oh? What’s that, then?” It means I make sure that training datasets for artificial intelligences don’t inadvertently perpetuate systemic discrimination or discriminate in new and exciting socially undesirable and inequitable ways” [doesn’t exist, should exist?]
– Institutional Review Boards for Deep Learning Datasets
– hell, Institutional Review Boards for Deep Learning