Except that Google employs thousands of homeworkers quality-checking search results to reduce the impact of Search Engine Optimisation on its advertising business. Search is only ‘fully automated’ when it suits Google’s interests to keep it that way.

Google’s response in full context, missing from that article: “One thing I think the SEO community is missing is that this program has nothing to do with SEO or rankings. What this program does is help Google refine their algorithm. For example, the Side-by-Side tasks show the results as they are next to the results with the new algorithm change in them. Google doesn’t hire these raters to rate the web; they hire them to rate how they are doing in matching users queries with the best source of information.”

Sure, but they ‘refine’ their (secret) algorithm about 500 times a year, i.e. more than once a day on average. Since we don’t know what these ‘refinements’ consist of, it is impossible to say whether or not they are ad hoc adjustments to deal with particular SEO abuses. Google’s weaselly statement doesn’t rule this out.
In any event, the real lesson is that Google cannot pretend they do not have human eyeballs vetting their search results. If they devoted even 150 people, not 1500, to monitoring pirate sites, they could easily deal with the ‘whack-a-mole’ problem. The basic search algorithm can detect minor name variations like ‘Newzbin2’ instead of ‘Newzbin’, and an eyeball can do the rest. To win at whack-a-mole you just need more whackers than moles, and there are really not that many seriously large-scale pirate sites – just a few hundred, judging by Google’s Transparency Report. If each mole-whacker was allocated ten sites to monitor, 150 people could have a massive impact. If Google don’t do it, it is not because they can’t but because they don’t want to.

If Google can do it via human eyeballs, so can content creators. Except content creators can do it better, because they know what content they don’t want on the Internet, and what content they don’t mind people sharing.

“content creators can do it better, because they know what content they don’t want on the Internet”
Boohoo, the Evil Content Creators want to censor the Internet. 🙂
Get out of here, we just don’t want pirates to steal our property. Simple…

It is very important to understand that Google have absolute control over their engine.
They successfully filter child porn, illegal distribution of drugs and weapons, assassins for hire, counterfeit money and stolen credit card numbers.
Google’s business model is to maximize the revenue they generate from the interval between a and b, where a is the moment they upload the criminal material, and b is the moment when they are forced to take it down.

Thank you for the refreshing dose of reality in this ridiculous debate over whether google has the capability to control or change it’s own search algorithms.
Google does not change the algorithm because there is no financial incentive to do so nor is there any legal obligation to do so.
Anyone who suggests that google cannot do anything about the search process is an unfrozen caveman drinking koolaid while in an on-line public relations circle jerk.

Ever setup an internet filter? You can block a lot of stuff, for a while. But users with enough time on their hands will get around it. Same goes for Google searches. If you have enough time on your hands, you can use Google to find credit card info. They block a lot of stuff, but they don’t and can’t block everything.

Yes filtering everything is impossible. But google could reduce or eliminate the suggestions that lead users to sites hosting illegal content. IMO If users have to dig through more than than three pages of links to locate an illegal download then at least 75% of casual downloaders will give up. Yes there will always be some users who want to dig and hoard everything but there are some simple steps that can be taken to stop making it easier to find illegal content. Right now it is too easy to find infringing files with google YouTube yahoo bing not to mention all of the other metacrawlers

“If users have to dig through more than than three pages of links to locate an illegal download then at least 75% of casual downloaders will give up.”
Which is why Google make sure we can find them on page 1.
And then we have autocomplete where Google deliberately encourages children and other innocent people to commit crimes that can have severe consequences.

“Severe consequences to whom?”
Depends on where you live…
When Google tricks an innocent Japanese teenager into downloading an illegal file, s/he faces 2 years in jail and a fine of $25,000.
Please note, that I’m not in favour of that sort of punishment. Not even close. I think that moderate fines of $100-200 per infringement is the way to go. And so does a majority of the population.
I’m just saying that everything is changing right now. And Google has to adjust.
Willingly or otherwise.

The only engine that matters is Google. But it goes without saying that all other deliberate DMCA abusers have to be stopped as well.
It’s also important to emphasize that the problem is Google’s search engine — along with a few other initiatives, such as their book scanning project.
Nobody wants a witch hunt.
Google still hold a few very good cards. Gmail’s nice. YouTube could be legitimate any day.

“Maybe its time for the music industry to stop complaining about Google’s ‘nefarious’ business strategy”
On the contrary, it’s time for the Piracy Industry to die.
The world loses billions of dollars and hundreds of thousands jobs because of IP theft every year.
People don’t want that anymore.

“people don’t actually give a shit”
Yes, they do!
Wake up, man — this is not the 90’s or 00’s anymore.
Google hasn’t noticed yet, but the financial crisis changed everything. People don’t want to lose billions of dollars and hundreds of thousands jobs to the piracy industry anymore.
A clear majority of the population even want to punish illegal downloaders now:http://dmnrocks.wpengine.com/permalink/2013/20130118blocking
Anti-piracy initiatives are popping up all over the globe right now.
Get over it…

The argument that “If Google can stop child porn from showing up, they can stop illegally shared copyrighted material from showing up” is totally bogus.
Google white washes terms like “child porn.” If you search for “child porn,” you only get results from a list of approved sources… gov’t sites, news sites, wikipedia, about.com, etc. So if you somehow had a legitimate website about child porn, it wouldn’t get listed.
Searches like “mp3” aren’t inherently illegal, & are much too broad to apply this type of solution… far too much legitimate content would get deindexed to justify it. Basically, what it would entail, is a list of “approved music sources,” which would be stuff like mtv.com, rollingstone.com, atlanticrecords.com, stereogum.com, etc, & if you started a new music blog, for instance, you would somehow have to get it added to this list of approved sources in order to get indexed on Google. Google is not interested in filtering it’s results this way, that would be even worse than what SOPA proposed.
What’s also worth mentioning is that a lot of these results don’t actually host illegal content. The first result, for instance, mp3skull, crawls sites like blogs & displays a list of links to posted mp3s out of their context. The 2nd result on mp3skull links to a file hosted on fuelfriendsblog.com, a rather popular blog with a pretty substantial audience of folks who love the type of cheesy faux-americana pop crap that the Lumineers peddle, & is leveraged on the regular by publicists with acts to promote. And another links to their daytrotter performance. The irony here is that these sort of links were essential in promoting a band like the Lumineers early on, in part responsible for their success, & they only become a liability after success is achieved.
This could be partly solved if music bloggers would adjust their htaccess files to disable the hotlinking of mp3s, though I’d wager that most bloggers don’t have the technical shills to accomplish this, or they use blogging services where they don’t allow access to disabling hotlinking.
This is a much more intricate issue than anyone here is really acknowledging.

Hehe, respect — your parody is improving. 🙂
But have you considered the price?
Soon, you’ll be turned into an anti-pirate and I won’t have to post here anymore. 🙂
Feel the dark side coming closer, do you?

“jw, please, you are on the wrong website”
On the contrary, our brave pro-pirates have a lot to learn and this is a good place to start.
I, for one, will be very happy to help them understand how much harm mainstream piracy does to society and why it is very easy to stop most of it without jeopardizing any of the freedom and diversity that is so essential to us all.
“This isn’t TorrentFreak :)”
Nah, that’s for pirates & pedophiles.

Be that as it may, no one bothered to actually click the links & see where the content is coming from.
If a publisher sends a blog an mp3, & then another site indexes that mp3, is the indexing site illegal? Should it be removed from Google?
That’s a relevant question. The anti-technologists aren’t bothering to take a serious lay of the land, the popular “solution” is more like a quick glance judge/jury/executioner steamrolling.

These responses are so depressing. You shouldn’t be giddy over these issues, you should be able to discuss them without resorting to mischaracterizations or sarcasm. If I didn’t know better, I’d think some of these responses were written by children.
The chances of getting a virus or malware from an mp3 site using Chrome on a mac are almost 0. Practically unheard of. Either way, that doesn’t give you an excuse to not know what the hell you’re talking about.

Word. There’s just not enough marketshare to make writing mac viruses worthwhile. They’re out there, but you’re not going to get them from mp3 sites.
Every now & again I end up on a fishy page & it downloads an exe to my computer & I’m like “lol!”
Lack of understanding of technology breeds fear of technology. That’s why a serious discussion on this topic can’t be had here.

That’s assuming the publisher delivered the MP3 to the initial site. This is not the case when it comes to take-down notices. In most cases of these sites, the blogs are individuals uploading their content. Then these other sites find the MP3s and “index” to the blogs in order to keep themselves harmless against infringment suits by claiming it was user generated – meanwhile they rake in the advertisement dough by generating views by promoting free downloads. The music industry isn’t interested in joe shmoes blog, but want to take down the websites that drive ad traffic by indexing joe shmoes blog. And they’re the 1st result because Google services the ads to their page. Follow the money. Piracy is a multi-billion dollar industry with Google at the top tier of profits.

Do yourself a favor & do a search for “lumineers mp3.” (Preferably in a Chrome Incognito window, so your results aren’t colored by your search habits.) And tell me what ads show up on that results page. I’ll go ahead & tell you, it’s going to be an ad or two for local cosmetic dentistry.
Then click on the top search results, start with mp3skull.com. Who’s serving those ads? I’ll tell you, Yield Manager & Ad Reactor. Amazon is a legitimate result. Hulkshare is a cloud storage site, any any illegally posted content there is a violation of it’s terms of services, so that page should be removed by Hulkshare, but it does have google ads. Freemp3x is next, ads by Ad Reactor. Next is soundowl.com, advertising by advertising.com. Next is popstache.com, no advertising. Then a direct link to an isohunt.com torrent, which is strange… no advertising. Then mp3olimp.com, no advertising. And then a torrent.cd link, which appears to sell direct advertising.
Tell me again about how Google is making all of the profits from piracy. And tell me again about how you followed the money, & where it took you to. Because I see Yahoo! (yield manager) & AOL (advertising.com) implicated, but the only google ad is on a cloud storage site. And the google ads on the search results page are obviously not designed to profit off of piracy.
I’ll let you in a little secret. Best Buy never made money off of CD sales. Music was a loss leader… the sold for less than what Best Buy was paying for them. The recorded music industry is tiny compared the sales of computers, televisions, refrigerators, car stereos, etc. That’s how Best Buy made it’s profits. And that’s how Google makes it’s profits, off of legitimate sales of products that cost more than $.99 a pop. Not off of mp3skull or anything like that. If you think Google is making billions off of music piracy, & if you think that’s the backbone of the company, you’re absolutely clueless.
Do you think Apple would’ve launched iTunes if it didn’t have the iPod to sell? No way. Do you think Microsoft would’ve launched the zune store without the zune? Or xbox music without the xbox? Your understanding of how these technology companies are making their money is warped.
Get some perspective, man.

“Do you think Apple would’ve launched iTunes if it didn’t have the iPod to sell?”
Eventually, yes.
See, Apple is the exact antithesis of Google:
Apple create where Google steal & copy, and they are genuinely interested in music and musicians.
That’s why they invented the ecosystem that so many of us use all the way from creation to distribution, while Google created an equally powerfull but destructive ecosystem designed to abuse right holders and ultimately abolish the concept of copyright.
No company has harmed artists more than Google, and no company has empowered artists more than Apple.
The only thing Apple need to fix — and I know this is off topic, but I can’t help it — is their censorship:
Do we, for instance, need them to blank out the title of Naomi Wolf’s Vagina: A New Biography?

This comment gives far too much weight to the downside of search blocking as against its benefits. I suspect there is, if only subconsciously, a false analogy being made with the criminal justice system. It is often said that it is better to let a hundred guilty men go free than to jail (or hang) one innocent one. I doubt that in reality any justice system is quite that scrupulous, but let us accept it as a noble ideal. Does it follow that we should tolerate a hundred illegal search results just to let one legal one go unhindered?
Not at all. The consequence of blocking a search result is just a minor inconvenience, which would only be temporary if there is an appeal procedure. No-one is being hanged. Nor is any legal activity actually being stopped. For example, if an artist wishes to make a track available for free download from their own website, and this is inadvertently blocked from search results, it would still be available and could still be publicised in various ways.
But even if we did apply the ‘100 to 1’ principle, I suspect the balance would still be heavily in favor of blocking. If you search for any moderately popular artist’s name, followed by ‘download’, you will get literally millions of results, of which at most a few thousand can be legitimate, and these mainly on a handful of sites like iTunes which could be ‘whitelisted’.

“The consequence of blocking a search result is just a minor inconvenience”
It’s not even an inconvenience. We’re talking about obvious illegal material here, nobody’s going to miss that.
No legitimate music/movie/software/literature will be removed from the market so no censorship will be involved at any point, and I’d like to ask our brave pro-pirates to remember one little detail they often ‘forget’:
Nobody does more to protect freedom of speech and to fight censorship than artists. Because freedom of speech enables us to do what we do, and any brand of censorship prevents us from it. That was true a hundred years ago, and it is true today.
While pirates never complain about regimes that remove legitimate material from the market and persecute content creators…

“I’d love to hear exactly how all infringing content can be removed without touching any legitimate content.”
No, you don’t. But let me repeat the facts for the rest of you:
Google have absolute control over their search engine.
They already successfully filter child porn, illegal distribution of drugs and weapons, assassins for hire, counterfeit money and stolen credit card numbers.
We just ask them to do to mainstream piracy what they already do to mainstream child porn (which is almost a contradiction in terms today because of these very efforts).
And that’s what’s going to happen if they want to survive.
If you want to see the future of their search engine — or the future of the next ‘Google’, if this one fails — look at YouTube: When they clean up its autocomplete function (which can be done within an hour) the service will be legitimate.
Nobody has ever expected them to eradicate 100% of all illegal material. We wouldn’t want to live in a world where that can be done.
We just expect them to stop their deliberate DMCA abuse.

Do you wanna know how YouTube works? Have you ever uploaded a video to YouTube? Here’s some insight.
First of all, you can’t upload just any video file to YouTube. There are codec licensing issues involved, technology issues, etc. And it has to be under a certain size. Let’s say you decide to upload a 50mb quicktime file. Let’s say that took 6 minutes. Then YouTube takes that file & converts it flv (flash) format at multiple resolutions/qualities (and probably does a host of other things I’m not aware of). From there, YouTube has a standardized format that it can compare to copyrighted data in it’s database, because the format is conducive to that sort of thing. It automatically analyzes the video (file length, waveform data, etc) to it’s list, & it doesn’t determine the file to be copyrighted material, it passes through. I dunno what the average length of time this takes is, I’m going to guess & say 30 minutes later your video is viewable on youtube.com.
The way that Google’s search spiders work, they’re only scraping essential information, no multimedia files, no javascript, no css files, etc. Google’s results are as exhaustive as they are because of the efficiency of it’s spiders. Do you really think that Google could operate analyzing every multimedia file on every page of the internet in the same way that YouTube, which is a strictly controlled environment, does? Does that REALLY make sense to you? Understanding that there are all sorts of file enclosures that YouTube doesn’t allow, & codecs that it doesn’t allow or is forbidden to allow due to licensing, understanding some spiders would be stuck converting and analyzing multi-gig videos, rather than skimming through 40kb html files… this would break Google.
Your understanding of how Google works versus how YouTube works is incomplete at best, outrageously ignorant at worst.

Like I said, perhaps you really believe your own words. But that doesn’t change the facts:
Google have absolute control over their search engine.
They show you what they want to show you — and block the rest.

Wow.
You really don’t know anything about search engine optimization, referal weights, performance analysis… any of the technology that goes into producing Google’s search results. You’re just ignorant of the whole thing, & you’re convinced that it’s all a conspiracy.
Tell me again about how the government is behind 9/11…

I think the DMCA was written to protect the internet, not to protect any particular company or type of company. If the law favors content over technology or vice versa, the consumer loses out. But there’s also an implied requirement that content creators keep pace with technology. And certain companies on the content creation side hate that.
I think that the WSJ article is overblown, in a “Well I guess it’s time we revisit this old hat” type of way. I mean, the article can pretty much be summed up by the parts that’s like “Yeah, ContentID works awesome, but some companies forget to use it.” lol. I mean that could come straight out of the Onion. But it’s in the WSJ. They’re just trying to stir up some controversy.
The truth of the matter is that the film industry continues to grow, & hasn’t experienced the pains that the music industry has because it’s partnered with and kept pace with technology companies, & YouTube is a large part of that, & the WSJ fails to provide that context. And no one should have any sympathy for content companies that forget to use ContentID, even if it’s Disney.

“companies on the content creation side hate that”
Content creating companies don’t deal with ‘hate’.
Content creators are concerned about Google’s deliberate DMCA abuse and Google’s infamous Creepy Line philosophy:
“There is what I call the Creepy Line, and the Google policy about a lot of these things is to get right up to the Creepy Line, but not cross it.”
Eric Schmidt
The creepy part about the Creepy Line philosophy is that Google, and no one else, determine what and where it is from case to case — and whether or not they cross it.

Ok “protect the rights of ” was a bad choice of words. I disagree that the DMCA was neutrally written, there were and still are huge sums of lobbying money from both sides that believe the same.
So how about “Do you think that the safe harbor provision in DMCA section 512 should be changed in such a way that would shift more of the responsibility and burden for policing content from the copyright holder to the internet tech industry?”
I believe this is the only question in this debate and it can be answered by informed individuals with either a yes or no.
Finally (the comment thread starts getting tiny), yes, the WSJ like all journalism must sell product. Creating controversy by editorial choice is the preferred method.

I think YouTube is a great example of a responsible distributor of content, & a responsible steward of the DMCA. By & large, the content creators that make use of ContentID are happy with it & consider it effective. And the content creators make a lot of money off of YouTube.
I think that there ought to be a third party system like ContentID that can be leveraged by any website that allows for the posting of material that will be made publicly available that could potentially violate copyright. This should be done at the point of upload, because that’s where the infraction occurs. It shouldn’t be up to Google to, after the fact, figure out if certain content is copyright kosher, just because a lot of people happen to go to Google. Google should be as neutral as possible, & enforcement of copyright law should happen at the point of the infraction, & should be between the user & the content host.
And obviously violating fair use rights is a concern, but even a conservative implementation of this sort of system would drastically reduce the amount of copyrighted material which exists online against rights holders’ wishes.
There’s good & bad with the DMCA, I think that it initially favored rights holders, but didn’t foresee the scope of the internet 15 years down the line. Which it couldn’t possibly have. And some technology companies have abused this. But I don’t really think that Google is one of them.
I think if there’s a centralized database of copyrighted material that is properly administered, the DMCA still offers a reasonable method of dealing with what slips through the cracks.

That’s not actually how Google’s search results work. A search for “Metallica download” produces “14 million results,” but the search results are only 72 pages deep, so you’re actually getting ~700 results. A few hundred of those are legitimate, a few hundred aren’t, & the remaining don’t actually have any content on them, they either send you into a maze looking for what you want, displaying ads the whole time, or they just send you straight to something like “PC Cleaner” which downloads an executable right to your hard drive. (Stay away from those, Windows users.)
So your 1:100,000 ratio is way off. In reality, it’s going to vary between 1:2 & 1:10, you’d literally have to go through all of the results to get a proper ratio. But consider this… a search for “Metallica download” is going to include results for “Metallica Download Festival,” “Download Metallica Font,” “Guitar Hero Metallica Downloadable Content,” etc. It’s not necessarily about the NUMBER of results removed, but the fact that these could be key legitimate results for other legitimate searches.
Also, consider this… Fun. is probably one of the most downloaded bands around. A search for “fun download” produces results for “free fun fonts,” “random fun files,” “Fly for Fun download,” & “Fun games free download” all on the first page of results.
Furthermore, a file like “allison.mp3” could just be a girl named Allison sharing her poetry online. Should that be unilaterally banned from Google searches just because it appears to be an Elvis Costello song?
When you really start to think about the cumulative effects of this type of whitewashing, it’s ABSOLUTELY IMPOSSIBLE to suggest that it could be effective without totally breaking google.
Another thing people fail to realize is the difference between Google & YouTube. There is a list of file types that google crawls… html, php, jsp, pdf, etc. These are pages. And to some degree images. It ignores css files, javascript files, linked multimedia files, etc. Effectively, they reduce their crawls to just a percent or two of relevant files, relative to the actual amount of data out there. Google works so well because it’s so efficient. Because of the nature of YouTube, being a video site & all, it’s a media file that it’s indexing, which can therefore be analyzed… the length of the song, the waveform image, etc. And the system is built to analyze that stuff, & that’s how content can be effectively blocked. For Google to crawl all files linked to the pages it crawls, or even just add audio & video files to their crawls, & to then analyze those files… that’s a mindblowing task. In a controlled environment where you determine the format of the file, when you can specify what codecs are allowed and how large a a file can be & the maximum dimensions of a file uploaded, etc. it’s possible. But the amount of content on the whole internet & the variety of content (file formats, codecs within those formats, etc) is many, many, many, many times more than what’s on YouTube, even if the files on YouTube are the ones getting watched so often. And the effort it would take to do that would be many, many, many times more than the effort Google exerts to index the web currently. It’s tantamount to telling airlines that they must do an exhaustive background check before they can sell a flight ticket, only that’s understating it.
I’m no programmer, but I know enough to know that arguments like “If Google can keep child porn off it’s search results…” or “If Google can do it for YouTube…” or “It would only remove 1 legitimate page for every 1,000 illegitimate page” are all totally bogus. This type of thing can’t effectively be automated, not without scaling indexing efforts by a 2- or 3- digit multiple, & that effort, & the inefficiency of it, could very well break Google.
Here’s another hypothetical. Say I run a music blog, & 12 months ago I get an e-mail from a publicist. “The Lumineers are playing in your town 4 weeks from now, would you mind posting this mp3 & mentioning the show? I can also send you 2 tickets to give away to your readers, if you’d like.” So you run the story. A year later the Lumineers blow up & this link becomes a liability. So you request to Google that all the results to “Lumineers mp3” are removed. This removes the page of your blog from Google’s index, removing the Lumineers content but also the 9 or so other bands you posted at that time. And as you update, that Lumineers post gets knocked back to the next page, which is then blacklisted by google, along with the previous page still blacklisted, which no longer contains any mention of the Lumineers. Over time, unless those pages are re-crawled by Google, large swaths of your content are deindexed, effectively knocking all of your content from 12 months ago through that Lumineers post off of Google. And, depending on how good you are at guessing what bands will become popular, or how well you cooperate with publicists, this could happen many times over, & it only takes a few of these holes to effectively delist your whole site.
Google learns what people want from their searches by monitoring which results are most linked to & which results are most clicked on. So when you search for “Fun. download” the top results are going to be the band Fun., but it wasn’t like that before they were popular. During their first album cycle, you had to search “fun nate ruess” or “fun aim and ignite” to hope to get any results for the band. So it may appear that Google can tell what’s an mp3 or what’s artist-related content & what’s not, but the truth is that it’s only displaying the results that users are most likely to click on. And that’s simply not a criteria for whitewashing results. Google’s search algorithm is based on people’s behavior, not on identifying what type of content is on a given page. And so the idea that a switch can just be flipped is absurd. That’s just, plainly & simply, not how things work.
At least that’s my understanding of all that.

“I’m no programmer”
And it shows. Perhaps you really believe what you’re saying.
But this is not about what Google can do. Believe me, the sky is the limit.
This is entirely about what they want to do.

This is a retarded argument, but you really should look up the statistics of how many macs get infected with viruses, versus pcs. And what it takes to get a virus on a mac. And about what safeguards are built into Mountain Lion.

They should be issuing a cease and decist to Blue Moon Brewery instead. They totally ripped off Ho Hey in a “licensed reproduction.” Absolute garbage. Makes me HATE on blue moon even harder than I do now.

The Blue Moon commercial is licensed. They just re-recorded it so they only had to license the composition & not the sound recording.
I dunno what’s to hate about a brewery. You might wanna relax a little.