Content Theft and WordPress

I and many WordPress “representatives”, along with the developers and staff of WordPress and Automattic, are getting more and more complaints and requests for help dealing with content theft issues. We all need to clear this up and spread the word about how this works in relationship to WordPress.

In order to go after a copyright violation of your content, you have to contact the site owner first. When that fails, you ask the web host for assistance, in accordance with the DMCA, to contact the site owner and deal with the copyright violation. Herein lies the problem.

As Mark Jaquith recently stated, there is a lot of name confusion between a WordPress blog and a WordPress.com blog. Since a web host in the United States is required to assist with copyright infringement issues when contacted, when is WordPress the host and when is it not? After all, it’s all WordPress, right?

Not.
A blog “Powered by WordPress” is not hosted by WordPress, Automattic, or any of its affiliates. WordPress, and its parent company, Automattic, has no responsibility for the site nor its content as they only provide the free program that runs the blog. Going after WordPress in this instance is like going after Microsoft for copyright violation of content printed from Word. WordPress, like Microsoft, is not responsible for the end use of their products.

A blog on WordPress.com, with the footer statement “Get a free blog on WordPress.com” or Blog at WordPress.com is hosted by the free blog hosting service owned and operated by Automattic, and the servers and business are based within the United States, therefore, the host comes under the Digital Millennium Copyright Act (DMCA) and must assist copyright holders in removing copyright violation content when requested – just required of search engines and other web hosts.

I know how frustrating the battle is against content thieves. The abuse of feed scrapers using our content on their ad-filled splogs is growing daily. Many believed only top bloggers were targets. Now, their “little” blogs are being scraped and their content used without their permission – leaving them feeling victimized and helpless as web hosts, especially blog hosting services, offer no simple and easy solution to reporting, or force the victim to jump through a lot of hoops in order to report.

The issue at hand, though, is blaming WordPress for the evil doing of content thieves. Here are some clues to figure out if you should contact WordPress.com regarding content theft:

Contact WordPress.com

If the blog’s URL is name.wordpress.com.

If the blog’s footer says “Blog at WordPress.com” or the equivalent.

If the WHOIS report for the site lists WordPress.com and/or Automattic as the web host.

Do not contact WordPress or WordPress.com

If the blog’s URL does not include wordpress.com.

If the WHOIS report does not list WordPress.com or Automattic as the web host.

If the footer says “Powered by WordPress”.

Here is the WordPress.com policy for handling content theft.

For WordPress.com users copying other WordPress.com bloggers, copyright holders are asked to provide links to the original and copied post. The violator is contacted and asked to either create an excerpt of the copyright violated content with appropriate credit links, or to remove the content. If they fail to comply, they get a warning notice on their blogs. If it continues, WordPress.com reserves the right to remove the copyright violating content or take more aggressive action for failure to comply.

For bloggers reporting copyright violations by WordPress.com bloggers, they are instructed to use the DMCA reporting process. Violators receive a warning with a seven-day deadline. If they fail to comply, their blogs will be shut down. Again, proof of copyright infringement must be proven in accordance with the DMCA. WordPress.com support recommends you leave a comment on the violator’s blog first, then contact them through the DMCA reporting process with the link to the original content, copy content, and link to your comment asking the violator to remove your copyright violating content.

Do not report “He is copying from my blog” or “She is linking to my pictures” in your compliant. Be very specific and to the point.

This post of http://example.wordpress.com/x-y-z is a copy of my blog at http://example2.wordpress.com/a-b-c.

OR

The picture in the post at http://example.wordpress.com/x-y-z is linking to my pictures on my blog and specifically from this post: http://example2.wordpress.com/a-b-c.

They don’t need to hear the whole story and nothing but the story. They just need the facts. The faster they get the information, the faster they can respond.

Remember, if the blogger is using a Fair Use equivalent, and not your blog’s whole content, they may be within their rights to use your content. WordPress.com admits that while Fair Use is pretty hard to define, clear copying of a whole post meets the requirements of a copyright violation.

According to WordPress.com support, the record for compliance with a copyright violation notice is two minutes. However, this is not the norm. Terms of Service reports and copyright violations are handled throughout the work day as fast as possible. The time delays come from non-responsive bloggers, either with those who report the violation without enough information or waiting on the offending blogger’s response, totally out of the hands of WordPress.com.

What You Need to Know About Reporting Copyright Violations

Sidebar Feeds: WordPress and WordPress.com blogs have the capability to add feeds to their sidebars through feed Widgets and Plugins. The content that arrives is usually limited to post titles and/or a few words in an excerpt. Accordingly, these are considered by WordPress.com to be within Copyright Fair Use and cannot be stopped. If you do not like the blog which is using your feed post titles or excerpts, you have to take it up directly with them.

Copyright Policy Doesn’t Match Copyright Violation Request: Many people don’t understand how copyright works and head to Creative Commons and get a badge and slap it on their site with a “that’s done” pride. Most of the Creative Commons licenses allow use of your content in one form or the other, from totally free-to-use to can’t use at all. If you are reporting a copyright violation and your Creative Commons badge and license has permissions for usage that are not in compliance with your request to stop the copyright violation, you are stuck. Consider updating and changing your Creative Commons license and copyright policy to be very clear about what you will and will not allow for usage of your blog’s content.

Understand Fair Use: The copyright law is vague concerning what amounts to Fair Use of content. Is it only a line or two in a quote, or several paragraphs? I have a policy of 10% or 400 word limit, but there is no clear definition of what constitutes an abuse of Fair Use except that the usage hurts the market value for the original author.

For some, any usage of the content for commercial purposes, Fair Use or not, is a violation of their copyright policy. What is the definition of commercial? it used to mean content used for the purpose of generating income. Does that also mean a blog with ads?

WordPress.com support reports that it’s really difficult to enforce copyright policies when there are those who have no problem using other people’s content within Fair Use but scream when their content is used accordingly. Consistency and fairness in practice helps.

Copyright is about playing nice and fair. There is a lot of trust involved with a lot of loose laws to protect the copyright owner. Set an example for others on how you protect the rights of other blogger’s content, and do your best to make sure your content is also protected.

40 Comments

Thank you for such informative post, Lorelle. One little mistake, though: it is not correct to say that Creative Commons licenses allow use of your content in one form or the other, from totally free-to-use to can’t use at all. All Creative Commons licenses allow the copying of the content, as long as attribution is given.

Some licenses don’t allow derivative works, others don’t allow the use for commercial purposes, and others don’t allow both. However, even the most restrictive CC license allows people to copy the content that was licensed that way.

In fact, the most restrictive license, the Attribution-Noncommercial-No Derivative Works 3.0, states that “You are free: to Share — to copy, distribute, display, and perform the work, under the following conditions: Attribution. You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work). Noncommercial. You may not use this work for commercial purposes. No Derivative Works. You may not alter, transform, or build upon this work”.

What is the blog is a Blogspot, Google blog, would we contact Google direct? Does anyone know of a contact form or a form of communication to contact Google about copyright material. Most of the splogs I see are blogspot ones to be honest.

My blog does not have much information to be copied, but my blog’s images are sometimes hot linked without my permission. There was once I left a comment on a site asking them to remove the image they stolen. They were very smart, they cropped off my blog’s name from the photo and made it smaller. But I knew it was my property at first sight

“Theft” is not possible for intellectual content. Internet content is something which cannot be stolen. No one will ever be able to stop inter-use of Internet content. Believing that one can do that, is a failure to understand the Internet—it’s an adolescent belief.

vkeong – the best way to stop hot linking is to go through your web hosting provider and have that ability turned off. For one photography client I did that because of the same thing.

weathervane – While there is a very fine line against internet content theft, it still is something that can be stolen. While the internet content is usually called ‘fair use’ it is also called perjury if used without consent or credit to the originator. Many web hosts have pretty strict policy against this.

With contacting certain webhosts, they’ll just say, right to your face, “I don’t give a damn”, and not do shit about it. Dreamhost is one of those companies (most of the scrapping that I have done on my blog is from someones hosted by Dreamhost). The flat-out told me that they won’t do anything unless a lawyer contacts them o.O

Great article about how to deal with Copyright infringements. It is a tough line to walk, of what is and isn’t ‘fair use’ but only by people protecting their intellectual property, can a dent be made in this.

On the note about hotlinking, use your htaccess file to prevent hotlinking, and if you have cPanel there is a feature in it that will generate the necessary code for your htaccess.

Lorelle: Great article, I don’t have much to add as I think it covered everything pretty nicely.

I do want to extend an open invitation though, if you’re having trouble with a non-wordpress.com site and want someone to help track down and tell you who to get in touch with, send me an email and I’ll gladly have a look at it.

It will likely be Wednesday before I’m able to get back in touch with you, but I will do what I can to help!

I really enjoyed reading your article…From my perspective, I see many blogs out there that seemed to be scraped materials from others. While I understand the concept MEME, I really disagree that using that content without back referencing that content to the original author is despicable.

Yes, many Google Blogger/Blogspot sites are scrapers and spammers, which is why Google needs to make it easier for tracking those sites down and stopping them, as well as copyright owners reporting on the violations. It has a terrible, and justified, reputation for “bad” blogs, which is so sad. It doesn’t have to be that way.

Still, while we are passing out blame, don’t forget that the majority of scrapers are self-hosted, and the biggest bastards are trying to pass themselves off as “aggregators” – indiscriminate aggregators.

You make a good point but, hmm, let’s see, I just spent two minutes this morning commenting on four blogs violating my content. An hour later, I’ve heard from two who apologized and removed the full content from their blogs, and I just checked and their third has removed the content, too. Checking took me less than a minute. Not bad for a days’ work. 😀

It’s a myth that it takes a lot of time to stop a content thief. That’s why they keep getting away with it.

It does take time, however, to write about content theft on your blog and help to teach others – the more we teach and educate everyone that being a copycat isn’t nice, the less we may have to spend any time battling this issue.

Good advice all round, I’ve started following a wikipedia style of attribution… Images and such content are listed with a fair-use rationale and list the source as accurately as possible.

It seems like good practice to ensure that the used content is limited (Be that a lower quality version of an image or a limited quote of a fuller article [so as to encourage a return to the original author/creator])and that you justify your use of the content with a link to your source.

vkeong: You can scupper hot-linking by editing your .htaccess, some web searches should give you advice on possible solutions.

@ Lorelle: I was wondering if you could elaborate a little what you see as the difference between a site that aggregates content (and helps authors by sending readers back to the original blog posts)and the scraper site. I’ve seen some sites that made me wonder, “Is this legit, or is this capitalizing off the work of others in a bad way?”

Good question, and one that can sometimes, though rarely, be in the eye of the beholder and copyright owner.

If your copyright policy is clear that it does not allow for commercial use of your content nor full post content use, only Fair Use, then it could be that any blog or site using ads with your full post content is violating your copyright. That’s pretty clear, scraper or not.

To answer your question more specifically, a “good” aggregator is typically one that specializes in specific information and one that does not use full post content but only Fair Use excerpts. It respects your copyrights, whether or not they ask. Techmeme is a good example.

A “bad” – very bad – example is one that is ad keyword based and uses your content, by Fair Use or otherwise – as its sole source of content without discrimination. You’ve probably seen your posts show up as title links, excerpts, or full posts on sites for drugs, sex, and totally unrelated subjects.

It’s kind of a “you know them when you see them” but it boils down to what your copyright allows and whether or not you have a clear enough case for copyright violation. If their usage is within your copyright policy, then there isn’t much you can do about it.

The first paragraph of usage can be considered acceptable under copyright’s Fair Use. Not much you can do about it unless your copyright policy clearly states that you will not allow ANY usage for any reason. For me, since I do not allow commercial use of my content nor use as a substitute for original content, I can ask them to remove my blog from the free scraper. They may or may not comply, and I may or may not have enough clout to take them to court with that kind of policy, but copyright law is changing due to these blatant abuses, and one day, I might be able to nail their buns to the wall for such usage. (…she smiles and dreams. 😀 )

I really don’t care. What a waste of precious time and energy. I have felt no discernable impact from scrapers. Sometimes what they do is so funny, I even point it out. Welcome to the internet, people. Quit acting like it’s still a Gutenberg press world. You have better things to do.

hello this is diesel28 from gaararug on wordpress.com has been copying from my site continualy i have wanrd him twice first it was only my pictures but now he copys my whole posts somtimes this is what he does he copys the post and just past it with no credit to my sit:(

I assume that you have reported this to the WordPress support team. Is it their job to respond to such allegations to protect your content, especially within WordPress.com. They are strong advocates of copyright protection.

It is also your responsibility to teach your readers and yourself about how copyright works, especially about Fair Use to everyone can learn and blog in peace.

The more time that is spent dissecting, analyzing, and critiquing a design by the wrong kinds of people the worse that design gets. The same trend applies to the number of people involved in the design process.
Thx a lot for infomation!

Hi Lorelle,
I’m new to blogging, new to WordPress, (new to you :))…
I published a new plugin that’s quickly gaining popularity.
I’m trying to understand how to think about articles referring to my plugin.

Should I be pleased that they wrote a couple of original sentences?
Should I be upset that 90% of the “article” is simply copied from my readme file?
Should I give up before I even begin, accepting that I need to choose my battles wisely?

And, just to make things more interesting: I got a pingback from a posting… in Persian! Not even Google Translate knows how to decipher it. What do you do when you have absolutely no clue what “they” are saying about your blog posting? Do you accept such pingbacks or block ’em?!!

First, the second post is in Arabic not Persian – which is Arabic. Well, a derivative. It’s hard to read as much of it is translated phonetically rather than precisely, but you get the idea that they are excited about the Plugin. BloggingPro is not a plagiarist, either.

When you publish a WordPress Plugin under the GPL license, the description of the Plugin must be copied, at least in part, to help describe the Plugin to the readers. If you wrote your Plugin description well, the clarity of your words to describe it is the best choice to help promote it. It’s like quoting a press release. You want your Plugin promoted, don’t you? If not, then why put the information into the WordPress Directory?

Think this through and be honored anyone paid any attention to your Plugin.

This is vastly different from plagiarism or content theft. Your announcement in the WordPress Directory DEMANDS it be republished, and its covered by GPL anyway.

Before ANY blogger can participate at WORDPRESS, they should be required to read about content theft and sign they have read it. That WORDPRESS is not liable is silly. The whole idea of blogging is to encourage content theft. The majority of bloggers have nothing original to say. Most can’t even write. WORDPRESS is laying the trap, but saying someone else put their foot in it. The day will come when someone with money will challenge the blog hosts and their whole scheme of theft will unravel.

I’m confused with your comment. WordPress is not responsible for content theft. Those who use WordPress products are liable for whatever they do with their blogs, not WordPress. WordPress.com has a very strict policy on content theft, as should every blogger and website creator and owner in the world, letting others know how they want their content used instead of others doing what they will without permission. When someone signs up for a WordPress.com account, the terms of service state clearly that copyright violators will be warned and then their sites shut down if they fail to comply.

But WordPress does not encourage content theft. People who didn’t learn in school that it is against the rules to copy others are the ones at fault. Copyright is the law. WordPress.com free hosted blogs have a staff that works hard to protect the rights of copyright owners, as the law requires. I’m proud that they are.

As for bloggers who blog without original content, I would like to see these sites shut down. There is aggregation and republishing of content WITH PERMISSION and then copyright violation. The latter must end on the web.

[…] be a little different depending on which one you own. Learn more about wordpress content theft at Lorelle’s blog. Site Tags:blog scraping, content theft, copyright infringement, wordpress addthis_url = […]

[…] Content Theft And How Report It- Lorelle Van Fossen has written a detailed article that explains how you can report content theft on your WordPress.com or WordPress.org blog. This article contains a slew of great information and I’ll be looking forward to seeing what Jonathan Bailey has to add in his speech about this subject at WordCamp Dallas. […]

[…] touches on the problems of blog scraping in her post WordPress.com Blogs Feeds Scraped and Content Theft and WordPress is a brand new post worth checking out. First, you noticed that I didn’t say “if” […]

[…] blogger directly, and/or another blogger. For instance, if a WordPress.com bloggers is found to be violating copyrights by plagiarizing content or violating the WordPress.com Terms of Service with advertising, they are often warned several […]

[…] Content Theft And How Report It- Lorelle Van Fossen has written a detailed article that explains how you can report content theft on your WordPress.com or WordPress.org blog. This article contains a slew of great information and I’ll be looking forward to seeing what Jonathan Bailey has to add in his speech about this subject at WordCamp Dallas. […]

[…] touches on the problems of blog scraping in her post WordPress.com Blogs Feeds Scraped and Content Theft and WordPress is a brand new post worth checking out. First, you noticed that I didn’t say “if” […]

[…] blog administrator, owner, and contributors, not the software or blogging platform. For example, report self-hosted WordPress blogs to the proper authorities, not WordPress, as they are not responsible for the content. However, WordPress.com is responsible for blogs […]