NB The following is a community-generated list of websites that republish Stack Exchange content without attributing it properly. It is no longer being maintained, because the procedure for reporting such sites has changed; see the duplicate for more information.

There are a number of license-violating clones of Stack Exchange sites popping up that use Stack Exchange's CC-wiki data without following our Creative Commons attribution terms. Those terms are linked at the bottom of every Stack Exchange webpage, and are also included as a .txt file in every data dump we produce.

The option to block a site appears when you click a search result and then navigate back to the search results page. Click the "Block" link next to that result to block all pages within the site's entire domain.

This question exists because it has historical significance, but it is not considered a good, on-topic question for this site, so please do not use it as evidence that you can ask similar questions here. This question and its answers are frozen and cannot be changed. More info: help center.

@JeffAtwood I would put a poison pill question(s) that only those APIs presented so that you could automatically re-scrap via Google and easily find infringing sites. Much like cartographers do on physical maps by putting fake locations or markers to easily spot copies. The thieves would never know what to look for and you could easily make this an automated process.
–
Jarrod RobersonJul 14 '12 at 6:04

3

@Jeff is this handled at some level? Can we know which sites were reported and closed?
–
Shadow WizardAug 8 '12 at 10:58

8

@ShaWizDowArd I don't have a handy list of sites that have been dealt with, but we do monitor this thread and contact offenders. This post isn't just a black hole. :)
–
Anna Lear♦Aug 31 '12 at 22:25

3

Is there a reason the original list of offenders was split into countless posts over multiple pages? Giving it its own topic, sure, but splitting into individual answers just makes searching for existing entries a PITA and invites duplicates.
–
Daniel BeckMay 27 '13 at 8:49

purports to be Stack Overflow (contact page) Ironically even offers SE as a login

Site is pretty egregious, direct copy, user profiles and everything. Only site with 100% user profile copies that I have found. Even uses the favicon.

At first the site wouldn't let me view (said I had reached my maximum or some message along those lines), and I had to delete the cookie on my machine from them in order to access the page. Once I realized it was a copy I stopped browsing because of the malicious nature of the owner.

Copy: http://www.advancesharp.com/Questions/6463/beating-jquery-addiction-which-jquery-methods-are-easily-translated-into-pure-javascriptwhich the owner of the site takes credit for... methinks Felix Kling will not be impressed.

Really, after you find a bogus Question/Answer combo just click on the 'user' who posted it. All other content from said user is scraped from SO - and there are MANY of these users all with 2/3 questions / answers - I can't see it being an elaborate way for someone on that site to gain 'rep'

They show you a question, and to get the answer, you need to view an offer. You don't need to click on it, there is a "skip" button, but they are hiding other sites behind their offers, so they get money for sending you away from their site.

Them: http://www.solutionoferror.com/html/sublime-text-2-autocompleting-html-escaped-‌​characters-starting-with-amper-40267.asp SO: stackoverflow.com/questions/11870427/… The site rates high on my malware-o-meter. It asked for a FB login (when you click whitespace) and to install Flash (when you go back).
–
KatieKJun 14 '13 at 20:52

The scraper also purports to give a link "View original page at" (no further text), but this link goes nowhere. The scraper is presumably availing him/herself of a great deal of Stack Exchange content, a clearinghouse for SE content

This scraper site is heavily ad supported

Does state at the bottom that the content is licensed under cc-wiki and enumerates some (not all) SE sites involved, but does not link to the original question or the users.

A search for the term "Judaism" on their main site turns up over 600 hits on the subject, many, many of which (in a cursory search) are directly copied from Mi.Yodeya; a search for an active username turns up hundreds of hits, including questions, answers, and comments.

In addition, a search for "English Language & Usage" without the quotes turns up hundreds of hits (many from that SE site); searches for it with the quotes turns up 48; and searches for names with which I am familiar from other SE sites turn up content of those SE users.

The attribution is not only flimsy but actually incorrect. There's a disclaimer at the bottom of the main page (but over at Mi.Yodeya we believe it to be violating the license terms)

The content is from serverfault.com, superuser.com and stackoverflow.com,
and is licensed under cc-wiki. Any advice please contact us.

Confirmed, and I also see this site scraping content from Physics, e.g. this (original). Who knows how many other network sites they are taking from (without any attribution whatsoever).
–
David ZJul 31 '13 at 1:42

1

Funny: the copies faithfully reproduce inter-post links back to the originating Stack Exchange site (see for instance my answer on http://www.techques.com/question/5-31514/Why-is-the-%28free%29-neutron-lifetime‌​-so-long? which links back to physics.stackexchange.com/a/31526/520). The person who wrote the scraper is too much of a poser to even get that right. ::sigh:: You just can't get good villians these days.
–
dmckeeJul 31 '13 at 6:29

copies the content of SE sites without any attribution. For example, http://www.happyforlove.com/questions/c_444314/prove-that-in-an-obtuse-triangle-the-orthocentre-is-the-excenter-of-the-orthic-t is a copy of this one on Math SE.

Scratch that, they're scraping from multiple sites now. I see at least a few questions from Photo.SE, and Sharepoint.SE.
–
fbueckertJun 10 '13 at 14:23

1

One of their latest scrapes is completely blatant: http://qandasys.info/how-to-do-organize-papers-urls-and-other-tcs-related-resou‌​rces/ No attempt at all to reroute URLs.
–
fbueckertJun 12 '13 at 14:37

Example: http://www.iasptk.com/ubuntuwp/tag/11-04/ contains content from many questions and answers.
At the time of this writing, the first question on that page is the AU question Iptables proxy host on port.

It goes as far as using the list of site tags and site name as headings/links, but does not indicate Stack Exchange as the source of the content. The four requirements from here are not followed.

I googled my name "Rocket Hazmat" (keep the quotes), and this page was the 7th result. Not just the question and answers, they are also copying user names AND avatars with no links to the source! (Their scraper is pretty crappy, because their page shows that I posted 3 answers. I didn't post any, I just happened to have been the last person to edit 3 of the answers!)
–
Rocket HazmatJun 13 '13 at 14:08

http://qnundrum.com/answer.php?q=19297 is a copy of a question from AskDifferent. It links to AskDifferent's author page and then gives no answer, just a button "View Full Answer on Apple StackExchange."

No attribution to us.

Note that its Contact page says

If a question of yours was indexed by Qnundrum that you don't want indexed, click here to fill out a take down request form. Submitted questions for takedown that are approved will be removed within 48 hours.

The entire site looks like it was copied, but the real clue was http://google-seo-help.blogspot.com/2010/10/how-can-i-fetch-more-than-1000-google.html, taken verbatim from SO: http://stackoverflow.com/a/3794141/453277

It's quite funny, because in the answer I use my own name repeatedly as sample data. The post is attributed to "Peter".

I can find no indication anywhere of the content's origin. It's not a big site, but it's a blatant ripoff.

Examples

On Gaming and on LoveVideoGame: www.lovevideogame.com/q/answers-is-it-possible-to-use-a-different-folder-than-minecraft-100106.html

lovevideogame.com/q/answers-starcraft-1-disc-2-isnt-continuing-the-installation-126684.html is a flagrant copy of Starcraft 1 disc 2 isn't continuing the installation . No recognition whatsoever that this is StackExchange material. They even stole the profile pictures along with the usernames.

Examples

http://www.blogosfera.co.uk/2013/08/implement-iqueryable-wrapper-to-translate-result-objects/ Note the lack of links to the original source (here). However, if you go here: http://www.blogosfera.co.uk/category/net-2/, you should be able to find the same post, and the title will take you to the original source (click the date at the bottom to get to the individual post on the system).

codeblow.com

Does not visually indicate origin, does not link to the original question, does not show any author names, and therefore cannot hyperlink back to the user profile.

Says "Code Blow provides to you all the programmers knowledge of all programmers that took time to write answers to every question that you can find here. Feel free to ready and learn from all these informations." - They acknowledge that it took time, but feel that it's OK to steal it? What gives?

www.gp958.com is a verbatim copy of the majority of content from Android Enthusiasts right down to the site's title tag, and linking images from stack.imgur.com. The copyright line at the bottom of each page says: