NB The following is a community-generated list of websites that republish Stack Exchange content without attributing it properly. It is no longer being maintained, because the procedure for reporting such sites has changed; see the duplicate for more information.

There are a number of license-violating clones of Stack Exchange sites popping up that use Stack Exchange's CC-wiki data without following our Creative Commons attribution terms. Those terms are linked at the bottom of every Stack Exchange webpage, and are also included as a .txt file in every data dump we produce.

The option to block a site appears when you click a search result and then navigate back to the search results page. Click the "Block" link next to that result to block all pages within the site's entire domain.

This question exists because it has historical significance, but it is not considered a good, on-topic question for this site, so please do not use it as evidence that you can ask similar questions here. This question and its answers are frozen and cannot be changed. More info: help center.

@JeffAtwood I would put a poison pill question(s) that only those APIs presented so that you could automatically re-scrap via Google and easily find infringing sites. Much like cartographers do on physical maps by putting fake locations or markers to easily spot copies. The thieves would never know what to look for and you could easily make this an automated process.
–
Jarrod RobersonJul 14 '12 at 6:04

3

@Jeff is this handled at some level? Can we know which sites were reported and closed?
–
Shadow WizardAug 8 '12 at 10:58

8

@ShaWizDowArd I don't have a handy list of sites that have been dealt with, but we do monitor this thread and contact offenders. This post isn't just a black hole. :)
–
Anna Lear♦Aug 31 '12 at 22:25

3

Is there a reason the original list of offenders was split into countless posts over multiple pages? Giving it its own topic, sure, but splitting into individual answers just makes searching for existing entries a PITA and invites duplicates.
–
Daniel BeckMay 27 '13 at 8:49

without (1) visually indicating that the content is from Stack Overflow, Meta Stack Overflow, Server Fault, or Super User [etc.] in some way; (2) hyperlinking directly to the original question on the source site; or (4) hyperlinking each author name directly back to their user profile page on the source site.

For (2), they are linking back ("Read More Details"), but are both using tinyurl and nofollow, so it's not a direct link.

They have this in the footer, but it doesn't meet SE attribution requirements:

Except for third party materials and otherwise stated, this content is made available under a Creative Commons Attribution-NonCommercial-ShareAlike 2.0 Licence. The original Author/Publisher name is mentioned above.

yulebiao.com

Site has malware / rogue JS issues so be wary, example link: http://www.yulebiao.com/questions/15571503/multiple-tables-for-addresses-serving-multiple-purposes

It looks like someone wrote a connector to import our dumps right into Question2Answer, but it doesn't handle the attribution requirements. Additionally, the site might be compromised - so be wary when visiting.

I actually came across it due to a spamming user who posted a link to their site twice in this question: Support offline data display in iOS app (10k only) with that link being to their copy of that question.

Checking out their home page today, they seem to be getting stuff pretty quickly; probably using the API. They also have Disqus comments active, and seem to get a bit of activity there, so they are apparently getting hits through from Google.

Here is a description from the site, a pretty murky looking, badly designed FAQ page:

IFMDb is searchable database of messages which are legally posted on the internet forums by internet community, for internet community

To try to illustrate the cc-wiki breach, let's take for example http://android.ifmdb.com/IsEBDg6p0-set-a-new-views-margins-programatically.html. This page does not break attribution requirement 1, since there is a reference to StackOverflow (albeit not-hyperlinked). What is does breach is 2, as the link to the "full question" is a proxy between the original clone site and StackOverflow, e.g. http://android.ifmdb.com/fullpost-IsEBDg6p0.html has a nofollow and doesn't link back.

Furthermore, I believe 3 is breached as well, as only Unknown shows up as the author. To round it all up, 4 is not respected, as there is no reference to the original account from StackOverflow which originated the question – original question is here