Host- and Domain-Level Web Graphs Nov/Dec/Jan 2017-2018
We are pleased to announce a new release of host-level and domain-level web graphs based on the published crawls of November, December 2017 and January 2018. These graphs, along with ranked lists of hosts and domains, follow the prior web graph releases (Feb/Mar/Apr 2017, May/Jun/Jul 2017 and Aug/Sep/Oct 2017). Additional information about data formats, the processing pipeline, our objectives, and credits can be found in the preceding announcements.
Please note that the first released version (released 2018-02-08, withdrawn 2018-02-21) contained only links from the January 2018 crawl, see the notice on the Common Crawl user group. On 2018-02-28 a fix has been provided with graphs or rankings containing all links, hosts and/or domains over all 3 crawls. We also provide the erroneously released graphs and rankings from the January 2018 crawl.
What’s new?
Here is a summary of notable aspects and changes of this web graph release:
- a bug has been fixed which caused that relative links pointing to a different host (
//www.example.com/index.html
) are not added as edges of the host/domain-level webgraphs - the domain graph now contains the number of hosts per domain as additional column in the vertices and rankings files
- the naming scheme has changed – the release name is now part of the file name
- webgraph offset files are not released any more, they can be created by running
java it.unimi.dsi.webgraph.BVGraph -O -L cc-main-2017-18-nov-dec-jan-host java it.unimi.dsi.webgraph.BVGraph -O -L cc-main-2017-18-nov-dec-jan-domain
Host-level graph
The graph consists of 2.75 billion nodes and 8.6 billion edges. The graph includes dangling nodes i.e. hosts that have not been crawled yet are pointed to from a link on a crawled page. There are 2.67 billion dangling nodes (97%) and the largest strongly connected component contains only 65 million (2.3%) nodes. The host names are reversed and a leading www.
is stripped: www.subdomain.example.com
becomes com.example.subdomain
.
You can download the graph and the ranks of all 2.75 billion hosts from AWS S3 on the path s3://commoncrawl/projects/hyperlinkgraph/cc-main-2017-18-nov-dec-jan/host/
. Alternatively, you can use https://data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2017-18-nov-dec-jan/host/
as prefix to access the files from everywhere.
The following files and formats are provided:
Download files of the Common Crawl Nov/Dec/Jan 2017-18 host-level webgraph
Size | File | Description |
---|---|---|
15.9 GB | cc-main-2017-18-nov-dec-jan-host-vertices.paths.gz | nodes 〈id, rev host〉, paths of 28 vertices files |
40.0 GB | cc-main-2017-18-nov-dec-jan-host-edges.paths.gz | edges 〈from_id, to_id〉, paths of 28 edges files |
16.4 GB | cc-main-2017-18-nov-dec-jan-host.graph | graph in BVGraph format |
2 kB | cc-main-2017-18-nov-dec-jan-host.properties | |
24.2 GB | cc-main-2017-18-nov-dec-jan-host-t.graph | transpose of the graph (outlinks inverted to inlinks) |
2 kB | cc-main-2017-18-nov-dec-jan-host-t.properties | |
1 kB | cc-main-2017-18-nov-dec-jan-host.stats | WebGraph statistics |
38.1 GB | cc-main-2017-18-nov-dec-jan-host-ranks.txt.gz | harmonic centrality and pagerank |
Domain-level graph
The domain graph was built by aggregating the host graph on the level of pay-level domains (PLDs). The extraction of PLDs is based on the public suffix list from publicsuffix.org. Only “ICANN” domains are accepted; “private” domains are not accepted (cf. section “divisions” in the documentation on publicsuffix.org). For example, foo.blogspot.com
and data.commoncrawl.org
are not accepted as pay-level domains, they are aggregated, respectively, as the domains blogspot.com
, amazonaws.com
and stored in the reversed form com.blogspot
.
The domain-level graph has 94 million nodes and 1.44 billion edges. 59% or 56 million nodes are dangling nodes, the largest strongly connected component covers 33 million or 35% of the nodes.
All files related to the domain graph are available on AWS S3 under s3://commoncrawl/projects/hyperlinkgraph/cc-main-2017-18-nov-dec-jan/domain/
resp. https://data.commoncrawl.org/projects/hyperlinkgraph/cc-main-2017-18-nov-dec-jan/domain/
.
Download files of the Common Crawl Nov/Dec/Jan 2017-18 domain-level webgraph
Size | File | Description |
---|---|---|
0.67 GB | cc-main-2017-18-nov-dec-jan-domain-vertices.txt.gz | nodes 〈id, rev domain, num hosts〉 |
5.7 GB | cc-main-2017-18-nov-dec-jan-domain-edges.txt.gz | edges 〈from_id, to_id〉 |
3.1 GB | cc-main-2017-18-nov-dec-jan-domain.graph | graph in BVGraph format |
2 kB | cc-main-2017-18-nov-dec-jan-domain.properties | |
3.3 GB | cc-main-2017-18-nov-dec-jan-domain-t.graph | transpose of the graph |
2 kB | cc-main-2017-18-nov-dec-jan-domain-t.properties | |
1 kB | cc-main-2017-18-nov-dec-jan-domain.stats | WebGraph statistics |
2.0 GB | cc-main-2017-18-nov-dec-jan-domain-ranks.txt.gz | harmonic centrality and pagerank |
Below you’ll find the top 1000 domains ranked by Harmonic Centrality or PageRank. The full list of all 94 million domains is available for download.
Top 1000 domains ranked by harmonic centrality (Nov/Dec/Jan 2017-2018)
harmonic centrality rank | hc value | page rank | page rank value | reversed hostname |
---|---|---|---|---|
1 | 26073210 | 2 | 0.013220 | com.facebook |
2 | 25501832 | 1 | 0.016444 | com.googleapis |
3 | 23718256 | 3 | 0.009278 | com.google |
4 | 23371534 | 4 | 0.008406 | com.twitter |
5 | 22832192 | 5 | 0.007823 | com.youtube |
6 | 21653376 | 6 | 0.006112 | org.w |
7 | 20324636 | 7 | 0.004710 | org.gmpg |
8 | 20045928 | 8 | 0.003501 | com.instagram |
9 | 19837996 | 10 | 0.002871 | com.linkedin |
10 | 19439618 | 12 | 0.002753 | org.wordpress |
11 | 19334234 | 14 | 0.002070 | com.wordpress |
12 | 19214522 | 17 | 0.001665 | com.pinterest |
13 | 19145770 | 27 | 0.001242 | org.wikipedia |
14 | 19121822 | 23 | 0.001462 | com.gravatar |
15 | 18842296 | 33 | 0.000966 | com.blogspot |
16 | 18810990 | 11 | 0.002837 | com.bootstrapcdn |
17 | 18718320 | 19 | 0.001594 | com.apple |
18 | 18626224 | 26 | 0.001255 | com.vimeo |
19 | 18434062 | 15 | 0.001863 | com.adobe |
20 | 18419880 | 44 | 0.000691 | be.youtu |
21 | 18397832 | 34 | 0.000964 | com.amazon |
22 | 18350614 | 13 | 0.002084 | com.macromedia |
23 | 18323552 | 29 | 0.001015 | com.microsoft |
24 | 18321908 | 41 | 0.000757 | gl.goo |
25 | 18302296 | 31 | 0.001009 | com.flickr |
26 | 18270630 | 46 | 0.000657 | com.tumblr |
27 | 18183288 | 59 | 0.000540 | com.yahoo |
28 | 18136014 | 20 | 0.001531 | net.doubleclick |
29 | 18074436 | 70 | 0.000464 | ly.bit |
30 | 18072284 | 32 | 0.000988 | com.amazonaws |
31 | 18039506 | 18 | 0.001618 | com.googletagmanager |
32 | 17994916 | 35 | 0.000913 | com.paypal |
33 | 17957448 | 78 | 0.000417 | eu.europa |
34 | 17950818 | 25 | 0.001280 | com.cloudflare |
35 | 17880136 | 87 | 0.000397 | com.weebly |
36 | 17863816 | 30 | 0.001012 | com.github |
37 | 17859140 | 81 | 0.000412 | org.mozilla |
38 | 17838500 | 40 | 0.000769 | net.cloudfront |
39 | 17830430 | 95 | 0.000348 | co.t |
40 | 17794416 | 80 | 0.000414 | org.creativecommons |
41 | 17773226 | 102 | 0.000289 | com.googleusercontent |
42 | 17757566 | 57 | 0.000562 | org.w3 |
43 | 17751372 | 39 | 0.000782 | io.github |
44 | 17703562 | 97 | 0.000340 | com.soundcloud |
45 | 17674626 | 118 | 0.000226 | com.blogger |
46 | 17673486 | 138 | 0.000182 | net.slideshare |
47 | 17666384 | 108 | 0.000265 | com.reddit |
48 | 17650506 | 51 | 0.000617 | com.bing |
49 | 17622678 | 147 | 0.000171 | com.myspace |
50 | 17614686 | 65 | 0.000474 | com.medium |
51 | 17600302 | 117 | 0.000233 | org.archive |
52 | 17597652 | 136 | 0.000187 | com.imgur |
53 | 17581558 | 66 | 0.000474 | com.list-manage |
54 | 17545184 | 37 | 0.000804 | org.apache |
55 | 17499074 | 155 | 0.000154 | com.imdb |
56 | 17493316 | 240 | 0.000097 | com.about |
57 | 17491778 | 28 | 0.001104 | com.gstatic |
58 | 17471556 | 169 | 0.000144 | com.wsj |
59 | 17464136 | 126 | 0.000218 | com.jimdo |
60 | 17462240 | 234 | 0.000101 | com.livejournal |
61 | 17450286 | 47 | 0.000649 | com.wp |
62 | 17447836 | 129 | 0.000206 | com.issuu |
63 | 17445244 | 130 | 0.000204 | com.android |
64 | 17443518 | 122 | 0.000222 | com.yelp |
65 | 17419300 | 43 | 0.000721 | com.statcounter |
66 | 17406774 | 50 | 0.000626 | me.wp |
67 | 17392892 | 179 | 0.000138 | com.oracle |
68 | 17372570 | 162 | 0.000148 | com.digg |
69 | 17368632 | 231 | 0.000102 | me.about |
70 | 17367318 | 299 | 0.000078 | com.scribd |
71 | 17361946 | 255 | 0.000091 | org.python |
72 | 17359688 | 127 | 0.000210 | uk.co.google |
73 | 17357006 | 61 | 0.000525 | com.cnn |
74 | 17342014 | 124 | 0.000220 | com.nytimes |
75 | 17339968 | 319 | 0.000073 | com.quora |
76 | 17329620 | 249 | 0.000092 | com.ted |
77 | 17321450 | 153 | 0.000161 | com.spotify |
78 | 17301998 | 148 | 0.000168 | com.wixsite |
79 | 17300500 | 233 | 0.000101 | com.dailymotion |
80 | 17297908 | 208 | 0.000118 | com.staticflickr |
81 | 17288954 | 390 | 0.000062 | org.chromium |
82 | 17276404 | 106 | 0.000273 | com.ytimg |
83 | 17269890 | 259 | 0.000089 | com.webs |
84 | 17265306 | 145 | 0.000177 | org.ietf |
85 | 17255342 | 222 | 0.000109 | com.mozilla |
86 | 17243666 | 187 | 0.000133 | net.behance |
87 | 17243048 | 191 | 0.000130 | com.disqus |
88 | 17242476 | 273 | 0.000085 | com.mysql |
89 | 17240042 | 158 | 0.000152 | com.stumbleupon |
90 | 17236410 | 268 | 0.000085 | com.foursquare |
91 | 17231272 | 314 | 0.000075 | gov.loc |
92 | 17213010 | 151 | 0.000164 | org.gnu |
93 | 17210118 | 146 | 0.000171 | com.tripadvisor |
94 | 17203374 | 361 | 0.000066 | org.nodejs |
95 | 17201882 | 378 | 0.000064 | com.storify |
96 | 17178790 | 156 | 0.000153 | com.forbes |
97 | 17177956 | 60 | 0.000527 | com.huffingtonpost |
98 | 17168464 | 133 | 0.000196 | com.dropbox |
99 | 17164012 | 199 | 0.000125 | com.typepad |
100 | 17156522 | 241 | 0.000097 | com.example |
101 | 17150188 | 166 | 0.000146 | uk.co.bbc |
102 | 17148528 | 479 | 0.000051 | edu.virginia |
103 | 17142618 | 89 | 0.000384 | com.paypalobjects |
104 | 17140226 | 48 | 0.000645 | net.fbcdn |
105 | 17130568 | 403 | 0.000060 | com.pixabay |
106 | 17126316 | 383 | 0.000063 | ca.blogspot |
107 | 17118492 | 200 | 0.000124 | org.wikimedia |
108 | 17116358 | 297 | 0.000079 | com.githubusercontent |
109 | 17115676 | 363 | 0.000066 | com.sun |
110 | 17111592 | 36 | 0.000863 | com.squarespace |
111 | 17106572 | 292 | 0.000079 | com.goodreads |
112 | 17105500 | 56 | 0.000574 | com.fb |
113 | 17103768 | 459 | 0.000053 | kr.flic |
114 | 17094226 | 431 | 0.000057 | org.ampproject |
115 | 17086396 | 530 | 0.000048 | edu.gatech |
116 | 17086356 | 180 | 0.000137 | com.theguardian |
117 | 17085768 | 96 | 0.000344 | com.wix |
118 | 17083032 | 518 | 0.000049 | it.scoop |
119 | 17081382 | 427 | 0.000057 | org.sciencemag |
120 | 17072106 | 139 | 0.000182 | net.sourceforge |
121 | 17062258 | 515 | 0.000049 | com.nike |
122 | 17056708 | 400 | 0.000060 | org.eclipse |
123 | 17054770 | 432 | 0.000056 | co.g |
124 | 17052438 | 269 | 0.000085 | com.tinyurl |
125 | 17052256 | 62 | 0.000509 | net.akamaihd |
126 | 17047904 | 437 | 0.000055 | org.kernel |
127 | 17045616 | 68 | 0.000467 | com.mashable |
128 | 17045460 | 489 | 0.000051 | au.com.blogspot |
129 | 17042294 | 64 | 0.000480 | org.schema |
130 | 17041062 | 620 | 0.000043 | com.discogs |
131 | 17038234 | 141 | 0.000181 | com.youtube-nocookie |
132 | 17037262 | 370 | 0.000065 | com.npmjs |
133 | 17034618 | 298 | 0.000079 | com.symantec |
134 | 17023588 | 196 | 0.000126 | com.live |
135 | 17018612 | 328 | 0.000072 | com.googlecode |
136 | 17016620 | 396 | 0.000061 | com.git-scm |
137 | 17012130 | 394 | 0.000061 | com.500px |
138 | 17011310 | 198 | 0.000126 | edu.stanford |
139 | 17010782 | 416 | 0.000058 | com.unity3d |
140 | 17010532 | 686 | 0.000042 | com.wikidot |
141 | 16992494 | 334 | 0.000071 | com.alexa |
142 | 16983610 | 447 | 0.000054 | com.sap |
143 | 16978146 | 250 | 0.000092 | com.businessinsider |
144 | 16976726 | 272 | 0.000085 | com.cnet |
145 | 16976366 | 372 | 0.000064 | com.getpocket |
146 | 16971698 | 252 | 0.000092 | com.go |
147 | 16964456 | 232 | 0.000101 | com.washingtonpost |
148 | 16957034 | 567 | 0.000046 | com.chrome |
149 | 16955976 | 9 | 0.003080 | com.godaddy |
150 | 16954152 | 140 | 0.000182 | com.sharethis |
151 | 16953542 | 211 | 0.000115 | com.ebay |
152 | 16949556 | 506 | 0.000050 | edu.berkeley |
153 | 16948812 | 377 | 0.000064 | au.gov.nsw |
154 | 16942594 | 289 | 0.000080 | com.msn |
155 | 16938334 | 333 | 0.000072 | com.time |
156 | 16937752 | 213 | 0.000114 | com.nbcnews |
157 | 16934364 | 75 | 0.000429 | edu.utexas |
158 | 16930038 | 532 | 0.000048 | com.jetbrains |
159 | 16927712 | 317 | 0.000074 | edu.harvard |
160 | 16924770 | 545 | 0.000047 | ms.1drv |
161 | 16917850 | 189 | 0.000130 | com.etsy |
162 | 16914838 | 176 | 0.000140 | gov.nih |
163 | 16911752 | 664 | 0.000043 | com.klout |
164 | 16905058 | 327 | 0.000072 | edu.mit |
165 | 16903928 | 316 | 0.000074 | com.reuters |
166 | 16898946 | 235 | 0.000098 | com.mapquest |
167 | 16898876 | 318 | 0.000074 | com.wired |
168 | 16893364 | 570 | 0.000046 | com.crunchbase |
169 | 16893270 | 401 | 0.000060 | gov.nasa |
170 | 16890130 | 722 | 0.000040 | com.4shared |
171 | 16885770 | 281 | 0.000082 | io.codepen |
172 | 16882882 | 295 | 0.000079 | com.photobucket |
173 | 16875932 | 257 | 0.000090 | com.udacity |
174 | 16865692 | 309 | 0.000076 | com.aol |
175 | 16858168 | 408 | 0.000059 | com.cnbc |
176 | 16853816 | 293 | 0.000079 | com.tripod |
177 | 16848676 | 517 | 0.000049 | org.aarp |
178 | 16847720 | 563 | 0.000046 | edu.utah |
179 | 16846928 | 342 | 0.000070 | org.npr |
180 | 16844128 | 746 | 0.000039 | com.diigo |
181 | 16842074 | 303 | 0.000077 | com.meetup |
182 | 16840924 | 120 | 0.000223 | com.mailchimp |
183 | 16840096 | 367 | 0.000065 | com.gmail |
184 | 16835606 | 24 | 0.001310 | ru.yandex |
185 | 16834612 | 425 | 0.000057 | com.appspot |
186 | 16833556 | 287 | 0.000080 | com.ibm |
187 | 16827030 | 338 | 0.000071 | gov.ca |
188 | 16826202 | 242 | 0.000095 | com.surveymonkey |
189 | 16825532 | 276 | 0.000083 | com.usatoday |
190 | 16824988 | 778 | 0.000038 | com.googledrive |
191 | 16822846 | 749 | 0.000039 | com.naturalnews |
192 | 16819990 | 764 | 0.000038 | io.soup |
193 | 16815880 | 340 | 0.000070 | uk.co.telegraph |
194 | 16814236 | 163 | 0.000148 | com.eventbrite |
195 | 16813884 | 206 | 0.000119 | com.opera |
196 | 16813306 | 676 | 0.000043 | com.zappos |
197 | 16811868 | 88 | 0.000394 | com.jquery |
198 | 16811796 | 692 | 0.000042 | com.wholefoodsmarket |
199 | 16809508 | 535 | 0.000048 | com.createspace |
200 | 16809072 | 322 | 0.000073 | com.images-amazon |
201 | 16807592 | 304 | 0.000077 | com.bloomberg |
202 | 16796502 | 193 | 0.000128 | com.twimg |
203 | 16793364 | 414 | 0.000058 | com.kickstarter |
204 | 16792730 | 103 | 0.000285 | com.addthis |
205 | 16791844 | 251 | 0.000092 | com.techcrunch |
206 | 16791074 | 804 | 0.000037 | edu.washington |
207 | 16790852 | 689 | 0.000042 | com.abebooks |
208 | 16790694 | 294 | 0.000079 | com.googlesyndication |
209 | 16790246 | 511 | 0.000049 | edu.cornell |
210 | 16785260 | 529 | 0.000048 | com.buzzfeed |
211 | 16783130 | 412 | 0.000059 | org.un |
212 | 16781132 | 263 | 0.000087 | com.stackoverflow |
213 | 16780958 | 149 | 0.000166 | com.feedburner |
214 | 16779250 | 608 | 0.000044 | com.theverge |
215 | 16775130 | 796 | 0.000037 | com.pearltrees |
216 | 16774700 | 67 | 0.000473 | com.vk |
217 | 16774586 | 375 | 0.000064 | com.latimes |
218 | 16765579 | 699 | 0.000042 | com.sublimetext |
219 | 16760696 | 498 | 0.000050 | org.rubyonrails |
220 | 16755911 | 170 | 0.000142 | com.zendesk |
221 | 16754800 | 880 | 0.000035 | com.fotolog |
222 | 16754091 | 69 | 0.000466 | me.fb |
223 | 16751300 | 577 | 0.000045 | com.audible |
224 | 16750615 | 549 | 0.000047 | org.pbs |
225 | 16749314 | 536 | 0.000048 | com.deviantart |
226 | 16747765 | 410 | 0.000059 | com.wiley |
227 | 16746660 | 307 | 0.000077 | org.acm |
228 | 16745326 | 862 | 0.000036 | tl.page |
229 | 16744572 | 212 | 0.000114 | com.ssl-images-amazon |
230 | 16743890 | 824 | 0.000037 | com.instapaper |
231 | 16742662 | 741 | 0.000039 | com.kinja |
232 | 16742008 | 110 | 0.000253 | com.shopify |
233 | 16740811 | 767 | 0.000038 | com.newyorker |
234 | 16740369 | 503 | 0.000050 | com.yellowpages |
235 | 16736172 | 203 | 0.000122 | org.drupal |
236 | 16734978 | 758 | 0.000039 | com.xda-developers |
237 | 16732311 | 921 | 0.000035 | com.adsoftheworld |
238 | 16731895 | 221 | 0.000110 | org.mediawiki |
239 | 16731137 | 279 | 0.000083 | fr.free |
240 | 16730080 | 805 | 0.000037 | co.ello |
241 | 16729515 | 444 | 0.000054 | com.theatlantic |
242 | 16725251 | 409 | 0.000059 | uk.co.dailymail |
243 | 16721365 | 1189 | 0.000031 | edu.columbia |
244 | 16720295 | 388 | 0.000062 | com.bbc |
245 | 16720112 | 45 | 0.000661 | com.yimg |
246 | 16718905 | 451 | 0.000054 | com.wikihow |
247 | 16718697 | 236 | 0.000098 | net.php |
248 | 16714787 | 589 | 0.000044 | com.citysearch |
249 | 16701081 | 811 | 0.000037 | com.jigsy |
250 | 16699551 | 684 | 0.000043 | com.vice |
251 | 16693416 | 992 | 0.000034 | ly.ow |
252 | 16692056 | 534 | 0.000048 | com.exacttarget |
253 | 16685527 | 261 | 0.000089 | com.salesforce |
254 | 16682819 | 539 | 0.000047 | com.cbsnews |
255 | 16678018 | 502 | 0.000050 | com.zdnet |
256 | 16676526 | 397 | 0.000061 | gov.whitehouse |
257 | 16675419 | 582 | 0.000045 | com.ft |
258 | 16669694 | 105 | 0.000280 | de.google |
259 | 16667239 | 1190 | 0.000031 | edu.yale |
260 | 16661478 | 1213 | 0.000031 | edu.ucla |
261 | 16657707 | 606 | 0.000044 | uk.co.guardian |
262 | 16655324 | 685 | 0.000043 | com.googleblog |
263 | 16654076 | 734 | 0.000040 | com.nationalgeographic |
264 | 16651951 | 92 | 0.000369 | com.qq |
265 | 16649666 | 1166 | 0.000032 | edu.psu |
266 | 16649394 | 399 | 0.000060 | uk.co.blogspot |
267 | 16648730 | 766 | 0.000038 | com.foxnews |
268 | 16648321 | 644 | 0.000043 | org.virtualbox |
269 | 16647850 | 523 | 0.000048 | org.maven |
270 | 16647058 | 77 | 0.000418 | com.people |
271 | 16646731 | 216 | 0.000113 | uk.co.amazon |
272 | 16645964 | 258 | 0.000089 | com.hp |
273 | 16642738 | 550 | 0.000047 | com.cisco |
274 | 16640084 | 777 | 0.000038 | com.economist |
275 | 16639283 | 321 | 0.000073 | gov.cdc |
276 | 16634710 | 590 | 0.000044 | com.bandsintown |
277 | 16634368 | 1326 | 0.000027 | com.indiegogo |
278 | 16630754 | 1187 | 0.000031 | com.gizmodo |
279 | 16630079 | 218 | 0.000112 | com.windowsphone |
280 | 16628634 | 584 | 0.000045 | org.hbr |
281 | 16628196 | 919 | 0.000035 | com.authorstream |
282 | 16627672 | 439 | 0.000055 | edu.cmu |
283 | 16624396 | 851 | 0.000036 | com.timeanddate |
284 | 16621468 | 1186 | 0.000031 | com.evernote |
285 | 16620476 | 578 | 0.000045 | com.dropboxusercontent |
286 | 16619394 | 1160 | 0.000033 | com.sciencedaily |
287 | 16616793 | 687 | 0.000042 | com.wikia |
288 | 16615286 | 224 | 0.000108 | com.bandcamp |
289 | 16613293 | 395 | 0.000061 | org.whatbrowser |
290 | 16612143 | 256 | 0.000090 | io.atom |
291 | 16612009 | 1259 | 0.000029 | in.blogspot |
292 | 16610174 | 714 | 0.000040 | com.dpreview |
293 | 16610098 | 280 | 0.000083 | com.smugmug |
294 | 16609456 | 171 | 0.000142 | com.weibo |
295 | 16605754 | 528 | 0.000048 | com.theknot |
296 | 16604115 | 751 | 0.000039 | com.merchantcircle |
297 | 16599614 | 871 | 0.000035 | us.imageshack |
298 | 16598567 | 882 | 0.000035 | com.slate |
299 | 16598491 | 197 | 0.000126 | com.blogblog |
300 | 16596575 | 715 | 0.000040 | org.imagemagick |
301 | 16594197 | 1197 | 0.000031 | org.arxiv |
302 | 16591680 | 476 | 0.000051 | com.squareup |
303 | 16591661 | 369 | 0.000065 | com.skype |
304 | 16588342 | 1428 | 0.000023 | edu.ucsd |
305 | 16586521 | 1297 | 0.000028 | com.ning |
306 | 16582959 | 575 | 0.000046 | com.tinypic |
307 | 16582704 | 493 | 0.000050 | com.giphy |
308 | 16582423 | 696 | 0.000042 | com.box |
309 | 16582058 | 311 | 0.000076 | com.nypost |
310 | 16576626 | 1454 | 0.000023 | com.posterous |
311 | 16576158 | 688 | 0.000042 | com.bookdepository |
312 | 16576073 | 885 | 0.000035 | com.brandyourself |
313 | 16575175 | 1249 | 0.000029 | edu.upenn |
314 | 16573309 | 1155 | 0.000033 | org.eff |
315 | 16572621 | 478 | 0.000051 | org.postgresql |
316 | 16571814 | 677 | 0.000043 | de.blogspot |
317 | 16568213 | 407 | 0.000059 | com.angieslist |
318 | 16564953 | 787 | 0.000038 | com.samsung |
319 | 16563339 | 843 | 0.000036 | com.comixology |
320 | 16561663 | 1408 | 0.000024 | edu.wisc |
321 | 16560984 | 1161 | 0.000032 | gov.census |
322 | 16559941 | 747 | 0.000039 | com.shutterstock |
323 | 16559463 | 1323 | 0.000027 | uk.ac.cam |
324 | 16558927 | 1171 | 0.000032 | gov.nist |
325 | 16558858 | 543 | 0.000047 | com.geocities |
326 | 16558841 | 168 | 0.000144 | com.xing |
327 | 16558455 | 422 | 0.000057 | com.oreilly |
328 | 16558027 | 1459 | 0.000023 | edu.purdue |
329 | 16556658 | 716 | 0.000040 | com.nature |
330 | 16556180 | 1397 | 0.000024 | com.hotmail |
331 | 16555028 | 802 | 0.000037 | com.uk |
332 | 16554351 | 996 | 0.000034 | com.livestream |
333 | 16553202 | 920 | 0.000035 | com.arstechnica |
334 | 16552006 | 337 | 0.000071 | com.prnewswire |
335 | 16548858 | 284 | 0.000081 | ca.google |
336 | 16546727 | 705 | 0.000041 | org.vim |
337 | 16545866 | 220 | 0.000111 | com.getclicky |
338 | 16543548 | 415 | 0.000058 | int.who |
339 | 16541425 | 1436 | 0.000023 | edu.princeton |
340 | 16538268 | 569 | 0.000046 | com.entrepreneur |
341 | 16538222 | 382 | 0.000063 | com.sxsw |
342 | 16538110 | 1499 | 0.000022 | com.angelfire |
343 | 16537923 | 1229 | 0.000030 | edu.umich |
344 | 16537889 | 426 | 0.000057 | com.springer |
345 | 16533600 | 779 | 0.000038 | com.bravesites |
346 | 16533096 | 1038 | 0.000033 | org.unesco |
347 | 16531594 | 1351 | 0.000026 | uk.ac.ox |
348 | 16531456 | 484 | 0.000051 | com.office |
349 | 16529055 | 1260 | 0.000029 | org.iso |
350 | 16528766 | 1330 | 0.000027 | com.pcworld |
351 | 16527778 | 860 | 0.000036 | com.unsplash |
352 | 16527375 | 755 | 0.000039 | com.blackberry |
353 | 16526680 | 210 | 0.000117 | de.amazon |
354 | 16525790 | 781 | 0.000038 | gov.state |
355 | 16523581 | 449 | 0.000054 | com.fortune |
356 | 16522217 | 703 | 0.000041 | org.aclweb |
357 | 16522178 | 786 | 0.000038 | net.vnexpress |
358 | 16522049 | 354 | 0.000068 | com.booking |
359 | 16521727 | 879 | 0.000035 | com.dynamics |
360 | 16521106 | 1020 | 0.000034 | com.weather |
361 | 16520245 | 1003 | 0.000034 | com.communitywalk |
362 | 16519672 | 763 | 0.000039 | com.vagrantup |
363 | 16516137 | 159 | 0.000152 | com.constantcontact |
364 | 16514887 | 708 | 0.000041 | jobs.amazon |
365 | 16514721 | 1039 | 0.000033 | com.indiatimes |
366 | 16512625 | 775 | 0.000038 | com.cbslocal |
367 | 16512276 | 1200 | 0.000031 | com.lifehacker |
368 | 16511972 | 1432 | 0.000023 | com.vox |
369 | 16509793 | 270 | 0.000085 | it.placehold |
370 | 16508711 | 565 | 0.000046 | com.newsweek |
371 | 16508203 | 1609 | 0.000020 | net.comcast |
372 | 16505501 | 209 | 0.000118 | org.joomla |
373 | 16505442 | 448 | 0.000054 | com.force |
374 | 16505148 | 1299 | 0.000028 | com.politico |
375 | 16502701 | 1310 | 0.000028 | org.altervista |
376 | 16500671 | 588 | 0.000044 | com.venturebeat |
377 | 16498765 | 278 | 0.000083 | gov.ftc |
378 | 16497198 | 756 | 0.000039 | com.java |
379 | 16497000 | 1264 | 0.000029 | co.vine |
380 | 16493364 | 1068 | 0.000033 | com.ubuntu |
381 | 16493303 | 1463 | 0.000023 | com.thinkwithgoogle |
382 | 16488408 | 446 | 0.000054 | com.businesswire |
383 | 16488348 | 253 | 0.000091 | to.amzn |
384 | 16488175 | 1343 | 0.000026 | fm.last |
385 | 16487061 | 868 | 0.000035 | hu.elte |
386 | 16486012 | 1203 | 0.000031 | com.gofundme |
387 | 16485698 | 1168 | 0.000032 | ca.cbc |
388 | 16484940 | 1071 | 0.000033 | gov.senate |
389 | 16482707 | 1590 | 0.000020 | edu.uchicago |
390 | 16482550 | 679 | 0.000043 | com.googlesource |
391 | 16481201 | 713 | 0.000040 | org.sqlite |
392 | 16473573 | 1335 | 0.000026 | com.airbnb |
393 | 16471045 | 680 | 0.000043 | gov.noaa |
394 | 16470456 | 719 | 0.000040 | com.manta |
395 | 16470297 | 142 | 0.000180 | org.bbb |
396 | 16466733 | 1237 | 0.000029 | com.searchengineland |
397 | 16465690 | 2103 | 0.000014 | com.twitpic |
398 | 16465265 | 1406 | 0.000024 | edu.umn |
399 | 16464820 | 884 | 0.000035 | com.googlelabs |
400 | 16464294 | 1169 | 0.000032 | com.engadget |
401 | 16464089 | 1399 | 0.000024 | uk.co.theregister |
402 | 16463266 | 519 | 0.000049 | com.inc |
403 | 16463082 | 79 | 0.000414 | com.bleacherreport |
404 | 16461353 | 264 | 0.000086 | es.google |
405 | 16461168 | 1324 | 0.000027 | com.dell |
406 | 16459804 | 1650 | 0.000019 | com.blogs |
407 | 16459344 | 1236 | 0.000030 | com.stackexchange |
408 | 16458810 | 1676 | 0.000019 | edu.usc |
409 | 16458357 | 1482 | 0.000022 | com.mtv |
410 | 16456299 | 527 | 0.000048 | org.sonatype |
411 | 16456115 | 1720 | 0.000018 | mp.j |
412 | 16456086 | 1316 | 0.000027 | com.variety |
413 | 16455553 | 740 | 0.000039 | org.gnupg |
414 | 16454441 | 2510 | 0.000011 | edu.unl |
415 | 16453361 | 1332 | 0.000027 | org.ieee |
416 | 16452222 | 1554 | 0.000021 | edu.northwestern |
417 | 16451197 | 1184 | 0.000031 | com.americanexpress |
418 | 16450122 | 456 | 0.000053 | com.snapchat |
419 | 16450065 | 219 | 0.000111 | fr.google |
420 | 16448305 | 1307 | 0.000028 | com.discovery |
421 | 16447926 | 1257 | 0.000029 | com.businessweek |
422 | 16447711 | 1219 | 0.000030 | com.netflix |
423 | 16445870 | 1599 | 0.000020 | edu.jhu |
424 | 16445859 | 769 | 0.000038 | com.jsbin |
425 | 16445370 | 128 | 0.000209 | com.googleadservices |
426 | 16445100 | 735 | 0.000040 | com.intel |
427 | 16444823 | 566 | 0.000046 | com.delicious |
428 | 16444541 | 1152 | 0.000033 | com.pinimg |
429 | 16443282 | 474 | 0.000052 | com.nwsource |
430 | 16442373 | 1156 | 0.000033 | tv.ustream |
431 | 16439601 | 165 | 0.000147 | it.google |
432 | 16439513 | 723 | 0.000040 | br.com.uol |
433 | 16439408 | 521 | 0.000048 | com.herokuapp |
434 | 16438062 | 312 | 0.000075 | com.bitly |
435 | 16432114 | 184 | 0.000134 | com.eepurl |
436 | 16431325 | 1620 | 0.000019 | com.examiner |
437 | 16431244 | 358 | 0.000067 | com.bizjournals |
438 | 16430338 | 855 | 0.000036 | com.souq |
439 | 16429560 | 1174 | 0.000032 | au.net.abc |
440 | 16429209 | 1192 | 0.000031 | fr.blogspot |
441 | 16428874 | 1806 | 0.000016 | edu.rutgers |
442 | 16428650 | 858 | 0.000036 | ca.pinterest |
443 | 16428386 | 1630 | 0.000019 | com.udemy |
444 | 16426324 | 1680 | 0.000018 | uk.co.thesun |
445 | 16425975 | 1429 | 0.000023 | com.prezi |
446 | 16422871 | 1693 | 0.000018 | com.speakerdeck |
447 | 16421467 | 1290 | 0.000028 | com.mlb |
448 | 16421465 | 782 | 0.000038 | com.mysanantonio |
449 | 16421194 | 1211 | 0.000031 | com.chicagotribune |
450 | 16420605 | 720 | 0.000040 | com.shopbop |
451 | 16418476 | 1575 | 0.000020 | it.blogspot |
452 | 16418348 | 290 | 0.000080 | com.hubspot |
453 | 16416163 | 1899 | 0.000015 | edu.msu |
454 | 16415995 | 282 | 0.000082 | com.fc2 |
455 | 16415711 | 697 | 0.000042 | com.moz |
456 | 16414248 | 784 | 0.000038 | com.boxofficemojo |
457 | 16413782 | 727 | 0.000040 | io.getmdl |
458 | 16410305 | 76 | 0.000421 | me.m |
459 | 16409061 | 1274 | 0.000028 | gov.fbi |
460 | 16406432 | 1966 | 0.000015 | ch.ethz |
461 | 16406244 | 262 | 0.000088 | com.dribbble |
462 | 16404420 | 194 | 0.000126 | jp.co.yahoo |
463 | 16402794 | 1491 | 0.000022 | com.trello |
464 | 16401561 | 1026 | 0.000034 | com.slack |
465 | 16401275 | 1325 | 0.000027 | net.researchgate |
466 | 16400689 | 332 | 0.000072 | edu.nyu |
467 | 16398770 | 137 | 0.000185 | com.google-analytics |
468 | 16398291 | 555 | 0.000047 | com.wunderground |
469 | 16398105 | 429 | 0.000057 | com.naver |
470 | 16397960 | 1824 | 0.000016 | com.tutsplus |
471 | 16396587 | 2171 | 0.000013 | com.googlepages |
472 | 16396128 | 1594 | 0.000020 | edu.academia |
473 | 16395104 | 413 | 0.000059 | com.bigcartel |
474 | 16394348 | 810 | 0.000037 | it.binged |
475 | 16393983 | 1380 | 0.000025 | org.khanacademy |
476 | 16393285 | 1194 | 0.000031 | com.reverbnation |
477 | 16393035 | 1587 | 0.000020 | com.mac |
478 | 16392781 | 1472 | 0.000022 | com.target |
479 | 16392485 | 2085 | 0.000014 | edu.asu |
480 | 16391796 | 275 | 0.000084 | com.wufoo |
481 | 16391219 | 2036 | 0.000014 | edu.arizona |
482 | 16390459 | 700 | 0.000041 | uk.co.independent |
483 | 16389602 | 1519 | 0.000022 | com.pexels |
484 | 16389509 | 1412 | 0.000024 | com.over-blog |
485 | 16388260 | 466 | 0.000052 | com.adweek |
486 | 16387362 | 260 | 0.000089 | com.myshopify |
487 | 16387344 | 1395 | 0.000024 | com.bostonglobe |
488 | 16387145 | 1572 | 0.000020 | com.zazzle |
489 | 16387134 | 1361 | 0.000025 | com.libsyn |
490 | 16386010 | 418 | 0.000058 | com.fastcompany |
491 | 16385497 | 580 | 0.000045 | gov.ed |
492 | 16385161 | 119 | 0.000223 | com.baidu |
493 | 16385107 | 612 | 0.000044 | cn.com.sina |
494 | 16384455 | 591 | 0.000044 | gov.fda |
495 | 16384415 | 728 | 0.000040 | es.com.blogspot |
496 | 16384099 | 1040 | 0.000033 | gov.nps |
497 | 16383586 | 1646 | 0.000019 | com.vanityfair |
498 | 16382978 | 887 | 0.000035 | ws.snack |
499 | 16382297 | 1046 | 0.000033 | com.marketwatch |
500 | 16381800 | 1888 | 0.000016 | com.yolasite |
501 | 16381324 | 1558 | 0.000021 | com.nba |
502 | 16379962 | 109 | 0.000261 | org.networkadvertising |
503 | 16379082 | 1153 | 0.000033 | gov.house |
504 | 16376893 | 898 | 0.000035 | com.sfgate |
505 | 16373505 | 2289 | 0.000012 | edu.caltech |
506 | 16372783 | 465 | 0.000053 | com.w3schools |
507 | 16372566 | 123 | 0.000221 | jp.co.google |
508 | 16372041 | 2035 | 0.000014 | com.instructables |
509 | 16369468 | 1821 | 0.000016 | com.msnbc |
510 | 16368539 | 1470 | 0.000022 | com.scientificamerican |
511 | 16368503 | 1543 | 0.000021 | com.ehow |
512 | 16366625 | 1984 | 0.000015 | uk.ac.ucl |
513 | 16366040 | 690 | 0.000042 | org.bitbucket |
514 | 16365725 | 2162 | 0.000013 | ca.ualberta |
515 | 16364806 | 464 | 0.000053 | net.openid |
516 | 16364617 | 768 | 0.000038 | org.gradle |
517 | 16364203 | 1772 | 0.000017 | org.aclu |
518 | 16363540 | 1474 | 0.000022 | com.elpais |
519 | 16363484 | 724 | 0.000040 | com.yarnpkg |
520 | 16363077 | 2742 | 0.000010 | com.hubpages |
521 | 16362675 | 633 | 0.000043 | com.cargocollective |
522 | 16361964 | 1640 | 0.000019 | com.mercurynews |
523 | 16359187 | 1177 | 0.000032 | com.steampowered |
524 | 16358974 | 1845 | 0.000016 | edu.ufl |
525 | 16358831 | 1235 | 0.000030 | org.change |
526 | 16358719 | 583 | 0.000045 | gov.usda |
527 | 16358083 | 828 | 0.000036 | com.warriorplus |
528 | 16357302 | 1244 | 0.000029 | com.thenextweb |
529 | 16357275 | 1386 | 0.000024 | de.spiegel |
530 | 16357064 | 1158 | 0.000033 | com.proofpoint |
531 | 16356965 | 809 | 0.000037 | com.whitepages |
532 | 16353352 | 1188 | 0.000031 | gov.fcc |
533 | 16352815 | 1795 | 0.000017 | com.nfl |
534 | 16352068 | 1345 | 0.000026 | com.globo |
535 | 16351967 | 3065 | 0.000009 | com.answers |
536 | 16351579 | 706 | 0.000041 | org.jenkins-ci |
537 | 16351057 | 1557 | 0.000021 | com.billboard |
538 | 16350457 | 776 | 0.000038 | ly.snip |
539 | 16349376 | 1172 | 0.000032 | com.ggpht |
540 | 16349324 | 1780 | 0.000017 | org.ap |
541 | 16349172 | 1901 | 0.000015 | edu.indiana |
542 | 16349066 | 1445 | 0.000023 | com.nokia |
543 | 16348888 | 1943 | 0.000015 | com.ign |
544 | 16348185 | 1926 | 0.000015 | com.ikea |
545 | 16348047 | 1706 | 0.000018 | edu.umd |
546 | 16348024 | 52 | 0.000596 | com.messenger |
547 | 16347814 | 757 | 0.000039 | com.msdn |
548 | 16345860 | 1801 | 0.000017 | org.weforum |
549 | 16345787 | 526 | 0.000048 | org.doi |
550 | 16345233 | 202 | 0.000122 | jp.ameblo |
551 | 16344577 | 891 | 0.000035 | com.woot |
552 | 16344447 | 1283 | 0.000028 | com.patreon |
553 | 16343957 | 1538 | 0.000021 | br.com.blogspot |
554 | 16343414 | 132 | 0.000199 | ru.mail |
555 | 16343276 | 2104 | 0.000014 | com.oxforddictionaries |
556 | 16342590 | 744 | 0.000039 | com.photoshelter |
557 | 16341919 | 1322 | 0.000027 | gov.uspto |
558 | 16341309 | 1652 | 0.000019 | fr.lemonde |
559 | 16340939 | 1591 | 0.000020 | com.rollingstone |
560 | 16340630 | 1763 | 0.000017 | uk.co.metro |
561 | 16340043 | 602 | 0.000044 | com.sciencedirect |
562 | 16339730 | 3779 | 0.000007 | mx.unam |
563 | 16339436 | 944 | 0.000035 | com.hotfrog |
564 | 16338592 | 2127 | 0.000014 | com.fiverr |
565 | 16337795 | 173 | 0.000141 | jp.ne.hatena |
566 | 16337443 | 1840 | 0.000016 | com.aliexpress |
567 | 16336373 | 3072 | 0.000009 | com.123rf |
568 | 16336076 | 386 | 0.000063 | au.com.google |
569 | 16334801 | 1238 | 0.000029 | com.prweb |
570 | 16334152 | 1835 | 0.000016 | br.com.abril |
571 | 16332875 | 1487 | 0.000022 | com.pcmag |
572 | 16332192 | 873 | 0.000035 | ly.plot |
573 | 16331709 | 3250 | 0.000008 | com.blog |
574 | 16331549 | 391 | 0.000061 | us.icio |
575 | 16331199 | 837 | 0.000036 | com.folkd |
576 | 16331176 | 2316 | 0.000012 | org.kiva |
577 | 16330752 | 2396 | 0.000012 | edu.brown |
578 | 16330624 | 1478 | 0.000022 | com.qz |
579 | 16330490 | 1180 | 0.000032 | com.psychologytoday |
580 | 16329689 | 2088 | 0.000014 | com.newscientist |
581 | 16329114 | 1577 | 0.000020 | com.playstation |
582 | 16326425 | 1401 | 0.000024 | edu.si |
583 | 16324979 | 846 | 0.000036 | io.material |
584 | 16324000 | 1072 | 0.000033 | gov.usa |
585 | 16322896 | 1608 | 0.000020 | com.hulu |
586 | 16321672 | 1341 | 0.000026 | com.cafepress |
587 | 16321195 | 1986 | 0.000015 | ca.utoronto |
588 | 16321003 | 1597 | 0.000020 | com.econsultancy |
589 | 16320939 | 815 | 0.000037 | gov.copyright |
590 | 16320038 | 440 | 0.000055 | gov.irs |
591 | 16318690 | 3008 | 0.000009 | cc.co |
592 | 16318681 | 1834 | 0.000016 | com.canva |
593 | 16317432 | 2792 | 0.000010 | pt.sapo |
594 | 16315297 | 1705 | 0.000018 | com.colourlovers |
595 | 16314881 | 994 | 0.000034 | com.hotukdeals |
596 | 16314271 | 830 | 0.000036 | com.getskeleton |
597 | 16312515 | 1495 | 0.000022 | com.nymag |
598 | 16312187 | 406 | 0.000059 | com.barnesandnoble |
599 | 16311552 | 1258 | 0.000029 | org.worldbank |
600 | 16310245 | 2006 | 0.000014 | com.bestbuy |
601 | 16310108 | 1783 | 0.000017 | com.nhl |
602 | 16308879 | 2013 | 0.000014 | edu.uci |
603 | 16308831 | 1398 | 0.000024 | com.boston |
604 | 16308814 | 878 | 0.000035 | com.insiderpages |
605 | 16307539 | 2856 | 0.000010 | edu.tufts |
606 | 16307217 | 365 | 0.000066 | nl.google |
607 | 16306423 | 826 | 0.000037 | gov.hhs |
608 | 16306066 | 2229 | 0.000013 | edu.osu |
609 | 16306024 | 1716 | 0.000018 | edu.duke |
610 | 16304996 | 1226 | 0.000030 | com.hootsuite |
611 | 16304703 | 247 | 0.000093 | jp.co.amazon |
612 | 16302217 | 1534 | 0.000021 | gov.nyc |
613 | 16301587 | 1855 | 0.000016 | com.fifa |
614 | 16301259 | 1642 | 0.000019 | com.withgoogle |
615 | 16300727 | 424 | 0.000057 | com.clicky |
616 | 16298229 | 524 | 0.000048 | com.whatsapp |
617 | 16297862 | 704 | 0.000041 | com.redbubble |
618 | 16297676 | 2934 | 0.000009 | com.friendfeed |
619 | 16297637 | 1954 | 0.000015 | com.gawker |
620 | 16296981 | 1333 | 0.000027 | org.oecd |
621 | 16296568 | 2082 | 0.000014 | nl.xs4all |
622 | 16296383 | 1857 | 0.000016 | com.pastebin |
623 | 16295427 | 938 | 0.000035 | com.tiki-toki |
624 | 16294836 | 2809 | 0.000010 | edu.uic |
625 | 16294750 | 1281 | 0.000028 | com.istockphoto |
626 | 16294356 | 1605 | 0.000020 | com.hyatt |
627 | 16294322 | 2059 | 0.000014 | edu.tamu |
628 | 16293017 | 2164 | 0.000013 | edu.ncsu |
629 | 16292389 | 1385 | 0.000024 | com.com |
630 | 16292103 | 946 | 0.000034 | jp.ac.kobe-u |
631 | 16291741 | 906 | 0.000035 | com.quantcast |
632 | 16291635 | 1931 | 0.000015 | nl.blogspot |
633 | 16291618 | 803 | 0.000037 | com.webmd |
634 | 16291353 | 2823 | 0.000010 | com.wolfram |
635 | 16291336 | 729 | 0.000040 | ca.amazon |
636 | 16290941 | 455 | 0.000053 | net.launchpad |
637 | 16290095 | 2170 | 0.000013 | com.wikispaces |
638 | 16289600 | 1304 | 0.000028 | com.walmart |
639 | 16288978 | 3081 | 0.000009 | edu.colostate |
640 | 16288094 | 520 | 0.000048 | in.co.google |
641 | 16286417 | 1207 | 0.000031 | com.redhat |
642 | 16286409 | 1574 | 0.000020 | com.merriam-webster |
643 | 16286142 | 1730 | 0.000018 | int.wipo |
644 | 16284744 | 1196 | 0.000031 | com.adage |
645 | 16284151 | 1224 | 0.000030 | com.ups |
646 | 16283988 | 844 | 0.000036 | com.newsbank |
647 | 16283941 | 3078 | 0.000009 | com.squidoo |
648 | 16283791 | 1337 | 0.000026 | gov.dot |
649 | 16283705 | 1677 | 0.000018 | com.me |
650 | 16283271 | 1444 | 0.000023 | com.mediafire |
651 | 16283190 | 2122 | 0.000014 | ca.ubc |
652 | 16282261 | 2632 | 0.000011 | ca.uwaterloo |
653 | 16281888 | 1600 | 0.000020 | edu.unc |
654 | 16281405 | 2002 | 0.000015 | org.kde |
655 | 16280999 | 2109 | 0.000014 | org.gimp |
656 | 16280611 | 477 | 0.000051 | com.pingdom |
657 | 16279262 | 2517 | 0.000011 | gd.is |
658 | 16279233 | 2713 | 0.000010 | edu.hawaii |
659 | 16278204 | 2076 | 0.000014 | com.aljazeera |
660 | 16277880 | 1525 | 0.000021 | com.xbox |
661 | 16276593 | 1611 | 0.000020 | com.freewebs |
662 | 16276064 | 2253 | 0.000013 | com.britannica |
663 | 16275159 | 1686 | 0.000018 | uk.co.mirror |
664 | 16274962 | 2261 | 0.000013 | uk.co.timesonline |
665 | 16274102 | 2065 | 0.000014 | au.com.news |
666 | 16273788 | 1532 | 0.000021 | com.xkcd |
667 | 16273554 | 1198 | 0.000031 | com.feedly |
668 | 16273045 | 2931 | 0.000009 | com.laughingsquid |
669 | 16272291 | 1507 | 0.000022 | gov.wa |
670 | 16272286 | 1842 | 0.000016 | tv.periscope |
671 | 16272122 | 1460 | 0.000023 | com.mixcloud |
672 | 16270257 | 2919 | 0.000010 | com.codecademy |
673 | 16269871 | 2003 | 0.000015 | edu.illinois |
674 | 16269399 | 1679 | 0.000018 | uk.co.huffingtonpost |
675 | 16269107 | 387 | 0.000062 | net.themeforest |
676 | 16269031 | 1654 | 0.000019 | uk.co.ebay |
677 | 16268894 | 379 | 0.000063 | com.ea |
678 | 16268536 | 998 | 0.000034 | com.att |
679 | 16268423 | 1666 | 0.000019 | net.daum |
680 | 16267963 | 2613 | 0.000011 | ca.mcgill |
681 | 16265041 | 659 | 0.000043 | com.houzz |
682 | 16264648 | 1514 | 0.000022 | com.intuit |
683 | 16264335 | 573 | 0.000046 | fr.amazon |
684 | 16262384 | 2087 | 0.000014 | com.softpedia |
685 | 16261968 | 1872 | 0.000016 | com.autodesk |
686 | 16261899 | 207 | 0.000119 | org.icann |
687 | 16261856 | 1812 | 0.000016 | com.deadline |
688 | 16261306 | 2708 | 0.000010 | edu.vanderbilt |
689 | 16261208 | 1643 | 0.000019 | com.foxbusiness |
690 | 16260540 | 1598 | 0.000020 | gov.uscourts |
691 | 16259038 | 380 | 0.000063 | com.heroku |
692 | 16258429 | 1471 | 0.000022 | com.gumroad |
693 | 16257667 | 2215 | 0.000013 | com.flipboard |
694 | 16256425 | 1578 | 0.000020 | com.us |
695 | 16256212 | 1634 | 0.000019 | de.welt |
696 | 16255761 | 1193 | 0.000031 | com.deloitte |
697 | 16254736 | 2276 | 0.000012 | com.yfrog |
698 | 16254687 | 1596 | 0.000020 | org.owasp |
699 | 16254424 | 2729 | 0.000010 | com.lynda |
700 | 16254139 | 2046 | 0.000014 | org.coursera |
701 | 16253423 | 942 | 0.000035 | com.cdbaby |
702 | 16252259 | 1303 | 0.000028 | com.sagepub |
703 | 16252243 | 1585 | 0.000020 | com.vmware |
704 | 16252225 | 2045 | 0.000014 | net.earthlink |
705 | 16251417 | 711 | 0.000041 | com.usnews |
706 | 16251315 | 1383 | 0.000025 | org.unicef |
707 | 16251165 | 3714 | 0.000007 | com.space |
708 | 16250745 | 2121 | 0.000014 | com.vogue |
709 | 16249713 | 423 | 0.000057 | com.cracked |
710 | 16249436 | 1844 | 0.000016 | com.domain |
711 | 16249262 | 505 | 0.000050 | net.yahoo |
712 | 16248921 | 248 | 0.000092 | com.nielsen |
713 | 16247818 | 1066 | 0.000033 | site.tenerifeforum |
714 | 16247744 | 2129 | 0.000014 | com.theonion |
715 | 16247457 | 732 | 0.000040 | com.atlassian |
716 | 16246881 | 726 | 0.000040 | com.sharefile |
717 | 16245304 | 821 | 0.000037 | org.osgeo |
718 | 16244753 | 2329 | 0.000012 | com.searchenginejournal |
719 | 16244591 | 1541 | 0.000021 | com.searchenginewatch |
720 | 16243862 | 1669 | 0.000019 | com.windows |
721 | 16243809 | 2503 | 0.000011 | org.greenpeace |
722 | 16242894 | 1061 | 0.000033 | org.bravenewvoices |
723 | 16242695 | 2314 | 0.000012 | edu.wustl |
724 | 16242478 | 2015 | 0.000014 | uk.ac.lse |
725 | 16242148 | 958 | 0.000034 | com.2findlocal |
726 | 16241958 | 1929 | 0.000015 | edu.ucdavis |
727 | 16238994 | 2808 | 0.000010 | edu.uoregon |
728 | 16238622 | 772 | 0.000038 | org.openweathermap |
729 | 16238445 | 1479 | 0.000022 | com.kissmetrics |
730 | 16237769 | 2095 | 0.000014 | net.jsfiddle |
731 | 16237405 | 1629 | 0.000019 | com.chron |
732 | 16237225 | 1900 | 0.000015 | gov.usaid |
733 | 16237011 | 1404 | 0.000024 | com.steamcommunity |
734 | 16236194 | 941 | 0.000035 | com.ripple |
735 | 16234398 | 1773 | 0.000017 | org.craigslist |
736 | 16234380 | 1768 | 0.000017 | com.howstuffworks |
737 | 16233822 | 788 | 0.000038 | com.hilton |
738 | 16233735 | 1382 | 0.000025 | com.alibaba |
739 | 16233361 | 2582 | 0.000011 | edu.uga |
740 | 16232973 | 2725 | 0.000010 | edu.pitt |
741 | 16232849 | 1663 | 0.000019 | com.yoast |
742 | 16232226 | 2847 | 0.000010 | com.rottentomatoes |
743 | 16232203 | 239 | 0.000097 | org.purl |
744 | 16230013 | 1416 | 0.000024 | org.plos |
745 | 16229506 | 1849 | 0.000016 | com.espn |
746 | 16228657 | 2078 | 0.000014 | com.gamespot |
747 | 16226657 | 4149 | 0.000007 | ca.yorku |
748 | 16226145 | 2437 | 0.000012 | gov.cia |
749 | 16224968 | 385 | 0.000063 | com.youku |
750 | 16224769 | 1683 | 0.000018 | com.csmonitor |
751 | 16224511 | 890 | 0.000035 | tv.twitch |
752 | 16224300 | 3502 | 0.000008 | com.secondlife |
753 | 16223843 | 1481 | 0.000022 | com.hollywoodreporter |
754 | 16222670 | 1668 | 0.000019 | net.battle |
755 | 16221357 | 1870 | 0.000016 | com.irishtimes |
756 | 16221082 | 927 | 0.000035 | com.bizcommunity |
757 | 16220820 | 2328 | 0.000012 | edu.vt |
758 | 16220679 | 2375 | 0.000012 | com.technet |
759 | 16220669 | 806 | 0.000037 | uk.co.currys |
760 | 16219206 | 3157 | 0.000009 | com.avast |
761 | 16217463 | 1485 | 0.000022 | org.fao |
762 | 16217257 | 2055 | 0.000014 | com.twilio |
763 | 16215764 | 717 | 0.000040 | com.netdna-cdn |
764 | 16215730 | 2844 | 0.000010 | com.popsci |
765 | 16215299 | 2205 | 0.000013 | com.podbean |
766 | 16214805 | 1256 | 0.000029 | org.redcross |
767 | 16213945 | 2813 | 0.000010 | org.kqed |
768 | 16213937 | 1453 | 0.000023 | us.tx.state |
769 | 16213886 | 420 | 0.000058 | br.com.google |
770 | 16212530 | 1785 | 0.000017 | mil.navy |
771 | 16211785 | 2010 | 0.000014 | com.netvibes |
772 | 16211715 | 3255 | 0.000008 | edu.iastate |
773 | 16209341 | 1631 | 0.000019 | com.animoto |
774 | 16209268 | 2916 | 0.000010 | int.esa |
775 | 16209261 | 2214 | 0.000013 | com.makezine |
776 | 16208037 | 2024 | 0.000014 | edu.ucsf |
777 | 16208029 | 3576 | 0.000008 | uk.ac.manchester |
778 | 16206623 | 1904 | 0.000015 | com.foxsports |
779 | 16206024 | 1792 | 0.000017 | com.blogtalkradio |
780 | 16205145 | 1336 | 0.000026 | com.docker |
781 | 16204975 | 1632 | 0.000019 | mil.army |
782 | 16204618 | 2335 | 0.000012 | com.lonelyplanet |
783 | 16204345 | 1440 | 0.000023 | jp.blogspot |
784 | 16203706 | 3517 | 0.000008 | edu.wsu |
785 | 16203462 | 1754 | 0.000017 | co.angel |
786 | 16202931 | 702 | 0.000041 | com.technorati |
787 | 16202713 | 1621 | 0.000019 | com.today |
788 | 16202473 | 228 | 0.000104 | com.elegantthemes |
789 | 16201395 | 1553 | 0.000021 | com.fedex |
790 | 16201336 | 1827 | 0.000016 | com.macworld |
791 | 16200855 | 1689 | 0.000018 | ru.spb |
792 | 16200440 | 3491 | 0.000008 | org.eu |
793 | 16199798 | 3940 | 0.000007 | edu.byu |
794 | 16199604 | 1980 | 0.000015 | com.topsy |
795 | 16199518 | 1308 | 0.000028 | gov.energy |
796 | 16199425 | 1932 | 0.000015 | edu.umass |
797 | 16198010 | 1782 | 0.000017 | org.cancer |
798 | 16197684 | 827 | 0.000037 | com.themonitor |
799 | 16197204 | 1539 | 0.000021 | gov.congress |
800 | 16197173 | 453 | 0.000054 | com.zenfolio |
801 | 16195131 | 326 | 0.000073 | com.newrelic |
802 | 16193637 | 981 | 0.000034 | com.scribblemaps |
803 | 16193445 | 1327 | 0.000027 | com.webnode |
804 | 16193351 | 1309 | 0.000028 | com.zoho |
805 | 16192916 | 1403 | 0.000024 | com.techrepublic |
806 | 16192600 | 469 | 0.000052 | jp.ne.sakura |
807 | 16191941 | 1475 | 0.000022 | com.html5rocks |
808 | 16191512 | 752 | 0.000039 | gov.sec |
809 | 16191011 | 246 | 0.000093 | me.line |
810 | 16190396 | 560 | 0.000046 | gov.export |
811 | 16190321 | 2421 | 0.000012 | com.redbull |
812 | 16190245 | 1289 | 0.000028 | de.bund |
813 | 16190148 | 1273 | 0.000028 | com.formstack |
814 | 16189401 | 1727 | 0.000018 | org.pewresearch |
815 | 16187055 | 2504 | 0.000011 | org.documentcloud |
816 | 16186153 | 2206 | 0.000013 | com.denverpost |
817 | 16185105 | 1791 | 0.000017 | com.freepik |
818 | 16184501 | 1164 | 0.000032 | gov.justice |
819 | 16184479 | 83 | 0.000405 | com.shareaholic |
820 | 16184211 | 857 | 0.000036 | org.bouncycastle |
821 | 16184134 | 181 | 0.000137 | info.aboutads |
822 | 16183073 | 1006 | 0.000034 | com.weddingbee |
823 | 16182519 | 22 | 0.001469 | com.wixstatic |
824 | 16181153 | 1822 | 0.000016 | com.sky |
825 | 16180898 | 3587 | 0.000008 | edu.syr |
826 | 16180704 | 442 | 0.000055 | com.teamviewer |
827 | 16180502 | 1741 | 0.000017 | edu.cuny |
828 | 16179437 | 1492 | 0.000022 | de.heise |
829 | 16179398 | 2290 | 0.000012 | com.refinery29 |
830 | 16179105 | 1373 | 0.000025 | com.gigaom |
831 | 16179051 | 7653 | 0.000004 | nr.co |
832 | 16178943 | 3066 | 0.000009 | com.seekingalpha |
833 | 16178795 | 509 | 0.000049 | com.informit |
834 | 16178394 | 2169 | 0.000013 | com.pbworks |
835 | 16176779 | 2857 | 0.000010 | com.threadless |
836 | 16176523 | 1009 | 0.000034 | com.spoke |
837 | 16176244 | 1690 | 0.000018 | com.salon |
838 | 16175314 | 864 | 0.000036 | com.tractorsupply |
839 | 16175036 | 436 | 0.000055 | ru.vkontakte |
840 | 16173419 | 7092 | 0.000004 | com.xanga |
841 | 16173061 | 834 | 0.000036 | com.withoutabox |
842 | 16172167 | 3431 | 0.000008 | edu.rochester |
843 | 16172004 | 1928 | 0.000015 | google.blog |
844 | 16171996 | 3039 | 0.000009 | cc.tiny |
845 | 16171862 | 2338 | 0.000012 | com.sony |
846 | 16171745 | 495 | 0.000050 | com.mapbox |
847 | 16171579 | 2216 | 0.000013 | edu.uiuc |
848 | 16171435 | 1369 | 0.000025 | com.justgiving |
849 | 16171018 | 973 | 0.000034 | com.quandl |
850 | 16170907 | 3044 | 0.000009 | edu.oregonstate |
851 | 16170779 | 3088 | 0.000009 | edu.rice |
852 | 16170257 | 989 | 0.000034 | com.citysquares |
853 | 16169303 | 1522 | 0.000021 | com.accenture |
854 | 16169045 | 1717 | 0.000018 | gov.weather |
855 | 16168264 | 2578 | 0.000011 | ch.cern |
856 | 16167787 | 2365 | 0.000012 | com.nbcsports |
857 | 16167665 | 3458 | 0.000008 | tt.db |
858 | 16167457 | 1311 | 0.000027 | gov.ny |
859 | 16166904 | 3764 | 0.000007 | com.panoramio |
860 | 16166471 | 398 | 0.000061 | com.list-manage1 |
861 | 16165716 | 3499 | 0.000008 | edu.fsu |
862 | 16165602 | 1672 | 0.000019 | com.indeed |
863 | 16164824 | 1670 | 0.000019 | org.gnome |
864 | 16164267 | 2306 | 0.000012 | com.motherjones |
865 | 16164109 | 3286 | 0.000008 | com.techsmith |
866 | 16164023 | 2021 | 0.000014 | de.bild |
867 | 16163786 | 987 | 0.000034 | com.zwire |
868 | 16163768 | 936 | 0.000035 | org.gwtproject |
869 | 16163441 | 1593 | 0.000020 | uk.co.thetimes |
870 | 16161745 | 1183 | 0.000031 | com.hostgator |
871 | 16161683 | 2247 | 0.000013 | com.shutterfly |
872 | 16161224 | 7557 | 0.000004 | com.weheartit |
873 | 16161078 | 1037 | 0.000033 | com.lacartes |
874 | 16159885 | 2120 | 0.000014 | me.flavors |
875 | 16159494 | 1869 | 0.000016 | com.digitaltrends |
876 | 16158877 | 2518 | 0.000011 | com.lego |
877 | 16158867 | 4685 | 0.000006 | com.skyrock |
878 | 16158248 | 1455 | 0.000023 | com.ssrn |
879 | 16156591 | 774 | 0.000038 | ru.google |
880 | 16156380 | 1661 | 0.000019 | ru.narod |
881 | 16155319 | 2711 | 0.000010 | au.edu.anu |
882 | 16155135 | 2907 | 0.000010 | net.nocookie |
883 | 16154395 | 1682 | 0.000018 | com.infoworld |
884 | 16153736 | 1777 | 0.000017 | com.starbucks |
885 | 16152817 | 1018 | 0.000034 | com.live5news |
886 | 16151932 | 3984 | 0.000007 | to.gplus |
887 | 16151470 | 4044 | 0.000007 | org.nypl |
888 | 16151454 | 2106 | 0.000014 | com.trendmicro |
889 | 16150935 | 1616 | 0.000019 | com.codeplex |
890 | 16150786 | 1649 | 0.000019 | com.gettyimages |
891 | 16150162 | 516 | 0.000049 | com.typeform |
892 | 16149380 | 1814 | 0.000016 | com.amzn |
893 | 16149212 | 1733 | 0.000018 | com.upwork |
894 | 16148959 | 2374 | 0.000012 | com.hatenablog |
895 | 16148537 | 1275 | 0.000028 | uk.co.eventbrite |
896 | 16148311 | 2952 | 0.000009 | ly.cl |
897 | 16148267 | 979 | 0.000034 | au.com.yelp |
898 | 16147669 | 1221 | 0.000030 | com.linksynergy |
899 | 16147623 | 2450 | 0.000012 | tv.blip |
900 | 16147593 | 966 | 0.000034 | com.strawberryperl |
901 | 16146692 | 2516 | 0.000011 | com.ezinearticles |
902 | 16146250 | 5746 | 0.000005 | com.minus |
903 | 16146123 | 1280 | 0.000028 | gov.archives |
904 | 16146002 | 991 | 0.000034 | net.brownbook |
905 | 16145895 | 2041 | 0.000014 | org.c-span |
906 | 16145833 | 4399 | 0.000006 | com.treehugger |
907 | 16145696 | 1496 | 0.000022 | se.google |
908 | 16145590 | 1413 | 0.000024 | com.smashingmagazine |
909 | 16145205 | 3120 | 0.000009 | com.askmen |
910 | 16144599 | 2359 | 0.000012 | com.rt |
911 | 16144362 | 1340 | 0.000026 | gov.sba |
912 | 16143368 | 2191 | 0.000013 | com.madmimi |
913 | 16143289 | 3201 | 0.000009 | com.voanews |
914 | 16142193 | 1031 | 0.000034 | edu.alamo |
915 | 16141278 | 1293 | 0.000028 | be.google |
916 | 16141249 | 98 | 0.000323 | org.nginx |
917 | 16140255 | 2790 | 0.000010 | com.asus |
918 | 16139778 | 1699 | 0.000018 | com.techradar |
919 | 16139702 | 2009 | 0.000014 | com.allthingsd |
920 | 16139074 | 2150 | 0.000013 | com.mentalfloss |
921 | 16138955 | 4009 | 0.000007 | net.minecraft |
922 | 16137702 | 4417 | 0.000006 | com.pbase |
923 | 16136223 | 1659 | 0.000019 | com.bloglovin |
924 | 16136014 | 1523 | 0.000021 | com.forrester |
925 | 16135924 | 929 | 0.000035 | com.sacurrent |
926 | 16135556 | 1182 | 0.000032 | com.strikingly |
927 | 16135377 | 1781 | 0.000017 | org.openoffice |
928 | 16134817 | 1054 | 0.000033 | com.garmin |
929 | 16134754 | 1157 | 0.000033 | org.postimg |
930 | 16134565 | 2475 | 0.000011 | com.eonline |
931 | 16134180 | 1595 | 0.000020 | com.lulu |
932 | 16131936 | 1809 | 0.000016 | com.ibtimes |
933 | 16131778 | 924 | 0.000035 | com.fabric |
934 | 16131713 | 1655 | 0.000019 | com.zillow |
935 | 16131623 | 990 | 0.000034 | com.shareasale |
936 | 16131491 | 2161 | 0.000013 | com.history |
937 | 16131332 | 1542 | 0.000021 | com.mcafee |
938 | 16131031 | 5442 | 0.000005 | com.archdaily |
939 | 16130791 | 324 | 0.000073 | com.cloudinary |
940 | 16130604 | 3700 | 0.000007 | com.thingiverse |
941 | 16130416 | 3633 | 0.000008 | com.starwars |
942 | 16130039 | 3149 | 0.000009 | com.pitchfork |
943 | 16130007 | 3528 | 0.000008 | com.gyazo |
944 | 16129708 | 1861 | 0.000016 | ca.huffingtonpost |
945 | 16129039 | 355 | 0.000068 | com.monster |
946 | 16128947 | 4034 | 0.000007 | com.tistory |
947 | 16128783 | 4079 | 0.000007 | edu.utk |
948 | 16128549 | 3858 | 0.000007 | com.lmgtfy |
949 | 16128496 | 1064 | 0.000033 | mp.mailchi |
950 | 16127860 | 1724 | 0.000018 | com.ssllabs |
951 | 16127481 | 1247 | 0.000029 | org.moodle |
952 | 16126306 | 1017 | 0.000034 | org.simile-widgets |
953 | 16126142 | 2231 | 0.000013 | com.invisionapp |
954 | 16126015 | 2105 | 0.000014 | com.real |
955 | 16125289 | 3640 | 0.000007 | edu.buffalo |
956 | 16124973 | 3342 | 0.000008 | com.indiewire |
957 | 16124959 | 283 | 0.000082 | org.debian |
958 | 16124811 | 2030 | 0.000014 | com.ew |
959 | 16124811 | 1531 | 0.000021 | com.uber |
960 | 16124747 | 5051 | 0.000006 | edu.gsu |
961 | 16124457 | 836 | 0.000036 | com.list-manage2 |
962 | 16124380 | 1364 | 0.000025 | net.java |
963 | 16123933 | 1167 | 0.000032 | com.tandfonline |
964 | 16123911 | 486 | 0.000051 | com.taobao |
965 | 16123603 | 1660 | 0.000019 | com.bmj |
966 | 16123240 | 3420 | 0.000008 | org.lifehack |
967 | 16122808 | 2302 | 0.000012 | com.canalblog |
968 | 16122597 | 2141 | 0.000013 | edu.ucsc |
969 | 16122368 | 980 | 0.000034 | org.tpr |
970 | 16122358 | 2781 | 0.000010 | nl.utwente |
971 | 16121608 | 1941 | 0.000015 | com.getresponse |
972 | 16121577 | 2631 | 0.000011 | com.dallasnews |
973 | 16120998 | 2237 | 0.000013 | edu.colorado |
974 | 16118560 | 1638 | 0.000019 | com.ecwid |
975 | 16118476 | 1287 | 0.000028 | es.amazon |
976 | 16118471 | 1022 | 0.000034 | com.ibegin |
977 | 16118135 | 1637 | 0.000019 | com.deezer |
978 | 16117989 | 1394 | 0.000024 | jp.ne.goo |
979 | 16117772 | 1971 | 0.000015 | jp.ne.biglobe |
980 | 16117756 | 2130 | 0.000014 | edu.bu |
981 | 16117709 | 214 | 0.000114 | com.homestead |
982 | 16117477 | 931 | 0.000035 | com.chamberofcommerce |
983 | 16116130 | 5892 | 0.000005 | ie.tcd |
984 | 16115772 | 4085 | 0.000007 | edu.uconn |
985 | 16114731 | 3590 | 0.000008 | edu.usf |
986 | 16114702 | 1526 | 0.000021 | com.warnerbros |
987 | 16114348 | 4777 | 0.000006 | ca.ucalgary |
988 | 16113857 | 2014 | 0.000014 | hk.com.google |
989 | 16113786 | 178 | 0.000139 | com.parallels |
990 | 16113467 | 1841 | 0.000016 | com.getfirebug |
991 | 16113219 | 1530 | 0.000021 | com.waze |
992 | 16113141 | 3372 | 0.000008 | ru.org |
993 | 16112949 | 3183 | 0.000009 | com.polyvore |
994 | 16112624 | 2473 | 0.000011 | com.campaignmonitor |
995 | 16112555 | 1684 | 0.000018 | com.thehill |
996 | 16112407 | 985 | 0.000034 | com.showmelocal |
997 | 16112353 | 1321 | 0.000027 | gov.usgs |
998 | 16111937 | 1908 | 0.000015 | jp.or.nhk |
999 | 16111757 | 5851 | 0.000005 | com.rapidshare |
1000 | 16111647 | 3040 | 0.000009 | com.expedia |
Graphs of January 2018 Crawl
Erroneously we released webgraphs and rankings of a single monthly crawl (January 2018) instead of a quarterly release covering 3 crawls. To ensure reproducibility we’ve preserved the erronuous release.
The host-level graph consists of 775 million nodes and 2.7 billion edges. The graph includes dangling nodes i.e. hosts that have not been crawled yet are pointed to from a link on a crawled page. There are 719 million dangling nodes (93%).
Download files of the Common Crawl Jan 2018 host-level webgraph
Size | File | Description |
---|---|---|
4.84 GB | cc-main-2018-jan-host-vertices.txt.gz | nodes 〈id, rev host〉 |
10.21 GB | cc-main-2018-jan-host-edges.txt.gz | edges 〈from_id, to_id〉 |
4.90 GB | cc-main-2018-jan-host.graph | graph in BVGraph format |
2 kB | cc-main-2018-jan-host.properties | |
5.94 GB | cc-main-2018-jan-host-t.graph | transpose of the graph (outlinks mapped to inlinks) |
2 kB | cc-main-2018-jan-host-t.properties | |
1 kB | cc-main-2018-jan-host.stats | WebGraph statistics |
10.79 GB | cc-main-2018-jan-host-ranks.txt.gz | harmonic centrality and pagerank |
The domain-level graph with 70 million nodes and 835 million edges has 60% or 42 million nodes are dangling nodes, the largest strongly connected component covers 22 million or 31% of the nodes.
Download files of the Common Crawl Jan 2018 domain-level webgraph
Size | File | Description |
---|---|---|
0.49 GB | cc-main-2018-jan-domain-vertices.txt.gz | nodes 〈id, rev domain, num hosts〉 |
3.30 GB | cc-main-2018-jan-domain-edges.txt.gz | edges 〈from_id, to_id〉 |
1.80 GB | cc-main-2018-jan-domain.graph | graph in BVGraph format |
2 kB | cc-main-2018-jan-domain.properties | |
1.89 GB | cc-main-2018-jan-domain-t.graph | transpose of the graph |
2 kB | cc-main-2018-jan-domain-t.properties | |
1 kB | cc-main-2018-jan-domain.stats | WebGraph statistics |
1.46 GB | cc-main-2018-jan-domain-ranks.txt.gz | harmonic centrality and pagerank |
Credits
Thanks to the authors of the WebGraph framework, whose software made the computation of graph properties and ranks possible.
We hope the data will be useful for you to do any kind of research on ranking, graph analysis, link spam detection, etc. Let us know about your results via Common Crawl’s Google Group!