Another two videos

– Can you tell us a little bit about Google datacenters?
– Should all datacenters on the same Class C block be roughly the same?

In the middle of that session, I talked about the frustration that modern data center watchers will encounter these days (because there are often slightly different things at different places) and I mentioned a slide from Boston Pubcon. Here’s the slide I was talking about:

Can you imagine trying to monitor that, especially when the same IP address can query different data centers for different people? It wouldn’t be my preferred hobby.

– Is it possible to search just for home pages?
– News Flash: you can use strong and em instead of bold (b) and italics (i) !
– Will we ever see kitty posts again?
– What are Google SSD, Google RS2, and all those other things Tony Ruscoe found?
– Does Google rank blog sites any differently than regular websites?
– Does Google treat .gov.pl links with the same weight as regular .gov links?

And I have to say, I’m kinda tired of talking to a video camera. So I’ll probably take a break from videos for at least a few days. 🙂

I have a question about locations within a webpage. I’ve been told that content high up in my website is deemed more important then content down below, beacause people are more likely to see content above the fold.

Does Google distinct between visually high within a page or high within the code of my page?

Living in Oz, I can’t get to your upcoming sessions, but I’d like more info on the duplicate site situation if possible especially related to mirror sites which happen to be on a different domain. You seemed to be saying it was Ok but Google would choose what they preferred (w/o penalty??). I’m confused about this!

And, by the way, you class a dinosaur as a 90’s gen person ? – I’ll go get my walking stick having started in the 70’s on IBM 370 stuff….arrgghh…

When talking about duplicate content you mentioned multilingual websites. Do you have any specific advice for people making sites which have sections for different language versions of the same content.

Some people use sub-directories, some sub-domains and some go to the expense of registering domain names with country specific TLDs. Does Google recognise these as different sites or parts of the same one?

It’s common to have links on every page which take you, either to the home page of a translated site or to the same page in an alternative language. Can you offer any guidelines on how to mark those up so that Google understands them and treats them appropriately?

I have a question relating to session 2, “Some SEO Myths”. You talk about having multiple sites on a single IP address, and mentioned that 5 or 10 would be okay, but the 2,000 sites from your example is obviously going to run into trouble.

1. Thats a pretty broad range – how would you go with 100? or 200?

2. Splitting them up over multiple IP addresses is worthwhile if you have a lot of sites, right? Is it important to use different C-blocks too?

We host a lot of sites for our clients, many of whom cross-link. I’d hate to think that having all sites on a single dedicated machine would have our clients take a hit to their rankings…

Matt, several people noticed they lost TBPR on a good number of their inside pages. Is Google starting to ignore certain types of links when calculating PR? I figured whether a link is a vote or a part of a linktrade, it would still pass PR under “normal” circumstances. Or is the TBPR showing something that was in place months ago?

Awwwwwww but I thought you were improving greatly! I was gonna nominate you for some sort of Oscar or Emmy or something. Ah well.

I don’t know if this is the right place to make suggestions about Google Video, but is it possible to shorten the length of the slider bar and put the elapsed/total times beside the slider bar instead of above them? When the video gets close to the end (i.e. when the slider itself gets to the elapsed/total times part), it gets somewhat hard to read because the slider is passing over those areas.

Either that, or maybe lose the volume control? Most people will have this on their speakers anyway.

I just wanted to say that these videos have been great. They are both entertaining and informative.

I have a question:
I know you’ve already discussed this, but I thought hearing about it in a video would really make it clear.

When it comes to the link: operator:

I have noticed that link:www.domain.com shows different links than link:www.domain.com/specific-page.html, so I assume that the link: operator works on specific urls and not on the domain as a whole. Would that be right?

Also, what is the purpose of only showing some links? Is the accurate link reporting really that secretive that you can only show a sample, or is it just something that is broken and isn’t really that high of a priority for Google?

I actaully have a duplicate content question that’s been nagging me for a short while…

I’ve got a birth site, which contains birth stories. Women being the sharing creatures that they are, they tend to share them with multiple sites. This results in a decent number of “my” content items being duplicated on various other sites. Obviuosly you can’t give away any secrets, but is there a way for me to know what’s generally safe with regards to that situation? Do I need to fear penalization or worse yet- getting blacklisted???

In relation to the redirect question you answered in your video, how does Google handle 302 redirects vs. 301 redirects? I have several domains that are minor typos of my main domain that are being forwarded by the domain registrar to get people who mistype my domain into the right spot. Unfortunately domain registrars only use 302 redirects. Could this cause any problems with Google seeing duplicate content? Also how important is it to use 301 vs. 302 redirects when redirecting within a site?

Matt, thanks for the nicely detailed answer to my duplicate content question. Very helpful. But you didn’t say how I can challenge Maestro Brin to a Table Tennis game at the Google party. As the hippest international company I think Google’s got to get it’s Table Tennis act in gear to avoid more Valleywag criticism.

I wish they would… armed with that data, and each site’s pagerank, and google’s cache of each site, and the full number of links etc… it’d be a fun contest to try to reverse engineer a search algorithm that matches googles for a specifci query.

You figure one term that has say 100,000 results or less…. you could do this on the average webserver.

If they published the true links, we’d have almost every piece of information that the algorithm looks at freely available to us.

Part of me thinks that’s the reason they don’t show it… part of me wishs I had enough free time to try this out.

I have a question regarding subdomains. Let’s suppose i have a website about the NBA or NFL, and i want to provide some of my visitors with a personal blog under the main domain (eg: mydomain.com ), should i set up all the blogs as subdomains (like matt.mydomain.com ) ? My concern is about how Google treats subdomains and domains. What if one of those personal blogs is used for spamming or if it’s banned by Google? Will it affect the main domain also? Is a subdomain seen as a totally new domain, different than the main domain?
If the main domain is not linking the pennalized subdomain, can it be banned also as a consequence of the spamming in the subdomain?

There are many examples like geocities.com, blogger.com, or many other forums, where users can post lots of outbound links and content which could affect the main domain. Some of them use subdomains and others use folders. And they haven’t beed penalized by Google. Everyone knows there a lot of spamming pages in Geocities for example. But i have also read about many domains banned because of spamming subdomains…

If an IP gets blacklisted because of a subdomain, it will also affect the main domain i think…

So, what can you tell us about how Google manage these situations?

If you have already talked about it please let me know where i can find your answers, thank you very much.

The whole world is focused on backlinks and PageRank and it is obvious that they are important. But what about outbound links in my pages? What does Google do with those? In other words, what are the effects of outbound links in a page on the rankings in Google?

I have a question regarding blogs and duplicate content. If the montly archive pages or category pages have the same content, will it be counted as duplicate? Would bloggers be better off not using monthly archives at all and sticking with the main page plus categories? I am about to delete my monthly archives for this reason but don’t want to do it if it isn’t necessary!

Hi Matt
I would like to ask a ‘off-topic’ question that I can never get a straight answer for or find one that appears to be remotely correct:

On a project we are building: There will be translations in other languages that will include Arabic, Spanish, French and Russian. The point of it is to duplicate the English content so that foreign visitors can read it etc.
Does Google consider this to be duplicate content, will it spider it and figure and realise that it’s (let’s day French) and it is the same content and as a result weaken the page for having duplicate content?
It’s been a burning question for a while and for anyone who is involved with building large structures that call for foreign language translations then quite a valid one.

There was a lot of talk a few years ago about the Dublin Core Initiative which standardized metatags for semantic use and data exchange. Does Google look at and use the DC metatags in the indexing process? Are the DC metatags used in the scoring process at all by Google? If so, does is there any limitation as to top-level domain types, i.e., gov or edu?

I just stumbled upon your videos and website tonight. I have to say what you are doing is awesome. I understand you are not doing this as a Google employee, however, to me, it is nice to see a face of someone from Google that has an apparent interest in addressing the concerns and questions of the webmasters.

As for topics in your next sessions, could you consider addressing one or more of the following:
1) Does Google prefer a domain site over a “Rewrite” site?
2) My website was pummled on June 27th. Then upon correcting the site, my results were higher, however on July 27th I ended up in the same boat. It is apparent I am not doing something correctly, is there a way to find out if I was penalized or if there’s something specific I am doing incorrectly?
3) If your site is penalized, how do you correct this? Does the penalty hold or is it removed over time like a speeding ticket?
4) Lastly, are conferences that you reccomend for webmasters to attend that address SEO concepts and strategies in some detail?

Thanks for your personal time and I definitely appreciate it !
Michael
Charles Town WV

None of the data centers show our Kew Gardens website (kw; Kew Gardens) on page one anymore. This is a wonderful, rich, non-commercial multimedia site which was on page one for over a year. Now it’s down around #800 or so. What happened?

The same thing happened to our Explore St Paul’s Cathedral website. I cannot figure out what we’ve done, other than to provide a rich, in-depth website..

I have a question regarding blogs. Let’s suppose i have a website about the NBA or NFL, and i want to provide some of my visitors with a personal blog under the main domain (eg: mydomain.com ), should i set up all the blogs as subdomains (like mike.mydomain.com ) ? My concern is about how Google treats subdomains and domains. What if one of those personal blogs is used for spamming or if it’s banned by Google? Will it affect the main domain also? Is a subdomain seen as a totally new domain, different than the main domain?
If the main domain is not linking the pennalized subdomain, can it be banned also as a consequence of the spamming in the subdomain?

There are many examples like geocities.com, blogger.com, or many other forums, where users can post lots of outbound links and content which could affect the main domain. Some of them use subdomains and others use folders. And they haven’t beed penalized by Google. Everyone knows there a lot of spamming pages in Geocities for example. But i have also read about many domains banned because of spamming subdomains…

If an IP gets blacklisted because of a subdomain, it will also affect the main domain i think…

So, what can you tell us about how Google manage these situations?

If you have already talked about it please let me know where i can find your answers, thank you very much.

Hi Matt,
I have a question about indexing and duplicate content. Our friends over at Yahoo have indexed some of our clients’ pages in their SERPs with the Google AdWords tracking script. The ONLY way to get to a page with this tracking script is to go through an adwords ad. So the URL in yahoo looks somethign like this: http://www.yourdomain.com?source=googleadwords.

The crew at AdWords has assured me this is not affecting my impressions, CPC or clicks in general, but my tracking is VERY dicey right now because of these indexed pages.

Ive never seen Google do anything similar – do you strip out query strings on SERPS?

Are there duplicate content issues with this? I’ve had one person suggest some 404 pages and noindex robots on that specific URL, but I dont want to block the AdWords spider.

And we’re really frustrated since the 27 June “update” or whatever G’s calling it (or not calling it).

Our Explore the Taj Mahal site went from #1 (for years) down now to #8, and traffic is off by 60%. It is also a very rich site, with Flash, many exquisite 360-degree panos (e.g. from the Roof of the Taj), a “5-Star – WOW!” rating from the Sunday Times (the only virtual tour they gave five stars to), with separate HTML pages of all the assets, plus downloadable MS Word files (with pictures) for use by schools. All-in-all, a truly wonderful site.

And I’ll be darned if I can figure out why some really poor and un-linked sites are now above us… I’d sure appreciate some feedback or ideas… (you can email me via any of our virtual tour websites)

Hi Matt. You have been nice to us with the meta noodp. Can we get some sort of control on the extra links that Google is attaching to no. 1. SERP’s. Sometimes links that are not appropriate are being added.

Would be great to have a rel=toplist style tag that we could ask Google to use one link in preference to another for those number one SERP results.

Just because “add a link” happens to be popular for my site, should not mean that it gets shown on the Google listings.

For software companies that have download links added, would be great to have a little control over the links/packages that are shown.

>>“Google doesn’t treat .edu and .gov links differently, those pages just usually have a lot of PageRank.”

>>Hah! No way I’m believing that!

How hard it is to kill a myth.

Certain ideas about how Google works have reached the status of holy writ among Google-watchers, yet they are based on no solid evidence at all. Another one in the same cluster is the idea that Google gives an extra boost to sites listed in the ODP. Google spokesmen have denied this over and over again.

With the advent of TrustRank at least as a technical possibility, people see a mechanism into which these preconceived ideas will fit. “Aha! We knew it all along. Google is giving an extra boost to .gov, .edu and the ODP.” Now Wikipedia seems to have joined the list as well.

It may seem to make sense that these sites would be ‘trusted’ by Google. Yet the TrustRank paper makes it pretty clear that it doesn’t operate in that way. The ODP and Wikipedia do not fit the criteria used for seed sites. Nor were seed sites selected on the basis of domain. So there might well be some .gov and .edu sites included, but not all of them.

Google staff are the only only who really know the algorithm. But I don’t believe that Matt would lie to us. Why should he?

Just like to say how much I’ve enjoyed your video content.
BUT – One thing that annoys me is that your links don’t open in a ‘new window’. I click one of your video links (or normal links) and I’ve lost your site! I have to keep remembering to SHIFT+CLICK – I would guess this annoys others as well? My recommendation would be to have all your external links open in a new window…

While I’m here 🙂

One of my sites has many supplemental pages – these pages contain links to PDF versions of the same content for visitors to print. Is it possible that I’m being hit with a duplicate content penalty?

I am a website marketer, my friends all think I’m cool.
I sell stuff on the Internet, where my web rankings rule.
One day in May they dropped from sight, were no place to be found.
Google took my website ranks and sent them underground.

If you smell something funny when your on the Internet,
You think it’s body odor, loss of rankings make you sweat,
But it really smells much worse than that, it’s worse than doggie poo,
You weren’t watching where you step, it’s Google on your shoe.

First week of May two thousand six, there was a Google crash.
Their content filter went bizarre, bad websites made a dash
Right into the index where they’d never been before.
Good content they should have crawled was totally ignored.

Google says they do no harm, but didn’t say a word
That anything was wrong or that this even had occurred.
Of course it’s good for AdWords, but they say they never do
Anything organic that would make you need them, true?

The moral to this story is that Google’s not your friend.
Friends will tell you when they’re wrong, be truthful to the end.
They will not betray your trust and will not make you blue.
Which is why you should be careful, don’t get Google on your shoe.

So, if you smell something funny when your on the Internet,
You know your SEO is good, you’ve nothing to regret,
And all your pretty pages simply disappear from view,
It wasn’t anything you did, it’s Google on your shoe.

I’ve been toying with Google Accesible and I must say I am impressed. It is nice to know that the labs have been working on technologies that can remove some of the fluff in the results pages. I noticed a few other interesting effects too that made me curious to ask :

Does this mean that the day when a web page is rewarded ranking based on the merit of its own content is coming near or can we expect to see top rankings awarded mainly from forces externally for some time to come?

Does anybody know the date and time PBS will air the documentary video of the making of the Matt Cutts’ videos? Behind the scene shows exactly what Matt is focusing on when it appears that he is looking directly into the camera while one listens to the back ground music of Pink Floyd’s The Wall…

2) The idea that a government link is worth more is asinine, simply because government sites, like every other sites, are maintained by individuals and groups of people. And as someone who has worked on a site owned by a partly government-funded agency, I can verify that the process of link acquisition is almost identical to that of a private website. The only difference is that the links have to go through a typical government bureaucratic process.

Quality isn’t necessarily at the forefront either: it may be a cross-promotion between two government agencies, or it may be a link that is provided by a funder, or some other weird twist and turn along the way.

3) Would a link from a .GOV website owned by a dictatorship govermnent that relies on propaganda be more valuable than a link from a private webmaster, all other aspects of the site remaining equal?

Protecting the algorithm is certainly essential for Google. Matt makes no secret of the fact that there are things he can’t tell us. But spreading disinformation is unneccessary and would be foolish, when silence serves the purpose.

Besides that, can you explain a bit about how Google changes SERPs based on geography? For example, a website I was looking at ranked 35 when I looked at the SERPs, while the (Danish) owner saw them at 36, and when he checked via a us proxy, he saw 28.

The importance of the question is to help webmasters marketing to a foreign audience actually reach that audience. Local search is great, but most people still go to Google.com (or .ca) and type “cityname + keyword.” And why not, when it works most of the time?
Hoping you answer this in a future video, here are some of the markets it affects:

travel industry
entertainment industry (i.e. Here in Montreal, we have the Jazz festival. People looking for info will probably type “montreal jazz festival”…)
relationship/dating industry
anything else that is location-specific. Lots of people, in other words.

Another thing that’s been bugging me for a while are two questions that have come up from some “Google-digging.” I read most of the paper Sergey Brin and Larry Page wrote at Stanford, and it seemed to suggest that information on capitalization is retrieved. Which bears the question:

Should My Meta Tag And My Content Have Its Keywords Looking Like The Annoying Adwords Ads Selling Made For Adsense Turnkey Sites And Duplicate Content Garbage That Probably Doesn’t Have Good Grammar But Does Run-On Sentences Have? I.e. lots of caps?

Oh, and for my fellow comment readers, here’s a funny quote from another Google scholar paper. Discussing how to achieve high pageRank, the paper states “At worst, you can have manipulation in the form of buying advertisements (links) on important sites. But, this seems well under control since it costs money.” So well under control … When Text-Link-Ads works through AdWords… Oh, the irony.

There are ways of handling public relations which do not involve outright lies. Outright lies are a foolish risk. If a company, government or other body is caught in a lie, damage to its reputation can be massive. Trust may never be reestablished. That might not matter to some fly-by-night operator who will pop up under another name, but reputation and goodwill are vital to those who are in operation for the long haul.

Outright lies about Google’s algorithm would be just plain stupid. It is totally unneccessary. Matt simply does not respond to queries that pertain to Google’s ‘secret sauce’ or he explains that he cannot go further into a particular topic. We have seen that over and over.

Yet you believe that in this instance Matt has changed his pattern of behaviour and is telling an outright lie for no good reason. If the question about .edu had been too sensitive, Matt could simply have ignored it. Instead he took the opportunity to correct a popular myth.

You make a good point but I guess where I differ is that I just don’t think Google would be concerned with bending the truth on this issue if will help them continue to heavily trust backlinks. Can you see the headlines: “Google lies to SEO specialists about search algorithm weights applied to EDU and GOV links, crediablity crumbles, stocks plummet.” An extreme example I know, but you get what I thinking.

@Adam

Scoring links from these sites is a good ranking heuristics, a rule of thumb, it’s not perfect but it’s a lot better than nothing. Last time I checked business decisions aren’t meant to be logical, they’re meant to be profitable, sometimes you get both, sometimes not.

“Are you so productive whenever Mrs Cutts out of town?”
I think that’s the secret, Harith. :)”
—
Of course this may mean that Mrs Cutts will start receiving free weekend passes to spa’s etc courtesy of half the SEO’s on the interweb 😀

Thanks for sharing these videos with us. The information given in these videos are really great. But it’s taking too much time for buffering may be due to heavy size of video. The buffering at the slow internet connection for this video is really tough. If possible then please make optimized version for the low bandwidth users also.

What I’m realy want to know is, why does a DC have like 4/5 pages indexed of a website on one day and on the other day ( mostly weekends ), the same DC has like 10/11 pages indexed. Are those fluctuations in the results due to the CD updates or are those differences of some other nature? Also if I see half od DC have 4/5 pages indexed and the other half have 10/11 pages indexed of a website, should I look forward to see the higher indexed DC to get a lover page index number, or should those DC with a lover number get a higher number during the next few days.

Thanks for the video!!
I have question that if Geo location and language of my site will have any impact on getting better search engine ranking. Also should i use Language Translation to have my site in multiple languages in that case.