Webmaster Bloghttp://blogs.bing.com/webmaster
Official Blog for Bing Webmaster Tools & ExperiencesWed, 11 Feb 2015 23:26:32 +0000en-UShourly1http://wordpress.org/?v=4.1.1Understanding and Testing: Implicit Local Querieshttp://blogs.bing.com/webmaster/2015/02/11/understanding-and-testing-implicit-local-queries/
http://blogs.bing.com/webmaster/2015/02/11/understanding-and-testing-implicit-local-queries/#commentsWed, 11 Feb 2015 23:26:32 +0000http://blogs.bing.com/webmaster/?p=9633For this post, we spent some time with Wei Wang, a Senior Program Manager in the Mobile Relevance team here at Bing and asked him to explain some of the basic things they face when answering “local” queries via mobile, and to explain a bit about how Bing fills in the blanks to determine the “best” result for a local-intent query.- Duane

In web search, many queries are associated with location, even when there is no location specified in the query explicitly. For example, with the query local news issued by a user in Seattle, the user expects to get results from Seattle, in most cases. We call these queries Implicit Local queries.

For mobile users, Implicit Local is even more important. Given that typing with a small keyboard in a mobile setting is difficult, it’s more convenient to omit the explicit location in a query; especially if the engine can understand this naturally. On the other hand, mobile users have higher expectations that they will get results close to their current location, because they expect the search engine will know where they are if they have allowed their phone to have access to their location information. You can’t ensure every mobile user’s settings are correct (though we’ll share info on proper settings in a bit to help with your own testing), but you can help make sure we understand your site’s local area focus.

Bing is smart enough to provide search results close to the user’s location for Implicit Local queries. If a mobile user types in the query local news, for example we understand they mean the local news most applicable to themselves. Below you can see examples of the same query term, for two individual locations, where Bing solves for ‘location’ and returns the appropriate results: the top URL is www.king5.com for users from “Seattle, WA”, while it’s www.wtsp.com for users from “Tampa, FL”.

It’s important for Bing to understand whether a query is associated with a location or not. We’re good at this stuff, because if we miss providing local results for an Implicit Local query, that user will not be satisfied. Amplifying the dissatisfaction, they’ll have to issue the query again, appending locations to get a better result. In this next example for a person in Boston, MA, you can see the relevant results for tv guide local listings.

Another challenge for Implicit Local queries is to figure out how to provide the closest relevant result. Here is an example. When a query backpage is issued from “Fort Mitchell, KY”, Bing returns local results for “Cincinnati, OH”. While this might seem like a mistake at first, if we look at it in more detail, we can see that although Cincinnati is in a different state, it is the closest big city around Fort Mitchell. Since there is no local backpage site for Fort Mitchell, the local site for Cincinnati is the best one.

Implicit Local is very important for mobile users. At Bing, we’ve made great progress in serving results, like providing local results to more queries, and improving the relevance of local results step by step. Businesses should always look for ways to more clearly understand their “local” market. In the example above, “backpage” is a common phrase used in this geographic region, but in other areas it may be referenced differently, such as the tried and true “classifieds”. Be sure you’re optimizing your site for the correct, relevant phrases/queries for the region you’re geographically linked to –the area mobile queries will likely result from.

Earlier we mentioned the importance of testing your site. This is important not only across devices, but also within a geographic region, if local queries are your target. Chances are good that within any business there will be a smartphone in more than one pocket. Even better is if you can test across multiple devices, as the browsers on iPhones, Android phones and Windows Phones are each unique. Use this as an opportunity to test things out on your own.

It’s probably worth it to try your testing from the road, too. A bit of a local road trip could prove fun, and educational, as you’ll get a very real sense of how the average searcher in the region will encounter your site in the results. You’ll understand how your site loads, how long it takes to load, how it functions, and if you check your results in search, you’ll see how you rank and what a visitor will see when they click through to your site.

We’ve prepped some quick tips to help you get the settings sorted across three popular Operating Systems. In your phone settings, if you turn location “On” and allow Bing to access your location, it will help to provide a better local search experience for you. Below are examples about how to do it for Windows Phone, iPhone and Android. (left to right)

Because we seek to answer queries as directly and accurately as possible, Webmasters should take advantage of every option to help an engine understand their location. Inside Webmaster Tools the Geo-Targeting option can help with this task at a higher level (think country-level targeting here), and using the Language Tag in your meta data can also be a useful clue. Other useful clues might come from the actual page content, your About Us page, Title Tags and so on. It’s surprising how many businesses assume everyone will know they’re local to a community, and yet not post any location information on their website.

Getting a bit more technical, you can use Schema markup to annotate your address as well. This is, obviously, a preferred solution, though will require some work to implement the markup on your website. This does help us understand clearly the location, and can help when we need to return mobile results to local queries.

With mobile’s growth still continuing upwards, mobile testing remains highly important. By even using just the phones of those in your office, and managing the settings appropriately, you can get a clear indication of what your customers will see.

Investing in the mobile experience is critical today, as consumers shift more and more to mobile for their needs. Don’t miss the opportunity to align with their needs. We are. We recently discussed just this topic in a blog post about mobile relevance and ranking techniques, in fact. And while most consumers won’t really understand the value behind this next tidbit, the Webmaster Blog has a recent post where you can Meet our Mobile Bots to understand when it’s us visiting.

We are on a journey to improve the mobile experience for searchers. Stay tuned for more on this topic!

Wei Wang –
Senior Program Manager, Mobile Relevance

]]>http://blogs.bing.com/webmaster/2015/02/11/understanding-and-testing-implicit-local-queries/feed/0Recommended Reading: The Role of Content Quality in Bing Rankinghttp://blogs.bing.com/webmaster/2014/12/08/recommended-role-of-content-quality-in-ranking/
http://blogs.bing.com/webmaster/2014/12/08/recommended-role-of-content-quality-in-ranking/#commentsMon, 08 Dec 2014 21:00:13 +0000http://blogs.bing.com/webmaster/?p=9423You’ve heard us talk about quality quite a bit on this blog and we can all agree that creating quality content is the only sustainable strategy in attracting and retaining visitors. Naturally, as a search engine we always aim to connect our searchers with the best content out there, so content quality plays an important part in our algorithms. Which brings me to the “recommended reading” part:

My colleague Michael Basilyan from the Bing Content Quality Team just published an excellent post on the Role of Content Quality in Bing Ranking on the Search Quality Blog. In this post Michael delves into the “Three Pillars of Content Quality” (Authority, Utility, and Presentation) and provides some practical examples of how these pillars are applied in Bing ranking.

Needless to say, this is essential stuff for webmasters and SEOs. Enjoy!

Vincent Wehren – Senior Program Manager – Bing Webmaster Experiences

]]>http://blogs.bing.com/webmaster/2014/12/08/recommended-role-of-content-quality-in-ranking/feed/0How Bing and Your Mobile Device Became Friendshttp://blogs.bing.com/webmaster/2014/11/20/bing-and-mobile-friends/
http://blogs.bing.com/webmaster/2014/11/20/bing-and-mobile-friends/#commentsThu, 20 Nov 2014 23:38:09 +0000http://blogs.bing.com/webmaster/?p=9223As we discussed in Meet our Mobile Bots recently, Bing probes websites using device-specific crawlers to understand if they provide a good experience on different devices and platforms and to inform our mobile ranking algorithms. Today we are joined by Mir Rosenberg from the Mobile Relevance team to add some color to the subject by discussing a recent mobile ranking update that resulted from this effort. Enjoy! — Vincent

Traditionally, Bing wasn’t heavily relying on specific device and platform signals to provide web results to the user. You would get similar results on your PC, Mac, or smartphone for most of your searches.

However, we live in a mobile-first, cloud-first world and we need to think about our users’ search experience on mobile devices differently. As a result, we’ve been really intensifying how we look at web results across these mobile devices. We have a long and exciting journey ahead of us, but as a very first step in this long-term investment, we started probing web pages for “mobile friendliness” and ranking web pages accordingly on our users’ mobile phones.

Why is Mobile Relevance Important?

Most of you search from your mobile device more frequently than a year ago, some of you almost exclusively search from your phones. What’s more, comScore expects the number of mobile web users to surpass desktop users for the first time this year:

You likely already know this from your own server logs or analytics package: the number of people visiting sites from mobile searches has been growing, too. So we want them to be happy on Bing (and, by extension, any Bing-powered search) — not just on the PC or Mac but also on their phones.

So when they are using their mobile device, Bing should just know, and Bing should respond differently when it comes to web results. Sounds easy, right?

Mobile Challenges

There are several interesting challenges for mobile relevance when compared to “traditional” relevance. For instance:

It is easy to type URLs on PCs and Macs, but it’s more cumbersome on phones

Some sites have mobile-incompatible content. For example, a non-mobile friendly search result may send you to a page with fonts or buttons so small that you can barely use it without zooming or pinching — if at all

Some pages that work fine on a PC or Mac can be useless on some mobile devices, think Flash-only pages on iOS

In some cases, the “normal” URL redirects to a mobile version, which not only wastes user’s time but also consumes bandwidth on their data plans

All of these user challenges and more were used to inform how we rank pages on mobile devices. For a subset of queries, we made a number of changes that prevent users from getting non-device friendly results such as:

…to results that work well on their device of choice:

These are a few examples of our journey towards increased awareness of mobile friendliness for the pages we rank.

In this example, we know which pages are mobile-friendly so automatically rank them higher with the new update, whereas previously the searcher would have had a much bigger change of landing on a non-mobile friendly page or possibly had to wait for a redirect to a mobile-friendly page.

Mobile Ranking Techniques

As always, there are many ranking factors at play — and mobile raking has its fair share of Secret Squirrel stuff — but here are some of the things that we do to improve mobile relevance:

We identify and classify mobile and device-friendly web pages and websites

We analyze web documents from a mobile point-of-view by looking at:

Content compatibility

Content readability

Mobile functionality (to weed out “junk”, that is pages that are 404 on mobile or Flash only etc.)

Return more mobile-friendly URLs to the mobile SERP

Ranking the results pages based on all of the above

What Site Owners Can Do

Not all sites follow the same mobile site or content strategies, so this creates challenges to all search engines. Ideally, there shouldn’t be a difference between the “mobile-friendly” URL and the “desktop” URL: the site would automatically adjust to the device — content, layout, and all. That’s why we continue to recommend you use responsive designs over separate mobile (m.*) sites and ensure a great experience for users on all devices and avoid compatibility, readability, and functionality issues. Also, make sure to heed the recommendations in Meet our Mobile Bots and allow our crawlers access to all necessary resources.

What’s Next?

The recent update marks the beginning of our journey towards increased mobile relevance and is now improving a small but steadily growing percentage of our mobile queries. There’s lots more to come. So look out for more news about this topic soon!

Mir Rosenberg – Principal Program Manager

Mobile Relevance Team

]]>http://blogs.bing.com/webmaster/2014/11/20/bing-and-mobile-friends/feed/0Meet our Mobile Botshttp://blogs.bing.com/webmaster/2014/11/03/meet-our-mobile-bots/
http://blogs.bing.com/webmaster/2014/11/03/meet-our-mobile-bots/#commentsMon, 03 Nov 2014 21:00:20 +0000http://blogs.bing.com/webmaster/?p=9053In today’s post we are joined by Lee Xiong from the Bing Crawl team. Lee is going to discuss some new developments on the crawl front pertaining to mobile SEO. Enjoy! – Vincent

We can all agree that mobile is the future. Actually, we can’t really say “mobile is the future” anymore. Mobile is the present. Mobile is now. With that in mind, it’s time to take a fresh look at some essential things from the ground up. Specifically, let’s re-examine all the work you’ve been doing to get crawled, selected, indexed, and ranked. This time though, let’s look at it through a mobile lens.

Mobile Crawl to Inform our Rankers

We’ve blogged about the importance of mobile before, but it’s never too late for a reminder. And what we want to discuss today is mainly how Bing is actively checking your website for “mobile compatibility” — an important aspect of how we view your site when we serve our results on Bing or Bing-powered search across mobile devices.

The Power of One

Our original recommendation to webmasters still applies when it comes to mobile: avoid duplication and prevent bifurcation of your ranking power by avoiding separate m.-URLs for mobile. Instead, move to responsive design that adapts to the device and benefit from maximum SEO power instilled in a single URL. This continues to be the way forward for future-looking sites.

Probing the Web for Mobile Friendliness

At the same time, we are cognizant of the fact that many sites still use different URLs for their mobile phone or smart phone customers or have varying levels of user experiences depending on the type of device. So, as true advocates for our users, we are very interested in understanding how your content “renders” on these devices and if it makes a for a good user experience. To that end, we have started to probe websites with a number of new crawlers with the aim to give us the best representation of what our users can expect from your website when viewed on their favorite device.

Introducing our Bingbot Mobile User Agents

As you may know from our help topic Which Crawlers Does Bing Use, we have a number of crawlers to perform our common crawl duties. To understand how your site behaves specifically for our mobile searchers, we have added a couple of new crawler variants which identify themselves with a user agent that mimics some of the most common mobile device types. In general, these crawlers use a user agent string that follow the following format:

In all of these examples, the user agent strings containing “BingPreview” refer to crawlers that are capable of “rendering” the page, just like a user’s browser would. It is therefore paramount that you allow our crawlers to not only find the core content of the URLs themselves, but that you also allow them access to the necessary resources needed to load each page, that is, including any CSS, script, and image files.

Conclusion

It’s no secret that mobile is pervasive and that performing well on mobile devices is becoming more and more important for websites to succeed. To safeguard the best mobile search experience for your users, move towards a responsive model that adapts to your user’s device at the same URL. At the same time, to safeguard the best mobile search experience for our mobile Bing users, we do probe the web with crawlers that emulate the most popular user devices (and which identify themselves as such!) to inform our rankers. To that end, make sure our mobile bots can crawl your site freely and that you are not blocking essential parts of your site (such as JavaScript or CSS files).

For a limited time, we’re opening up our wallets to new webmasters: receive $100* in advertising credit for verifying your site with Bing Webmaster Tools + opening a new Bing Ads account.

Two weeks ago I told you how to get $100* in free advertising credit by becoming a new Bing Ads user if you had a Bing Webmaster Tools account. As it turned out, many of you seized this opportunity to claim your ad credit, just in time for the holiday season. And for those of you who didn’t act yet: simply log into your webmaster tools account today and click the red banner at the bottom of your webmaster navigation menu to see how to redeem your coupon.

The Goodness Continues

But the goodness continues. And this time it is good news for prospective Bing Webmaster Tools users. How come? Well, elated by the fact that so many of you took advantage of our holiday offer, my friends from the Bing Ads team not only have authorized me to continue the campaign for our existing Bing Webmaster Tools users, they also are opening up their wallets even a little more: for a limited time we are doubling the advertising credit that we give new Bing Webmaster users for signing up with us and Bing Ads from $50 to $100*. That’s right: one hundred dollars.

So, if you haven’t had a chance to join Bing Webmaster Tools yet, sign up with us today and not only unlock all of the goodness that Bing Webmaster Tools offers you to help you with your SEO (such as Index Explorer or SEO Analyzer) but also receive $100* in advertising credit to use on the Yahoo! Bing Network. once you’ve gone the few extra steps to sign up with Bing Ads.

]]>http://blogs.bing.com/webmaster/2014/10/23/double-sign-up-credit/feed/8Building Authority & Setting Expectationshttp://blogs.bing.com/webmaster/2014/10/17/building-authority-setting-expectations/
http://blogs.bing.com/webmaster/2014/10/17/building-authority-setting-expectations/#commentsFri, 17 Oct 2014 21:51:51 +0000http://blogs.bing.com/webmaster/?p=8933Authority is defined by the Oxford Dictionary as “a person or organization having power or control in a particular, typically political or administrative, sphere”. For our purposes, we also understand that “authority” conveys a sense of trust and influence. Searchers generally want authoritative sources to engage with. They want to be able to trust the sources they visit.

Naturally, this means being an authority ranks pretty high on the “must do” lists for most businesses. But becoming an authority, as noteworthy a goal as it is, isn’t an easy task.

Some folks simply start referring to themselves as a “thought leader”, “an expert” or an “authority” on a given topic, with the hope it catches on and others start referencing the same way. The trouble with this approach is that eventually, you’ll fail. Someone will ask you a question you don’t know the answer to, but that you would if you were an actual authority, and the jig, as they say, will be up. The fall from grace will be hard, the landing brutal.

In reality, the proper approach to being an authority starts with work. Lots of plain old boring work to learn the topic so in-depth it becomes second nature. You find yourself talking about it over dinner, with friends after work and mumbling about it in your sleep. You steer conversations in its direction. You start to selectively scan for references to anything related to your topic at parties and join conversations to share your thinking, knowledge and learnings. Well, you get the idea. It’s bound to become a part of who you are.

You could also inherit authority as you bring a well-respected offline business onto the Internet, but retaining it will still take a lot of work. Those visiting your website will expect to have the same level of service, access, information and integrity as displayed in your bricks & mortar locations. And while you might have competition a few miles away in your offline community, online your competitors are a simple click away – much tougher bar to crest.

And there is no let up. No rest, no breaks. Building yourself into an authority and maintaining that position takes constant care and feeding. You need to think through every step carefully: sharing on facebook? Keep personal biases out of the conversation (unless that IS your target market, to the exclusion of others). About to Tweet something? Be careful the wording doesn’t offend. Could your few words be misinterpreted? Was that winky smile inclusive or exclusive from your customer’s POV? So much detail to manage that you might think it simply isn’t worth it.

But it is very much worth it.

Being an authority is something search engines look for. Yes, you still have to pass a number of trust hurdles, but the bottom line remains: more authority and more trust usually sees higher rankings. Thus exposure, traffic and gold coins follow! OK, maybe not so much on the gold coins. There are a lot of unshareable details that go into the behind the scenes work the algorithm does, obviously, but simplified it’s pretty clear. People trust authorities. Authorities are easier for engines to trust when ranking. Therefore being an authority is a good idea.

Though it should be noted that your expectations need to be realistic. First, pretty much anyone can step up and become an authority on a topic they choose. So tomorrow might see you outranked by someone with an authoritative edge. This happens every day in life, and so it goes in search, too. By far, though, the biggest issue around authority building is that people vastly underestimate the amount of time and work it takes. We’re not talking about getting an article published. We’re talking getting dozens published. We’re not talking about getting mentions on blogs, or a scattering of interviews, we’re talking winning peer-chosen awards and being a go-to resource reporters turn to for their stories. We’re not talking just blasting out articles via social media, we’re talking taking the time to engage in conversations, answer questions and solve problems via social media. Treating that person on Twitter as if they were physically right in front of you. You must overcome the technology bias to understand that social media is a form of in-person communication.

You can approach this effort directly, with a detailed plan to build your credibility and authority over time, and you can also take a more passive approach. Just do what you do best, blog for example, and let those in your chosen field call out your excellence in their own time. Both perfectly valid approaches.

Both can lead to what you really want – authority. The right to stand among your peers and be acknowledged as one of the best. That little edge that helps the search algorithm choose you over another site, which sees you sitting on top of the stack as THE resource on the topic.

Since our June 2012 re-launch of Bing Webmaster Tools, many webmasters, site owners, and search marketers like yourself have joined our program and use the tools to stay up-to-date on how their web site is performing in Bing and Bing-powered search results.

Millions of content publisher share their Sitemaps, Parameters to Ignore, and Crawl Preferences with us, allowing their websites to shine on Bing.com, Yahoo!, and other places powered by Bing — as all of these things help us to more successfully connect their content to searchers worldwide.

As such, Bing Webmaster Tools is certainly an essential set of tools to master to be successful in in Bing and Bing-powered search. However, as users flock to the web for their Holiday shopping over the next couple of months, you may need some additional options to accelerate your website’s opportunity to get in front of the Holiday crowds.

A $100 USD Opportunity this Holiday Season

Usually we talk mainly about Search Engine Optimization (SEO) and focus on the “free” or “organic” traffic side of things. We don’t usually talk much about the “paid” aspects of Search Engine Marketing (SEM) on the Webmaster Tools blog. However, with the help of the Bing Ads team — who are great friends of the Bing Webmaster Tools program — we are able to talk about the latter a bit more today by presenting you with a unique opportunity:

To help you get started with your Holiday Season ad campaign and attract more holiday shoppers, the Ads team has authorized the Bing Webmaster Team to provide all users that are not yet part of our Bing Ads program $100 dollars in free advertising credit* to spend on the Yahoo! Bing Advertising network this Holiday season when signing up for a new Bing Ads account.

What do you have to do to get the $100?

If you are a Bing Webmaster Tools user that has not yet signed up with Bing Ads, check your inbox where you get your usual Bing Webmaster notifications for our special offer email containing your coupon code, the instructions on how to get into the Bing Ads program, and the exact offer details pertaining to your country**.

On top of that, we will start reminding you of this opportunity and your coupon code inside Webmaster Tools itself. Simply look for the little banner at the bottom of the Webmaster Tools navigation menu when you’ve logged into one of your sites. It will look something like this:

A click on this banner will reveal your coupon code and provide you with instructions and the exact offer details. You will also find a copy of the offer in your Message Center once you log into Webmaster Tools. This comes in handy in case you either missed the invite email or in case you never ticked the right communication preferences box in your Webmaster Profile.

Well, there you have it: $100 in free advertising credit just to sign up with Bing Ads. Go grab it while you can!

Stay up to date with what else is new in Bing Webmaster Tools land and follow me on Twitter: @vincentwehren.

]]>http://blogs.bing.com/webmaster/2014/10/10/get-100-dollars-bing-ads-credit/feed/0MGC Spam Filteringhttp://blogs.bing.com/webmaster/2014/10/08/mgc-spam-filtering/
http://blogs.bing.com/webmaster/2014/10/08/mgc-spam-filtering/#commentsWed, 08 Oct 2014 02:46:27 +0000http://blogs.bing.com/webmaster/?p=8623In today’s edition of the Bing Index Quality blog we will delve into one particular spamming technique – MGC (short for ‘machine generated content’.) We will discuss what it is, why & how spammers employ it and introduce a specific update we shipped a few months ago aimed at detecting and filtering out pages utilizing this technique.

What is MGC and why/ how spammers employ it?

As we mentioned in the Web Spam Filtering overview blog from August 27, an important element of a spammer’s arsenal is the ability to mass-produce pages at little cost to the spammer. This is an essential step in enabling the spammer to maximize their web presence and exposure to search users. Whatever black hat SEO technique they are planning to leverage, the logic is simple – why not apply it to thousands of pages instead of just one and have all thousands vie for a good SERP position. This also enables them to maximize their target area, perhaps through targeting different keywords on different pages.

Here is a relevant paragraph from our earlier blog that describes some of the techniques spammers use to achieve this: “There are a number of approaches spammers utilize to quickly and cheaply generate a large number of webpages, including a) copying other’s content (either entirely or with minor tweaks), b) using programs to automatically generate page content, c) using external APIs to populate their pages with non-unique content. Our technology attempts to detect these and similar mechanisms directly. To amplify this, we also develop creative clustering algorithms (using things like page layout, ads, domain names and WhoIS-type information) that in a way act as force-multipliers to help identify large clusters of these mass produced pages/ sites.”

As you probably figured out, the concept described in b) above is in fact what we refer to in this blogpost as MGC. The concept as such is fairly intuitive and easy to grasp. Let’s review some of the key distinguishing characteristics of this technique to reinforce the concept:

Complexity can range from basic to very sophisticated (e.g. random character generation vs. using the latest language modeling programs.)

Often paired with keyword insertion black-hat SEO

Upon close examination, content is gibberish providing zero user value

Now let’s take a look at a few examples of pages that we’d consider MGC:

Here is a ‘beautiful’ example of MGC that helps illustrate just about every one of the points mentioned above. It includes tons of keyword stuffing (e.g. ‘michael kors bags’, ‘michael kors outlets’), content appears to be copied from multiple sources and joined together, zero thought given to content presentation, content doesn’t make much sense and is incoherent (if you need convincing, just read through the circled paragraph.)

Here is another poster child example. Clearly the spammer is hoping to optimize for ‘nude celebrity’ type queries (which are quite plentiful as you can imagine.) Content is not only gibberish, but is also not pertinent to the topic. Sentence punctuation and word capitalization is busted throughout the page.

In this examples page author doesn’t even try to make the content appear legitimate/ intended for human consumption. Content is completely incoherent (first line tells you all you need to know), just about every sentence is grammatically (and logically) incorrect, images/ text intertwined throwing readability right out the window.

Why care?

While the impact of this technique is not particularly huge (we’ll talk more about this below), we care about it because a) it provides absolutely no value to the user and b) it masks itself such that it’s not immediately obvious that the content is garbage. Having come to an MGC page, the user typically needs to spend (read: waste) some amount of time reading the content before realizing that it’s nonsense (1st example above illustrates this point particularly well.)

How do we combat it?

As in previous posts, I will not go into too much detail since I have no desire to make spammer’s life any easier, but instead talk about the gist of the algorithm and share some of the signals we look at that suggest possible use of MGC technique. At a high level, we look at various aspects of the content that give away its automated nature. If enough evidence is found, then the page becomes an MGC candidate. MGC pages typically have poor grammar, misuse of punctuation, invalid use/ format of proper name, improper capitalization, etc. Content incoherence (i.e. one sentences not making sense next to its neighbor) is another strong giveaway. For a human, spotting MGC is often easy and fairly obvious because language isn’t used correctly, word sequences seem unnatural, and grammar is all off/ over the place. In short, our technology aims to duplicate what comes so easily and naturally to a human reader.

Naturally we need to be very careful labeling pages/ sites as MGC, just like with any other detection technology, out of concern of generating false positives. Just because a page has grammatical errors or uses language that wouldn’t necessarily earn it an A+ in Ms. Johnson’s literature class doesn’t make it MGC. Certain types of pages are particularly susceptible to being misclassified as MGC (e.g. content written by non-native speakers or children, non-standard content that falls outside the normal language models like technical manuals or academic papers.) To mitigate this, we always look for not just evidence of the spamming technique, MGC in this case, but also other supporting information that often accompanies MGC pages (e.g. keyword stuffing, poor quality of content, page popularity – or rather lack thereof, uniqueness of content, etc…) and with the aid of this make the final determination.

What has been the impact on the end user & the SEO community?

Users: This update impacted relatively few queries, only ~0.05% (on average ~1 in 10 results was filtered out per impacted query.)

]]>http://blogs.bing.com/webmaster/2014/10/08/mgc-spam-filtering/feed/2Blame The Meta Keyword Taghttp://blogs.bing.com/webmaster/2014/10/03/blame-the-meta-keyword-tag/
http://blogs.bing.com/webmaster/2014/10/03/blame-the-meta-keyword-tag/#commentsFri, 03 Oct 2014 20:12:59 +0000http://blogs.bing.com/webmaster/?p=8583I blame the meta keywords tag. That little so-and-so started all this. Well, the “tag” and a few crafty humans, really. The idea was pretty simple. If the keyword appeared in the meta keywords tag, the page was relevant to the topic. And, on the surface, this was a solid idea. Something useful to the systems, yet invisible to the average person. Alas, with a simple right-click, the beginnings of modern day search optimization was born.

“What if I added more words? What if I repeated the word or phrase?” And so began the quest to modify pages to satisfy a search algorithm. In truth there were many more things at work than just the humble meta keyword tag, but you get the idea.

Today, it’s pretty clear the meta keyword tag is dead in terms of SEO value. Sure, it might have value for contextual ad systems or serve as a signal to ‘bots plying the web looking for topics to target, but as far as search goes, that tag flat lined years ago as a booster.

Between then and now, however, an entire $20B industry took root and bloomed. By far, most people in this space were legit, but enough weren’t that it still casts a pall over much work associated with SEO today. Inside companies, where groups fight for resources and budget, pitching SEO work is sometimes still tricky business.

We know it works when thoughtfully and accurately applied. But those who don’t speak the language are quick to point to other options: social media, paid search, paid social, email, etc. All ways to move needles faster than what SEO can promise.

And you know what? They’re right. SEO is the turtle in the race. It’s the unsexy concrete foundation under the building. (I guess where they see “unsexy”, we see ‘exciting’, ‘complex’, ‘engineered’ and ‘fundamental’ elements.) Bottom line, though, is that without a firm base of optimization applied, you’re leaving value on the table. You’re letting the competition get ahead. You’re committing corporate treason.

By covering the base work SEO focuses on, and by tackling the tricky, technical, advanced work, you set the whole business on a much more secure footing. And that’s gotta make you feel good about the work you do, right?

Well, it should, but let’s pause for a moment.

Because the reality today is that no business is successful due to a single element. Product alone rarely makes it happen. Marketing alone doesn’t work. Slick PR won’t save a sick product. Search rankings won’t solve fundamental product problems. Social media has the power to make, or break, you. Essentially, the winning program today requires dedicated investment across all areas.

And that’s the reality check for SEOs. Despite the growth of the industry, despite the number of conferences and events, and despite the more mainstream spotlight moving its way towards this work, SEO is finding its intended niche. It’s a marketing tactic and marketing plays a supporting role in a company. There’s nothing wrong with this.

Today’s SEO is a far cry from the place things started. Yet the goals remain the same: increase traffic, increase revenue. Engines are getting smarter, signals sought to determine ranking are shifting and consumer behavior is changing the landscape on both sides. So will SEO remain an important investment point for businesses?

Yes. It’s the foundation of the house, after all. But that foundation does not make a house a home. Everything else you invest in, what you build on top of the foundation and the people accomplish that. It could be said SEO started “for the people”; a webmaster wanting to gain personally. Well, today it truly needs to be “about the people”, your customers, or it’ll end up another failure point.

Duane Forrester
Sr. Product Manager
Bing

]]>http://blogs.bing.com/webmaster/2014/10/03/blame-the-meta-keyword-tag/feed/6Extrapolating Malware Detection with Rolluphttp://blogs.bing.com/webmaster/2014/09/24/extrapolating-malware-detection-with-rollup/
http://blogs.bing.com/webmaster/2014/09/24/extrapolating-malware-detection-with-rollup/#commentsWed, 24 Sep 2014 18:57:22 +0000http://blogs.bing.com/webmaster/?p=8513Protecting Bing users from malware is a top priority for the Index Quality team. To that end, we analyze every signal available to us and determine not only whether the page is infected, but also whether it runs at a high risk of infection at a future date. One of the key elements of this analysis is discovering clues about potential vulnerabilities on the ‘container’ hosting the page that could be exploited by malware distributors to spread their malware to other URLs under the container. In this edition of the Bing Index Quality blog my colleague, David Felstead, in our Anti-Malware team provides an overview of the technique we use to address this and improvements we recently rolled out that aim to improve its coverage and precision.

Igor Rondel, Principal Development Manager, Bing Index Quality

Unfortunately for webmasters and searchers alike, hacked websites are a very real danger of the web. For a web searcher, visiting a website that has been compromised presents a very real risk of their computer being infected with malware. For a webmaster, the discovery of the root cause of the hack, the cleanup of the compromised code, and finally the damage to reputation and brand can be nightmarish. Bing is scouring the web twenty four hours a day, seven days a week to discover hacked websites and malware distributors, to better protect our searchers and to keep webmasters who have had the good sense to sign up with Bing Webmaster Tools informed.

One challenge the Bing anti-malware team faces is striking the balance between detection completeness and accuracy and one major facet of this challenge is understanding when to “rollup” our malware detection, that is, consider an entire segment of a site or the site itself as malicious. At Bing, the nomenclature we use to describe a collection of URLs at the path, host or domain level is a “container”, and this is the basic unit we use for rollup – essentially if a container is rolled up, then every URL under that container will be considered malware; e.g. a rollup on the host “foo.example.com” will cause every URL on that host to be marked as malicious, whereas a rollup under “example.com/malware” will cause all URLs under the path “/malware” and all its sub-paths to be marked as malicious, but not the homepage or other paths. The concept of rollup is fairly well established when thinking of a site’s reputation, be it for malware detection, adult classification or spam discovery; it is the concept that “if >N% of the URLs in a particular container are of one specific category, then the likelihood of the remaining URLs in that container being the same category is increased.” In the case of malware, we use it as a proxy to determine how deep the level of compromise on the site actually is – is it a few isolated pages, or is the entire website under the control of a malware distributor?

Recently our team spent some time re-evaluating and improving rollup for malware detection, specifically the conditions on when and where a rollup judgment will be applied. The balance we need to strike here over-triggering the warning when it appears the compromise may be localized or already cleaned up. To determine where (e.g. at the path level, host level or domain level) to roll up, and whether or not a rollup is warranted, we look at many features of a site:

The number of malicious URLs found in each container vs. how many were scanned;

The overall scan coverage of the container;

The frequency at which the malicious URLs were discovered;

The types of infections found;

The size of the container in URLs vs. the size of the site;

The amount of traffic being sent to the container vs. the amount of traffic being sent to the site;

The “depth” from the root of the site of the malicious URLs (e.g. malware on the homepage is much more problematic than malware on a single page deep within the site);

The popularity of the site;

…and many more

Using this set of features, each container on the site is evaluated to determine if rollup should be applied. By intuition, one might think “well, if you found malware anywhere on this site, shouldn’t the entire site be marked as risky?”, and that is indeed a valid argument. However, we need to take into account that compromises occur in a variety of ways, and by their nature are often extremely transient. Even the most secure, trusted sites may occasionally have malware detected on them not as the result of webmaster carelessness or misconfiguration (what we traditionally consider being “hacked”), but from malicious ads being distributed through third-party ad networks; not an uncommon experience:

In the cases of ad network compromise, infections tend to be transient and short lived, often occurring only once, and perhaps never showing up to a real person – in this case, a rollup of a site or container would be unwarranted. However, if the infection is persistent, (i.e. is observed several times), widespread (across many URLs on a site) or recurrent (is cleaned up, then reoccurs) then rollup is likely the best way to protect the users of the site.

Since we made the improvements to our rollup algorithm, we have observed the following changes, which we feel indicate a much higher level of protection for our customers:

Rollup coverage on URLs in the Bing crawled index increased by 2x

60% more high-risk malware URLs flagged with rollup on Bing SERPs

Approximately 0.015% of Bing query traffic affected, that is ~1 in every 7000 queries

From a webmaster perspective, Bing reports rollup and infection information via Bing Webmaster Tools, so if you’re a webmaster and have not signed up, what are you waiting for?

As always, we are constantly observing and re-evaluating our data, telemetry, techniques and technologies, not to mention the state of the malware ecosystem on the web, to provide the best and most secure search experience to Bing users. The web is a dynamic and ever changing place; even more so when it comes to illegal activity such as malware distribution. As such, we never have the luxury of “resting on our laurels”, so check back regularly for more updates and information of what we do here at Bing, we have plenty to share.