FLash SEO Primer

Dynamic websites have advantages over their static counterparts. They don’t
need to reload, include rich client-side functionality and can utilize powerful
graphics. But the biggest drawback often negates any advantage – many portions
of dynamic sites are not SEO-friendly, if crawlable at all.

That’s about to change. Google will soon index Flash sites and, as time goes
on, others will follow suit while Google improves its capabilities. As such, you
should get familiar with some general guidelines to implementing SEO practices
for Flash-based dynamic websites. Knowledge of Flash programming is not
required, but some concepts outlined below will be easier to grasp with some
Flash knowledge.

DEEP LINKS
The main difference between static and dynamic websites is that static sites are
organized in pages. And as static pages dominate the current landscape, most
Internet practices are related to pages. They are referred in search engines,
tracked in statistics, and stored in browser history. So, in order to compete,
dynamic sites must mimic page organization. This is the practice of building
deep links.

The core concept of deep links is their entrance point. Each virtual page should
have a real URL and, if loaded from this URL, the dynamic website should show
this virtual page. For example, if http://yoursite.com displays the main page of
the site, http://yoursite.com/news could load the same site, but show the “news”
page after initial loading. From the main page you could navigate to news and
the URL will remain “http://yoursite.com” since Flash does not require reloading
the page. Here, deep links are used for external reference only.

It is recommended to use deep links together with browser history integration,
so your virtual page structure remains the same. The integration is achieved by
using #-based links that do not require reloading but are stored in browser
history, allowing the “back” button to function correctly. With this addition
the example above will change slightly. If you navigate to http://yoursite.com
then go to “news,” the URL will change to http://yoursite.com/#news without
reloading the page.

The reason why it is recommended to have both http://yoursite.com/news and
http://yoursite/#news variants of your URL is that #-based parts of URLs are
often removed by tools and search engines, so the first variant guarantees link
survival.

To further increase usability it is also recommended to update context menus for
all such links, so the virtual pages can be opened in a new browser window. This
will lead to maximum comfort of website users, especially conservative ones.

The deep links created this way should be demonstrated for all external
services. For example, a standard sitemap should contain deep links references.
Another specific example is Google Analytics and similar tools.

Google Analytics has dynamic website APIs, which can record any client-side
action, such as opening a new page. This way you can store your deep links while
users navigate through the system. But remember to store links in full format
(http://yoursite.com/news). Google Analytics is one of the tools which removes
the #-based URL section.

The first part of the problem is content readability. Usually, data is loaded
into the .swf file from XML like this:

In first case it is quite easy for search engines to find the link but the
second variant is a tricky task. However, both have their problems.

The better solution is to use correct XHTML. For programmers it doesn’t matter
what to parse, and for search engines it will be much easier to understand. So
the previous example look like this:

The second part of the problem is correct links. In the example above, the
website gives internal content links to the search engine. Instead of going to
the deep link of http://yoursite.com/news/good, the search engine will store the
link to its content directly. This will result in users visiting the strange,
unformatted HTML pages instead of your beautiful dynamic website.

To fix this, you need to show external links only and reconstruct the internal
links from them. If the site’s structure is strict, it’s very easy to understand
from this code:

Here the content is being read from http://yoursite.com/news.xml?good. It’s a
simple conversion and a great help to the search engines.

The third part of the same problem is content parsing. For example, for the deep
link http://yoursite.com/news/good, the search engine should parse the news
text. But this doesn’t happen. If the page is working with users, the workflow
is the following:

The deep link loads the common .swf file.

This file reads and parses the URL.

From parsing its URL the file generates the link, http://yoursite.com/news.xml?good
and reads its content.

The content is shown to the user.

These steps are too complex for a simple spider. It reads the file but sees
no text and no links inside.

The great solution for all three problems is RSS/Atom. It has a “link” field,
where you can place correct links, and a “description” field, where the content
can be posted. So if your content is reflected in RSS/Atom, it will be parsed by
search engines correctly. It is recommended to keep the description field brief.
If your descriptions are too long (articles or even books), they will not be
searchable in this way. In addition, with both feeds and readable content, the
relevance will be higher.

The key idea of searchable content rarely changes. After an article is added, it
is stored in one location and contains fixed content. This guarantees good
search engine ratings, but also opens an interesting possibility. For SEO
reasons, .swf files can be generated dynamically.

There is an open source Adobe Flex SDK, which allows (in addition to other
things) you to dynamically build Flex projects from a Java application. So, when
you are adding an article, the site could generate a specific version of the
main website .swf file, containing the article’s text. In this instance, the .swf
loaded on the http://yoursite.com/news/good deep link will contain the news text
and will be easily parsed.

The disadvantage of this solution is the large amount of disk space required for
the .swf files. To reduce it, it is recommended to move all shared content to a
single place. This is especially important for graphics, which take the most
disk space.

ADDITIONAL TIPS
To make your links searchable, try to make them as straightforward as possible.
Programmers tend to make flexible solutions like this:

According to Web standards, crawlers should mark themselves in the “user-agent”
field of an HTTP-request. This allows dynamic websites to recognize spiders and
show adopted static content for them. The disadvantage of this technique is its
convenience for cheating. So, it is expected from search engines to mimic
regular users periodically. The website caught with differences between regular
and crawler content could be marked as a cheater, depending on the search engine
policy. Use this technique at your own risk.

With proper practices, dynamic flash websites can be made almost as SEO-friendly
as static sites. But it is still recommended to have a static version of your
website for the following reasons.

Some users are very conservative.

Not all browser features can be emulated (causing dissatisfaction of
these conservative users.)

Not all search engines parse .swf files and there could be differences.

The .swf parsing has low SEO-cheat protection. Search engines put plenty
of effort into detecting invisible content in HTML in order to mark it with
low ratings. As long as parsers are unable to find invisible elements in .swf
files, they could rate the whole flash site as more suspicious than a static
HTML site.

About the Author: Andrey Gavrilov has worked for over 10 years as a programmer,
project manager and analyst in the IT industry. Read his blog at
www.freelanceconsult.biz.