Menu Search
Jump to the content X X
Design Systems book

We use ad-blockers as well, you know. We gotta keep those servers running though. Did you know that we publish useful books such as Design Systems, a book dedicated to creating design systems and pattern libraries? Check the table of contents →

Searchable Dynamic Content With AJAX Crawling

Google Search likes simple, easy-to-crawl websites. You like dynamic websites that show off your work and that really pop. But search engines can’t run your JavaScript. That cool AJAX routine that loads your content is hurting your SEO.

Google’s robots parse HTML with ease; they can pull apart Word documents, PDFs and even images from the far corners of your website. But as far as they’re concerned, AJAX content is invisible.

Further Reading on SmashingMag: Link

The Problem With AJAX Link

AJAX has revolutionized the Web, but it has also hidden its content. If you have a Twitter account, try viewing the source of your profile page. There are no tweets there — just code! Almost everything on a Twitter page is built dynamically through JavaScript, and the crawlers can’t see any of it. That’s why Google developed AJAX crawling.

Because Google can’t get dynamic content from HTML, you will need to provide it another way. But there are two big problems: Google won’t run your JavaScript, and it doesn’t trust you.

Google indexes the entire Web, but it doesn’t run JavaScript. Modern websites are little applications that run in the browser, but running those applications as they index is just too slow for Google and everyone else.

The trust problem is trickier. Every website wants to come out first in search results; your website competes with everyone else’s for the top position. Google can’t just give you an API to return your content because some websites use dirty tricks like cloaking5 to try to rank higher. Search engines can’t trust that you’ll do the right thing.

Google needs a way to let you serve AJAX content to browsers while serving simple HTML to crawlers. In other words, you need the same content in multiple formats.

Two URLs For The Same Content Link

Let’s start with a simple example. I’m part of an open-source project called Spiffy UI. It’s a Google Web Toolkit6 (GWT) framework for REST and rapid development. We wanted to show off our framework, so we made SpiffyUI.org7 using GWT.

GWT is a dynamic framework that puts all of our content in JavaScript. Our index.html file looks like this:

   <script type="text/javascript" language="javascript"

Everything is added to the page with JavaScript, and we control our content with hash tags8 (I’ll explain why a little later). Every time you move to another page in our application, you get a new hash tag. Click on the “CSS” link and you’ll end up here:

The URL in the address bar will look like this in most browsers:

We’ve fixed it up with HTML5. I’ll show you how later in this article.

This simple hash works well for our application and makes it bookmarkable, but it isn’t crawlable. Google doesn’t know what a hash tag means or how to get the content from it, but it does provide an alternate method for a website to return content. So, we let Google know that our hash is really JavaScript code instead of just an anchor on the page by adding an exclamation point (a “bang”), like this:!css

This hash bang is the secret sauce in the whole AJAX crawling scheme. When Google sees these two characters together, it knows that more content is hidden by JavaScript. It gives us a chance to return the full content by making a second request to a special URL:

The new URL has replaced the #! with ?_escaped_fragment_=. Using a URL parameter instead of a hash tag is important, because parameters are sent to the server, whereas hash tags are available only to the browser.

That new URL lets us return the same content in HTML format when Google’s crawler requests it. Confused? Let’s look at how it works, step by step.

Snippets Of HTML Link

The whole page is rendered in JavaScript. We needed to get that content into HTML so that it is accessible to Google. The first step was to separate into snippets of HTML.

Google still thinks of a website as a set of pages, so we needed to serve our content that way. This was pretty easy with our application, because we have a set of pages, and each one is a separate logical section. The first step was to make the pages bookmarkable.

Bookmarking Link

Most of the time, JavaScript just changes something within the page: when you click that button or pop up that panel, the URL of the page does not change. That’s fine for simple pages, but when you’re serving content through JavaScript, you want give users unique URLs so that they can bookmark certain areas of your application.

JavaScript applications can change the URL of the current page, so they usually support bookmarking via the addition of hash tags. Hash tags work better than any other URL mechanism because they’re not sent to the server; they’re the only part of the URL that can be changed without having to refresh the page.

The hash tag is essentially a value that makes sense in the context of your application. Choose a tag that is logical for the area of your application that it represents, and add it to the hash like this:

When a user accesses this URL again, we use JavaScript to read the hash tag and send the user to the page that contains the CSS.

You can choose anything you want for your hash tag, but try to keep it readable, because users will be looking at it. We give our hashes tags like css, rest and security.

Because you can name the hash tag anything you want, adding the extra bang for Google is easy. Just slide it between the hash and the tag, like this:!css

You can manage all of your hash tags manually, but most JavaScript history frameworks will do it for you. All of the plug-ins that support HTML4 use hash tags, and many of them have options for making URLs bookmarkable. We use History.js9 by Ben Lupton10. It’s easy to use, it’s open source, and it has excellent support for HTML5 history integration. We’ll talk more about that shortly.

Serving Up Snippets Link

The hash tag makes an application bookmarkable, and the bang makes it crawlable. Now Google can ask for special escaped-fragment URLs like so:


When the crawler accesses our ugly URL, we need to return simple HTML. We can’t handle that in JavaScript because the crawler doesn’t run JavaScript in the crawler. So, it all has to come from the server.

You can implement your server in PHP, Ruby or any other language, as long as it delivers HTML. is a Java application, so we deliver our content with a Java servlet12.

The escaped fragment tells us what to serve, and the servlet gives us a place to serve it from. Now we need the actual content.

Getting the content to serve is tricky. Most applications mix the content in with the code; but we don’t want to parse the readable text out of the JavaScript. Luckily, Spiffy UI has an HTML-templating mechanism. The templates are embedded in the JavaScript but also included on the server. When the escaped fragment looks for the ID css, we just have to serve CSSPanel.html.

The template without any styling looks very plain, but Google just needs the content. Users see our page with all of the styles and dynamic features:


Google gets only the unstyled version:


You can see all of the source code for our servlet. This servlet is mostly just a look-up table that takes an ID and serves the associated content from somewhere on our server. It’s called because this class also handles the generation of our site map.

Tying It All Together With A Site Map Link

Our site map15 tells the crawler what’s available in our application. Every website should have a site map; AJAX crawling doesn’t work without one.

Site maps are simple XML documents that list the URLs in an application. They can also include data about the priority and update frequency of the app’s pages. Normal entries for site maps look like this:


Our AJAX-crawlable entries look like this:


The hash bang tells Google that this is an escaped fragment, and the rest works like any other page. You can mix and match AJAX URLs and regular URLs, and you can use only one site map for everything.

You could write your site map by hand, but there are tools that will save you a lot of time. The key is to format the site map well and submit it to Google Webmaster Tools.

Google Webmaster Tools Link

Google Webmaster Tools2616 gives you the chance to tell Google about your website. Log in with your Google ID, or create a new account, and then verify your website.


Once you’ve verified, you can submit your site map and then Google will start indexing your URLs.

And then you wait. This part is maddening. It took about two weeks for to show up properly in Google Search. I posted to the help forums half a dozen times, thinking it was broken.

There’s no easy way to make sure everything is working, but there are a few tools to help you see what’s going on. The best one is Fetch as Googlebot18, which shows you exactly what Google sees when it crawls your website. You can access it in your dashboard in Google Webmaster Tools under “Diagnostics.”


Enter a hash bang URL from your website, and click “Fetch.” Google will tell you whether the fetch has succeeded and, if it has, will show you the content it sees.


If Fetch as Googlebot works as expected, then you’re returning the escaped URLs correctly. But you should check a few more things:

  • Validate your site map21.
  • Manually try the URLs in your site map. Make sure to try the hash-bang and escaped versions.
  • Check the Google result for your website by searching for

Making Pretty URLs With HTML5 Link

Twitter leaves the hash bang visible in its URLs, like this:!/ZackGrossbart

This works well for AJAX crawling, but again, it’s slightly ugly. You can make your URLs prettier by integrating HTML5 history.

Spiffy UI uses HTML5 history integration to turn a hash-bang URL like this…!css

… into a pretty URL like this:

HTML5 history makes it possible to change this URL parameter, because the hash tag is the only part of the URL that you can change in HTML4. If you change anything else, the entire page reloads. HTML5 history changes the entire URL without refreshing the page, and we can make the URL look any way we want.

This nicer URL works in our application, but we still list the hash-bang version on our site map. And when browsers access the hash-bang URL, we change it to the nicer one with a little JavaScript.

Cloaking Link

Earlier, I mentioned cloaking. It is the practice of trying to boost a website’s ranking in search results by showing one set of pages to Google and another to regular browsers. Google doesn’t like cloaking and may remove offending websites from its search index22.

AJAX-crawling applications always show different results to Google than to regular browsers, but it isn’t cloaking if the HTML snippets contain the same content that the user would see in the browser. The real mystery is how Google can tell whether a website is cloaking or not; crawlers can’t compare content programmatically because they don’t run JavaScript. It’s all part of Google’s Googley power.

Regardless of how it’s detected, cloaking is a bad idea. You might not get caught, but if you do, you’ll be removed from the search index.

Hash Bang Is A Little Ugly, But It Works Link

I’m an engineer, and my first response to this scheme is “Yuck!” It just feels wrong; we’re warping the purpose of URLs and relying on magic strings. But I understand where Google is coming from; the problem is extremely difficult. Search engines need to get useful information from inherently untrustworthy sources: us.

Hash bangs shouldn’t replace every URL on the Web. Some websites have had serious problems23 with hash-bang URLs because they rely on JavaScript to serve content. Simple pages don’t need hash bangs, but AJAX pages do. The URLs do look a bit ugly, but you can fix that with HTML5.

Further Reading Link

We’ve covered a lot in this article. Supporting AJAX crawling means that you need to change your client’s code and your server’s code. Here are some links to find out more:

Thanks to Kristen Riley for help with some of the images in this article.


Footnotes Link

  1. 1
  2. 2
  3. 3
  4. 4
  5. 5
  6. 6
  7. 7
  8. 8
  9. 9
  10. 10
  11. 11
  12. 12
  13. 13
  14. 14
  15. 15
  16. 16
  17. 17
  18. 18
  19. 19
  20. 20
  21. 21
  22. 22
  23. 23
  24. 24
  25. 25
  26. 26
  27. 27

↑ Back to top Tweet itShare on Facebook

Zack Grossbart is an engineer, designer, and author. He's an Architecting Engineer and Human Factors Specialist at Micro Focus where he focuses on enterprise identity and security. Zack began loading DOS from a floppy disk when he was five years old. He's been getting paid to code since he was 15 and started his first software company when he was 16. Zack lives in Cambridge, Massachusetts with his wife and daughter.

  1. 1

    Not a single mention of the History API? Really?

    • 2

      Vitaly Friedman

      September 27, 2011 3:23 pm

      I think it would make a great follow-up article, don’t you think, Chris? ;-)

    • 3

      We’re using history.js to interact with the History API. I’d love to see a follow-up article about working with the browser history.

  2. 4

    I would imagine that cloaking detection would have to come from file/page size comparisons. If one page is 10kb to one Googlebot, but is 100kb to another Googlebot, then there is an obvious difference in content.

    Just my theory.

  3. 5

    I didnt read the whole post, so maybe im off the topic.

    Why not leave all urls as is like and rewrite this when user uses it into
    I´m using this on one site and have had no problems. On the server side if you use hashes it goes into partial rendering or just data retrievment so you render via javascript and without hashes it renders as usual. Wouldnt that be much more prettier?

    • 6

      Michael Yagudaev

      May 20, 2012 10:50 am

      Yes I think you are right luke, I don’t like the “ugly urls” that the google Ajax API uses. A much better approach would be to use the same URL for both things and use the HTML5 history API to avoid that ugly hash in the URL.

      Things should be seamless to the user and the search engine.

  4. 7

    Great article Zach, and very timely too. We have faced this situation many times and wondered how best to address honest SEO without running the risk of facing a ban from the same search engines we’re attempting to accommodate.

  5. 8

    TL;DR? – Bottom-line: This is ugly, incompatible, user-unfriendly and doesn’t make use of capabilities of the web.

    Isn’t this a step backwards? It makes use of javascript for all content, which renders the page useless for browsers without/disabled javascript. I suppose it doesn’t work that well with Opera Mobile and other proxy browsers, but I haven’t got one here so I can’t be definite about it.

    I think the web community should take a step away from using the hash for something else than it is, and instead feed the website in a non-js version and just catch (proper) urls with javascript to make the site dynamic (“ajaxy”). With proper urls I mean urls that actually links to where they lead, instead of (as you’ve got it on Without proper links it’s impossible to open a link in a new tab.

    So this is clearly (in my eyes) a step backwards. Also, is prettier than, right? Takes less time to type and less characters too. And having the slash as a section-divider is a common and user-known pattern. Why add a seemingly (and practically) unneeded character? For someone more interested, have a look at

  6. 9

    Very nice and informative article. Thank you.

  7. 10

    Good article except you kept mentioning google can’t run javascript. This incorrect and google can run javascript which we have verified through our own testing.

    We have found that google doesn’t like ajax but normal javascript maybe fine.

    • 11

      Thanks for the correction Marc. I haven’t been able to get Google to run any GWT generated JavaScript and haven’t had much luck with simple JavaScript either.

      What types of normal JavaScript have you been able to get Google to run? How did you test it?


      • 12

        You can run WebKit from CLI for sure.

        An article how to do some good headles JS stuff

        If anyone can run ANY js code, why can’t google?

        The question is how can google handle onClick events attached by ajax? I bet hardly, but if AJAX changes content in a sense new DOM contains proper href’s, there is no reason big G cant read it.

        Question is: Do they want it?

        If everyone would start using AJAX content, it would mean many millions of USD more required to run google bots, as JS might be very power consuming, so they don’t advertise it

        • 13

          Javascript can be very OO. Even to the point of being Java-like, but you do have to iplmement it yourself. Honestly though, you don’t always need to be completely OO. I much prefer Python, and Javascript to Java, because I feel more productive in general. I don’t need to write an excessive amount of text to do the exact same things, and I can have unit testing and all of that great stuff the same. Even Sun is adding dynamic concepts to Java with closures, and by bringing dynamic languages to the JVM. Rhino, Jython, JRuby, they are all officially supported and being developed by Sun in some way.

  8. 14

    Benjamin Lupton

    September 27, 2011 3:16 pm

    Great and thorough article Zack :-)

    Though I’m a bit confused on why you only recommend using the HTML5 History API for clean urls? As using the HTML5 History API, you don’t need hashbangs at all… and is a much simpler solution than hashbangs altogether, as they require no server-side work (where the hashbang does).

    There is a nice article detailing graceful AJAX with the HTML5 History API here:

    Would love to know your thoughts on this Zack, always willing to learn :-)

    • 15

      Hi Ben,

      GitHub does a really good job of integrating HTML5 history without needing the hashbang. They serve each page individually as static content, but they also load them dynamically when you’re clicking around. That’s an awesome way to get clean URLs, but it doesn’t work for every site.

      Your article talks about loading each page, but that doesn’t help for page which load snippets and other smaller chunks of content through AJAX like Twitter. This is another place in web design where you have to pick the solution that’s right for your site.


      • 16


        While I understand Zack’s point about the need to hashbang (for ajax “lazy loading”, to my understanding). I’m wondering why is no longer using the hashbang in the urls. Any insight Zack?


  9. 17

    Well, imho you make simple things sound complicated. I kinda had problem with fetching of content for site with ajax submenus, but it was all sorted out with site tree page. and if you really need hash sign, do htaccess.

  10. 18

    I’ve read many articles about crawling ajax content, but this one just covered it all with ease ;) Thanks!

  11. 19

    Michaël van Oosten

    September 28, 2011 6:37 am

    Thanks for sharing Zack; very nice read. Nice to see the results when you Google “”; works like a charm! Good job.

  12. 20

    Sweet. I work at a company that uses GWT internally where SEO hasn’t been important, but are about to develop some pages for public consumption.

    Added bonus is I’ll be checking out Spiffy for our use, too.

  13. 21

    Excellent. AJAX is great for websites but as mentioned it comes with its disadvantages. The main thing when building websites to rank in search engines is ensuring they are correct!

  14. 22

    Hi I realy liked this article. But why I can’t share it with my friends by useing LIKE ?


  15. 23


    June 24, 2012 7:18 am

    hi there macauley if you are still in need of them here is the link
    and details,ring them if you need them in a hurry ,tell them mick told you to ring

  16. 24

    Andreas Brekken

    September 11, 2012 12:02 am

    Little late to the party here, but I’ve created a service where your website serves a browser rendered html snapshot of your website in accordance with Google’s ajax crawling specification. If anyone is interested in testing this on their website, drop me an e-mail at


↑ Back to top