In this article, I will talk mostly about Vue.js, since it is the framework I’ve used most, and with which I have direct experiences in terms of indexing by the search engines on major projects, but I can assume that most of what I will cover is valid for other frameworks, too.
Replacing jQuery With Vue.js
Did you know that you can incorporate Vue into your project the same way that you would incorporate jQuery — with no build step necessary? Read a related article →
Some Background On The Problem
How Indexing Works
For your website to be indexed by Google, it needs to be crawled by Googlebot (an automated indexing software that visits your website and saves the contents of pages to its index) following links within each page. Googlebot also looks for special Sitemap XML files in websites to find pages that might not be linked correctly from your public site and to receive extra information on how often the pages in the website change and when they have last changed.
A Little Bit Of History
“Serving Googlebot different content than a normal user would see is considered cloaking, and would be against our Webmaster Guidelines.”
Google (with its AJAX crawling scheme) also guaranteed that you would avoid penalties due to the fact that in this case you were serving different content to Googlebot and to the user. However, since 2015, Google has deprecated that practice with an official blog post that told website managers the following:
How Does Google Actually Index Pages Created With Front-End Frameworks?
In order to see what Google actually indexes in websites that have been created with a front-end framework, I built a little experiment. It does not cover all use cases, but it is at least a means to find out more about Google’s behavior. I built a small website with Vue.js and had different parts of text rendered differently.
The website’s contents are taken from the description of the book Infinite Jest by David Foster Wallace in the Infinite Jest Wiki (thanks guys!). There are a couple of introductory texts for the whole book, and a list of characters with their individual biography:
- Some text in the static HTML, outside of the Vue.js main container;
- Some text is rendered immediately by Vue.js because it is contained in variables which are already present in the application’s code: they are defined in the component’s
- #Some text is rendered by Vue.js from the
dataobject, but with a delay of 300ms;
- The character bios come from a set of rest APIs, which I’ve built on purpose using Sandbox. Since I was assuming that Google would execute the website’s code and stop after some time to take a snapshot of the current state of the page, I set each web service to respond with an incremental delay, the first with 0ms, the second with 300ms, the third with 600ms and so on up to 2700ms.
Each character bio is shortened and contains a link to a sub-page, which is available only through Vue.js (URLs are generated by Vue.js using the history API), but not server-side (if you call the URL of the page directly, you get no response from the server), to check if those got indexed too. I assumed that these would not get indexed, since they are not proper links which render server-side, and there’s no way that Google can direct users to those links directly. But I just wanted to check.
I published this little test site to my Github Pages and requested indexing — take a look.
The results of the experiment (concerning the homepage) are the following:
- The contents which are already in the static HTML content get indexed by Google (which is rather obvious);
- The contents which are generated by Vue in real-time always get indexed by Google;
- The contents which are generated by Vue, but rendered after 300ms get indexed as well;
- The contents which come from the web service, with some delay, might get indexed, but not always. I’ve checked Google’s indexing of the page in different moments, and the content which was inserted last (after a couple of seconds) sometimes got indexed, sometimes it didn’t. The content that gets rendered pretty quickly does get indexed most of the time, even if it comes from an asynchronous call to an external web service. This depends on Google having a render budget for each page and site, which depends on its internal algorithms, and it might vary wildly depending on the ranking of your site and the current state of Googlebot’s rendering queue. So you cannot rely on content coming from external web services to get indexed;
- The subpages (as they are not accessible as a direct link) do not get indexed as expected.
What does this experiment tell us? Basically, that Google does index dynamically generated content, even if comes from an external web service, but it is not guaranteed that content will be indexed if it “arrives too late”. I have had similar experiences with other real, production websites besides this experiment.
Okay, so the content gets indexed, but what this experiment doesn’t tell us is: will the content be ranked competitively? Will Google prefer a website with static content to a dynamically-generated website? This is not an easy question to answer.
From my experience, I can tell that dynamically-generated content can rank in the top positions of the SERPS. I’ve worked on the website for a new model of a major car company, launching a new website with a new third-level domain. The site was fully generated with Vue.js — with very little content in the static HTML besides
<title> tags and
The site started ranking for minor searches in the first few days after publication, and the text snippets in the SERPs reported words coming directly from the dynamic content.
Within three months it was ranking first for most searches related to that car model — which was relatively easy since it was hosted on an official domain belonging to the car’s manufacturer, and the domain was heavily linked from reputable websites.
But given the fact that we had had to face strong opposition from the SEO company that was in charge of the project, I think that the result was still remarkable.
Due to the tight deadlines and lack of time given for the project, we were going to publish the site without pre-rendering.
What Google does not index is heavily-animated text. The site of one of the companies I work with, Rabbit Hole Consulting, contains lots of text animations, which are performed while the user scrolls, and require the text to be split into several chunks across different tags.
The main texts in the website’s home page are not meant for search engine indexing since they are not optimized for SEO. They are not made of tech-speak and do not use keywords: they are only meant to accompany the user on a conceptual journey about the company. The text gets inserted dynamically when the user enters the various sections of the home page.
None of the texts in these sections of the website gets indexed by Google. In order to get Google to show something meaningful in the SERPs, we added some static text in the footer below the contact form, and this content does show as part of the page’s content in SERPs.
The text in the footer gets indexed and shown in SERPs, even though it is not immediately visible to the users unless they scroll to the bottom of the page and click on the “Questions” button to open the contact form. This confirms my opinion that content does get indexed even if it is not shown immediately to the user, as long as it is rendered soon to the HTML — as opposed to being rendered on-demand or after a long delay.
What About Pre-Rendering?
So, why all the fuss about pre-rendering — be it done server-side or at project compilation time? Is it really necessary? Although some frameworks, like Nuxt, make it much easier to perform, it is still no picnic, so the choice whether to set it up or not is not a light one.
I think it is not compulsory. It is certainly a requirement if a lot of the content you want to get indexed by Google comes from external web service and is not immediately available at rendering time, and might — in some unfortunate cases — not be available at all due to, for example, web service downtime. If during Googlebot’s visits some of your content arrives too slowly, then it might not be indexed. If Googlebot indexes your page exactly at a moment in which you are performing maintenance on your web services, it might not index any dynamic content at all.
Furthermore, I have no proof of ranking differences between static content and dynamically-generated content. That might require another experiment. I think that it is very likely that, if content comes from external web service and does not load immediately, it might impact on Google’s perception of your site’s performance, which is a very important factor for ranking.
Recommended reading: How Mobile Web Design Affects Local Search (And What To Do About It)
Google has recently announced that it is now running the latest version of Chromium (74, at the time of writing) in Googlebot, and that the version will be updated regularly. The fact that Google was running Chromium 41 might have had big implications for sites which decided to disregard compatibility with IE11 and other old browsers.
You can see a comparison of Chromium 41 and Chromium 74’s support for features here, however, if your site was already polyfilling missing features to stay compatible with older browsers, there should have been no problem.
Always use polyfills since you never know which browser misses support for features that you think are commonplace. For example, Safari did not support a major and very useful new feature like IntersectionObserver until version 12.1, which came out in March 2019.
Other Search Engines
The other search engines do not work as well as Google with dynamic content. Bing does not seem to index dynamic content at all, nor do DuckDuckGo or Baidu. Probably those search engines lack the resources and computing power that Google has in spades.
Note: To get more information on other search engines’ rendering capabilities, you can check this article by Bartosz Góralewicz. It is a bit old, but according to my experience, it is still valid.
Remember that your site will be visited by other bots as well. The most important examples are Twitter, Facebook, and other social media bots that need to fetch meta information about your pages in order to show a preview of your page when it is linked by their users. These bots will not index dynamic content, and will only show the meta information that they find in the static HTML. This leads us to the next consideration.
title tag and meta description/information.
The conclusions I’ve come to while researching this article are the following:
- If you only target Google, it is not mandatory to use pre-rendering to have your site fully indexed, however:
- You should not rely on third-party web services for content that needs to be indexed, especially if they don’t reply quickly.
- The content you insert into your HTML immediately via Vue.js rendering does get indexed, but you shouldn’t use animated text or text that gets inserted in the DOM after user actions like scrolling, etc.
- If your site has multiple pages, you still need to have some logic to create pages that, while relying on the same front-end rendering system as the home page, can be indexed by Google as individual URLs.
- If you need to have different description and preview images for social media between different pages, you will need to address this too, either server-side or by compiling static pages for each URL.
- If you need your site to perform on search engines other than Google, you will definitely need pre-rendering of some sort.