100% found this document useful (2 votes)
220 views

The Ultimate Guide To JavaScript SEO (2020 Edition) - Onely Blog

The document discusses JavaScript SEO and how to optimize websites that use JavaScript for search engines. It covers how search engines like Google deal with JavaScript, common issues that arise, and best practices for JavaScript SEO, including making sites crawlable, indexable, fast loading, and maintaining consistent metadata and internal links when using JavaScript.

Uploaded by

IsraMtz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (2 votes)
220 views

The Ultimate Guide To JavaScript SEO (2020 Edition) - Onely Blog

The document discusses JavaScript SEO and how to optimize websites that use JavaScript for search engines. It covers how search engines like Google deal with JavaScript, common issues that arise, and best practices for JavaScript SEO, including making sites crawlable, indexable, fast loading, and maintaining consistent metadata and internal links when using JavaScript.

Uploaded by

IsraMtz
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

What is JavaScript SEO?

JavaScript brings interactivity and dynamism to the web


that’s built on the static foundation of HTML markup. This
guide covers everything you need to know to make sure
using JavaScript doesn’t impair your rankings or your
User Experience.

JavaScript SEO is a branch of technical SEO that makes JS-


powered websites:

1. easy for search engines to fully crawl, render, and index,


2. accessible to users with outdated browsers,
3. keep their metadata and internal linking consistent,
4. load fast despite having to parse and execute
JavaScript code.

JavaScript is extremely popular. Based on my research, as


much as 80% of the popular eCommerce stores in the
USA use JavaScript for generating main content or links
to similar products.
However, many JavaScript websites – despite its popularity
– underperform in Google because they don’t do JavaScript
SEO properly.

In this article, I will guide you through why it’s happening


and how to fix it. You’ll learn:

how Google and other search engines deal with


JavaScript
how to check if your website has a problem with
JavaScript
what are the best practices of JavaScript SEO
the most common problems with JavaScript that
SEOs overlook

I will provide tons of additional tips and recommendations,


too. We have a lot to cover, so get yourself a cup of coffee
(or two) and let’s get started.
chapter 1

In 2020, there’s no doubt: JavaScript is the future of the


web.

Of course, HTML and CSS are the foundation. But virtually


every modern web developer is expected to code in
JavaScript too.

But what can JavaScript do, exactly? And how can you
check which elements of your website are using it? Read on
and you’ll find out.
JavaScript is an extremely popular programming language.
It’s used by developers to make websites interactive.

JavaScript has the unique ability to dynamically update the


content of a page.

For instance, it’s used by Forex and CFD trading platforms to


continually update the exchange rates in real-time.

Now, imagine a website like Forex.com without JavaScript.

Without JavaScript, users would have to manually refresh


the website to see the current exchange rates. JavaScript
simply makes their lives much easier.

In other words, you can build a website using only HTML and
CSS, but JavaScript is what makes it dynamic and
interactive.

HTML defines the actual content of a page (body/frame


of a car).
CSS defines the look of the page (colors, style).
JavaScript adds interactivity to the page. It can easily
control and alter HTML (engine + wheel + gas pedals).
Which website elements are commonly
generated by JavaScript?

The types of content that are commonly generated by


JavaScript can be basically put into six categories:
Pagination
Internal links
Top products
Reviews
Comments
Main content (rarely)

How do I know if my website is using


JavaScript?

1. Use WWJD

To make it easy for you to check if your website relies on


JavaScript, we created WWJD – What Would JavaScript Do-
which is just one of our FREE tools.

Simply go to WWJD and type the URL of your website into


the console.
Then look at the screenshots that the tool generates and
compare the two versions of your page – the one with
JavaScript enabled and the one with JavaScript disabled.

2. Use a browser plugin

Using our tool is not the only way to check your website’s
JavaScript dependency. You can also use a browser plugin
like Quick JavaScript Switcher on Chrome, or JavaScript
Switch on Firefox.

When you use the plugin, the page you’re currently on will
be reloaded with JavaScript disabled.
If some of the elements on the page disappear, it means that
they were generated by JavaScript.

An important tip: if you decide to use a browser plugin


instead of WWJD, make sure you also take a look at the page
source and the DOM (Document Object Model) and pay
attention to your canonical tags and links. It often happens
that JavaScript doesn’t change much visually and you
wouldn’t notice that it’s even there. However, it can change
your metadata under the hood, which can potentially lead to
serious issues.

Important: “view source” is not enough when


auditing JS websites
You may hear that investigating what’s inside the source
code of your web pages is one of the most important things
in an SEO audit. However, with JavaScript in the picture, it
gets more complicated.

HTML is a file that represents just the raw information used


by the browser to parse the page. It contains some markup
representing paragraphs, images, links, and references to JS
and CSS files.

You can see the initial HTML of your page by simply right-
clicking -> View page source.

However, by viewing the page source you will not see


any of the dynamic content updated by JavaScript.

With JavaScript websites, you should look at the DOM


instead. You can do it by right-clicking -> Inspect element.
Here’s how I would describe the difference between the
initial HTML and the DOM:
Note: If Google can’t fully render your page, it can still
index just the initial HTML (which doesn’t contain
dynamically updated content).

Now that you’re sure what elements of your page depend on


JavaScript, it’s time to find out if Google can properly deal
with your JavaScript content.

chapter 2

JavaScript makes the web truly dynamic and interactive, and


that’s something that users love.

But what about Google and other search engines? Can they
easily deal with JavaScript, or is it more of a love-hate
relationship?

As a leading technical SEO agency, we’re constantly doing


research to look for Google’s strengths and weaknesses.

And when it comes to JavaScript, it’s sort of a mixed bag…

Indexing JavaScript content by Google is never


guaranteed.

We recently investigated multiple websites that are using


JavaScript. It turned out that on average, their JavaScript
content was not indexed by Google in 25% of the cases.

That’s one out of four times.

Here are some examples of the tested websites:

On the other hand, some of the websites we tested did very


well:
As you can see, Google can index JavaScript content on
some websites much better than others. This means that
these issues are self-induced and can be avoided. Keep
reading to learn how.

It’s also important to know that indexing content isn’t


guaranteed even in the case of HTML websites. JavaScript
simply adds more complexity, as there are a couple more
things that could go wrong.

Why Google (and other search engines) may


have difficulties with JavaScript

I. The complexity of JavaScript crawling

In the case of crawling traditional HTML websites, everything


is easy and straightforward, and the whole process is
lightning fast:

1. Googlebot downloads an HTML file.


2. Googlebot extracts the links from the source code and
can visit them simultaneously.
3. Googlebot downloads the CSS files.
4. Googlebot sends all the downloaded resources to
Google’s Indexer (Caffeine).
5. The indexer (Caffeine) indexes the page.

For Google, things get complicated when it comes to


crawling a JavaScript-based website:

1. Googlebot downloads an HTML file.


2. Googlebot finds no links in the source code as they are
only injected after executing JavaScript.
3. Googlebot downloads the CSS and JS files.
4. Googlebot has to use the Google Web Rendering
Service (a part of the Caffeine Indexer) to parse,
compile and execute JavaScript.
5. WRS fetches the data from external APIs, from the
database, etc.
6. The indexer can index the content.
7. Google can discover new links and add them to the
Googlebot’s crawling queue. In the case of the HTML
website, that’s the second step.

In the meantime, there are many things that can go wrong


with rendering and indexing JavaScript. As you can see, the
whole process is much more complicated with JavaScript
involved. The following things should be taken into account:

Parsing, compiling and running JavaScript files is very


time-consuming – both for users and Google. Think of
your users! I bet anywhere between 20-50% of your
website’s users view it on their mobile phone. Do you
know how long it takes to parse 1 MB of JavaScript on a
mobile device? According to Sam Saccone from Google:
Samsung Galaxy S7 can do it in ~850ms and Nexus 5 in
~1700ms. After parsing JavaScript, it has to be
compiled and executed, which takes additional time.
Every second counts.
In the case of a JavaScript-rich website, Google can’t
usually index the content until the website is fully
rendered.
The rendering process is not the only thing that is
slower. It also refers to the process of discovering new
links. With JavaScript-rich websites, it’s common
that Google cannot discover any links on a page
before the page is rendered.
The number of pages Googlebot wants to & can crawl is
called the crawl budget. Unfortunately, it’s limited,
which is important for medium to large websites in
particular. If you want to know more about the crawl
budget, I advise you to read the Ultimate Guide to the
Crawler Budget Optimization by Artur Bowsza, Onely’s
SEO Specialist. Also, I recommend reading Barry
Adams’ article “JavaScript and SEO: The Difference
Between Crawling and Indexing” (the JavaScript =
Inefficiency and Good SEO is Efficiency sections, in
particular, are must-haves for every SEO who deals with
JavaScript).

II. Googlebot Doesn’t Act Like a Real Browser

It’s time to go deeper into the topic of the Web Rendering


Service.

As you may know, Googlebot is based on the newest version


of Chrome. That means that Googlebot is using the current
version of the browser for rendering pages. But it’s not
exactly the same.

Googlebot visits web pages just like a user would when


using a browser. However, Googlebot is not a typical Chrome
browser.

Googlebot declines user permission requests (i.e.


Googlebot will deny video auto-play requests).
Cookies, local, and session storage are cleared across
page loads. If your content relies on cookies or other
stored data, Google won’t pick it up.
Browsers always download all the resources –
Googlebot may choose not to.

When you surf the internet, your browser (Chrome, Firefox,


Opera, whatever) downloads all the resources (such as
images, scripts, stylesheets) that a website consists of and
puts it all together for you.

However, since Googlebot acts differently than your


browser, its purpose is to crawl the entire internet and grab
valuable resources.

The World Wide Web is huge though, so Google optimizes


its crawlers for performance. This is why Googlebot
sometimes doesn’t load all the resources from the
server. Not only that, Googlebot doesn’t even visit all the
pages that it encounters.

Google’s algorithms try to detect if a given resource is


necessary from a rendering point of view. If it isn’t, it may
not be fetched by Googlebot. Google warns webmasters
about this in the official documentation.

Googlebot and its Web Rendering Service (WRS)


component continuously analyze and identify resources
that don’t contribute essential page content and may not
fetch such resources.

Because Googlebot doesn’t act like a real browser, Google


may not pick some of your JavaScript files. The reason
might be that its algorithms decided it’s not necessary from
a rendering point of view, or simply due to performance
issues (i.e. it took too long to execute a script).

Additionally, as confirmed by Martin Splitt, a Webmaster


Trends Analyst at Google, Google might decide that a page
doesn’t change much after rendering (after executing JS) so
they won’t render it in the future.

Also, rendering JavaScript by Google is still delayed


(however, it is much better than in 2017-2018, when we
commonly had to wait weeks till Google rendered
JavaScript).

If your content requires Google to click, scroll, or perform


any other action in order for it to appear, it won‘t be indexed.

Last but not least: Google’s renderer has timeouts. If it takes


too long to render your script, Google may simply skip it.

chapter 3

You now know that JavaScript makes Google’s job a little


more complicated.

And because of that, there are additional steps that you


should take to make your JavaScript website do well in
Google.

JavaScript SEO may seem intimidating at first, but don’t


worry! This chapter will help you diagnose potential
problems on your website and get the basics right.
There are three factors at play here:

1) crawlability (Google should be able to crawl your website


with a proper structure and discover all the valuable
resources);

2) renderability (Google should be able to render your


website);

3) crawl budget (how much time it will take for Google to


crawl and render your website).
Rendering JavaScript can affect your crawl budget and
delay Google’s indexing of your pages.

Is your JavaScript content search engine-


friendly?

Here’s a checklist that you can use to check if Google and


other search engines are able to index your JavaScript
content.

I. Check if Google can technically render your website.

As a developer, website owner, or SEO, you should always


make sure that Google can technically render your
JavaScript content. It simply isn’t enough to open Chrome
and see if it’s OK.

Instead, use the Live Test in Google’s URL Inspection Tool,


available through the Search Console. It allows you to see a
screenshot of how exactly Googlebot would render the
JavaScript content on your page.
Inspect the screenshot and ask yourself the following
questions:

Is the main content visible?


Can Google access areas like similar articles and
products?
Can Google see other crucial elements of your page?

If you want to dive deeper, you can also take a look at the
HTML tab within the generated report.
Here, you can see the DOM – the rendered code, which
represents the state of your page after rendering.

What if Google cannot render your page properly?

It may happen that Google renders your page in an


unexpected way.

Looking at the image above, you can see that there’s a


significant difference between how the page looks to the
user compared to how Google renders it.

There are a few possible reasons for that:

Google encountered timeouts while rendering.


Some errors occurred while rendering.
You blocked crucial JavaScript files from Googlebot.
By clicking on the More info tab, you can easily check if any
JavaScript errors occurred while Google was trying to render
your content.

Important note: making sure Google can properly render


your website is a necessity.

However, it doesn’t guarantee your content will be


indexed. Which brings us to the second point.

II. Check if your content is indexed in Google.

There are two ways of checking if your JavaScript content is


really indexed in Google.
1. Using the “site” command – the quickest method.
2. Checking Google Search Console – the most accurate
method.

As a quick note: you shouldn’t rely on checking Google


Cache as a way of ensuring Google indexed your
JavaScript content. Even though many SEOs still use it, it’s
a bad idea to rely on Google Cache. You can find out more
by reading “Why Google Cache Lies to You and What To Do”
by Maria Cieślak, Onely’s Head of SEO.

The “Site” Command

In 2020, one of the best options for checking if your content


is indexed by Google is the “site” command. You can do it in
two simple steps.

1. Check if the page itself is in Google’s index.

First, you have to ensure that the URL itself is in Google’s


index. To do that, just type “site:URL” in Google (where the
URL is the URL address of a page you want to check).
Now, when you know that the URL is in fact in Google’s
database, you can:

2. Check if Google really indexed your JavaScript


content.

It’s very easy, too. Knowing which fragments of your page


depend on JavaScript after using our tool, WWJD, just copy
a text fragment from your page and type the following
command in Google: site:{your website} “{fragment}”.
If a snippet with your fragment shows up, that means your
content is indexed in Google.

I encourage you to check the “site” command across various


types of JS-generated content.

My personal recommendation: perform a “site:” query


with a fragment in incognito mode.

Google Search Console

A more precise, albeit slower, method of checking if your


content is indexed by Google is using the Google Search
Console.

Type the URL in question into the URL Inspection Tool.

Then click View crawled page. This will show you the code of
your page that is indexed in Google.

Just Ctrl+F to make sure if the crucial fragments of your


content generated by JavaScript are here.

I recommend repeating this process for a random sample of


URLs to see if Google properly indexed your content. Don’t
stop at just one page; check a reasonable number of
pages.

What if Google doesn’t index my JavaScript


content?

As I mentioned before, the problems that some websites


have with getting their JavaScript content indexed are
largely self-induced. If you happen to struggle with that
problem, you should find out why it might be happening in
the first place.

There are multiple reasons why your JavaScript content


wasn’t picked up by Google. To name a few:

Google encounters timeouts. Are you sure you aren’t


forcing Googlebot and users to wait many seconds until
they are able to see the content?
Google had rendering issues. Did you check the URL
Inspection tool to see if Google can render it?
Google decided to skip some resources (i.e.
JavaScript files).
Google decided the content is of low quality.
It may also happen that Google will index JavaScript
content with a delay.
Google simply wasn’t able to discover this page. Are
you sure it’s accessible via the sitemap and the internal
structure?

chapter 4

There are several ways of serving your web pages to both


users and search engines.

And understanding them is crucial when we are talking about


SEO, not exclusively in the context of JavaScript.

What’s right for your website: Client-side rendering (CSR),


Server-side rendering (SSR), or perhaps something more
complex? In this chapter, we’ll make sure you know which
solution suits your needs.
As we discuss whether Google can crawl, render and index
JavaScript, we need to address two very important
concepts: Server-side rendering and Client-side
rendering. It’s necessary for every SEO who deals with
JavaScript to understand them.

In the traditional approach (server-side rendering), a


browser or Googlebot receives an HTML file that completely
describes the page. The content copy is already there.
Usually, search engines do not have any issues with
server-side rendered JavaScript content.
The increasingly popular client-side rendering approach is
a little different and search engines sometimes struggle with
it. With this approach, it’s pretty common that a browser or
Googlebot gets a blank HTML page (with little to no content
copy) in the initial load. Then the magic happens: JavaScript
asynchronously downloads the content copy from the server
and updates your screen.

Remember our baking analogy? It’s valid here as well:

Client side-rendering is like a cooking recipe. Google


gets the cake recipe that needs to be baked and
collected.
Server-side rendering – Google gets the cake ready to
consume. No need for baking.

On the web, you’ll see a mix of these two approaches.

1. Server-Side Rendering

When, for some reason, Google cannot index your


JavaScript content, one of the solutions is to implement
server-side rendering. Websites like Netflix, Marvel, Staples,
Nike, Hulu, Expedia, Argos, and Booking.com take advantage
of server-side rendering.

There is one problem though: a lot of developers struggle


with implementing server-side rendering (however, the
situation is getting better and better!).

There are some tools that can make implementing SSR


faster:

Framework Solution
React Next.js, Gatsby
Angular Angular Universal
Vue.js Nuxt.js

My tip for developers: If you want your website to be server-


side rendered, you should avoid using functions operating
directly in the DOM. Let me quote Wassim Chegham, a
developer expert at Google: “One of THE MOST IMPORTANT
best practices I’d recommend following is: Never touch the
DOM.”
2. Dynamic rendering

Another viable solution is called dynamic rendering.

In this approach, you serve users a fully-featured JavaScript


website. At the same time, your server will send Googlebot
(and/or other bots) a static version of your website.
Dynamic rendering is an approach officially supported by
Google.

You can use these tools/services to implement dynamic


rendering on your website:

Prerender.io
Puppeteer
Rendertron

Google also provides a handy guide explaining how to


successfully implement dynamic rendering.

As of 2020, Google recommends using dynamic rendering in


two cases:

1. For indexable JS-generated content that changes


rapidly.
2. Content that used JS features that aren’t supported by
crawlers.

Can Twitter, Facebook and other social


media deal with JavaScript?
Unfortunately, social media sites don’t process the
JavaScript their crawlers find on websites.

To reiterate, social media like Facebook, Twitter, or


LinkedIn don’t run JavaScript.

What does that mean for you?

You must include Twitter Cards, as well as Facebook Open


Graph markup in the initial HTML. Otherwise, when
people share your content on social media, it won’t be
properly displayed.

Not convinced?

Let’s see how links to Angular.io and Vue.js look when you
share them on Twitter:
Angular.io is the second most popular JavaScript framework.
Unfortunately for them, Twitter doesn’t render JavaScript,
and therefore it can’t pick up the Twitter cards markup that
are generated by JavaScript.

Would you click on that link? Probably not.

Now contrast that with a link to Vue.js – the Twitter card


looks much better with the custom image and an informative
description!
Takeaway: If you care about traffic from social media,
make sure that you place the Twitter card and Facebook
Open Graph markup in the initial HTML!

chapter 5

Common Pitfalls with JS Websites


At this point, you should have a decent understanding of
how JavaScript is processed by Google and other search
engines.

With these foundations in place, you are set for success.


There’s just a couple of caveats to JavaScript SEO that are
hard to spot, and I want to help you avoid them.

Now that we’ve covered most of the basics, let’s look at


some of the common mistakes that SEOs and webmasters
make when optimizing their JavaScript-based websites.

BLOCKING JS AND CSS FILES FOR


GOOGLEBOT

Since Googlebot is able to crawl and render JavaScript


content, there is no reason to block it from accessing any
internal or external resources required for rendering.

IMPLEMENT PAGINATION CORRECTLY

Many popular websites use pagination as a way of


fragmenting large amounts of content. Unfortunately, it’s
very common that these websites only allow Googlebot to
visit the first page of pagination.

As a result, Google isn’t able to easily discover large


amounts of valuable URLs.

For example, when it comes to eCommerce websites,


Googlebot would only be able to reach 20-30 products per
category on paginated category pages.

As a consequence, Googlebot most likely cannot access all


the product pages.

How does this happen?

Many websites improperly implement pagination by not


using a proper <a href> link. Instead, they use pagination
that depends on a user action – a click.

In other words, Googlebot would have to click on a button


(View more items) to get to the consecutive pages.

Unfortunately, Googlebot doesn’t scroll or click the buttons.


The only way to let Google see the second page of
pagination is to use proper <a href> links.

If you still are not sure if Google can pick up your links, check
out this slide from the Google I/O conference in 2018:
Important note:

Having links hidden under link rel=”next” doesn’t help either.


Google announced in March 2019 that they no longer use
this markup:

To sum up, ALWAYS USE PROPER LINKS!

USING HASHES IN URLs

It’s still common that many JavaScript websites generate


URLs with a hash. There is a real danger that such URLs may
not be crawled by Googlebot:

Bad URL: example.com/#/crisis-center/


Bad URL: example.com#crisis-center
Good URL: example.com/crisis-center/

You may think that a single additional character in the URL


can’t do you any harm. On the contrary, it can be very
damaging.

For us, if we see the kind of a hash there, then that


means the rest there is probably irrelevant. For the
most part, we will drop that when we try to index the
content (…). When you want to make that content
actually visible in search, it’s important that you use
the more static-looking URLs.
That’s why you need to remember to always make sure your
URL doesn’t look like this: example.com/resource#dsfsd

Angular 1 uses hashtag-based URLs by default so be careful


if your website is built using that framework! You can fix it by
configuring $locationProvider (here is a tutorial on how to do
that!). Fortunately, the newer versions of the Angular
framework use Google-friendly URLs by default.

chapter 6

FAQ - BONUS CHAPTER!


As we wind down here, you probably have a few questions.

And that’s great!

I put together the most common questions readers have


asked me about JavaScript SEO over the years, including
questions about the two waves of indexing, PWAs, serving
Googlebot a pre-rendered version of your website, the future
of JavaScript SEO itself, and more!

But there is one small catch…


Want to read CHAPTER 6: FAQ?

Then sign up and get instant access.


Takeaways
As we’ve reached the end of the article, I want to take a
moment and address a problem that could affect even the
best SEOs.
It’s important to remember that JavaScript SEO is done on
top of traditional SEO, and it’s impossible to be successful at
the former without taking care of the latter.

Sometimes when you encounter an SEO problem, your first


instinct might be that it’s related to JavaScript, when in fact,
it’s related to traditional SEO.
Apply the principles of SEO well and be very careful
before you start blaming JavaScript. Being able to
quickly diagnose the source of the problem will save you
lots of time.

Here are some other takeaways:

Google’s rendering service is based on the most recent


version of Chrome. However, it has many limitations
(may not fetch all the resources, some features are
disabled). Google algorithms try to detect if a resource
is necessary from a rendering point of view. If not, it
probably won’t be fetched by Googlebot.
Usually, it’s not enough to analyze just the source page
(HTML) of your website. Instead, you should analyze the
DOM (Right-click -> Inspect tool).
You can use tools like Onely’s WWJD (What Would
JavaScript Do) or browser add-ons to check which
elements are generated by JavaScript.
When you have a JavaScript website and care about
traffic from social media, check what your social shares
look like (whether they have pictures and custom
descriptions).
You shouldn’t use Google cache to check how Google
is indexing your JavaScript content. It only tells you how
YOUR browser interpreted the HTML collected by
Googlebot. It’s totally unrelated to how Google
rendered and indexed your content.
JavaScript is very error-prone. A single error in your
JavaScript code can cause Google to be unable to
render your page.
Use the URL Inspection tool often. It will tell you if
Google was able to properly render and index your
JavaScript content.
To make sure Google is able to index your JavaScript
you have to take web performance into consideration.
Do it for the sake of your users and getting indexed by
search engines
If Google can’t render your page, it can pick up the raw
HTML for indexing. This can break your Single-Page
Application (SPA), because Google may index a blank
page with no actual content.

You might also like