Putting SEO First with Gatsby

I was drawn to Gatsby out of my love for React, and my desire to serve fast, perfomant, responsive applications. I started learning React just as the library started embracing ES6 classes, and just as create-react-app was taking off. It was a challenge at first to learn the simplicity of proxying to get the front end to run concurrently with a node / express API. And yet, to get from that point to server-side rendering took a little more research and effort. It has been fun learning with and from the larger coding community. I could continue to enumerate the other technical issues I have encountered and solved, and yet some others I still have to learn, but note something important so far regarding my journey—it has been a techinical-first-journey, focused on knowledge and skill of the various langauges and libraries, necessary yes, but perhaps not primary. The last thing on mind has been SEO , accessbility , and security.

Setting An SEO-First Mindset

I’ll leave accessibility and then security future posts, but what has impressed me thus far in my dive into the Gatbsy ecosystem has been the ease to configure search engine optimization. In fact, you can create an architecture for your site or app that is SEO driven long before developing your UI. I want to walk you through the things I have learned so far in optimizing a Gatsby site for SEO right from the start.

Before We Begin

You need to have some familiarity with Gatsby. To learn how to install the gatbsy-cli and create a new Gatsby project from one of the many Gatsby Starters, please consider completing the Gatsby tutorial.

Otherwise, pick a starter from the SEO category, and open the SEO component or just use the gatsby-starter-default by running from the command line:

The SEO component is dependent upon react-helmet, if your starter does not come with SEO initialized be sure to add it. We also want to add some other features like sitemap, google analytics, and RSS feed, for syndication. Finally, we need to create robots.txt to manage how search engines crawl the site. I’ve split up the commands below for spacing purposes, but they can all run as one yarn add command:

Setting up gatsby-config.js

This file, gatsby-config.js is going to do the heavy lifting of importing all your vital SEO related content into GraphQL or create necessary files directly (like robots.txt).

Site Metadata

Metadata is as it sounds, data that extends across or throughout the entirity of your site. It can be used anywhere, but will come most in handy when configuring your SEO component as well as Google Structured Data.

Initialize your metadata as an Object literal with key/value pairs that can be converted into <meta> tags, or can be passed to sitemaps or footers, wherever you might need global site data:

With siteMetadata set up, your can now query this data to be used within your plugin initialization, even within the same gatsby-config.js file. I’ve organized my siteMetadata so that I can make the following query:

Plugin Setup

We are going to focus on four plugins, each for the sitemap, RSS feed, robots.txt file, and Google Analytics (for tracking the SEO success of your site). We’ll initialize the plugins immediately following siteMetaData.

The sitemap plugin can be used simply or with options. Generally, you want to include anything and everything with high quality content, and exclude duplicate content, thin content, or pages behind authentication. Gatsby provides a detailed walkthrough for setting up your sitemap.

An RSS feed helps with syndication of content on your site, like if you had a blog and wanted to cross-post to another better established blog, and it helps communicate changes to your site to search engines. This plugin allows you to create any number of different feeds utlizing the power of GraphQL to query your data. Some of what is below is dependent on how you structure your pages and posts in gatsby-node.js. The feed will walk through each of your pages in markdown generate an XML RSS Feed. Gatsby provides a detailed walkthrough for adding an RSS feed.

Google analytics will help you track how users find and interact with your site, so you can manage and update your site over time for better SEO results. Verify your site with Google and store your analytics key here:

Optimizing the SEO Component

The secret sauce behind the SEO Component is the well-known react-hemlet package. Every page and every template imports the SEO Component and thus you can pass specific information to adjust the metadata for each public facing page.

Use your imagination—what can you pass to this component to create the perfect SEO enabled page? Is the page a landing page, a blog post, a media gallery, a video, a professional profile, a product? There are 614 types of schemas listed on https://schema.org. You can pass specific schema related information to the SEO component.

If any of these values you pass to the component, a StaticQuery would fill in what’s missing with siteMetaData. From this data, react-helmet creates all the <meta> tags, including open graph and twitter card tags, and pass relevant data to another component that returns Google Structured Data.

As I have tinkered with the SEO Component, here are the features I added:

Additional Fields Passed to Component to: distinguish between types of pages, such as blog posts, to handle authors of content other than main site author, and to handle changes in date modified for pages and posts. More could be added in the future.

Adding a Component to handle Google Structured Data using schema.org categories. I came across an example of such a component from reading through various documentation and articles, I can’t recall where I saw the link first, but I borrowed and adapted from Jason Lengstorf. I made two small adaptations to his excellent work.

Configuring the SchemaOrg Component

You will import and call the SchemaOrg Component from within the SEO Component and place it just after the closing tag of the Helmet Component and pass the following properties:

I won’t copy and paste the entire SchemaOrg Component here. Grab it from the link above, and give Jason Lengstorf some credit in your code. Below are the few additions I made:

I added author email to the Schema. This will come from siteMetadata for most pages and from post frontmatter from blog posts. This will support multiple authors for your site and each page can reflect that uniquely if you so choose.

I updated the organization logo from a simple URI to an ImageObject type. While the String URI is acceptable to the Organization type, Google has specific expectations and was throwing an error until I changed it to an ImageObject.

Add dataModified to reflect changes to the page after initial publication for the BlogPosting type.

datePublished,
dateModified,

When you have more complexity within you site, you can pass flags to this Component to return differing types of schema as needed. But no matter what you do, do not put your site into production without first passing through the script generated by the component to the Google Structured Data Testing Tool.

Concluding Thoughts

When I configured my site according to the plan I outlined above, not only do I get image rich, descriptive Social Sharing cards, I get perfect SEO scores when running a Lighthouse Audit on my site:

You also see that I scored 100% for Accessibility on my site. This is so easy to score with Gatsby as well, and I’ll talk about what I learned on this topic in the future.