The Ultimate Guide to Crawlability
If you are familiar with Search Engine Optimisation (SEO) already, you’ll probably have heard the terms ‘crawlability’ and ‘indexability’, but what exactly do they mean? In this guide, I’ll take you through all you need to know about Google’s crawlers and what they do. Most importantly, I’ll go through how you can improve your site’s crawlability and indexability to increase your visibility in search results, helping you reach your audience faster.
The team of digital marketing experts here at Embryo take pride in sharing our knowledge and helping to educate those who seek our guidance. If you’d like to learn more about the different areas of digital marketing and how we could potentially help your business, get in touch at 0161 327 2635 or email info@embryo.com.
What Is Crawlability? And Why Is It Important?
Crawlability in SEO terms is exactly what it sounds like – the ability or ease with which a search engine can ‘crawl’ or review the pages of a website. Their crawlers (also known as bots or spiders) follow internal links across your site to understand the URL and internal link structure.
Why Does Google Need to Crawl My Website?
Search engine crawlers make their way around the internet visiting all the websites available, following from link to link, to understand whether those sites are easy to navigate and worth showing in their search results. If a website or URL is deemed worthy, it is classed as ‘indexed’ and therefore saved in the search engine’s database (or index). Google (or Bing, Yahoo, even Ask Jeeves if you’re feeling nostalgic) then uses its own algorithm to review the content of the URLs to see where they should be ranked within search engine results for various keywords, phrases and questions. It goes without saying that the higher your site ranks for a search query, the more organic traffic you’re likely to receive from searches for that query, so this is a really important part of SEO.
How Often Does Google Crawl a Site?
The standard timeframe is somewhere between every four and 30 days. The bot (or Googlebot) will continue to crawl your website on a regular basis if you frequently add to or update your content, and a new version of each URL will be stored in the index. If you fail to freshen up your website content, or Google doesn’t deem your website to be important or relevant for users, a spider is less likely to crawl it.
How to Improve Your Crawlability
1. Submit Your Sitemap in Google Search Console
Your sitemap is essentially a roadmap of your website, showing every URL and providing a direct link to each page. If you have a new website, or you’ve recently made some changes or added new content to your existing website, submitting your XML sitemap in Google Search Console will give Google a little nudge to crawl the whole of your website sooner rather than later, and without having to follow all of your internal links. This means the latest version of your website will be saved in the index and can therefore be reviewed by the algorithms of search engines and ranked accordingly.
2. Improve Page Speed across Your Website
Page speed is super important for any website. Not only do slow-loading pages make for a poor user experience, but search engine bots aren’t keen on hanging around waiting for links or pages to load either. They have millions of web pages to crawl every day – and if yours doesn’t load in the specified time given for your site (known as your crawl budget), you’re not making the cut. They’ll leave your site, meaning your pages won’t be crawled, and therefore will not be indexed, leaving you with no visibility in search rankings.
3. Fix Any Broken Links
Broken links can seriously impact the crawlability of your website, as well as being super frustrating for users trying to use your site. You could check every link manually if you have plenty of time on your hands, but the easiest way to check for broken links is using an auditing tool like Screaming Frog, which will crawl your site and highlight any issues, including 4xx errors. You can export your 404s and work through updating any dead links to head to live pages instead, or just remove the broken link.
4. Improve Your Internal Linking Structure
Internal linking is a really important part of SEO, and in turn, the crawlability of your website. It’s also key in helping users navigate their way around your site and find additional information more easily. Here are our top tips for a great internal linking structure:
- Link from your homepage to your main service or subpages, then from those down to the next layer of content, such as your blog posts, as well as back to your homepage and to your other service pages. Your blog posts should link to each other, as well as links to relevant service pages and your homepage.
- Only add contextual links to relevant content where it feels natural to do so, using anchor text over images for your links.
- Regularly monitor your internal link error codes with a crawler tool to ensure any misspelt URLs and dead links are picked up (find out how in point 3).
- When new content is added to your website, be sure to add internal links to the new URL using contextual anchor text to connect it to other pages and help it be found by crawlers.
- Ensure your links are ‘follow’ links, or the search bot won’t be able to follow them.
- Google recommends you only add a ‘reasonable amount’ of internal links on any one page – which is, as usual in SEO, open to interpretation!
5. Use robots.txt Files
The majority of websites use a robots.txt file to tell crawler bots how the website owner would like them to crawl the website. They’re used to stop your website from being overloaded with requests to crawl content and allow you to block certain pages from being crawled and indexed, such as online shopping baskets, redirected pages or directories.
6. Regularly Refresh and Add New Content to Your Site
It’s important to revisit your website’s content when it becomes out of date and add new authoritative, high-quality content on a regular basis to improve your site’s crawlability. If Google bots find you’re adding brand new amazing content each time they crawl your site, or updating your older articles to add fresh value for users, those user agents will stop by more often. If you’ve created a new page with some really important content, you can use Google Search Console to request for that page to be crawled as a priority, so you don’t have to wait for the bots to come around at their leisure. However, there’s usually a limit to the number of times you can do this daily, so if you’ve updated a number of pages across your site, resubmitting your sitemap could be a more viable solution.
7. Eliminate Redirects Where Possible
Most websites have redirects in place, whether they come from a URL being updated or a more relevant page becoming prevalent, for example. It’s important to make sure your redirects are being handled correctly as Google and other search engines frown upon link chains and link loops when it comes to both crawlability and indexability. Link chains are where there are several levels to the redirect – so the link heads to one page and is redirected to a different page, and then redirected again, and so on. A link loop follows this same principle, but with the user ending up back on the original page they started on – in a never-ending loop.
8. Building Backlinks
This step is probably the most difficult and the one you have the least control over, but it’s worthwhile putting some time into it when planning your SEO strategy. Backlinks from relevant, authoritative sites provide certain link equity that signals to search engines that your site can be trusted, is valuable for readers and therefore contains crawlable content worth reviewing.
The best way to build your backlink profile is to create brilliant content with professional insights, supporting data and helpful information for users, and hopefully gain natural links from relevant websites in your industry. Of course, there are no guarantees. You could also try outreach to sites where your brand name is mentioned, asking them to link to your site or websites that are using your images to ask them to give you credit, for example, amongst other outreach tactics.
What Is Indexability?
Indexability refers to the ease with which search engines can review your website’s pages, analyse them and then add them to their database or ‘index’. You need your web pages to be indexed on Google in order for them to appear in search results, so it’s key to make sure you’re monitoring this and making improvements where necessary.
How to Check If Your Website Is Indexed
So we know you want your website to be indexed on Google, but how do you find out if it is? There are a couple of simple ways to check.
With Google Search Console
To find out whether a URL is indexed on Google using GSC, head to ‘URL inspection’ in the menu on the left. Copy and paste your URL into the search field at the top of the page and hit enter. If your page is indexed, it will show ‘URL is on Google’. If not, it will show ‘URL is not on Google’. You can also see when the URL was last crawled by spiders in this function, and resubmit it for crawling.
Without Google Search Console
If you don’t have GSC access, head over to Google.com. In the search field, type site: followed by your website address. If your website appears in the listing, your site is indexed. You can do this with any URL.
How to Improve Indexability
In addition to following the crawlability points mentioned above, there are further steps you can take to help get your website’s pages indexed and showing up in the SERPs for your potential customers.
1. Avoid Duplicate Content
Duplicate pages or content on your website can harm your indexability rate and cause confusion for crawlers, as they’re not sure which page to index. This can affect how well your pages rank as it can cause keyword cannibalisation, so it’s important to make sure all content on your website is unique. Google Search Console will flag pages with duplicate content as the reason why those pages aren’t indexed, or you can find them using an audit tool like Screaming Frog.
2. Update ‘Thin’ Content
If your page has ‘thin’ content, it’s less likely to be indexed by Google. This could be down to it being poorly written and filled with spelling mistakes, or it could be that there’s just not enough content to be useful for a reader. Take some time to add helpful, trustworthy content to these pages, run your content through a spell check to be safe and resubmit them for crawling using GSC.
3. Check Canonical Tags
Canonical tags are a great way to let Google know which pages you’d like indexing, and which ones you’d like the crawl bots to skip, such as duplicate or outdated pages. These can easily become obsolete as your website moves on though, so be sure to use a URL inspection tool to find any incorrect tags and remove them to ensure you’re not hiding your primary content from search results in error.
4. Review No-Indexed Pages on Google Search Console
Using GSC, head to Indexing > Pages in the left menu, and it will show you which pages are not indexed on your website and why. Some pages may be no-indexed on purpose, such as those with a no-index tag or pages with redirects in place. However, pages that sit under the reason ‘Crawled – currently not indexed’, for example, or similar will need to be investigated. It’s important you review no-indexed pages on a regular basis to ensure you’re making improvements to your content where required.
Review Your Website’s Crawlability and Indexability Regularly
As we’ve discussed, there are many factors to consider when it comes to making your website easy for Google bots to crawl and getting your pages indexed by search engines. There are no simple fixes as such, but some elements are easier than others to work on, like fixing broken links and submitting your sitemap in GSC. The big boys like creating relevant, well-structured, keyword-rich content for users, implementing a great internal linking structure and building those all-important backlinks to your website will take much more time and effort but bring great gains when it comes to showing up for your potential customers in the search results.
This is where Embryo can help. Our experts know a thing or two about SEO and getting results when it comes to organic traffic, leads and conversions. If you’re looking for some guidance, get in touch with the team today at 0161 327 2635 or email info@embryo.com.