Search Engine Optimisation

Soft 404s and Other Errors

Imagine you’re a hipster going to your favourite restaurant. You’ve checked their online menu and you’re excited to try the new Eggs Benedict on a gluten-free muffin. You get there, only to be told it’s out of stock. Annoyed, you ask for the scrambled eggs. They’re out of that too. Frustrated, you settle for a avocado on sourdough, but they’re out of that as well. You start to wonder if this restaurant is even worth visiting.

For users, encountering an error page that returns an HTTP 200 (OK) status code might not seem like a big deal. However, for web crawlers used by search engines, these status codes are crucial in determining if a page fetch was successful. Even if the page content is just an error message, crawlers will treat it as if it were a real page, wasting resources by repeatedly fetching the same useless content. Web crawlers don’t like doing this and is often referred to as a “soft 404”.

Why Soft Errors Are a Problem

Crawlers, such as those from Google, have extensive resources and can afford some wastage. Your site, on the other hand, probably can’t. Soft errors are detrimental because:

  1. Crawl Budget Waste: The limited “crawl budget” allocated to your site is wasted on error pages instead of being used on actual, valuable content.
  2. Poor Indexing: Pages returning soft errors are filtered out during indexing, meaning they won’t appear in search results. This results in no return on investment for the resources you’ve spent serving these pages.

How to Fix Soft Errors

If your server or client encounters an error, it’s crucial to serve the appropriate HTTP status code. This ensures that crawlers understand the issue and don’t waste resources on non-existent content. Use status codes correctly:

  • 404: Not Found – The server can’t find the requested resource.
  • 410: Gone – The resource is permanently removed and will not be available again.
  • 500: Internal Server Error – A generic error message for server issues.
  • 503: Service Unavailable – The server is currently unable to handle the request due to maintenance or overload.
  • 301: Moved Permanently – Redirected to another page

Here’s a full list of error codes if you want to know more.

By correctly using these status codes, you help crawlers focus on real pages with meaningful content, thereby improving your site’s efficiency and search visibility. For larger sites, doing this is extremely beneficial for lots of reasons including resource management and SEO.

Don’t try to rate limit crawlers or else you’ll ultimately create problems for yourself. Instead, manage your crawl budget effectively by ensuring your server returns the correct HTTP status codes and let Google and other search engines focus their attentions on the content that matters.

Exceptions

Google and other search engines consider out of stock products to be a “soft 404” as, in their opinion, these pages should return a “page not found” error instead of the 200 status. In these situations, Google have previously confirmed they’d likely drop these out of stock products from search results anyway, so should you remove the pages completely? No.

  • If the product or product line will never be stocked again then remove the page completely – 404
  • If the product is temporarily out of stock then leave it be – 200
  • If there’s an equivalent, related or similar product then redirect the page – 301

Conclusion

Soft 404s and similar errors are hidden problems that can severely impact your site’s performance if not addressed properly. By understanding and fixing these issues, you ensure that both users and crawlers have a better experience on your site, leading to improved visibility and efficiency. This work falls into technical SEO considerations but it’s easy to create confusion from the natural cross-over with web development. If you have any questions, please reach out and we’d be happy to help.

Want more?