We learnt how Google constantly crawls webpages on the web to make an index of which is which, and what goes where.
The number of websites, and webpages on the web is ever increasing! And, most of these sites, and pages are constantly undergoing changes…maybe new pages are added, or maybe things are being edited, or errors being rectified…
In this vast digital ocean of endless information, it can be difficult to maintain records! That is a problem, and Sitemaps are the solution. On our journey with Google Search Console so far, we have come across the term “sitemap” several times now.
But what are these “Sitemaps”?
A sitemap for crawlers is much like what a tourist map is for tourists. A sitemap gives the crawlers directions about which pages are to be crawled, where to find them, and in what order should they be crawled.
So the tourists (crawlers) follow the directions given in the map (sitemap), site-see (crawl) the spots (webpages), and take photographs (indexing) to memorize. They will fetch the best list of information (Search Engine Results) out for queries (search queries) relevant to the place (Website) (in the Search Engine Results Page (SERP)).
Basically, a sitemap is a list that contains the pages, and the files on your website that makes it easier for your site to be crawled optimally. There are several types of sitemaps, but the most commonly used sitemaps are HTML, XML, TXT, RSS, and Atom.
The HTML sitemaps are now considered ancient, and almost obsolete, as they were meant for users to navigate the then crude age of the internet. XML stands for Extensible Markup Language. These sitemaps are the most commonly used sitemaps in the modern age of the internet.
The basic difference between XML, and HTML sitemaps is that XML sitemaps are hidden behind the curtain, only for the bots to see, whereas HTML sitemaps were a physical page of links a user would go to, and click the links to navigate on the site.
RSS (Really Simple Syndication) is an XML-based format that is used for dynamic websites that have new updates every now, and then, for e.g., News websites. Atom is an alternative for RSS which is used for similar purpose.
We saw how sitemaps make it easy for your pages to be crawled, and indexed, but are they necessary? Sitemaps are not a necessity for your site to be crawled. In fact, if your website is really small, with all the pages well linked, you do not need a sitemap for your site. So, how to figure out if your site needs a sitemap?
Use Sitemaps in the following cases
- You have a massive site: Google states that if your website has above 500 pages, you should have a sitemap. Although, we advise anything over 100 pages or even lesser at times call for sitemaps.
- Key updates: Sometimes there are changes or updates that are a timely matter. In such cases, it is advisable to create a sitemap for Google bots to crawl URLs on priority.
- Bad links: When your pages are not linked to each other well enough. In this case, a sitemap will help Google find them, and index them.
- New website: If your site is new, or you have a lot of content that is updated quickly, and regularly. E.g., in case you have a Recruitment website, or a News website, a sitemap will make it easy for your content to be discovered.
Okay, but how to create a sitemap? Most of the Content Management Systems come with an inbuilt sitemap generator that links your pages, so in those cases, you do not have to. You could create sitemaps yourself, but there are more efficient ways to get that done with almost no efforts, and time!
You could use plugins like Yoast SEO, RankMath, Google XML Sitemaps, Jetpack, etc. This process will involve running a code on your server, so you might need a developer’s help in case you do not have a tech background.
If your bespoke site was created from scratch, you can check out online tools like XML-sitemaps.com, which crawls your website (free up to 500 pages), and builds you an XML sitemap. You can find many such tools online, and they are really easy to use.
For using most of these online tools, all you will have to do is enter your website URL in the area provided, and all the links will be listed. You may remove the links as per your choice, and then go on to generate a sitemap. Sitemaps may have additional tags that could be beneficial for serving your specific purpose.
For example, Frequency Tags show how frequently the page will be changed for the bots to know when (and what section) to re-crawl. This is especially useful for dynamic sites that have regular updates. Priority Tags rate/rank pages on the basis of the relative urgency/importance of that page for crawling, and indexing.
Now, let us see how you can submit your sitemap on the Google Search Console… Open the Console, and on Navigation Window, on the left side of the page, find the Sitemaps report. You will find it in the Index section, below Index Coverage.
Click on “Sitemaps”, and you will be led to the Sitemaps Report Page. You will see the Space for entering your sitemap’s URL for Google to access it. Under “Add a new sitemap”, enter your complete URL to your sitemap, and click on “SUBMIT” to submit a sitemap to Google.
Note that Owner permission for the property will be required to submit a sitemap. All the sitemaps that you have submitted will also be found in this report.
The list of sitemaps also contains further information under-
- URL: The URL of the sitemap will be given.
- Type: This will show whether the sitemap is in TXT, RSS, Atom, HTML, or XML format.
- Submitted: The date on which each sitemap was submitted to Google through the “Add a new sitemap” space above the list.
- Last Read: The date it was last read by the Google bots.
- Status: This will show whether the crawl was successful or if the bots couldn’t fetch the sitemap, or if the sitemap has errors.
- Discovered URLs: The number of URLs that the sitemaps contain that the bots have been able to discover.
If the Crawl Status is given as “Couldn’t fetch” or “Has errors”, click on the row to see all the possible problems, and errors.
Clicking on the graph icon on the right side of every listed sitemap will lead to the Index Coverage Report for that sitemap.
Check out the errors in the report, and get them solved. Visit Search Console Help for information on types of errors, and how to solve them. You do not need to resubmit the sitemap every time you solve issues or make changes. Once Google records your sitemap, it will regularly keep updating its index.
In conclusion, are Sitemaps necessary for your webpages to be crawled by Google? Nah. Do sitemaps ensure that all your pages will be crawled? Nope. Do you need them? Maybe not. But your site is likely going to benefit from one, and they do not cost you much effort or any money… So why not have one, or even more?