3

My static site doesn't generate sitemap.xml automatically.

Hundreds of pages have been modified recently (meta description, etc.). I would like Google to recrawl the whole website. Submitting the URLs one by one in Search Console is not an option.

So I'm about to create one sitemap.xml, and submit it on the Google Search Console.

Question: If I submit a sitemap.xml only once, will the GoogleBot still continue to crawl the website on its own in the long-term, and discover new pages on its own or not?

Or does submitting a sitemap.xml URL once will send an information to GoogleBot like "No need to crawl automatically in the future, just use this sitemap.xml"? and eventually this will force me to maintain an up-to-date sitemap.xml? (I did not plan to do this, I wanted to do a one-shot sitemap.xml and then let GoogleBot find new future pages on its own).

0

2 Answers 2

5

Google will crawl any page that is accessible with or without a sitemap. If your pages has changed, even if you don't submit a sitemap. Google will crawl them again.

If you want to upload a sitemap, it will not restrict google to these links only.

2
  • Thank you for your answer. Would you have a source for this information? It would be great to add a link for future reference.
    – Basj
    Nov 29, 2021 at 10:26
  • Well GoogleBot access any page that is public and have a link to. That's the principle of these bots If you don't restrict crawlability with a <meta> tag. Sometimes, bots can ommit a link and that's why we make sitemaps. To be sure that all the links are accessible. If your page is already indexed, it will crawl it again. You can read more about bots by searching how they work : raddinteractive.com/2828-2
    – Lka
    Nov 29, 2021 at 10:38
2

Submitting an XML sitemap may or may not get Googlebot to come re-crawl your entire site more quickly than it otherwise would. Google has suggested using temporary sitemaps to trigger recrawls so it may be worth a try.

One way to speed this up could be to submit a temporary sitemap file listing these URLs with the last modification date (eg, when you changed them to 404 or added a noindex), so that we know to recrawl & reprocess them.

Googlebot will probaby re-crawl most of your pages within a couple of weeks anyway. Submitting a temporary sitemap may speed up the process some, but Googlebot may just say "I already know about these URLs" and not recrawl just because they are in a sitemap for the first time.

If you are going to submit an XML sitemap, you should either keep it up to date going forward, or remove it after it serves its purpose. XML sitemaps don't give much control over which pages get indexed or how well they get ranked. At best they get Googlebot to come crawl all your URLs, provide a signal to Google about which of your URLs are canonical, and give you extra stats in Google Search Console. See The Sitemap Paradox. If you don't keep the sitemap up-to-date, it may confuse Google regarding your preferred URLs, and stats in Google Search Console won't be useful.

Having said that, having an old outdated sitemap won't prevent new pages on your site from getting crawled and indexed. Google doesn't limit crawling to just the sitemap, nor does Google index only the pages in the sitemap. When you have an XML sitemap, but Google indexes a page that is not included in it, Google will give you a warning in Google Search Console saying "indexed but not submitted in sitemap." See Google says an indexed page is not in the sitemap even though it is in the sitemap for an example. Avoiding these warnings is another reason not to leave an outdated sitemap in place.

TLDR: I would recommend one of the following actions:

  • Don't submit the sitemap as planned and just let Googlebot re-crawl your site on its own time.
  • Submit the sitemap but treat it as temporary and take it back down after a couple weeks.
  • Submit the sitemap and keep it updated by regenerating it periodically going forward.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Not the answer you're looking for? Browse other questions tagged or ask your own question.