We will launch a project with over 1.000.000 products and those are in approx 150 categories. Later on you will find reviews and technical detail pages to those products. So there will be more than 2.000.000 pages at start and a growing number as soon as reviews are generated.
Content: products, product ratings/reviews, product technical details, later on maybe product questions/answers
Our prior goal is to give Google informations about important content on our portal and where were last changes.
I was a little bit confused when reading some answers to similar questions:
This one says:
The idea of a sitemap is to have this point to all pages.
And this one says:
You should not include any categories, paginated pages or sub categories in your sitemap.xml if there is no unique and distinct content on those pages.
And I thought it's a good point to put product category overview urls into XML Sitemap to give Google a good starting point to crawl my site.
However - my question is if I have to write every existing url into XML Sitemap or just the product urls from where you (or Google) reach reviews or technical details with one click?
And what happens when a product got a new review? Should I listed up the review overview URL to XML Sitemap with last modified time stamp or just give this time stamp to product url and Google will crawl it again + linked pages?
And should I put category overview pages and top product pages into XML Sitemap too?
How often will Google scan our XML Sitemap index file (because we'll have many XML Sitemaps) or will we have to upload them each time last modified time stamp has changed?