Sitemap URL Counter
Count, Analyze & Validate XML Sitemaps

Paste your XML sitemap to instantly count all URLs, detect sitemap indexes, find duplicate entries, visualize path depth, and validate against Google's 50,000 URL limit.

Paste XML Sitemap index support Duplicate detection Client-side · private

Sitemap URL Counter

Paste the XML from your sitemap file (right-click → View Source on yoursite.com/sitemap.xml, then copy all). Handles both <urlset> and <sitemapindex> formats.

// Sitemap URL Counter — Paste XML sitemap content below
Sitemap XML content 0 characters
Ctrl+Enter to run
0 Total URLs
0 Duplicate URLs
0 Unique URLs
Sitemap type
Domains found

XML Sitemap Guide

How XML sitemaps work, what formats exist, and how to optimize them for search engine crawling.

Standard urlset sitemap

The standard XML sitemap uses <urlset> as the root element and <url><loc> for each URL. Optional tags: <lastmod> (last modified date), <changefreq> (how often the page changes), and <priority> (0.0–1.0, relative importance). Google only uses <loc> and <lastmod> reliably — the other tags are largely ignored.

Sitemap index files

When your site exceeds 50,000 URLs, you split into multiple sitemap files and reference them from a sitemap index. The index uses <sitemapindex> as root and <sitemap><loc> to point to each child sitemap. Google Search Console lets you submit the index URL and discovers all child sitemaps automatically. This counter detects and labels both formats.

What to include in a sitemap

Include canonical URLs only — the final destination URL without redirect chains. Include: key product pages, blog posts, category pages, and location pages. Exclude: URLs with noindex, paginated pages (except page 1), URLs with parameters if you have canonical versions, admin pages, and URLs returning 4xx or 5xx status codes. A clean, focused sitemap helps Googlebot prioritize crawling your important content.

Sitemap best practices

Submit your sitemap in Google Search Console at Search Console → Sitemaps → Add sitemap. Use absolute URLs including protocol. Keep <lastmod> accurate — setting it to today's date on every page trains Google to ignore it. Split large sites into topic-based sitemaps (products-sitemap.xml, blog-sitemap.xml) for better crawl prioritization. Compress with gzip to reduce file size for large sitemaps.

Google Sitemap Limits & Rules

Official Google sitemap requirements — what counts, what's mandatory, and what the limits are.

RuleLimit / RequirementWhat happens if exceeded
URLs per sitemap file50,000 maximumGoogle may stop reading at 50,000 — remaining URLs ignored
Sitemap file size50MB uncompressedGoogle rejects files over 50MB — use gzip compression or split
Sitemaps per sitemap index50,000 maximumIndex itself counts against the 50,000 limit
URL formatAbsolute URLs onlyRelative URLs are invalid — Google may not process them
URL encodingMust use entity escapingUnescaped &, ', ", <, > cause XML parse errors
Character encodingUTF-8 onlyOther encodings cause parse failures
Sitemap location scopeMust be at or above sitemap locationSitemap at /blog/sitemap.xml can only list /blog/* URLs
lastmod formatW3C datetime (YYYY-MM-DD)Non-standard dates may be ignored by Google
priority / changefreqOptional, largely ignoredNo negative effect — Google mostly ignores these
SubmissionGoogle Search Console or robots.txtUnsubmitted sitemaps may still be discovered, but slower

SEO Audit Tools

Sitemap URL Counter – FAQ

How many URLs can an XML sitemap contain?+
Google's official limit is 50,000 URLs per sitemap file with a maximum uncompressed file size of 50MB. If your site has more than 50,000 URLs, you must create multiple sitemap files and reference them from a sitemap index. Each child sitemap referenced in the index also has the 50,000 URL limit. The counter above validates against this limit and warns you if you're approaching or exceeding it.
What is the difference between a sitemap and a sitemap index?+
A regular sitemap uses <urlset> as the root element and lists <url><loc> entries directly. A sitemap index uses <sitemapindex> as the root element and lists <sitemap><loc> entries that point to individual sitemap files rather than page URLs. Sitemap indexes let you manage large sites with more than 50,000 pages by splitting into multiple files. This tool auto-detects which format you've pasted.
Why are duplicate URLs a problem in sitemaps?+
Duplicate URLs in a sitemap waste your URL budget (each counts toward the 50,000 limit), confuse Googlebot about which version is canonical, and may indicate deeper canonical issues in your site structure. Common causes: trailing slash vs. no trailing slash (/page vs /page/), HTTP vs. HTTPS versions, www vs. non-www, and paginated pages included multiple times. Clean up duplicates and ensure all sitemap URLs use your canonical URL format.
How do I find my sitemap URL?+
Common sitemap locations: yoursite.com/sitemap.xml (most common), yoursite.com/sitemap_index.xml, yoursite.com/sitemap-index.xml, yoursite.com/post-sitemap.xml (WordPress). Check your robots.txt file (yoursite.com/robots.txt) — it often includes a "Sitemap:" directive pointing to the correct URL. WordPress with Yoast SEO generates sitemaps at /sitemap_index.xml. Once found, open the URL in your browser, select all text (Ctrl+A), copy, and paste here.
Should I include all my site's pages in the sitemap?+
No. Include only pages you want indexed: canonical versions of pages, important product and category pages, blog posts, and location pages. Exclude: noindex pages, paginated URLs beyond page 1, URL parameter variants (?sort=, ?filter=), admin URLs, pages returning 4xx or 5xx status codes, and thin or duplicate content pages. A smaller, focused sitemap of your best content performs better than exhaustive inclusion of every URL.