What Is a Sitemap?
A sitemap is an XML file that lists all the important pages on your site. Google reads it to understand which pages exist, when they were last updated, and which ones are most important. You typically publish it as sitemap.xml, and the address looks like this: yourdomain.com/sitemap.xml
- Your site is new and has not yet received links from other sites
- You have a large number of pages (e-commerce, blog, portfolio, etc.)
- Some pages are hard to reach through your site's internal navigation
- You add new content frequently and want Google to notice it quickly
What Is robots.txt?
robots.txt is a small text file that lives in the root of your site (yourdomain.com/robots.txt). Using rules written inside it, you can tell search engine bots "don't enter this folder" or "don't crawl this page." Typical uses include blocking admin panels, test pages, or search result pages that you don't want indexed.
What Is the Difference Between the Two?
- Sitemap: "Google, look at these pages" — speeds up discovery
- robots.txt: "Google, stay out of these sections" — limits crawling
- Neither replaces the other; one missing doesn't mean the other picks up the slack
- A page added to the sitemap can still be blocked by robots.txt — this conflict is one of the most common mistakes
