A site map (or sitemap) is a list of pages of a web site accessible to crawlers or users. It can be either a document in any form used as a planning tool for web design, or a web page that lists the pages on a web site, typically organized in hierarchical fashion. This helps visitors and search engine bots find pages on the site.
Now I’m going to show you step by step evaluation of sitemap of most of the websites which is not normally shown to end users rather than search engines.
E.g. TO FIND SITEMAP OF GOOGLE (www.google.com)
Step 1: Determine robots file
This can be done by typing “/robots.txt” at the end of the homepage URL
This will display the robots exclusion standard file.
Step 2: At the bottom of the page you can find the sitemap URL and you can partially view how www.google.com is organized. The below links are the open sitemaps of Google
http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml
http://www.google.com/hostednews/sitemap_index.xml
http://www.google.com/ventures/sitemap_ventures.xml
http://www.google.com/sitemaps_webmasters.xml
http://www.gstatic.com/trends/websites/sitemaps/sitemapindex.xml
http://www.gstatic.com/dictionary/static/sitemaps/sitemap_index.xml
The same procedure can be used to determine the sitemaps of other websites too. The procedure is (at the end of the homepage URL, type (/robots.txt), go to bottom of the txt file to find sitemap of corporate and official websites.
E.g. 2.Robots of Wikipedia: (Carefully watch the robots.txt of Wikipedia)
By observing the robots.txt of Wikipedia, I came to know that these are very serious parameters, altering these may indulge into removal of indexing from search engines
More info @ http://en.wikipedia.org/wiki/Site_map
0 comments on STEPS TO FIND A SITEMAP OF A WEBSITE :
Post a Comment and Don't Spam!