STEPS TO FIND A SITEMAP OF A WEBSITE

Share on :
    A site map (or sitemap) is a list of pages of a web site accessible to crawlers or users. It can be either a document in any form used as a planning tool for web design, or a web page that lists the pages on a web site, typically organized in hierarchical fashion. This helps visitors and search engine bots find pages on the site.


Now I’m going to show you step by step evaluation of sitemap of most of the websites which is not normally shown to end users rather than search engines.


E.g. TO FIND SITEMAP OF GOOGLE (www.google.com)
 
Step 1: Determine robots file
          This can be done by typing “/robots.txt” at the end of the homepage URL


This will display the robots exclusion standard file.

Step 2: At the bottom of the page you can find the sitemap URL and you can partially view how www.google.com is organized. The below links are the open sitemaps of Google

SITEMAP OF WWW.GOOGLE.COM:

http://www.gstatic.com/s2/sitemaps/profiles-sitemap.xml
 
http://www.google.com/hostednews/sitemap_index.xml
 
http://www.google.com/ventures/sitemap_ventures.xml
 
http://www.google.com/sitemaps_webmasters.xml
 
http://www.gstatic.com/trends/websites/sitemaps/sitemapindex.xml
 
http://www.gstatic.com/dictionary/static/sitemaps/sitemap_index.xml

The same procedure can be used to determine the sitemaps of other websites too. The procedure is (at the end of the homepage URL, type (/robots.txt), go to bottom of the txt file to find sitemap of corporate and official websites.
E.g. 2.Robots of Wikipedia: (Carefully watch the robots.txt of Wikipedia)
                  By observing the robots.txt of Wikipedia, I came to know that these are very serious parameters, altering these may indulge into removal of indexing from search engines



0 comments on STEPS TO FIND A SITEMAP OF A WEBSITE :

Post a Comment and Don't Spam!