What is the sitemap feature in Oracle B2C Service and how can I use it for my site?
Oracle B2C Service sites, All versions
Search engines use web crawlers (spiders) to explore your website and index the pages that will be available when searches are conducted. They usually find the pages from links within your site and from other sites. But you may have pages that are not accessible by browsing on the interface (for example, pages that are accessed only when customers complete a form). The spider will not be able to index those pages because it cannot access them. Content in iFrames is not searched and indexed either.
Sitemap lets you inform search engines about all the answer links that are available on your site, including those that are not accessible through links. Sitemap is an XML-based protocol used by the major search engines, including Yahoo! and Google. By using the Sitemap protocol, you can identify all the web pages on your customer portal instead of waiting for the search engine spiders to find them.
For more information on sitemap technology, refer to http://www.sitemaps.org.
The Sitemap page is an XML-formatted document on a web server that lists all the URLs for a site along with metadata for each URL, including the priority of each page and when it was last updated. Spiders use the page to crawl your end-user interface more intelligently rather than relying solely on links within the site.
When a site is requested through Configuration Assistant, Sitemap is not enabled by default. See Answer ID 12254: Sitemap and robots.txt for instructions on how to enable Sitemap through the Configuration Assistant.
Alternatively, if your site is newly provisioned, then Sitemap is enabled. A Sitemap page is created and registered with Yahoo and Google. The XML output of the page is available at:
The HTML output of the page is at:
You can disable Sitemap through the Configuration Assistant.
When sitemap is enabled a link to it is added to the robots.txt file for your site. Each time a spider reads the robots file it will be directed to the sitemap which is dynamic and will reflect all public changes to your pages and answers. You should be able to view the content of your site's robots.txt file via the following path:
Important password considerations: If passwords are required for your end-user pages, a sitemap can still be created and registered. However, due to the password requirement, search engines will be unable to reach any of the URLs listed in the sitemap file. As a result, search engines will not report results if end-user passwords are required for your end-user pages.
Note: Some of the security settings that restrict end user access such as SEC_VALID_ENDUSER_HOSTS and SEC_INVALID_ENDUSER_HOSTS can also impact whether the spider can access this information.