Skip Navigation
Expand
Sitemap and robots.txt
Answer ID 12254   |   Last Review Date 06/01/2022

How can I update the robots.txt file for my site?

Environment:

Site Indexing, Configuration Assistant
Spiders, Robots

Resolution:

The options to enable Sitemap and update the robots.txt file can now be done in the Configuration Assistant. For information on how to access the Configuration Assistant, see Answer ID 7537: Oracle B2C Service Configuration Assistant on Oracle Cloud Portal.
 
Enabling Sitemap
 
Search engines use web crawlers (spiders) to explore your website and index the pages that will be available when searches are conducted. They usually find the pages from links within your site and from other sites. However you may have pages that are not accessible by browsing on the interface (for example, pages that are accessed only when customers complete a form). The spider will not be able to index those pages because it cannot access them. 
 
Sitemap lets you inform search engines about all the answer links that are available on your site, including those that are not accessible through links. Sitemap is an XML-based protocol used by the major search engines, including Yahoo! and Google. By using the Sitemap protocol, you can identify all the web pages on your customer portal instead of waiting for the search engine spiders to find them.
 
Use this procedure to enable/disable sitemap on each interface in the Configuration Assistant:
  1. Click on the name of the site that you wish to enable/disable sitemap for.
  2. Click on the “Interfaces” tab on the left side.
  3. Click on the hamburger menu for the interface you wish to enable/disable sitemap for.
  4. Click on “Enable Sitemap”/“Disable Sitemap” and click Yes when a confirmation message appears.
 
Enable Sitemap
 
 
Robots.txt
 
For Oracle B2C Service sites, a robots.txt file is installed on each interface. The robots.txt file prevents random spider searches that can be enacted against an Oracle B2C Service site. The robots.txt file for each interface can be edited in Configuration Assistant provided Sitemap is enabled on the interface. The answer linked below contains detailed information on how the robots.txt file should be modified to control access to the site: Answer ID 1669: Allowing other search engines to index the Oracle B2C Service application
 
Use this procedure to edit robots.txt file on each interface:
  1. Click on the name of the site that you wish to update robots.txt for.
  2. Click on the “Interfaces” tab on the left side.
  3. Click on the hamburger menu for the interface you wish to update robots.txt for.
  4. Click on “Update robots.txt file”.
Update robots.txt
  1. If adding lines, please make sure each line ends with “# CUSTOM”. Any lines not ending with this will be deleted automatically. Please do not include lines ending with “# ADDED BY HMS” as they are default entries and are not editable and already included in the robots.txt file. 
  2. After making changes, click on “Submit” to save the change.
Submit changes
  1. To view the current robots.txt file on the interface, you can either use the “Download robots.txt” button on the robots.txt editing dialogue window or access it directly by using the URL: https://<interface_name>.custhelp.com/robots.txt .