Why are special characters encoded as HTML entities appearing as the source, such as &?
Knowledge Advanced (all versions)
Special characters are being seen in search results or elsewhere as HTML entities rather than the desired character.
HTML entities -- such as "&" for "&" and "é" for "é", are not parsed when included in plain text fields. These characters should be entered normally when creating or editing content.
The Knowledge Advanced REST API will not do any kind of parsing of these values, as a security precaution against execution of arbitrary code. Prior to importing articles containing special characters, care should be taken to convert any outstanding characters encoded this way to the native encoding of the import request. For example, PHP has the base function html_entity_decode which may be useful in preparing content for import.
A regular expression that may be helpful in identifying where these characters exist is: &[^\s]*;