How does the Bayesian learning system work within Barracuda?
Incoming Email, Spam Filtering, Barracuda
Oracle B2C Service
Barracuda Bayesian Learning is a linguistic algorithm that profiles language used in both spam messages and legitimate email for any particular user or organization. To determine the likelihood that a new email is spam, Bayesian Analysis compares the words and phrases used in the new email against the corpus of previously identified email. Note that Bayesian training works only on messages with 11 words or more. To effectively "train" your Bayesian database, you must do the following:
- Classify 200 spam messages and 200 not spam messages from your Quarantine Inbox, which will train the Bayesian database as to what word or phrase patterns that appear, perhaps multiple times, throughout a messages you consider to be valid content or characteristic of spam. Messages can be marked as spam or not spam either from the QUARANTINE > Quarantine Inbox page. See your administrator for details.
- Use the Reset button on a regular basis to clear out old classifications of valid email versus spam to account for the fact that spam tactics change rapidly and the word and phrase patterns that appear in spam messages tend to change over time. Thus, by resetting your Bayesian database regularly and classifying 200 spam and not spam message anew, you'll keep your Bayesian database refreshed such that it has the best chance of identifying spam with a very high level of accuracy.
Bayesian Database Backup: To ensure the most out of your Bayesian learning system, each account comes with an ability to backup past rules. This allows for the account owner to update their rules without fear of losing all past decisions. This will, also, protect the account owner in the event of a corrupted Bayesian database. This information is located on the Preferences > Spam Settings page. The included options are as follows:
- Backup Bayesian Database - Allows you to download a copy of your personal Bayesian database.
- Restore Database - Allows you to upload a copy of a saved Bayesian database. The uploaded copy does not have to originate from the Barracuda Email Security Gateway or from the user's database.
Bayesian Poisoning: Some spammers will insert content in messages intended to bypass spam rules, such as excerpts of text from books or other content that may look "legitimate" in order to fool spam filtering algorithms. This tactic is called Bayesian Poisoning and could reduce the effectiveness of a Bayesian database if many of these messages are marked as either spam or not spam. The Barracuda Networks Bayesian engine is, however, very sophisticated and protects against Bayesian Poisoning if administrators or users consistently maintain their databases.
Individual Emails: You can view the Bayesian Score for each individual incident by viewing the email headers for that incident. In order to view the headers, they must be enabled within your Oracle B2C Service application. For more information on enabling email headers, refer to Answer ID 1595: Enabling Email Headers.
When the headers are enabled, click the envelope icon from the incident thread to view the headers for that email. The Barracuda Bayesian score is typically listed towards the bottom of the pop-up along with the Spam Status, which compares the email score to the tag score and quarantine score. For example:
X-Barracuda-Bayes: SPAM USER 0.9966 1.000 4.3029
X-Barracuda-Spam-Status: No, SCORE=1.58 using per-user scores of TAG_LEVEL=3.5 QUARANTINE_LEVEL=7.5 KILL_LEVEL=1000.0 tests=CN_BODY_332, HTML_TAG_BALANCE_BODY, HTML_TAG_EXIST_TBODY, THREAD_INDEX, THREAD_TOPIC
NOTE: Spam in recent years has become more of a "moving target". Modern spam campaigns have become shorter and more targeted requiring constant maintenance for the Bayesian database by frequently adding similar numbers of Spam and Not Spam messages. With other recent feature enhancements, a Bayesian database is not crucial to effectively blocking spam messages to your email server.