HELP!

How to use Bayesian Machine Learning Classification?

"Naive Bayes" Classifiers are based on Bayes’ probability theorem and can be used for creating effective classification models for both structured and unstructured (free-text) data.

The Viabl Platform optionally enables the use of Bayesian Classification models, trained on a set of pre-classified cases, to classify new unseen cases. These models can also incrementally "learn" dynamically (at runtime) from further pre-classified cases to improve their performance.

If your Viabl Platform license supports this option then you would have been provided with the relevant software components for installing your Viabl Platform Bayesian Server.

Once your Viabl Platform Bayesian Server has been installed, you Viabl Platform administrator can add this Server to your User Group to enable you to access it.

Using the Bayesian Server Management Portal

The Bayesian Server Management Portal which can be accessed via the main burger menu:

image descr

You can Train your Bayesian Classifier by uploading a csv file with the training cases. Alternatively you can create an empty domain and use the Incremental Bayesian Learning option to train your Classifier.

image descr

Training can be based on either "Structured" or "Unstructured" (free-text"). Structured domains are those that are trained using multiple fields of any type (Numeric, List, Free-Text) plus the Classification (category) field. Unstructured domains consists of only one free-text field plus the classification field. The following optional settings can be applied to free-text fields to enhance the Bayesian model's classification accuracy:

image descr

  • "Stop Words: These words are removed from the texts before processing. The list of stop words can be changed or replaced (e.g. For another language)
  • Word Extraction: allows certain specified phrases to be extracted from the text using Regex (e.g. part of the postcode that represents an area)
  • Minimum Word Length: allows short words to be ignored
  • Word Stemming: Individual English words can be stemmed (utilizing the Porter Stemmer Algorithm) to remove the more common morphological and inflexional endings from words (Should only be used for the English language)

Text letter-casing is ignored, so for example "Car" and "CAR" are treated as the same phrase.

Once a Bayesian domain has been created (with or without training) then you can use the "Use Bayesian Classifier Tool and Incremental Bayesian Learning Tool against this domain.

The Bayesian Domain maintains a record of the number/percentage of cases that have been processed, broken down by category (Classification):

image descr

Bayesian models can also created, trained and updated via the Viabl Platform Bayesian Server API

On This Page