The Viabl.ai Bayesian Classification Server is a multi-tenanted server providing a Machine Learning classifier based on the Naïve Bayesian algorithm. See here for both structured and unstructured data.
The following section describes the Bayesian server API
The process of using the Bayesian Classifier is detailed in the following steps:
The categorization process is further enhanced by the adoption of some additional text pre-processing steps:
The classifier provides a REST based API which is called via the HTTP POST method. The specific operation to perform is specified via the URL path with the BODY containing a stringified JSON object. The port number which the classifier listens is defined in the settings.json file.
URL path: /add_domain Add a new categorization domain. The domain is used to collect and classify a group of related texts. For example “spam” or “credit_authorization”
BODY message:
{
"domain": "the name of the domain to add"
}
URL path: /remove_domain Remove a previously added domain. N.b. all training data will be lost BODY message:
{
“domain”: “the name of the domain to remove”
}
URL path: /list_domains BODY message:
{}
Return message:
{
"data": [
"domain name 1",
"domain name 2"
]
}
URL path: /list_categories Retrieve the exhaustive list of categories for a [trained] domain. The return message also provides the record count of each category. BODY message:
{
"domain": "the name of the domain to retrieve the categories of"
}
Return message:
{
"data": [
{
"name": "pass",
"count": 2400
},
{
"name": "fail",
"count": 60
}
]
}
URL path: /train Train the classifier (for a specified domain) on a single text string. BODY message:
{
"domain": "the name of the domain to train",
"words": "the text to learn from",
"category": "the category associated with this text"
}
URL path: /write_metadata Store user-supplied data against a specified domain. BODY message:
{
“domain”: “the name of the domain to associate the data with”
“data”: user-supplied data of any type (object, array etc)
}
```
### Retrieve meta data for a domain
URL path: /read_metadata
Retrieve the (previously supplied) data for a specified domain.
BODY message:
``` javascript
{
“domain”: “the name of the domain to retrieve data for”
}
Return message:
{
"data": user-supplied data of any type (object, array etc)
}
URL path: /classify Perform classification on an unclassified text string (for a specified domain) BODY message:
{
"domain": "the name of the domain to train",
"words": "the text to learn from"
}
Example return message:
{
"results": [
{
"category": "reject",
"posterior": 1
},
{
"category": "accept",
"posterior": 0.71
}
]
}
Note The number of categories returned is limited to a maximum of 10. In addition, any category with a posterior (classification certainty) < 0.005 is not returned.