Semantic Object Classification Project Walkthrough
The Semantic Object Classification (or ‘SOC’) Project in Classify is a powerful feature that comes with Fluree Sense to allow users to tag or label or classify their Data as a specific attribute. This is the implementation of classical machine learning problems such as predicting a specific disease based on input features, classifying materials or products based on their names , models, descriptions and many other such features, Classifying and prediction whether a loan / credit card will become delinquent or not, anti-fraud etc.
In this video walkthrough we will understand the important of the SOC project, set-up the pre-requisites for it using best practices , run the model and then further train it. By the end, we will hope to generate a fairly accurate ML model showing improvement after each round of training. So let’s get started !
1.Understanding the SOC Project & Reviewing Available Data
We’ll take a real-world use case & analyze the defined objective of the project as well as the pre-requisite Data for executing this project.
2.Creating and Running an SOC Project
We will create this SOC project live using the available training and test (or ‘Project’) Data.For the best results, an 80:20 split between Training and Test Data should be aimed for. Once created, we’ll run this project.
3.Reviewing the First Round (Unsupervised Run) Results and Completing Training Tasks:
The first run is known as an Unsupervised run because there is no manual feedback-based training. The first run generates training tasks which project users need to complete by providing feedback on generated predictions. his training is then used to boost the model accuracy in subsequent runs.
4.Re-running the Project after Training (Supervised Run) & Reviewing the Results:
Once all training is complete, we run the project again in Supervised mode to hopefully see improvement in the model confidence.
And now that we completed these steps, as promised in the 4th step, lets look at the improvement in the model confidence after the first supervised run. As we can see the confidence has improved from around 75% to 87% . So, following the steps in a Data Classification (or SOC) project is likely to allow users to reach a decent level of model confidence. This can be improved further in subsequent runs. If you’d like to learn more about classification projects , do refer to our posts on it starting here
