Your custom AI solution will require training data. Training data is used to "teach" AI models how to understand the world. A good set of training data will result in a model that will make accurate predictions on data in real-world scenarios. There are two basic considerations to keep in mind when building a training dataset: quantity and quality.
When it comes to training AI models, more training data is generally better. More data means more examples from your model to learn from, and helps you improve the accuracy of your model.
This is one of the most common questions that comes up when building a new model. Unfortunately, there are no hard and fast rules about the number of inputs that will be required for your particular use case. But as a general rule, if you are training a custom model on top of a Clarifai Model, you will need much less training data (typically tens to tens-of-thousands of inputs), than if you are building a "deep trained model" (typically thousands to millions of inputs).
Bias occurs when the scope of your training data is too narrow. If you only see green apples, you’ll assume that all apples are green and think red apples were another kind of fruit. If the training data contains only a small number of examples, it’ll react accordingly, taking it as truth. Small datasets make for a smaller worldview.
Models that perform well tend to be trained on data that is unique and photographed in a consistent way.
For best results, train your model with data that:
Adheres to concept descriptions laid out in a taxonomy
Represents the reality of the use-case
Has visually noticeable qualities - something that is not too subtle for humans to pick up on AND something that can be picked up through the noise of a photo.
Models that tend to perform poorly:
Trained on data with inconsistent compositions
Photos require outside context (relationships to people in portraits, etc)
Subject matter is subtle. Keep in mind, the model has no concept of language, so in essence, “what you see is what you get”.
Training set is cast too wide. If you train a concept of too many different kinds of images, and they are all visually different, the training set will become noisy. This will make it difficult for the model to find the visually distinct qualities to learn from, resulting in high levels of "variance".
When labeling an image, try to avoid labeling what is "in" the image, instead you will get better results if you label what the photo is "of". In cases where there are multiple objects in a scene, use a detector model, and label the detected regions separately.