Custom training for Categorization

Imagga’s powerful auto categorization API makes it possible to conveniently organize photos in predefined categories. This provides way better experience for end-users and enables them to easily navigate through their photo collections. Dealing with user generated content might be challenging and the custom auto categorization comes to help you properly file every single image.

Auto categorization custom training is perfect match for businesses who need to quickly organize huge amounts of image content in sophisticated list of categories. Even if you don’t need categorization for the user-facing side of your project you can use it to extract insights about what type (category) of content is going through your platform.

See how Imagga helped to categorize 25 million hotel photos by building a custom trained classifier for Tavisca.

What is custom training and how it works

Custom training allows you to tweak Imagga’s auto categorization technology for very specific set of image categories. To do this we need the structure (preferably a flat list) of your categories and a list of public image URLs, or alternatively an archive of images, that belong to each category. Ideally we need 500-1000+ representative images for each category. It’s essential that you design the structure of your categories in way that they are distinct and non-overlapping.

After we get your category structure and the sample photos for each category, we do an expert validation of the quality of the input data to make sure you will obtain the best possible results.

The training itself is managed by our machine learning experts who make sure to tweak the learning parameters in the optimal way for the given task. Depending on the number of the categories and the sample photos the whole process may take up to 5 business days to finish.

What results to expect

We aim to achieve the highest possible precision rates with several iterations of tweaking the parameters of the learning process. As some photo concepts and image compositions might be very complex even for the human brain to categorize, it’s not a surprize that our technology can’t guarantee 100% precision rate. However we very often get to the 85-95% range in terms of precision rates, depending on the complexity of the tasks. After reviewing the training data set, we typically talk with the client and agree on an acceptable precision rate that would ensure the business goals of the clients are met.

Once the training is completed the custom categorization is accessible via our RESTful API and you get exclusive access to it associated with your API key and specific identifier of the newly trained categorizer.

How much does it cost

We charge $300 upfront in order to review the input data and run the custom training process. Once the training is completed we share the validation statistics and give you test access via the API so you can evaluate it yourself within a week. If the results met your needs and you want to go in production we charge a pre-agreed success fee starting from $899. The success fee varies depending on the complexity of the training task. You will receive a quote before we start working on your case.

Note that after the custom categorization is successfully implemented you need to be subscribed to one of our API subscription plans (based on the volume of categorization API calls you need per month) in order to commercially use it.

Discount options

We work to provide state-of-the-art image recognition technologies. As you all may know this is not just the code, all of this is based on smart algorithms that work as a brain that you are trying to teach. Here comes the training part where we need your support. One of the hardest task in the image recognition technology is to have enough good data so that we can feed this brain.

As a reward for helping us learn from your data we can offer discount options starting from 20% of the training price. If you allow us, we will store the image samples and continue to use them solely for the purpose of improving our technology. This means that in any case we WON’T provide any of the image data to third-parties, competitors or advertisers. The percentage of the discount we can offer depends on the quality of the data you have.

Need custom categorization?

Have questions about the custom training? Read our FAQ.

Is there an interface where I can set up a custom training?

No, currently this is work performed by our expert machine learning team. While we are working on a web interface that will make it possible for you to define categories, upload sample photos and do the training yourself, we still believe our team can help a lot in tasks such as defining the structure of the categories and pruning of the training data set, as well as the following training process, for optimal results.

How many images per category do you need for the training and how big do the images need to be?

Ideally we need around 1000+ photos per category but 200-500 might work as well if the categories are well distinguishable and each category is well represented by the given photos. The images need to be at least 300px on their shortest side.

Is there a limit on the number of categories you can do custom training for?

Theoretically yes, but in practice you don’t need to worry, we can handle training with tens of thousands of categories.

Should each category be represented by the same number of photos?

The number of photos in each category don't need to be exactly the same, but ideally there won’t be more than x2 times difference between the smallest and largest number of sample photos for their respective categories.

Can I use a hierarchy instead of flat structure?

Our training works with flat structures, so it’s best if you flatten the structure. Of course internally you can have your own knowledge of what hierarchical structure your flattened categories belong to.

How long does it take to do the actual training?

Depending on the complexity of the categories and the number of sample images it might take from a few hours up to 5 business days.

Would you use my data for other purposes?

The only way we would use your data is for improving the learning process of our core technology. We don’t share the images in any way. If you don’t want to contribute to the improvement process please let us know.

What if I am not satisfied with the precision of the results?

There are several options - adding more diverse sample photos per category can significantly increase the precision rate. Sometimes categories that are overlapping could be the reason for less than optimal results - in certain cases redefining the list of categories would work quite well.
In very rare cases we may jointly conclude that we can’t do anything feasible at this stage and you don’t need to pay the success fee. We try to minimize this case by carefully analyzing the definition of the categorization task before we start the actual training and request the upfront fee.