Not necessarily, Each model will be specialized to detecting some features.So when you averaging them you are trying to use the expertise of all models.But there maybe some models that are bad and might affect the average. Averaging the predictions is just a navie technique in ensembling. There are better methods like model-stacking , rank-averaging etc.

Hi @shushi2000, I tried googlenet, it does not seem to give as good performance as others.We can try other stuff though FCN, resnet-style FCN, inception-style FCN etc.Someone in the kaggle forums posted that clustering the train and test has given them better scores.I need to look in that as well

I think the clustering was especially meant for using the additional data, because in there you have multiple photos of the same person, so you can make sure that a photo of the same person does not end up in both train and valid and/or test. Are you using the additional data?

I used fc_model and trained it with slightly different droupout rates, and scored 0.969. However, I found something really odd: the best score is from an obviously not-well trained model after 8 epoch, like on my validation:

Hi @rteja1113, For now the VGG16 with customized fc layers gave me best score of 0.80 and I want to use ResNet next. Could you please let me know how you recreate the ResNet model, as Jerome did with VGG16 in the lecture.Here's what I have done and how I got stuck:

Greetings,
MobileODT is the first active project I'm running on Kaggle after my first pass at Part #1 and some peaks at Part #2 (couldn't resist).
https://www.kaggle.com/c/intel-mobileodt-cervical-cancer-screening
After reproducing each lesson's notebook on its own until I understood its inner workings, I constructed a dedicated notebook building on each component showcased by @jeremy and @rachel.
I started with a basic ConvNet on the Stage 1 dataset, reduced to a sample one and made a …