I got this random question in mind. How does having duplicates in training data affect the model created?This question was as a result of tinkering with data augmentation: realized that after a few epochs there is likely duplicate data created, if the image data generator is used to augment data.

I understand that.For instance if you use just a few data augmentation parameters, like say just changing the width of the image. Running for a few epochs can end up in creating a sample that has the same width shift. This is less of a problem if more augmentation parameters are used.