To improve, or not to improve; how changes in corpora influence the results of machine learning tasks on the example of datasets used for paraphrase identification,Krystyna Chodorowska, Barbara Rychalska, Katarzyna Pakulska and Piotr Andruszkiewicz