Biography

Previously, Research Product Manager in MillwardBrown Poland (one of the largest global institutes of market and opinion research), Assistant Professor in Department of Quantitative and Qualitative Methods at University of Information Technology and Management in Rzeszow, Poland. In 2017, short-term visiting Assistant Professor at Center for Social Research & Center for Research Computing at the University of Notre Dame (Indiana, USA).

ODGAR Framework

Selected Publications

It is well documented that financial literacy is at best moderate around the world and that the cost of ignorance in this field may be high on both microeconomic and macroeconomic levels. We surveyed a representative sample of Poles to measure their debt literacy—a little‐studied aspect of financial literacy—and therefore obtain insight into the factors predicting it. Our study evidenced low levels of debt literacy and its overestimation by respondents in their self‐reports. We also confirmed some of the patterns found in former studies, including the gender gap and a positive relationship between the level of educational attainment and debt literacy. Finally, our examination provides compelling outcomes with regard to the segmentation of the sample on the basis of objective and subjective debt literacy scores. They show large heterogeneity of debt literacy and thus confirm the need for far‐reaching customization of debt‐oriented education.

Although the percentage of foreign students in Poland has increased over 9 times over past 10 years, it is still well below European Union average. We have been looking for determinants of willingness to study in Poland among members of Polish Diaspora, who already have ties with the country. We created empirical models of willingness of people with Polish origin to study abroad, in Poland, and in a peripheral Polish academic centre in particular. Such models can help policy-makers and universities to meet the needs of international students and make universities more competitive on the global higher education market.

In recent years, there has been increased interest in methods for gender prediction based on first names that employ various open data sources. These methods have applications from bibliometric studies to customizing commercial offers for web users. Analysis of gender disparities in science based on such methods are published in the most prestigious journals, although they could be improved by choosing the most suited prediction method with optimal parameters and performing validation studies using the best data source for a given purpose.

Recent Talks

R is great for data analysis and Shiny is great for interactive data visualisation, but could we use R&Shiny for efficient declarative data collection? Moreover, how can we develop web data products in R&Shiny, that are based on real-time declarative data collection with after-question and after-survey instant feedback? Users of such web data products should be able to immediately access the feedback relevant to their answers. To increase the value of the feedback, it should be dynamically customised to each respondent. This can be achieved by pre-programmed templates of feedback scenarios, which can be adaptively customised by the respondent’s answers to this or previous questions. Employing large analytical and data visualisation capabilities in R, we could try to adapt any type of instant feedback to each user. Using R, we could also combine different feedback sources: a respondent’s answers to a given question and to other questions, other users’ answers, external open data (imported into our app or available via APIs), and aggregated or summarised outcomes from reference studies. What are the possibilities and obstacles for developing such data products natively in R&Shiny? How the idea of QAF (Question, Answer, and Feedback) objects can be implemented in R&Shiny? What is the roadmap for developing ODGAR framework for On-line Data Gathering, Analysing, and Reporting? Is it possible to build mobile app in R&Shiny? I will try to answer these questions using experience gained from developing early stage prototypes.

There is an urgent need for new declarative data that can help solve important social problems. However, such data are more and more often difficult to obtain even if the research project is non-for-profit and aims at solving some social problem of great importance. The main reason for this situation is the perseverance of traditional model of respondent-researcher relationship. This model is harmful to social science research in general, and often under-founded socially important research projects in particular. Additionally, traditional on-line research techniques which collect declarative data are obsolete. They do not fully take advantage of Internet technologies and specificity of the needs of Internet users. In order to advance declarative data collection for social good, we need to implement new model of long-term respondent-researcher relationship. In this model there is a need for close collaboration between social scientist, programmers and data scientists. This collaboration is necessary for the transformation of old social science research techniques into modern on-line data products for collecting declarative data and providing instant customized feedback for the respondents. The main goal of these new tools is to support stable on-line panels of respondents willing to participate in important social research projects in exchange for valuable content provided instantly by data scientists via the same research tool.

The global population of researchers, data scientists, and analysts from academia and private sector is hard to reach for quick and cost-effective survey needs. At the same time, quantified opinions of such experts are a valuable help for decision-making, public policies, and meta analyses of (open) science development. In the forthcoming age of the Open Science there is a strong need for tools and methods that would allow for quick and easy access to members of the scientific community for research purposes.

Recent Posts

The second eRum was organized this year in Budapest (Hungary) and gathered ~500 participants (mostly, but not only, from Europe). It was a great event and a worthy successor of the first eRum organized in Poznań (Poland) two years ago.
In the Workskop Day I participated in two workshops:
Efficient R programming by Colin Gillespie (author of the Efficient R Programming book by O’Reilly) Building a package that lasts by Colin Fay.

During my research visit at Notre Dame University I had the pleasure to participate in Hadley Wickham’s lecture Welcome to the Tidyverse and meet Hadley in person. Hadley’s talks are always well-structured and worth listening.
Hadley Wickham has been a prime mover in releasing R upon the masses, enabling hordes of unsuspecting would-be researchers to process and visualize data in ways they never dreamed of. The tidyverse, the culmination of years of effort in the R language, is a universe of packages that facilitate a grammar of data, graphics, and modeling that allows even beginners to speak the language of data science fluently.

Here is a selection of my Data Science and Programming skills and tools that turned out to be helpful in my work and I believe are important for any Data Scientist as well.
The last update of this post was done: 2018-05-23.
Data Science Skills
Data Manipulation
Efficient data manipulation in R;
dplyr, data.table, reshape2.
Working with dates and time-series;
lubridate, xts.
Reproducible Analyses
RStudio IDE, Markdown, LaTeX;