Search This Blog

Diana's Blog

Posts

After formulating an experimental plan, we needed to have some datasets to run those experiments on. There are some requirements that classify a good dataset for this task: it must be non-trivial in size, needs to include a protective attribute (such as gender or race), and it must have some true ranking. While it is possible to find a datasets that satisfies two out of the three requirements, it becomes difficult to satisfy all of the requirements.

Encompassing a dataset from Rankit, list of Fifa 2018 players was added as a possibility. It contained 17981 players, has a protective attribute (age and nationality), and potentially has a true ranking. The true ranking can be based on goals scored, or money earned by the player. Some ranking data can also be found over this datasets, although it does not encompass all of the players.

Hospitals and doctors also have a significant amount of data, both attribute and entity-vise. However, finding a true ranking might prove to be impossible.

With the Rankit paper submitted, it was time for me to change gears and dive into the research with MaryAnn.

For me, this week revolved around getting up to speed with the fair ranking research. I read over the current in progress fair ranking paper and attended meetings where MaryAnn and Caitlin helped familiarize me with the code base and research that they've completed.

The last step was to put everything together! The team did a great job of finishing the paper, getting results from the user study, and adding finishing touches to the UI.
The end product looks like this!

The graduate student, Caitlin, updated the backend code to send the score of each ranking. Previously, the score was filled with dummy values to simply get the frontend functionality in place. And move the entire column to a more intuitive place.

Furthermore, I also worked on updating our datasets to be better. For one, our States dataset had unintuitive attribute values. To change that, I updated to a new dataset and looked for a better movies dataset, a one that would include categorical attributes. Unfortunately, I could not find a dataset that had as much variety of attributes as the current movies dataset has and that has categorical attributes.

In order to meet the deadline, the development team has been on a strict schedule to complete all tasks necessary.

Because we will be applying to a visualization conference, Rankit needs to have more visualization features. Therefore, we've been working on integrating active learning into the tool. With active learning, our tool will have immediate and engaging feedback on the ranking as the user decides whether they should rank more items to get better results or if they are satisfied with the ranking as is and can stop.

The features I worked on were to make the Explore tool more robust. One feature is to work on highlighting the rows of the data table with a gradient signifying confidence.

The second feature was to have a bar for each row signifying the score of the object.

Taking into account the feedback received from the our mentors, we updated the section analyzing the outcome of the online user study. We updated the machine learning section to include more references and added more charts to the whole paper.
The team also discussed the next steps and observed new features to be implemented.