Visual Perception, Data Visualization, and Science

The purpose of Open Access Vis is to highlight open access papers, materials, and data and to see how many papers are available on reliable open access repositories outside of a paywall. See the about page for more details about reliable open access. Also, I just published a paper summarizing this initiative, describing the status of visualization research as of last year, and proposing possible paths for improving the field’s open practices: Open Practices in Visualization Research

Why?

Most visualization research papers are funded by the public, reviewed and edited by volunteers, and formatted by the authors. So for IEEE to charge $33 for each person who wants to read the paper is… well… (I’ll let you fill in the blank). This paywall is contrary to the supposedly public good of research and the claim that visualization research helps practitioners (who are not on a university campus).

But there’s an up side. IEEE specifically allows authors to post their version of a paper (not the IEEE version with a header and page numbers) to:

The author’s website

The institution’s website (e.g., lab site or university site)

A pre-print repository (which gives it a static URL and avoids “link rot”)

A few people have brought up concerns that repositories for open data and materials do not have long term viability. “What happens if the site shuts down in 5 years?” As an alternative, people have proposed storing data and materials in a pay-walled IEEE repository. While it’s good to hear that open access is being discussed, being informed is important for the discussion to be fruitful. So I’ll highlight some critical information about the Open Science Framework (OSF).

1. 50-year preservation fund

The Center for Open Science (COS) has a fund devoted specifically to preserving and maintaining the repository in case the organization ever shuts down. This fund would make a read-only form of the repository accessible for 50+ years. Here is a quote from the sustainability supplement in the COS’s strategic plan (page 24):

In the event of COS’s closing, the preservation fund guarantees long-term hosting and preservation of all the data and content stored on the OSF (50+ years based on present costs and use)

2. An open license with no paywall

Content posted to OSF can choose from a variety of open licenses. Any future work that builds upon the content, incorporates it into a meta-analysis, or scrutinizes it can freely access and link to the material. Openness facilitates research without needing to rely on an expensive subscription to the publisher. Furthermore, an open license means that future work will not require the original author give permission or even reply to emails.

On the other hand, some people want the content to be stored in IEEE’s digital library. That is exactly the opposite of open science. It would be behind a pay-wall (that’s not open). Also, IEEE would own the copyright of the data and material. Either IEEE or an obnoxious original author in fear of scrutiny could obstruct any attempt to publish work that reuses the content on licensing grounds (that’s not science).

3. No risk of lock-in

The openness of OSF allows people to copy their content elsewhere in the future. So there is little risk of being “stuck” with OSF if you don’t like it. If someone creates a better site, they could even mirror OSF’s content, so future open science systems could start with all of the information already on OSF.

4. Updates and edits to content

Like in version control, most open science repositories allow for updating content such that previous versions are always accessible. That approach allows for further updates such as added documentation or fixing typos without erasing the peer-reviewed version. In contrast, making a change to the IEEE digital library is a nightmare.

5. Templates for policies and submission forms

There have been some attempts by individuals and organizations such as ACM to “reinvent the wheel” by creating their own policies for open practice requirements and badges. These attempts often fail to consider flexibility and transparency in reporting.

Alternatively, the Transparency and Openness Promotion (TOP) guidelines have pre-written templates for modular policies that with various levels of strictness (from simply reporting whether it is available to mandatory submission) and for various artifacts (materials, collected data, analysis code, etc.). A table (artifact x sternness) summarizing the different policies is available on the last page here.

Open data allows people to independently check a paper’s analysis or perform an altogether new analysis. It’s also a way of allowing future work to perform meta-analyses and ask questions that may not have been asked in the original paper. Therefor, it’s important to make experiment data public, provide it completely, and make it accessible for it to be useful to others.

But many missteps can happen that reduce the value of open data. These tips should help ensure that your data is indeed open, useful, and accessible.

Sharing experiment data and materials is a key component of open science and is becoming increasingly common (Kidwell et al. 2016). But some in Visualization and HCI have expressed concern that this practice may not be compatible with anonymous submissions. Not true! Open data and open materials can easily be shared anonymously.Continue reading →

I did not discriminate beyond those two criteria. However, I am using a gold star ★ to highlight one property that only a few papers have: a generalizable explanation for why the results occurred. You can read more about explanatory hypotheses here.

The purpose of Open Access Vis is to highlight open access papers, materials, and data and to see how many papers are unavailable outside of a paywall. See the about page for more details about reliable open access.

Why?

Most visualization research papers are funded by the public, reviewed and edited by volunteers, and formatted by the authors. So for IEEE to charge $33 for each person who wants to read the paper is… well… (I’ll let you fill in the blank). This paywall is contrary to the supposedly public good of research and the claim that visualization research helps practitioners (who are not on a university campus).

But there’s an up side. IEEE specifically allows authors to post their version of a paper (not the IEEE version with a header and page numbers) to:

The author’s website

The institution’s website (e.g., lab site or university site)

A pre-print repository (which gives it a static URL and avoids “link rot”)

I did not discriminate beyond those two criteria. However, I am using a gold star ★ to highlight one property that only a few papers have: a generalizable explanation for why the results occurred. You can read more about explanatory hypotheses here.

It’s that time of year again. InfoVis abstracts have been submitted, and lots of people are scrambling to finish their full submission.

I was curious about the distribution of keywords in the submissions, so I visualized some of the data available to the program committee (PC). After checking with the chairs, I thought others might be curious about the results.

Note that these are only abstracts, so there will probably be some attrition before the full paper submission deadline. To see a MUCH more thorough analysis of multiple years and venues, check out http://keyvis.org

I did not discriminate beyond those two criteria. However, I am using a gold star ★ to highlight one property that only a few papers have: a generalizable explanation for why the results occurred. You can read more about explanatory hypotheses here.

As we all know, the most important part of publishing research is making sure that no one ever reads or cites it. After all, it’d be awful if anyone actually saw the end result of months or even years of your effort. So keep these tips in mind the next time you author a publication.Continue reading →