@samgta and I are in the process of creating a co-op called datactivi.st, dedicated to services around open data (training, helping NGOs to use open data for their lobbying, etc.).

One of the things that we would be interested in is helping as many people as possible using open data datasets. That would imply documenting and curating datasets. But of course this would be even better if not only us could do that, but if others could do it on their own too. So we would be interested in creating a set of tools that we as well as anyone could use to document, curate and share datasets.

Being an R adept, I’d be interested in using R to develop these tools (including GUIs, with Shiny for example). Compatibility with OKF’s Data Packages spec (or, at least to begin with, the Tabular Data Package spec) would also be an imortant feature. We would build on the existing ROpenSci package.

First of all, congratulations on the initiative. These kind of tools are very much needed indeed, at least in my perspective. Specially because of this [quote=“joelgombin, post:1, topic:2756”]
if others could do it on their own too.
[/quote]

It seems the current development is focused in Python, which is kind of understandable since it is used widely, but R should get its attention, specially when the curators need some kind of attention to work with data and, in that sense, R is more friendly (you can view data, view your changes, etc, while in Python you have to trust your instinct and skills ).

That said, even though I am not a technical user, if there’s anything I can do to help, please let me know. I still wanted to congratulate you guys on the initiative! Good luck!!

To be clear, we are interested in creating two different kinds of tools:

tools for users with some technical background etc., which could benefit from dedicated R packages etc. Here the idea would be to ease and promote good practices and standards - the idea that data should be packaged is making its way in the R community these days, but the practices are not really standardised yet.

perhaps more importantly, tools for non-technical users, who couldn’t create a data package on their own. Here R would be the underlying engine but the UI has to be a GUI, and if possible a friendly one. So I think we agree the key here is to get as many people as possible to participate and to lower the barriers to participation!

Hi, thanks for raising this issue! The ROpenSci package is set for some major development work in the near future. Have you tried it out yet? Do you have real data and, specifically, real issues you want addressed in working with that data?. At any rate, I’m PM’ing you .

Thanks @mattfullerton for your answer. I’ll try to join the hangout tonight, I might not be able to do so though (it’s about my son’s bathing time ;-))
In any case as I was saying to @danfowler by PM we could set up a hangout call once our own ideas anf projects get clearer - at that stage we haven’t decided in which direction to go exactly, waiting both to get reactions from you guys and also to be able to anticipate our workload over the next few months.
Thanks for the link to the presentation!