** PLEASE NOTE - The following text has been deprecated in favor of the revised Charter Statement attached to this page - 6 June 2018 **

CODATA/RDA Research Data Science Schools for Low and Middle Income Countries IG

Introduction(A brief articulation of what issues the IG will address, how this IG is aligned with the RDA mission, and how this IG would be a value-added contribution to the RDA community):

The goals of this RDA Interest Group are to continue the information sharing, targeted outreach and community collaboration with RDA members about the CODATA-RDA Schools for Research Data Science. At this point, there have been two very successful schools hosted by the International Center Theoretical Physics in Trieste :- the first being held in August of 2016 and the second in July of 2017. Both were held in Trieste, Italy. There are two planned upcoming events, one in Sao Paulo in December 2017 and a third Trieste school in August 2018. We are also looking into an African school in 2018. We plan to continue providing open and evolving curriculum materials, creation of a practical framework for hosting regionalized instances of the course, and focus on train-the-trainer concepts to grow regional capacity. This is aligned with the RDA mission in that it enables data sharing through its training and adds value to the RDA community by teaching some of the outputs of the RDA in Research Data Management.

User scenario(s) or use case(s) the IG wishes to address (what triggered the desire for this IG in the first place):

The school curriculum focuses on Open Data and FAIR practices, ethical data use, and builds a foundation of Data Science skills for early career researchers in all disciplines. The attending researchers are given priority based on a World Bank ranking of Low or Middle Income Countries (LMICs), so the focus is on resource constrained researchers. This specifically speaks to the RDA Vision of “…researchers and innovators openly sharing data across technologies, disciplines, and countries to address the grand challenges of society.” In the past RDA as an organization and its recommendations and outputs have been introduced to the school’s students. While the events thus far have targeted LMICs, the curriculum should have universal application for Early Career Researchers (ECRs) worldwide.

Objectives (A specific set of focus areas for discussion, including use cases that pointed to the need for the IG in the first place. Articulate how this group is different from other current activities inside or outside of RDA.):

· Continuing to provide successful School for Research Data Science Events.

· Creating a framework for hosting of regional events.

· Providing ECRs with a foundation of skills necessary to thrive in an Open Science and Open Data environment.

· Grow a base of worldwide trainers prepared to lecture, mentor, and provide support to future worldwide and regional events.

· Continually evolve the foundational Data Science curriculum designed for this course, along with making it accessible and reusable for other projects with similar goals.

This is distinct from other groups within the RDA.

Participation (Address which communities will be involved, what skills or knowledge should they have, and how will you engage these communities. Also address how this group proposes to coordinate its activity with relevant related groups.):

· The communities targeted for students are ECRs in LMICs with interest of need of a foundation in Data Science skills and resources. They should also embrace the concepts of Open Science and Open Data.

· The communities targeted for lectures and mentors are any Research Data Science professionals or academics interested in developing the next generation of Data Scientists in LMICs. This includes, but in not limited to RDA Members.

· NGO and Corporate sponsorship will need be leveraged for successful hosting of events.

Outcomes (Discuss what the IG intends to accomplish. Include examples of WG topics or supporting IG-level outputs that might lead to WGs later on.):

· Continued successful instances of the RDA/CODATA School for Research Data Science.

· Continued growth of lecturer and mentor base to regionalize training events and reduce the cost per event by leveraging local experience and talent.

Mechanism (Describe how often your group will meet and how will you maintain momentum between Plenaries.):

Regular event planning meetings will be held prior to upcoming scheduled events and targeted outreach events. A CODATA Task Group will continue to meet about governance, funding, curriculum evolution and operation of the project. We will leverage the RDA IG as an information sharing and recruiting platform.

Timeline (Describe draft milestones and goals for the first 12 months):

September 2017: RDA Plenary Final CODATA/RDA Summer School for Research Data Science WG Session has been accepted.

While the principles of the proposals may be balanced, cogent and encouraging, I find a large gap between what is stated on paper and how that actually translates into practice. "Data" are not, by their very origins and natures, entities that can talk to one another, or even have meaning for one another, without trained scientists in the loop. That is not a draconian request; "data" involve scientific activities, intelligent analyses and patient understanding, and cannot return worthy science without *all* of that. Even data scientists are not a generalized development of trained people; the requirements and related skills vary so widely between even nominally similar scientific domains that it takes several *per domain* to satisfy the needs of the scientists themselves, let alone those who have no ackground in the specific domain field. "Studying science" at this level is very much akin to "studying languages": you won't be able to master Chinese if all you have studied to date are European languages, and then Hungarian falls outside that basket anyway. Thus, to continue the analogy, a School on Linguistics is not going to offer 'solutions' that are of any practical value unless restricted to simple formulae such basic greetings in different tongues. The equivalent in science is equally pathetic - unless far more detail and divergence is somehow included. The proposal does not indicate any of the latter.