Search form

A short note on DMPs

You are here

Dear active-DMP group,
Based on our recent activities with the European Infrastructures and
several conversations we have created a very short (and rather pointed)
note to kickstart some discussion on DMPs in general.
One of the four points in this note is on the need for DMPs to be more
active/adaptable.
The plan is to turn this into an lively workshop around Q4/Q1 for the
European discussion.
I appreciate all comments, especially comments on the active/adaptable
point.
I do want to keep the note very short and to the point, so not long
stories please.
Once organising the workshop there will be a lot of room to expand on
these (especially since we have many very positive examples that this
note passes by, including (but not limited to) Oxford, DCC, Wageningen,
Elixir, CLARIN, etc.).
In any case, I look forward to your comments.
Cheers,
Herman
--
Dr. ir. Herman Stehouwer
Max Planck Computing and Data Facility (MPCDF)
RDA Secretariat***@***.*** 0031-619258815
Skype: herman.stehouwer.mpi

Hi Herman
How soon do you need a response...will Monday be OK?
Best wishes
Helen
- Show quoted text -From: herman.stehouwer=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of HermanStehouwer
Sent: 14 August 2015 16:11
To: ***@***.***-groups.org
Subject: [rda-datamanagplans] A short note on DMPs
Dear active-DMP group,
Based on our recent activities with the European Infrastructures and several conversations we have created a very short (and rather pointed) note to kickstart some discussion on DMPs in general.
One of the four points in this note is on the need for DMPs to be more active/adaptable.
The plan is to turn this into an lively workshop around Q4/Q1 for the European discussion.
I appreciate all comments, especially comments on the active/adaptable point.
I do want to keep the note very short and to the point, so not long stories please.
Once organising the workshop there will be a lot of room to expand on these (especially since we have many very positive examples that this note passes by, including (but not limited to) Oxford, DCC, Wageningen, Elixir, CLARIN, etc.).
In any case, I look forward to your comments.
Cheers,
Herman
--
Dr. ir. Herman Stehouwer
Max Planck Computing and Data Facility (MPCDF) RDA Secretariat***@***.*** 0031-619258815
Skype: herman.stehouwer.mpi
________________________________
This message (and any attachments) is for the recipient only. NERC is subject to the Freedom of Information Act 2000 and the contents of this email and any reply you make may be disclosed by NERC unless it is exempt from release under the Act. Any material supplied to NERC may be stored in an electronic records management system.
________________________________

A couple of things that I have been thinking about with respect to active/adaptable DMPs:

It would be great if various disciplines could map DMP elements to corresponding elements in their metadata standards. That way if the DMP is kept up-to-date, the end product could be a metadata record describing the data. Tools would likely need to be developed with targeted questions for initially writing the DMP, an interface for updating the DMP, and a utility for outputting standard XML metadata records.

With respect to keeping the DMPs updated, the tools could incorporate a schedule tracker. Prior to the start of the project, PIs would input an anticipated schedule for key milestones in the project where deviations from the initial DMP tend to happen. The tool could push notices to the project team asking if certain elements are still accurate and providing an easy way to edit.

The biggest problem I’ve always seen with even the concept of an active/adaptable DMP is the concept that data sets and projects are related 1 to 1 or maybe many to 1; but not in the many to many fashion which is the way things really work. If I am a researcher who has bene pursuing a line of research for 20 years (e.g., how does volcano plumbing work or what’s going on with the Greenland ice sheet); I may well have a sizable collection of materials that intellectually speaking are one continuous, cohesive collection (e.g., XYZ’s geological study of Antarctica’s Dry Valleys or Joe Blow’s 30 year record of XYZ measurements at Summit Greenland); yet the odds of there only having been one grant and one funding agency involved is probably identically zero. Yes, sure maybe today being able to pursue a single line of research to a meaningful conclusion is more difficult; but I am not convinced that makes the situation better - I think it might actually make this disconnect worse!
Back pre-digital era when that researcher retired and all of their stuff was handed to an archive, it would have been treated as a single collection. When descriptions of it are put on-line now (perhaps involving digitizing some analog materials), they probably would have been split into sub-collections, not by grant but by categories based on scientific utility. For example, in the Antarctic case, Dry Valleys rock samples, Dry Valley’s thin slices; Dry Valley’s chemical assays; etc. In the Greenland case, something like 30 year temperature record at Summit Greenland; 30 year snow albedo Summit, Greenland. Why would anyone want the data split into stuff collected using grant X, stuff collected using grant Y, etc.? Yet that is exactly what this active/adaptive DMP stuff tries to do, which is I think exactly what Herman was saying in the first bullet… What researchers do and what DMP’s aim to do are rather orthogonal at the moment… OK, yes funding agencies might like to see things organized by grant; but that certainly would not make it easy to re-use those data - in fact, organization by grant rather defeats the purpose of maximizing that data’s value.
Now if I had a DMP that actually discussed my line of research that was updated not only during a grant; but to include stuff coming in under any new grants; that might be more realistic.
My 2 cents…
Ruth
On Aug 14, 2015, at 11:19 AM, mlangset <***@***.***> wrote:
A couple of things that I have been thinking about with respect to active/adaptable DMPs:
It would be great if various disciplines could map DMP elements to corresponding elements in their metadata standards. That way if the DMP is kept up-to-date, the end product could be a metadata record describing the data. Tools would likely need to be developed with targeted questions for initially writing the DMP, an interface for updating the DMP, and a utility for outputting standard XML metadata records.
With respect to keeping the DMPs updated, the tools could incorporate a schedule tracker. Prior to the start of the project, PIs would input an anticipated schedule for key milestones in the project where deviations from the initial DMP tend to happen. The tool could push notices to the project team asking if certain elements are still accurate and providing an easy way to edit.
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532

On 14 Aug 2015, at 22:11, ***@***.*** wrote:
That funders seek a DMP does not necessarily mean a one to one mapping with their grant no? Cannot a single DMP be portable and reusable for all funders that request it?
Ideally yes, but that means that the requirements / policies of the funding agencies are consistent - or at least not contradictory, which is not always the case.
IMHO this means that a 3 way dialogue is needed: funding agencies / policy makers : scientists : service providers.
e.g. there needs to be a feedback loop as some policies may not be realistically implementable under certain conditions, nor even meaningful.
Looks like we will have a healthy discussion in Paris and beyond…
Cheers, Jamie
Sent while mobile.
On Aug 14, 2015, at 3:55 PM, rduerr <***@***.***> wrote:
The biggest problem I’ve always seen with even the concept of an active/adaptable DMP is the concept that data sets and projects are related 1 to 1 or maybe many to 1; but not in the many to many fashion which is the way things really work. If I am a researcher who has bene pursuing a line of research for 20 years (e.g., how does volcano plumbing work or what’s going on with the Greenland ice sheet); I may well have a sizable collection of materials that intellectually speaking are one continuous, cohesive collection (e.g., XYZ’s geological study of Antarctica’s Dry Valleys or Joe Blow’s 30 year record of XYZ measurements at Summit Greenland); yet the odds of there only having been one grant and one funding agency involved is probably identically zero. Yes, sure maybe today being able to pursue a single line of research to a meaningful conclusion is more difficult; but I am not convinced that makes the situation better - I think it might actually make this disconnect worse!
Back pre-digital era when that researcher retired and all of their stuff was handed to an archive, it would have been treated as a single collection. When descriptions of it are put on-line now (perhaps involving digitizing some analog materials), they probably would have been split into sub-collections, not by grant but by categories based on scientific utility. For example, in the Antarctic case, Dry Valleys rock samples, Dry Valley’s thin slices; Dry Valley’s chemical assays; etc. In the Greenland case, something like 30 year temperature record at Summit Greenland; 30 year snow albedo Summit, Greenland. Why would anyone want the data split into stuff collected using grant X, stuff collected using grant Y, etc.? Yet that is exactly what this active/adaptive DMP stuff tries to do, which is I think exactly what Herman was saying in the first bullet… What researchers do and what DMP’s aim to do are rather orthogonal at the moment… OK, yes funding agencies might like to see things organized by grant; but that certainly would not make it easy to re-use those data - in fact, organization by grant rather defeats the purpose of maximizing that data’s value.
Now if I had a DMP that actually discussed my line of research that was updated not only during a grant; but to include stuff coming in under any new grants; that might be more realistic.
My 2 cents…
Ruth
On Aug 14, 2015, at 11:19 AM, mlangset <***@***.***> wrote:
A couple of things that I have been thinking about with respect to active/adaptable DMPs:
It would be great if various disciplines could map DMP elements to corresponding elements in their metadata standards. That way if the DMP is kept up-to-date, the end product could be a metadata record describing the data. Tools would likely need to be developed with targeted questions for initially writing the DMP, an interface for updating the DMP, and a utility for outputting standard XML metadata records.
With respect to keeping the DMPs updated, the tools could incorporate a schedule tracker. Prior to the start of the project, PIs would input an anticipated schedule for key milestones in the project where deviations from the initial DMP tend to happen. The tool could push notices to the project team asking if certain elements are still accurate and providing an easy way to edit.
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532

Hi David,
As Jamie already noted the demands are not always consistent.
Furthermore, what a researcher needs from a DMP is quite different from
what the funder needs.
That is, if a researcher has an overarching DMP for her research it
would still have to be adapted for every grant. Which reduces the direct
usefulness. (though having one would still be useful and helpful for
developing good data praxis).
Cheers,
Herman

Hi David,
We’ve been working to implement a lifecycle approach within the DMPonline tool. The roadmap describes the work we’ve got planned this year. https://dmponline.dcc.ac.uk/roadmap
I can imagine that there could be a case for multiple funders wanting a single DMP but I can imagine that researchers might want to reuse a DMP as part of longitudinal research where a particular data set might be the basis of many related projects. In that case, it may be desirable to be able to have be able to import the previous DMP into a new DMP template for a new award as a starting point DMP that can be built upon. If the previous version was clearly distinguished from the new work in the plan, this approach might also help to be clear about roles and responsibilities over time. I’m not sure, but I’d imagine that this could be relatively straightforward to implement in a tool like DMPonline in cases where a single researcher or research group wants to develop a new proposal around data that he/she already collected. However, I imagine that such an approach might be far more difficult if we are aiming to allow new, unrelated researchers to reuse DMPs created by others/other groups. This would require a permission feature of some kind.
One way we might be able to address this would be to encourage researchers to publish a DMP at the end of the project. Sensitive information could be left out if necessary – for example DMPonline allows users to select which information they want to export and in which format. I’ve long been in favour of funders requesting that researchers publish a post award version of their DMP at the end of the award. The focus on pre-award is helpful to get researchers thinking about the challenges, risks and benefits at the earliest stage of their research which is great but the post award version is what we really want to see – what was done rather than what was planned. A post award stage DMP would provide context and provenance information that could help researches make an informed decision about reuse of data they find (i.e., could help to build trust). In the DCC, we’ve been encouraging this approach but unless it is mandated by funders, it may be difficult to progress. The final stage DMPs could go in an institutional repository and be linked to the data or could be stored with the data in a subject specific data repository.
In terms of research administration, a post award stage DMP will become an increasingly vital tool to help planning around institutional RDM infrastructure and support services. For instance, it is likely that research offices will look to mine the content of their DMPs to get an understanding of storage requests and various analytical and visualisation tools needed for a certain period of activity (e.g., annually). While having an idea of what researcher thought they would need is a helpful tool for budgeting and planning, having access to data that indicates what they actually used is far better for making investment decisions. The DCC and Jisc are currently working to develop an API for DMPonline and are working with Masud Khokha at the University of Lancaster to test a use case around research administration metrics for DMPS. http://www.dmao.info/blog/2015/07/03/dmponline-api-and-dmaonline.html
All the best,
Joy
From: herman.stehouwer=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of HermanStehouwer
Sent: 17 August 2015 10:00
To: ***@***.***; rduerr; Active Data Management Plans IG
Cc: mlangset
Subject: Re: [rda-datamanagplans] A short note on DMPs
Hi David,
As Jamie already noted the demands are not always consistent.
Furthermore, what a researcher needs from a DMP is quite different from what the funder needs.
That is, if a researcher has an overarching DMP for her research it would still have to be adapted for every grant. Which reduces the direct usefulness. (though having one would still be useful and helpful for developing good data praxis).
Cheers,
Herman
On 14/08/15 22:11, ***@***.*** wrote:
That funders seek a DMP does not necessarily mean a one to one mapping with their grant no? Cannot a single DMP be portable and reusable for all funders that request it?
Sent while mobile.
On Aug 14, 2015, at 3:55 PM, rduerr <***@***.***> wrote:
The biggest problem I’ve always seen with even the concept of an active/adaptable DMP is the concept that data sets and projects are related 1 to 1 or maybe many to 1; but not in the many to many fashion which is the way things really work. If I am a researcher who has bene pursuing a line of research for 20 years (e.g., how does volcano plumbing work or what’s going on with the Greenland ice sheet); I may well have a sizable collection of materials that intellectually speaking are one continuous, cohesive collection (e.g., XYZ’s geological study of Antarctica’s Dry Valleys or Joe Blow’s 30 year record of XYZ measurements at Summit Greenland); yet the odds of there only having been one grant and one funding agency involved is probably identically zero. Yes, sure maybe today being able to pursue a single line of research to a meaningful conclusion is more difficult; but I am not convinced that makes the situation better - I think it might actually make this disconnect worse!
Back pre-digital era when that researcher retired and all of their stuff was handed to an archive, it would have been treated as a single collection. When descriptions of it are put on-line now (perhaps involving digitizing some analog materials), they probably would have been split into sub-collections, not by grant but by categories based on scientific utility. For example, in the Antarctic case, Dry Valleys rock samples, Dry Valley’s thin slices; Dry Valley’s chemical assays; etc. In the Greenland case, something like 30 year temperature record at Summit Greenland; 30 year snow albedo Summit, Greenland. Why would anyone want the data split into stuff collected using grant X, stuff collected using grant Y, etc.? Yet that is exactly what this active/adaptive DMP stuff tries to do, which is I think exactly what Herman was saying in the first bullet… What researchers do and what DMP’s aim to do are rather orthogonal at the moment… OK, yes funding agencies might like to see things organized by grant; but that certainly would not make it easy to re-use those data - in fact, organization by grant rather defeats the purpose of maximizing that data’s value.
Now if I had a DMP that actually discussed my line of research that was updated not only during a grant; but to include stuff coming in under any new grants; that might be more realistic.
My 2 cents…
Ruth
On Aug 14, 2015, at 11:19 AM, mlangset <***@***.***> wrote:
A couple of things that I have been thinking about with respect to active/adaptable DMPs:
It would be great if various disciplines could map DMP elements to corresponding elements in their metadata standards. That way if the DMP is kept up-to-date, the end product could be a metadata record describing the data. Tools would likely need to be developed with targeted questions for initially writing the DMP, an interface for updating the DMP, and a utility for outputting standard XML metadata records.
With respect to keeping the DMPs updated, the tools could incorporate a schedule tracker. Prior to the start of the project, PIs would input an anticipated schedule for key milestones in the project where deviations from the initial DMP tend to happen. The tool could push notices to the project team asking if certain elements are still accurate and providing an easy way to edit.
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Dr. ir. Herman Stehouwer
Max Planck Computing and Data Facility (MPCDF)
RDA Secretariat***@***.*** 0031-619258815
Skype: herman.stehouwer.mpi

Hi,
I have also some ideas concerning “Static DMPs”.
We can reduce the bureaucracy and improve the quality of DMPs by automating their creation. We should go beyond online questionnaires and automatic notifications (they are useful, but not sufficient). To achieve that, we should identify which information can be sourced automatically; in what (common) format we can store it and in what way we can integrate DMP generation with existing tools.
We should especially focus on the phase, in which the actual research is performed, not the grant application phase. We do not need estimations, but real data. There are already solutions that could be used for generation of technical parts of DMPs. For example, we could use file format characterization (repositories domain) and provenance traces to obtain information on the size and type of data produced in an experiment. We are also able to detect software that was used to produce these results.
We can make the DMPs less static by reducing the workload and skills required to create them. This can be achieved by automatic sourcing of information.
All the best,
Tomasz Miksa
- Show quoted text -From: mlangseth=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of mlangset
Sent: Friday, August 14, 2015 7:19 PM
To: Active Data Management Plans IG
Subject: Re: [rda-datamanagplans] A short note on DMPs
A couple of things that I have been thinking about with respect to active/adaptable DMPs:
It would be great if various disciplines could map DMP elements to corresponding elements in their metadata standards. That way if the DMP is kept up-to-date, the end product could be a metadata record describing the data. Tools would likely need to be developed with targeted questions for initially writing the DMP, an interface for updating the DMP, and a utility for outputting standard XML metadata records.
With respect to keeping the DMPs updated, the tools could incorporate a schedule tracker. Prior to the start of the project, PIs would input an anticipated schedule for key milestones in the project where deviations from the initial DMP tend to happen. The tool could push notices to the project team asking if certain elements are still accurate and providing an easy way to edit.
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532

Sorry – that should have read *can’t* imagine a use case for multiple funders wanting access to a single DMP.
From: Joy Davidson
Sent: 17 August 2015 11:08
To: 'HermanStehouwer'; ***@***.***; rduerr; Active Data Management Plans IG
Cc: mlangset; Sarah Jones (HATII); 'ASHLEY Kevin'
Subject: RE: [rda-datamanagplans] A short note on DMPs
Hi David,
We’ve been working to implement a lifecycle approach within the DMPonline tool. The roadmap describes the work we’ve got planned this year. https://dmponline.dcc.ac.uk/roadmap
I can imagine that there could be a case for multiple funders wanting a single DMP but I can imagine that researchers might want to reuse a DMP as part of longitudinal research where a particular data set might be the basis of many related projects. In that case, it may be desirable to be able to have be able to import the previous DMP into a new DMP template for a new award as a starting point DMP that can be built upon. If the previous version was clearly distinguished from the new work in the plan, this approach might also help to be clear about roles and responsibilities over time. I’m not sure, but I’d imagine that this could be relatively straightforward to implement in a tool like DMPonline in cases where a single researcher or research group wants to develop a new proposal around data that he/she already collected. However, I imagine that such an approach might be far more difficult if we are aiming to allow new, unrelated researchers to reuse DMPs created by others/other groups. This would require a permission feature of some kind.
One way we might be able to address this would be to encourage researchers to publish a DMP at the end of the project. Sensitive information could be left out if necessary – for example DMPonline allows users to select which information they want to export and in which format. I’ve long been in favour of funders requesting that researchers publish a post award version of their DMP at the end of the award. The focus on pre-award is helpful to get researchers thinking about the challenges, risks and benefits at the earliest stage of their research which is great but the post award version is what we really want to see – what was done rather than what was planned. A post award stage DMP would provide context and provenance information that could help researches make an informed decision about reuse of data they find (i.e., could help to build trust). In the DCC, we’ve been encouraging this approach but unless it is mandated by funders, it may be difficult to progress. The final stage DMPs could go in an institutional repository and be linked to the data or could be stored with the data in a subject specific data repository.
In terms of research administration, a post award stage DMP will become an increasingly vital tool to help planning around institutional RDM infrastructure and support services. For instance, it is likely that research offices will look to mine the content of their DMPs to get an understanding of storage requests and various analytical and visualisation tools needed for a certain period of activity (e.g., annually). While having an idea of what researcher thought they would need is a helpful tool for budgeting and planning, having access to data that indicates what they actually used is far better for making investment decisions. The DCC and Jisc are currently working to develop an API for DMPonline and are working with Masud Khokha at the University of Lancaster to test a use case around research administration metrics for DMPS. http://www.dmao.info/blog/2015/07/03/dmponline-api-and-dmaonline.html
All the best,
Joy
From: herman.stehouwer=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of HermanStehouwer
Sent: 17 August 2015 10:00
To: ***@***.***; rduerr; Active Data Management Plans IG
Cc: mlangset
Subject: Re: [rda-datamanagplans] A short note on DMPs
Hi David,
As Jamie already noted the demands are not always consistent.
Furthermore, what a researcher needs from a DMP is quite different from what the funder needs.
That is, if a researcher has an overarching DMP for her research it would still have to be adapted for every grant. Which reduces the direct usefulness. (though having one would still be useful and helpful for developing good data praxis).
Cheers,
Herman
On 14/08/15 22:11, ***@***.*** wrote:
That funders seek a DMP does not necessarily mean a one to one mapping with their grant no? Cannot a single DMP be portable and reusable for all funders that request it?
Sent while mobile.
On Aug 14, 2015, at 3:55 PM, rduerr <***@***.***> wrote:
The biggest problem I’ve always seen with even the concept of an active/adaptable DMP is the concept that data sets and projects are related 1 to 1 or maybe many to 1; but not in the many to many fashion which is the way things really work. If I am a researcher who has bene pursuing a line of research for 20 years (e.g., how does volcano plumbing work or what’s going on with the Greenland ice sheet); I may well have a sizable collection of materials that intellectually speaking are one continuous, cohesive collection (e.g., XYZ’s geological study of Antarctica’s Dry Valleys or Joe Blow’s 30 year record of XYZ measurements at Summit Greenland); yet the odds of there only having been one grant and one funding agency involved is probably identically zero. Yes, sure maybe today being able to pursue a single line of research to a meaningful conclusion is more difficult; but I am not convinced that makes the situation better - I think it might actually make this disconnect worse!
Back pre-digital era when that researcher retired and all of their stuff was handed to an archive, it would have been treated as a single collection. When descriptions of it are put on-line now (perhaps involving digitizing some analog materials), they probably would have been split into sub-collections, not by grant but by categories based on scientific utility. For example, in the Antarctic case, Dry Valleys rock samples, Dry Valley’s thin slices; Dry Valley’s chemical assays; etc. In the Greenland case, something like 30 year temperature record at Summit Greenland; 30 year snow albedo Summit, Greenland. Why would anyone want the data split into stuff collected using grant X, stuff collected using grant Y, etc.? Yet that is exactly what this active/adaptive DMP stuff tries to do, which is I think exactly what Herman was saying in the first bullet… What researchers do and what DMP’s aim to do are rather orthogonal at the moment… OK, yes funding agencies might like to see things organized by grant; but that certainly would not make it easy to re-use those data - in fact, organization by grant rather defeats the purpose of maximizing that data’s value.
Now if I had a DMP that actually discussed my line of research that was updated not only during a grant; but to include stuff coming in under any new grants; that might be more realistic.
My 2 cents…
Ruth
On Aug 14, 2015, at 11:19 AM, mlangset <***@***.***> wrote:
A couple of things that I have been thinking about with respect to active/adaptable DMPs:
It would be great if various disciplines could map DMP elements to corresponding elements in their metadata standards. That way if the DMP is kept up-to-date, the end product could be a metadata record describing the data. Tools would likely need to be developed with targeted questions for initially writing the DMP, an interface for updating the DMP, and a utility for outputting standard XML metadata records.
With respect to keeping the DMPs updated, the tools could incorporate a schedule tracker. Prior to the start of the project, PIs would input an anticipated schedule for key milestones in the project where deviations from the initial DMP tend to happen. The tool could push notices to the project team asking if certain elements are still accurate and providing an easy way to edit.
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Dr. ir. Herman Stehouwer
Max Planck Computing and Data Facility (MPCDF)
RDA Secretariat***@***.*** 0031-619258815
Skype: herman.stehouwer.mpi

We might need some disambiguation of terms here. At CASRAI we think it is
more about reusing existing information elements about some 'thing' in a
way that reduces admin burden and duplication on researchers and their
supporting teams. This is less about a 'single DMP' and more about a common
approach to reusing DMP info so that two or more use cases might reuse 60%
or 80% of the same information without forcing the researcher to endlessly
retype. As we start to analyse use cases we can start to see where reuse
and harmonization is possible.
D
On Mon, Aug 17, 2015 at 9:33 AM, Joy Davidson <***@***.***>
wrote:

Hi David,
I imagine that reusing information within DMPs will likely be more useful - at least at this point in time - at the institutional level than the funder level. The DCC has been working with HEIs to consider how we might best reuse the information within the DMPs as a means of reducing burden on researchers when starting new project. For instance, we’d like to be able to push/pull info between the DMP and ethics approval forms. This might mean a change in the order that some processes occur within an institution (e.g., some HEIs start ethics approval once funding has been awarded, others include this in the pre-award stage) and might mean that some existing processes need to be amended. This way, we can start to reduce the risk of disjoin between DMPs, pathway to impact plans, and any consent forms that are developed for the project.
By integrating aspects of DMPs into other University processes, we can reduce burden and hopefully lead to improved consistency of data between systems. At the University of Leicester, for example, they have included a few questions from the DMP template into their project approval forms so that issues relating to data storage requirements and/or dealing with sensitive information can be flagged up early on in the lifecycle. We anticipate that the ability to push and pull DMP info will also be valuable when depositing the data into a repository for the required retention period (i.e., less fields of metadata would need to be entered manually but could be pulled through from other institutional systems).
Funders may also benefit from pulling through DMP information into their own public catalogues and portals (e.g., RCUK’s Gateway to Research).
All the best,
Joy
From: dbaker=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of ***@***.***
Sent: 17 August 2015 14:47
To: Joy Davidson; Active Data Management Plans IG
Cc: HermanStehouwer; rduerr; mlangset; Sarah Jones (HATII); ASHLEY Kevin
Subject: Re: [rda-datamanagplans] A short note on DMPs
We might need some disambiguation of terms here. At CASRAI we think it is more about reusing existing information elements about some 'thing' in a way that reduces admin burden and duplication on researchers and their supporting teams. This is less about a 'single DMP' and more about a common approach to reusing DMP info so that two or more use cases might reuse 60% or 80% of the same information without forcing the researcher to endlessly retype. As we start to analyse use cases we can start to see where reuse and harmonization is possible.
D
On Mon, Aug 17, 2015 at 9:33 AM, Joy Davidson <***@***.***> wrote:
Sorry – that should have read *can’t* imagine a use case for multiple funders wanting access to a single DMP.
From: Joy Davidson
Sent: 17 August 2015 11:08
To: 'HermanStehouwer'; ***@***.***; rduerr; Active Data Management Plans IG
Cc: mlangset; Sarah Jones (HATII); 'ASHLEY Kevin'
Subject: RE: [rda-datamanagplans] A short note on DMPs
Hi David,
We’ve been working to implement a lifecycle approach within the DMPonline tool. The roadmap describes the work we’ve got planned this year. https://dmponline.dcc.ac.uk/roadmap
I can imagine that there could be a case for multiple funders wanting a single DMP but I can imagine that researchers might want to reuse a DMP as part of longitudinal research where a particular data set might be the basis of many related projects. In that case, it may be desirable to be able to have be able to import the previous DMP into a new DMP template for a new award as a starting point DMP that can be built upon. If the previous version was clearly distinguished from the new work in the plan, this approach might also help to be clear about roles and responsibilities over time. I’m not sure, but I’d imagine that this could be relatively straightforward to implement in a tool like DMPonline in cases where a single researcher or research group wants to develop a new proposal around data that he/she already collected. However, I imagine that such an approach might be far more difficult if we are aiming to allow new, unrelated researchers to reuse DMPs created by others/other groups. This would require a permission feature of some kind.
One way we might be able to address this would be to encourage researchers to publish a DMP at the end of the project. Sensitive information could be left out if necessary – for example DMPonline allows users to select which information they want to export and in which format. I’ve long been in favour of funders requesting that researchers publish a post award version of their DMP at the end of the award. The focus on pre-award is helpful to get researchers thinking about the challenges, risks and benefits at the earliest stage of their research which is great but the post award version is what we really want to see – what was done rather than what was planned. A post award stage DMP would provide context and provenance information that could help researches make an informed decision about reuse of data they find (i.e., could help to build trust). In the DCC, we’ve been encouraging this approach but unless it is mandated by funders, it may be difficult to progress. The final stage DMPs could go in an institutional repository and be linked to the data or could be stored with the data in a subject specific data repository.
In terms of research administration, a post award stage DMP will become an increasingly vital tool to help planning around institutional RDM infrastructure and support services. For instance, it is likely that research offices will look to mine the content of their DMPs to get an understanding of storage requests and various analytical and visualisation tools needed for a certain period of activity (e.g., annually). While having an idea of what researcher thought they would need is a helpful tool for budgeting and planning, having access to data that indicates what they actually used is far better for making investment decisions. The DCC and Jisc are currently working to develop an API for DMPonline and are working with Masud Khokha at the University of Lancaster to test a use case around research administration metrics for DMPS. http://www.dmao.info/blog/2015/07/03/dmponline-api-and-dmaonline.html
All the best,
Joy
From: herman.stehouwer=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of HermanStehouwer
Sent: 17 August 2015 10:00
To: ***@***.***; rduerr; Active Data Management Plans IG
Cc: mlangset
Subject: Re: [rda-datamanagplans] A short note on DMPs
Hi David,
As Jamie already noted the demands are not always consistent.
Furthermore, what a researcher needs from a DMP is quite different from what the funder needs.
That is, if a researcher has an overarching DMP for her research it would still have to be adapted for every grant. Which reduces the direct usefulness. (though having one would still be useful and helpful for developing good data praxis).
Cheers,
Herman
On 14/08/15 22:11, ***@***.*** wrote:
That funders seek a DMP does not necessarily mean a one to one mapping with their grant no? Cannot a single DMP be portable and reusable for all funders that request it?
Sent while mobile.
On Aug 14, 2015, at 3:55 PM, rduerr <***@***.***> wrote:
The biggest problem I’ve always seen with even the concept of an active/adaptable DMP is the concept that data sets and projects are related 1 to 1 or maybe many to 1; but not in the many to many fashion which is the way things really work. If I am a researcher who has bene pursuing a line of research for 20 years (e.g., how does volcano plumbing work or what’s going on with the Greenland ice sheet); I may well have a sizable collection of materials that intellectually speaking are one continuous, cohesive collection (e.g., XYZ’s geological study of Antarctica’s Dry Valleys or Joe Blow’s 30 year record of XYZ measurements at Summit Greenland); yet the odds of there only having been one grant and one funding agency involved is probably identically zero. Yes, sure maybe today being able to pursue a single line of research to a meaningful conclusion is more difficult; but I am not convinced that makes the situation better - I think it might actually make this disconnect worse!
Back pre-digital era when that researcher retired and all of their stuff was handed to an archive, it would have been treated as a single collection. When descriptions of it are put on-line now (perhaps involving digitizing some analog materials), they probably would have been split into sub-collections, not by grant but by categories based on scientific utility. For example, in the Antarctic case, Dry Valleys rock samples, Dry Valley’s thin slices; Dry Valley’s chemical assays; etc. In the Greenland case, something like 30 year temperature record at Summit Greenland; 30 year snow albedo Summit, Greenland. Why would anyone want the data split into stuff collected using grant X, stuff collected using grant Y, etc.? Yet that is exactly what this active/adaptive DMP stuff tries to do, which is I think exactly what Herman was saying in the first bullet… What researchers do and what DMP’s aim to do are rather orthogonal at the moment… OK, yes funding agencies might like to see things organized by grant; but that certainly would not make it easy to re-use those data - in fact, organization by grant rather defeats the purpose of maximizing that data’s value.
Now if I had a DMP that actually discussed my line of research that was updated not only during a grant; but to include stuff coming in under any new grants; that might be more realistic.
My 2 cents…
Ruth
On Aug 14, 2015, at 11:19 AM, mlangset <***@***.***> wrote:
A couple of things that I have been thinking about with respect to active/adaptable DMPs:
It would be great if various disciplines could map DMP elements to corresponding elements in their metadata standards. That way if the DMP is kept up-to-date, the end product could be a metadata record describing the data. Tools would likely need to be developed with targeted questions for initially writing the DMP, an interface for updating the DMP, and a utility for outputting standard XML metadata records.
With respect to keeping the DMPs updated, the tools could incorporate a schedule tracker. Prior to the start of the project, PIs would input an anticipated schedule for key milestones in the project where deviations from the initial DMP tend to happen. The tool could push notices to the project team asking if certain elements are still accurate and providing an easy way to edit.
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Dr. ir. Herman Stehouwer
Max Planck Computing and Data Facility (MPCDF)
RDA Secretariat***@***.*** 0031-619258815
Skype: herman.stehouwer.mpi
--
David Baker
Executive Director
+1 613 291 7635***@***.***
@casrai_ed
@casrai
dbakerskypehttp://casrai.orghttp://www.youtube.com/watch?v=Gmx7U9-i3Gghttp://reconnect.casrai.orghttp://dictionary.casrai.org

Hi All;
Tools and infrastructure, which facilitate real data management, are different from those that simply support the review of project in terms of whether they are meeting obligations
From my perspective it all depend on what function the data management performs. I feel it not particularly helpful to separate the “planning elements” of data management from all the information you actually need to perform the data management. There is a real danger that infrastructure a will be produced that only support administrative obligations (which scientists will be resistant to) rather than create tools which support the realities of data management. This will critically include not being OAIS compliant and accepting higher level risks when pragmatic considerations are taken into account.
If you simply look at the planning elements in isolation you simply capture a snapshot of what a project is intending to do at any point in time. This probably supports the function of funder s reviewing projects to see what a project intend to do, supports to some degree an common understanding within a project and give an archive an indication of what is heading its way. There are some guidelines from funder that then suggest you must initiate this in the first six months and update mid term and towards the end of a project. However on the type of large scale collaborative projects using HPC resource we are seeing at CEDA (which I am sure is a common trend across many institutions) isn’t adequate.
In order to do data management fully and effectively you need project stakeholders to understand and engage with a lot more information than is currently contained in a lot DMP’s templates.
I am also in agreement with Ruth and other contributor as to the different relation stakeholder: project participants, archives and funders have with data management
I include diagram below which illustrate the different relationships CEDA has with Data Management (DyDaM stands for Dynamic Data Management). Where is managed on a research projects (with may different types of participants at different institutions). These projects which are funded by different organisations. HPC resource needs to be supplied by different institution to different types of project. Data input are supplies from different archives and data ouput form projects can be deleted or go to different archive where it archived in accordance with different standard , different levels of access are provided and it maintained/preserved( or not) in different ways during its lifetime. Data management is also required for legacy data which needs to brought under “Active preservation management”. There are also rescue and recovery operations to be considered.
Herman this actually includes a Max Planck (Jena) example from the BACI H2020 project
[cid:AB5DDD64-8F97-4FC0-924C-F039C410DFA3]
I would argue (apologies I have posted earlier version f this model before, but things have developed since ) that a common conceptual for information which cam be transferred between all types of project to archives and resource providers
[cid:928539CF-3293-44C0-8B37-73EA51C6A016]
The important thing to note is that risk management /review procedure drive processes.
· Aspirational entities, which include Data Entity definitions and their associated Preservation Objectives.
· Risks entities, which act as drivers for change within the data lifecycle. These include Acquisitional Risks, Technical Risks, Strategic Risks and External Risks
· Plans entities, which detail the actions to bring about change within an archive. These include Acquisition Plans, Preservation Plans and Monitoring plans which support responsive interactions with the community.
· The result entities describe the outcomes of the plans. Acquisitions. Mitigations and Accepted Risks. Allowing for imperfect but functioning solution which can be realistically supported by an archive resource levels. Also scientists participating in project need to be able to look in on progress ( What data has been acquired , validated, scientifically approved and published )
For example the strategic risk would be that only data that is valuable to the NCEO community is archive at CEDA so a risk management procedure to review data output is created. If deemed valuable then and only then an would a plan for CEDA to acquire the data and carry out archival preservation and access functions be created.
By having necessary procedure in place updates to the plans happen as frequently as the project requires and plans evolve in line with project needs. Plans only become as complex as required. So data that is archived with no long term preservation objective does not require preservation actions and can tolerate risks of obsolescence in 10 years time . If no immediate preservation solution are affordable the risk is tolerated and added to monitoring plan etc.
However as Sarah and Joy have pointed out before this would require a considerable investment if the something like the DMP online could extend its functionality from the “DyDaM project” perspective . But if the community was able to to come together to support something like this I think it would be invaluable .
Anyway thats my two pennies worth
Esther
From: Joy Davidson <***@***.***>
Date: Monday, 17 August 2015 14:35
To: 'HermanStehouwer' <***@***.***>, "'***@***.***'" <***@***.***>, 'rduerr' <***@***.***>, 'Active Data Management Plans IG' <***@***.***-groups.org>
Cc: 'mlangset' <***@***.***>, "Sarah Jones (HATII)" <***@***.***>, 'ASHLEY Kevin' <***@***.***>
Subject: Re: [rda-datamanagplans] A short note on DMPs
Sorry – that should have read *can’t* imagine a use case for multiple funders wanting access to a single DMP.
From: Joy Davidson
Sent: 17 August 2015 11:08
To: 'HermanStehouwer'; ***@***.***; rduerr; Active Data Management Plans IG
Cc: mlangset; Sarah Jones (HATII); 'ASHLEY Kevin'
Subject: RE: [rda-datamanagplans] A short note on DMPs
Hi David,
We’ve been working to implement a lifecycle approach within the DMPonline tool. The roadmap describes the work we’ve got planned this year. https://dmponline.dcc.ac.uk/roadmap
I can imagine that there could be a case for multiple funders wanting a single DMP but I can imagine that researchers might want to reuse a DMP as part of longitudinal research where a particular data set might be the basis of many related projects. In that case, it may be desirable to be able to have be able to import the previous DMP into a new DMP template for a new award as a starting point DMP that can be built upon. If the previous version was clearly distinguished from the new work in the plan, this approach might also help to be clear about roles and responsibilities over time. I’m not sure, but I’d imagine that this could be relatively straightforward to implement in a tool like DMPonline in cases where a single researcher or research group wants to develop a new proposal around data that he/she already collected. However, I imagine that such an approach might be far more difficult if we are aiming to allow new, unrelated researchers to reuse DMPs created by others/other groups. This would require a permission feature of some kind.
One way we might be able to address this would be to encourage researchers to publish a DMP at the end of the project. Sensitive information could be left out if necessary – for example DMPonline allows users to select which information they want to export and in which format. I’ve long been in favour of funders requesting that researchers publish a post award version of their DMP at the end of the award. The focus on pre-award is helpful to get researchers thinking about the challenges, risks and benefits at the earliest stage of their research which is great but the post award version is what we really want to see – what was done rather than what was planned. A post award stage DMP would provide context and provenance information that could help researches make an informed decision about reuse of data they find (i.e., could help to build trust). In the DCC, we’ve been encouraging this approach but unless it is mandated by funders, it may be difficult to progress. The final stage DMPs could go in an institutional repository and be linked to the data or could be stored with the data in a subject specific data repository.
In terms of research administration, a post award stage DMP will become an increasingly vital tool to help planning around institutional RDM infrastructure and support services. For instance, it is likely that research offices will look to mine the content of their DMPs to get an understanding of storage requests and various analytical and visualisation tools needed for a certain period of activity (e.g., annually). While having an idea of what researcher thought they would need is a helpful tool for budgeting and planning, having access to data that indicates what they actually used is far better for making investment decisions. The DCC and Jisc are currently working to develop an API for DMPonline and are working with Masud Khokha at the University of Lancaster to test a use case around research administration metrics for DMPS. http://www.dmao.info/blog/2015/07/03/dmponline-api-and-dmaonline.html
All the best,
Joy
From:herman.stehouwer=***@***.***-groups.org [mailto:***@***.***-groups.org] On Behalf Of HermanStehouwer
Sent: 17 August 2015 10:00
To: ***@***.***; rduerr; Active Data Management Plans IG
Cc: mlangset
Subject: Re: [rda-datamanagplans] A short note on DMPs
Hi David,
As Jamie already noted the demands are not always consistent.
Furthermore, what a researcher needs from a DMP is quite different from what the funder needs.
That is, if a researcher has an overarching DMP for her research it would still have to be adapted for every grant. Which reduces the direct usefulness. (though having one would still be useful and helpful for developing good data praxis).
Cheers,
Herman
On 14/08/15 22:11, ***@***.*** wrote:
That funders seek a DMP does not necessarily mean a one to one mapping with their grant no? Cannot a single DMP be portable and reusable for all funders that request it?
Sent while mobile.
On Aug 14, 2015, at 3:55 PM, rduerr <***@***.***> wrote:
The biggest problem I’ve always seen with even the concept of an active/adaptable DMP is the concept that data sets and projects are related 1 to 1 or maybe many to 1; but not in the many to many fashion which is the way things really work. If I am a researcher who has bene pursuing a line of research for 20 years (e.g., how does volcano plumbing work or what’s going on with the Greenland ice sheet); I may well have a sizable collection of materials that intellectually speaking are one continuous, cohesive collection (e.g., XYZ’s geological study of Antarctica’s Dry Valleys or Joe Blow’s 30 year record of XYZ measurements at Summit Greenland); yet the odds of there only having been one grant and one funding agency involved is probably identically zero. Yes, sure maybe today being able to pursue a single line of research to a meaningful conclusion is more difficult; but I am not convinced that makes the situation better - I think it might actually make this disconnect worse!
Back pre-digital era when that researcher retired and all of their stuff was handed to an archive, it would have been treated as a single collection. When descriptions of it are put on-line now (perhaps involving digitizing some analog materials), they probably would have been split into sub-collections, not by grant but by categories based on scientific utility. For example, in the Antarctic case, Dry Valleys rock samples, Dry Valley’s thin slices; Dry Valley’s chemical assays; etc. In the Greenland case, something like 30 year temperature record at Summit Greenland; 30 year snow albedo Summit, Greenland. Why would anyone want the data split into stuff collected using grant X, stuff collected using grant Y, etc.? Yet that is exactly what this active/adaptive DMP stuff tries to do, which is I think exactly what Herman was saying in the first bullet… What researchers do and what DMP’s aim to do are rather orthogonal at the moment… OK, yes funding agencies might like to see things organized by grant; but that certainly would not make it easy to re-use those data - in fact, organization by grant rather defeats the purpose of maximizing that data’s value.
Now if I had a DMP that actually discussed my line of research that was updated not only during a grant; but to include stuff coming in under any new grants; that might be more realistic.
My 2 cents…
Ruth
On Aug 14, 2015, at 11:19 AM, mlangset <***@***.***> wrote:
A couple of things that I have been thinking about with respect to active/adaptable DMPs:
It would be great if various disciplines could map DMP elements to corresponding elements in their metadata standards. That way if the DMP is kept up-to-date, the end product could be a metadata record describing the data. Tools would likely need to be developed with targeted questions for initially writing the DMP, an interface for updating the DMP, and a utility for outputting standard XML metadata records.
With respect to keeping the DMPs updated, the tools could incorporate a schedule tracker. Prior to the start of the project, PIs would input an anticipated schedule for key milestones in the project where deviations from the initial DMP tend to happen. The tool could push notices to the project team asking if certain elements are still accurate and providing an easy way to edit.
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Full post: https://www.rd-alliance.org/group/active-data-management-plans-ig/post/s...
Manage my subscriptions: https://www.rd-alliance.org/mailinglist
Stop emails for this post: https://www.rd-alliance.org/mailinglist/unsubscribe/49532
--
Dr. ir. Herman Stehouwer
Max Planck Computing and Data Facility (MPCDF)
RDA Secretariat***@***.*** 0031-619258815
Skype: herman.stehouwer.mpi