Slideshows by User: dgarijohttps://www.slideshare.net/
http://www.slideshare.net/images/logo.gifSlideshows by User: dgarijohttps://www.slideshare.net/
Tue, 03 Apr 2018 00:43:24 GMTSlideShare feed for Slideshows by User: dgarijoCapturing Context in Scientific Experiments: Towards Computer-Driven Sciencehttps://www.slideshare.net/dgarijo/caopturing-context-in-scientific-experiments-towards-computerdriven-science
researchstatement-isi2018-180403004324 Scientists publish computational experiments in ways that do not facilitate reproducibility or reuse. Significant domain expertise, time and effort are required to understand scientific experiments and their research outputs. In order to improve this situation, mechanisms are needed to capture the exact details and the context of computational experiments. Only then, Intelligent Systems would be able help researchers understand, discover, link and reuse products of existing research.
In this presentation I will introduce my work and vision towards enabling scientists share, link, curate and reuse their computational experiments and results. In the first part of the talk, I will present my work for capturing and sharing the context of scientific experiments by using scientific workflows and machine readable representations. Thanks to this approach, experiment results are described in an unambiguous manner, have a clear trace of their creation process and include a pointer to the sources used for their generation. In the second part of the talk, I will describe examples on how the context of scientific experiments may be exploited to browse, explore and inspect research results. I will end the talk by presenting new ideas for improving and benefiting from the capture of context of scientific experiments and how to involve scientists in the process of curating and creating abstractions on available research metadata.]]>
Scientists publish computational experiments in ways that do not facilitate reproducibility or reuse. Significant domain expertise, time and effort are required to understand scientific experiments and their research outputs. In order to improve this situation, mechanisms are needed to capture the exact details and the context of computational experiments. Only then, Intelligent Systems would be able help researchers understand, discover, link and reuse products of existing research.
In this presentation I will introduce my work and vision towards enabling scientists share, link, curate and reuse their computational experiments and results. In the first part of the talk, I will present my work for capturing and sharing the context of scientific experiments by using scientific workflows and machine readable representations. Thanks to this approach, experiment results are described in an unambiguous manner, have a clear trace of their creation process and include a pointer to the sources used for their generation. In the second part of the talk, I will describe examples on how the context of scientific experiments may be exploited to browse, explore and inspect research results. I will end the talk by presenting new ideas for improving and benefiting from the capture of context of scientific experiments and how to involve scientists in the process of curating and creating abstractions on available research metadata.]]>
Tue, 03 Apr 2018 00:43:24 GMThttps://www.slideshare.net/dgarijo/caopturing-context-in-scientific-experiments-towards-computerdriven-sciencedgarijo@slideshare.net(dgarijo)Capturing Context in Scientific Experiments: Towards Computer-Driven SciencedgarijoScientists publish computational experiments in ways that do not facilitate reproducibility or reuse. Significant domain expertise, time and effort are required to understand scientific experiments and their research outputs. In order to improve this situation, mechanisms are needed to capture the exact details and the context of computational experiments. Only then, Intelligent Systems would be able help researchers understand, discover, link and reuse products of existing research.
In this presentation I will introduce my work and vision towards enabling scientists share, link, curate and reuse their computational experiments and results. In the first part of the talk, I will present my work for capturing and sharing the context of scientific experiments by using scientific workflows and machine readable representations. Thanks to this approach, experiment results are described in an unambiguous manner, have a clear trace of their creation process and include a pointer to the sources used for their generation. In the second part of the talk, I will describe examples on how the context of scientific experiments may be exploited to browse, explore and inspect research results. I will end the talk by presenting new ideas for improving and benefiting from the capture of context of scientific experiments and how to involve scientists in the process of curating and creating abstractions on available research metadata.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/researchstatement-isi2018-180403004324-thumbnail-2.jpg?cb=1522716909" style="border:1px solid #C3E6D8;float:right;" /><br> Scientists publish computational experiments in ways that do not facilitate reproducibility or reuse. Significant domain expertise, time and effort are required to understand scientific experiments and their research outputs. In order to improve this situation, mechanisms are needed to capture the exact details and the context of computational experiments. Only then, Intelligent Systems would be able help researchers understand, discover, link and reuse products of existing research.
In this presentation I will introduce my work and vision towards enabling scientists share, link, curate and reuse their computational experiments and results. In the first part of the talk, I will present my work for capturing and sharing the context of scientific experiments by using scientific workflows and machine readable representations. Thanks to this approach, experiment results are described in an unambiguous manner, have a clear trace of their creation process and include a pointer to the sources used for their generation. In the second part of the talk, I will describe examples on how the context of scientific experiments may be exploited to browse, explore and inspect research results. I will end the talk by presenting new ideas for improving and benefiting from the capture of context of scientific experiments and how to involve scientists in the process of curating and creating abstractions on available research metadata.

]]>
1450//cdn.slidesharecdn.com/ss_thumbnails/researchstatement-isi2018-180403004324-thumbnail-2.jpg?cb=1522716909presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata Annotationshttps://www.slideshare.net/dgarijo/a-controlled-crowdsourcing-approach-for-practical-ontology-extensions-and-metadata-annotations
iswc-linkedearthpresentation-171024232020 Traditional approaches to ontology development have a large lapse between the time when a user using the ontology has found a need to extend it and the time when it does get extended. For scientists, this delay can be weeks or months and can be a significant barrier for adoption. We present a new approach to ontology development and data annotation enabling users to add new metadata properties on the fly as they describe their datasets, creating terms that can be immediately adopted by others and eventually become standardized. This approach combines a traditional, consensus-based approach to ontology development, and a crowdsourced approach where ex-pert users (the crowd) can dynamically add terms as needed to support their work. We have implemented this approach as a socio-technical system that includes: 1) a crowdsourcing platform to support metadata annotation and addition of new terms, 2) a range of social editorial processes to make standardization decisions for those new terms, and 3) a framework for ontology revision and updates to the metadata created with the previous version of the ontology. We present a prototype implementation for the paleoclimate community, the Linked Earth Framework, currently containing 700 datasets and engaging over 50 active contributors. Users exploit the platform to do science while extending the metadata vocabulary, thereby producing useful and practical metadata]]>
Traditional approaches to ontology development have a large lapse between the time when a user using the ontology has found a need to extend it and the time when it does get extended. For scientists, this delay can be weeks or months and can be a significant barrier for adoption. We present a new approach to ontology development and data annotation enabling users to add new metadata properties on the fly as they describe their datasets, creating terms that can be immediately adopted by others and eventually become standardized. This approach combines a traditional, consensus-based approach to ontology development, and a crowdsourced approach where ex-pert users (the crowd) can dynamically add terms as needed to support their work. We have implemented this approach as a socio-technical system that includes: 1) a crowdsourcing platform to support metadata annotation and addition of new terms, 2) a range of social editorial processes to make standardization decisions for those new terms, and 3) a framework for ontology revision and updates to the metadata created with the previous version of the ontology. We present a prototype implementation for the paleoclimate community, the Linked Earth Framework, currently containing 700 datasets and engaging over 50 active contributors. Users exploit the platform to do science while extending the metadata vocabulary, thereby producing useful and practical metadata]]>
Tue, 24 Oct 2017 23:20:19 GMThttps://www.slideshare.net/dgarijo/a-controlled-crowdsourcing-approach-for-practical-ontology-extensions-and-metadata-annotationsdgarijo@slideshare.net(dgarijo)A Controlled Crowdsourcing Approach for Practical Ontology Extensions and Metadata AnnotationsdgarijoTraditional approaches to ontology development have a large lapse between the time when a user using the ontology has found a need to extend it and the time when it does get extended. For scientists, this delay can be weeks or months and can be a significant barrier for adoption. We present a new approach to ontology development and data annotation enabling users to add new metadata properties on the fly as they describe their datasets, creating terms that can be immediately adopted by others and eventually become standardized. This approach combines a traditional, consensus-based approach to ontology development, and a crowdsourced approach where ex-pert users (the crowd) can dynamically add terms as needed to support their work. We have implemented this approach as a socio-technical system that includes: 1) a crowdsourcing platform to support metadata annotation and addition of new terms, 2) a range of social editorial processes to make standardization decisions for those new terms, and 3) a framework for ontology revision and updates to the metadata created with the previous version of the ontology. We present a prototype implementation for the paleoclimate community, the Linked Earth Framework, currently containing 700 datasets and engaging over 50 active contributors. Users exploit the platform to do science while extending the metadata vocabulary, thereby producing useful and practical metadata<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/iswc-linkedearthpresentation-171024232020-thumbnail-2.jpg?cb=1508887321" style="border:1px solid #C3E6D8;float:right;" /><br> Traditional approaches to ontology development have a large lapse between the time when a user using the ontology has found a need to extend it and the time when it does get extended. For scientists, this delay can be weeks or months and can be a significant barrier for adoption. We present a new approach to ontology development and data annotation enabling users to add new metadata properties on the fly as they describe their datasets, creating terms that can be immediately adopted by others and eventually become standardized. This approach combines a traditional, consensus-based approach to ontology development, and a crowdsourced approach where ex-pert users (the crowd) can dynamically add terms as needed to support their work. We have implemented this approach as a socio-technical system that includes: 1) a crowdsourcing platform to support metadata annotation and addition of new terms, 2) a range of social editorial processes to make standardization decisions for those new terms, and 3) a framework for ontology revision and updates to the metadata created with the previous version of the ontology. We present a prototype implementation for the paleoclimate community, the Linked Earth Framework, currently containing 700 datasets and engaging over 50 active contributors. Users exploit the platform to do science while extending the metadata vocabulary, thereby producing useful and practical metadata

]]>
2050//cdn.slidesharecdn.com/ss_thumbnails/iswc-linkedearthpresentation-171024232020-thumbnail-2.jpg?cb=1508887321presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0WIDOCO: A Wizard for Documenting Ontologieshttps://www.slideshare.net/dgarijo/widoco-a-wizard-for-documenting-ontologies
iswc-widoco-171024230805 WIDOCO is a WIzard for DOCumenting Ontologies that guides users through the documentation process of their vocabularies. Given an RDF vocabulary, WIDOCO detects missing vocabulary metadata and creates a documentation with diagrams, human readable descriptions of the ontology terms and a summary of
changes with respect to previous versions of the ontology. The documentation consists on a set of linked enriched HTML pages that can be further extended by end users. WIDOCO is open source and builds on well established Semantic Web tools. So far, it has been used to document more than one hundred ontologies in different domains.]]>
WIDOCO is a WIzard for DOCumenting Ontologies that guides users through the documentation process of their vocabularies. Given an RDF vocabulary, WIDOCO detects missing vocabulary metadata and creates a documentation with diagrams, human readable descriptions of the ontology terms and a summary of
changes with respect to previous versions of the ontology. The documentation consists on a set of linked enriched HTML pages that can be further extended by end users. WIDOCO is open source and builds on well established Semantic Web tools. So far, it has been used to document more than one hundred ontologies in different domains.]]>
Tue, 24 Oct 2017 23:08:05 GMThttps://www.slideshare.net/dgarijo/widoco-a-wizard-for-documenting-ontologiesdgarijo@slideshare.net(dgarijo)WIDOCO: A Wizard for Documenting OntologiesdgarijoWIDOCO is a WIzard for DOCumenting Ontologies that guides users through the documentation process of their vocabularies. Given an RDF vocabulary, WIDOCO detects missing vocabulary metadata and creates a documentation with diagrams, human readable descriptions of the ontology terms and a summary of
changes with respect to previous versions of the ontology. The documentation consists on a set of linked enriched HTML pages that can be further extended by end users. WIDOCO is open source and builds on well established Semantic Web tools. So far, it has been used to document more than one hundred ontologies in different domains.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/iswc-widoco-171024230805-thumbnail-2.jpg?cb=1508886593" style="border:1px solid #C3E6D8;float:right;" /><br> WIDOCO is a WIzard for DOCumenting Ontologies that guides users through the documentation process of their vocabularies. Given an RDF vocabulary, WIDOCO detects missing vocabulary metadata and creates a documentation with diagrams, human readable descriptions of the ontology terms and a summary of
changes with respect to previous versions of the ontology. The documentation consists on a set of linked enriched HTML pages that can be further extended by end users. WIDOCO is open source and builds on well established Semantic Web tools. So far, it has been used to document more than one hundred ontologies in different domains.

]]>
3960//cdn.slidesharecdn.com/ss_thumbnails/iswc-widoco-171024230805-thumbnail-2.jpg?cb=1508886593presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0Towards Automating Data Narrativeshttps://www.slideshare.net/dgarijo/towards-automating-data-narratives
iui2017v5-170316071023 We propose a new area of research on automating data narratives. Data narratives are containers of information about computationally generated research findings. They have three major components: 1) A record of events, that describe a new result through a workflow and/or provenance of all the computations executed; 2) Persistent entries for key entities involved for data, software versions, and workflows; 3) A set of narrative accounts that are automatically generated human-consumable renderings of the record and entities and can be included in a paper. Different narrative accounts can be used for different audiences with different content and details, based on the level of interest or expertise of the reader. Data narratives can make science more transparent and reproducible, because they ensure that the text description of the computational experiment reflects with high fidelity what was actually done. Data narratives can be incorporated in papers, either in the methods section or as supplementary materials. We introduce DANA, a prototype that illustrates how to generate data narratives automatically, and describe the information it uses from the computational records. We also present a formative evaluation of our approach and discuss potential uses of automated data narratives.]]>
We propose a new area of research on automating data narratives. Data narratives are containers of information about computationally generated research findings. They have three major components: 1) A record of events, that describe a new result through a workflow and/or provenance of all the computations executed; 2) Persistent entries for key entities involved for data, software versions, and workflows; 3) A set of narrative accounts that are automatically generated human-consumable renderings of the record and entities and can be included in a paper. Different narrative accounts can be used for different audiences with different content and details, based on the level of interest or expertise of the reader. Data narratives can make science more transparent and reproducible, because they ensure that the text description of the computational experiment reflects with high fidelity what was actually done. Data narratives can be incorporated in papers, either in the methods section or as supplementary materials. We introduce DANA, a prototype that illustrates how to generate data narratives automatically, and describe the information it uses from the computational records. We also present a formative evaluation of our approach and discuss potential uses of automated data narratives.]]>
Thu, 16 Mar 2017 07:10:22 GMThttps://www.slideshare.net/dgarijo/towards-automating-data-narrativesdgarijo@slideshare.net(dgarijo)Towards Automating Data NarrativesdgarijoWe propose a new area of research on automating data narratives. Data narratives are containers of information about computationally generated research findings. They have three major components: 1) A record of events, that describe a new result through a workflow and/or provenance of all the computations executed; 2) Persistent entries for key entities involved for data, software versions, and workflows; 3) A set of narrative accounts that are automatically generated human-consumable renderings of the record and entities and can be included in a paper. Different narrative accounts can be used for different audiences with different content and details, based on the level of interest or expertise of the reader. Data narratives can make science more transparent and reproducible, because they ensure that the text description of the computational experiment reflects with high fidelity what was actually done. Data narratives can be incorporated in papers, either in the methods section or as supplementary materials. We introduce DANA, a prototype that illustrates how to generate data narratives automatically, and describe the information it uses from the computational records. We also present a formative evaluation of our approach and discuss potential uses of automated data narratives.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/iui2017v5-170316071023-thumbnail-2.jpg?cb=1489648258" style="border:1px solid #C3E6D8;float:right;" /><br> We propose a new area of research on automating data narratives. Data narratives are containers of information about computationally generated research findings. They have three major components: 1) A record of events, that describe a new result through a workflow and/or provenance of all the computations executed; 2) Persistent entries for key entities involved for data, software versions, and workflows; 3) A set of narrative accounts that are automatically generated human-consumable renderings of the record and entities and can be included in a paper. Different narrative accounts can be used for different audiences with different content and details, based on the level of interest or expertise of the reader. Data narratives can make science more transparent and reproducible, because they ensure that the text description of the computational experiment reflects with high fidelity what was actually done. Data narratives can be incorporated in papers, either in the methods section or as supplementary materials. We introduce DANA, a prototype that illustrates how to generate data narratives automatically, and describe the information it uses from the computational records. We also present a formative evaluation of our approach and discuss potential uses of automated data narratives.

]]>
3250//cdn.slidesharecdn.com/ss_thumbnails/iui2017v5-170316071023-thumbnail-2.jpg?cb=1489648258presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0Automated Hypothesis Testing with Large Scale Scientific Workflowshttps://www.slideshare.net/dgarijo/automated-hypothesis-testing-with-large-scale-scientific-workflows
acs2016-v2-170215033342 (Credit to Varun Ratnakar and Yolanda Gil).
The automation of important aspects of scientific data analysis would significantly accelerate the pace of science and innovation. Although important aspects of data analysis can be automated, the hypothesize-test-evaluate discovery cycle is largely carried out by hand by researchers. This introduces a significant human bottleneck, which is inefficient and can lead to erroneous and incomplete explorations. We introduce a novel approach to automate the hypothesize-test-evaluate discovery cycle with an intelligent system that a scientist can task to test hypotheses of interest in a data repository. Our approach captures three types of data analytics knowledge: 1) common data analytic methods represented as semantic workflows; 2) meta-analysis methods that aggregate those results, represented as meta-workflows; and 3) data analysis strategies that specify for a type of hypothesis what data and methods to use, represented as lines of inquiry. Given a hypothesis specified by a scientist, appropriate lines of inquiry are triggered, which lead to retrieving relevant datasets, running relevant workflows on that data, and finally running meta-workflows on workflow results. The scientist is then presented with a level of confidence on the initial hypothesis (or a revised hypothesis) based on the data and methods applied. We have implemented this approach in the DISK system, and applied it to multi-omics data analysis.]]>
(Credit to Varun Ratnakar and Yolanda Gil).
The automation of important aspects of scientific data analysis would significantly accelerate the pace of science and innovation. Although important aspects of data analysis can be automated, the hypothesize-test-evaluate discovery cycle is largely carried out by hand by researchers. This introduces a significant human bottleneck, which is inefficient and can lead to erroneous and incomplete explorations. We introduce a novel approach to automate the hypothesize-test-evaluate discovery cycle with an intelligent system that a scientist can task to test hypotheses of interest in a data repository. Our approach captures three types of data analytics knowledge: 1) common data analytic methods represented as semantic workflows; 2) meta-analysis methods that aggregate those results, represented as meta-workflows; and 3) data analysis strategies that specify for a type of hypothesis what data and methods to use, represented as lines of inquiry. Given a hypothesis specified by a scientist, appropriate lines of inquiry are triggered, which lead to retrieving relevant datasets, running relevant workflows on that data, and finally running meta-workflows on workflow results. The scientist is then presented with a level of confidence on the initial hypothesis (or a revised hypothesis) based on the data and methods applied. We have implemented this approach in the DISK system, and applied it to multi-omics data analysis.]]>
Wed, 15 Feb 2017 03:33:42 GMThttps://www.slideshare.net/dgarijo/automated-hypothesis-testing-with-large-scale-scientific-workflowsdgarijo@slideshare.net(dgarijo)Automated Hypothesis Testing with Large Scale Scientific Workflowsdgarijo(Credit to Varun Ratnakar and Yolanda Gil).
The automation of important aspects of scientific data analysis would significantly accelerate the pace of science and innovation. Although important aspects of data analysis can be automated, the hypothesize-test-evaluate discovery cycle is largely carried out by hand by researchers. This introduces a significant human bottleneck, which is inefficient and can lead to erroneous and incomplete explorations. We introduce a novel approach to automate the hypothesize-test-evaluate discovery cycle with an intelligent system that a scientist can task to test hypotheses of interest in a data repository. Our approach captures three types of data analytics knowledge: 1) common data analytic methods represented as semantic workflows; 2) meta-analysis methods that aggregate those results, represented as meta-workflows; and 3) data analysis strategies that specify for a type of hypothesis what data and methods to use, represented as lines of inquiry. Given a hypothesis specified by a scientist, appropriate lines of inquiry are triggered, which lead to retrieving relevant datasets, running relevant workflows on that data, and finally running meta-workflows on workflow results. The scientist is then presented with a level of confidence on the initial hypothesis (or a revised hypothesis) based on the data and methods applied. We have implemented this approach in the DISK system, and applied it to multi-omics data analysis.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/acs2016-v2-170215033342-thumbnail-2.jpg?cb=1487129709" style="border:1px solid #C3E6D8;float:right;" /><br> (Credit to Varun Ratnakar and Yolanda Gil).
The automation of important aspects of scientific data analysis would significantly accelerate the pace of science and innovation. Although important aspects of data analysis can be automated, the hypothesize-test-evaluate discovery cycle is largely carried out by hand by researchers. This introduces a significant human bottleneck, which is inefficient and can lead to erroneous and incomplete explorations. We introduce a novel approach to automate the hypothesize-test-evaluate discovery cycle with an intelligent system that a scientist can task to test hypotheses of interest in a data repository. Our approach captures three types of data analytics knowledge: 1) common data analytic methods represented as semantic workflows; 2) meta-analysis methods that aggregate those results, represented as meta-workflows; and 3) data analysis strategies that specify for a type of hypothesis what data and methods to use, represented as lines of inquiry. Given a hypothesis specified by a scientist, appropriate lines of inquiry are triggered, which lead to retrieving relevant datasets, running relevant workflows on that data, and finally running meta-workflows on workflow results. The scientist is then presented with a level of confidence on the initial hypothesis (or a revised hypothesis) based on the data and methods applied. We have implemented this approach in the DISK system, and applied it to multi-omics data analysis.

]]>
1470//cdn.slidesharecdn.com/ss_thumbnails/acs2016-v2-170215033342-thumbnail-2.jpg?cb=1487129709presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0OntoSoft: A Distributed Semantic Registry for Scientific Softwarehttps://www.slideshare.net/dgarijo/ontosoft-a-distributed-semantic-registry-for-scientific-software
garijo-daniel-slides-161027205546 Credit to Yolanda Gil.
OntoSoft is a distributed semantic registry for scientific software. This paper describes three major novel contributions of OntoSoft: 1) a software metadata registry designed for scientists, 2) a distributed approach to software registries that targets communities of interest, and 3) metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology along six dimensions that matter to scientists: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software. OntoSoft is a distributed registry where each site is owned and maintained by a community of interest, with a distributed semantic query capability that allows users to search across all sites. The registry has metadata crowdsourcing capabilities, supported through access control so that software authors can allow others to expand on specific metadata properties.]]>
Credit to Yolanda Gil.
OntoSoft is a distributed semantic registry for scientific software. This paper describes three major novel contributions of OntoSoft: 1) a software metadata registry designed for scientists, 2) a distributed approach to software registries that targets communities of interest, and 3) metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology along six dimensions that matter to scientists: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software. OntoSoft is a distributed registry where each site is owned and maintained by a community of interest, with a distributed semantic query capability that allows users to search across all sites. The registry has metadata crowdsourcing capabilities, supported through access control so that software authors can allow others to expand on specific metadata properties.]]>
Thu, 27 Oct 2016 20:55:45 GMThttps://www.slideshare.net/dgarijo/ontosoft-a-distributed-semantic-registry-for-scientific-softwaredgarijo@slideshare.net(dgarijo)OntoSoft: A Distributed Semantic Registry for Scientific SoftwaredgarijoCredit to Yolanda Gil.
OntoSoft is a distributed semantic registry for scientific software. This paper describes three major novel contributions of OntoSoft: 1) a software metadata registry designed for scientists, 2) a distributed approach to software registries that targets communities of interest, and 3) metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology along six dimensions that matter to scientists: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software. OntoSoft is a distributed registry where each site is owned and maintained by a community of interest, with a distributed semantic query capability that allows users to search across all sites. The registry has metadata crowdsourcing capabilities, supported through access control so that software authors can allow others to expand on specific metadata properties.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/garijo-daniel-slides-161027205546-thumbnail-2.jpg?cb=1477601920" style="border:1px solid #C3E6D8;float:right;" /><br> Credit to Yolanda Gil.
OntoSoft is a distributed semantic registry for scientific software. This paper describes three major novel contributions of OntoSoft: 1) a software metadata registry designed for scientists, 2) a distributed approach to software registries that targets communities of interest, and 3) metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology along six dimensions that matter to scientists: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software. OntoSoft is a distributed registry where each site is owned and maintained by a community of interest, with a distributed semantic query capability that allows users to search across all sites. The registry has metadata crowdsourcing capabilities, supported through access control so that software authors can allow others to expand on specific metadata properties.

]]>
4440//cdn.slidesharecdn.com/ss_thumbnails/garijo-daniel-slides-161027205546-thumbnail-2.jpg?cb=1477601920presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0OEG tools for supporting Ontology Engineeringhttps://www.slideshare.net/dgarijo/oeg-tools-for-supporting-ontology-engineering
oetoolsontoologymaster-160721141945 In this talk we do an overview of the suite of tools developed at the OEG for supporting ontology engineering. The tasks we support are ontology documentation, evaluation, diagraming and publication with permanent ids and content negotiation. All the tools are integrated in OnToology, which uses GitHub to publish the outcome produced for each ontology.]]>
In this talk we do an overview of the suite of tools developed at the OEG for supporting ontology engineering. The tasks we support are ontology documentation, evaluation, diagraming and publication with permanent ids and content negotiation. All the tools are integrated in OnToology, which uses GitHub to publish the outcome produced for each ontology.]]>
Thu, 21 Jul 2016 14:19:45 GMThttps://www.slideshare.net/dgarijo/oeg-tools-for-supporting-ontology-engineeringdgarijo@slideshare.net(dgarijo)OEG tools for supporting Ontology EngineeringdgarijoIn this talk we do an overview of the suite of tools developed at the OEG for supporting ontology engineering. The tasks we support are ontology documentation, evaluation, diagraming and publication with permanent ids and content negotiation. All the tools are integrated in OnToology, which uses GitHub to publish the outcome produced for each ontology.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/oetoolsontoologymaster-160721141945-thumbnail-2.jpg?cb=1469110986" style="border:1px solid #C3E6D8;float:right;" /><br> In this talk we do an overview of the suite of tools developed at the OEG for supporting ontology engineering. The tasks we support are ontology documentation, evaluation, diagraming and publication with permanent ids and content negotiation. All the tools are integrated in OnToology, which uses GitHub to publish the outcome produced for each ontology.

]]>
1190//cdn.slidesharecdn.com/ss_thumbnails/oetoolsontoologymaster-160721141945-thumbnail-2.jpg?cb=1469110986presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0Software Metadata: Describing "dark software" in GeoScienceshttps://www.slideshare.net/dgarijo/software-metadata-describing-dark-software-in-geosciences
dagstuhl-ontosoft-lite-160622074604 Credit to Yolanda Gil.
In this talk I provide an overview of the current state of the art for software description in geosciences, along with our approach to facilitate this task in OntoSoft, a distributed semantic registry for scientific software. Three key aspects of OntoSoft are: a software metadata ontology designed for scientists, a distributed approach to software registries that targets communities of interest, and metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology, designed to support scientists to share, document, and reuse software, and organized along six dimensions: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software.]]>
Credit to Yolanda Gil.
In this talk I provide an overview of the current state of the art for software description in geosciences, along with our approach to facilitate this task in OntoSoft, a distributed semantic registry for scientific software. Three key aspects of OntoSoft are: a software metadata ontology designed for scientists, a distributed approach to software registries that targets communities of interest, and metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology, designed to support scientists to share, document, and reuse software, and organized along six dimensions: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software.]]>
Wed, 22 Jun 2016 07:46:04 GMThttps://www.slideshare.net/dgarijo/software-metadata-describing-dark-software-in-geosciencesdgarijo@slideshare.net(dgarijo)Software Metadata: Describing "dark software" in GeoSciencesdgarijoCredit to Yolanda Gil.
In this talk I provide an overview of the current state of the art for software description in geosciences, along with our approach to facilitate this task in OntoSoft, a distributed semantic registry for scientific software. Three key aspects of OntoSoft are: a software metadata ontology designed for scientists, a distributed approach to software registries that targets communities of interest, and metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology, designed to support scientists to share, document, and reuse software, and organized along six dimensions: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/dagstuhl-ontosoft-lite-160622074604-thumbnail-2.jpg?cb=1466582162" style="border:1px solid #C3E6D8;float:right;" /><br> Credit to Yolanda Gil.
In this talk I provide an overview of the current state of the art for software description in geosciences, along with our approach to facilitate this task in OntoSoft, a distributed semantic registry for scientific software. Three key aspects of OntoSoft are: a software metadata ontology designed for scientists, a distributed approach to software registries that targets communities of interest, and metadata crowdsourcing through access control. Software metadata is organized using the OntoSoft ontology, designed to support scientists to share, document, and reuse software, and organized along six dimensions: identify software, understand and assess software, execute software, get support for the software, do research with the software, and update the software.

]]>
4270//cdn.slidesharecdn.com/ss_thumbnails/dagstuhl-ontosoft-lite-160622074604-thumbnail-2.jpg?cb=1466582162presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0Reproducibility Using Semantics: An Overviewhttps://www.slideshare.net/dgarijo/reproducibility-using-semantics-an-overview
2016reproducibilityusingsemantics-dagstuhl-160128152111 Overview of the different approaches for addressing reproducibilities (using semantics) in laboratory protocols, workflow description and publication and workflow infrastructure. Furthermore, Research Objects are introduced as a means to capture the context and annotations of scientific experiments, together with the privacy and IPR concerns that may arise. This presentation was presented in Dagstuhl Seminar 16041: http://www.dagstuhl.de/16041]]>
Overview of the different approaches for addressing reproducibilities (using semantics) in laboratory protocols, workflow description and publication and workflow infrastructure. Furthermore, Research Objects are introduced as a means to capture the context and annotations of scientific experiments, together with the privacy and IPR concerns that may arise. This presentation was presented in Dagstuhl Seminar 16041: http://www.dagstuhl.de/16041]]>
Thu, 28 Jan 2016 15:21:11 GMThttps://www.slideshare.net/dgarijo/reproducibility-using-semantics-an-overviewdgarijo@slideshare.net(dgarijo)Reproducibility Using Semantics: An OverviewdgarijoOverview of the different approaches for addressing reproducibilities (using semantics) in laboratory protocols, workflow description and publication and workflow infrastructure. Furthermore, Research Objects are introduced as a means to capture the context and annotations of scientific experiments, together with the privacy and IPR concerns that may arise. This presentation was presented in Dagstuhl Seminar 16041: http://www.dagstuhl.de/16041<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/2016reproducibilityusingsemantics-dagstuhl-160128152111-thumbnail-2.jpg?cb=1453994722" style="border:1px solid #C3E6D8;float:right;" /><br> Overview of the different approaches for addressing reproducibilities (using semantics) in laboratory protocols, workflow description and publication and workflow infrastructure. Furthermore, Research Objects are introduced as a means to capture the context and annotations of scientific experiments, together with the privacy and IPR concerns that may arise. This presentation was presented in Dagstuhl Seminar 16041: http://www.dagstuhl.de/16041

]]>
5290//cdn.slidesharecdn.com/ss_thumbnails/2016reproducibilityusingsemantics-dagstuhl-160128152111-thumbnail-2.jpg?cb=1453994722presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0PhD Thesis: Mining abstractions in scientific workflowshttps://www.slideshare.net/dgarijo/phd-thesis-mining-abstractions-in-scientific-workflows
tesis-miningabstractionsinscientificworkflows-151204104538-lva1-app6891 Slides of the presentation for my PhD dissertation. I strongly recommend downloading the slides, as they have animations that are easier to see in power point. The abstract of the thesis is as follows: "Scientific workflows have been adopted in the last decade to represent the computational methods used in in silico scientific experiments and their associated research products. Scientific workflows have demonstrated to be useful for sharing and reproducing scientific experiments, allowing scientists to visualize, debug and save time when re-executing previous work. However, scientific workflows may be difficult to understand and reuse. The large amount of available workflows in repositories, together with their heterogeneity and lack of documentation and usage examples may become an obstacle for a scientist aiming to reuse the work from other scientists. Furthermore, given that it is often possible to implement a method using different algorithms or techniques, seemingly disparate workflows may be related at a higher level of abstraction, based on their common functionality. In this thesis we address the issue of reusability and abstraction by exploring how workflows relate to one another in a workflow repository, mining abstractions that may be helpful for workflow reuse. In order to do so, we propose a simple model for representing and relating workflows and their executions, we analyze the typical common abstractions that can be found in workflow repositories, we explore the current practices of users regarding workflow reuse and we describe a method for discovering useful abstractions for workflows based on existing graph mining techniques. Our results expose the common abstractions and practices of users in terms of workflow reuse, and show how our proposed abstractions have potential to become useful for users designing new workflows". ]]>
Slides of the presentation for my PhD dissertation. I strongly recommend downloading the slides, as they have animations that are easier to see in power point. The abstract of the thesis is as follows: "Scientific workflows have been adopted in the last decade to represent the computational methods used in in silico scientific experiments and their associated research products. Scientific workflows have demonstrated to be useful for sharing and reproducing scientific experiments, allowing scientists to visualize, debug and save time when re-executing previous work. However, scientific workflows may be difficult to understand and reuse. The large amount of available workflows in repositories, together with their heterogeneity and lack of documentation and usage examples may become an obstacle for a scientist aiming to reuse the work from other scientists. Furthermore, given that it is often possible to implement a method using different algorithms or techniques, seemingly disparate workflows may be related at a higher level of abstraction, based on their common functionality. In this thesis we address the issue of reusability and abstraction by exploring how workflows relate to one another in a workflow repository, mining abstractions that may be helpful for workflow reuse. In order to do so, we propose a simple model for representing and relating workflows and their executions, we analyze the typical common abstractions that can be found in workflow repositories, we explore the current practices of users regarding workflow reuse and we describe a method for discovering useful abstractions for workflows based on existing graph mining techniques. Our results expose the common abstractions and practices of users in terms of workflow reuse, and show how our proposed abstractions have potential to become useful for users designing new workflows". ]]>
Fri, 04 Dec 2015 10:45:38 GMThttps://www.slideshare.net/dgarijo/phd-thesis-mining-abstractions-in-scientific-workflowsdgarijo@slideshare.net(dgarijo)PhD Thesis: Mining abstractions in scientific workflowsdgarijoSlides of the presentation for my PhD dissertation. I strongly recommend downloading the slides, as they have animations that are easier to see in power point. The abstract of the thesis is as follows: "Scientific workflows have been adopted in the last decade to represent the computational methods used in in silico scientific experiments and their associated research products. Scientific workflows have demonstrated to be useful for sharing and reproducing scientific experiments, allowing scientists to visualize, debug and save time when re-executing previous work. However, scientific workflows may be difficult to understand and reuse. The large amount of available workflows in repositories, together with their heterogeneity and lack of documentation and usage examples may become an obstacle for a scientist aiming to reuse the work from other scientists. Furthermore, given that it is often possible to implement a method using different algorithms or techniques, seemingly disparate workflows may be related at a higher level of abstraction, based on their common functionality. In this thesis we address the issue of reusability and abstraction by exploring how workflows relate to one another in a workflow repository, mining abstractions that may be helpful for workflow reuse. In order to do so, we propose a simple model for representing and relating workflows and their executions, we analyze the typical common abstractions that can be found in workflow repositories, we explore the current practices of users regarding workflow reuse and we describe a method for discovering useful abstractions for workflows based on existing graph mining techniques. Our results expose the common abstractions and practices of users in terms of workflow reuse, and show how our proposed abstractions have potential to become useful for users designing new workflows". <img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/tesis-miningabstractionsinscientificworkflows-151204104538-lva1-app6891-thumbnail-2.jpg?cb=1449226186" style="border:1px solid #C3E6D8;float:right;" /><br> Slides of the presentation for my PhD dissertation. I strongly recommend downloading the slides, as they have animations that are easier to see in power point. The abstract of the thesis is as follows: &quot;Scientific workflows have been adopted in the last decade to represent the computational methods used in in silico scientific experiments and their associated research products. Scientific workflows have demonstrated to be useful for sharing and reproducing scientific experiments, allowing scientists to visualize, debug and save time when re-executing previous work. However, scientific workflows may be difficult to understand and reuse. The large amount of available workflows in repositories, together with their heterogeneity and lack of documentation and usage examples may become an obstacle for a scientist aiming to reuse the work from other scientists. Furthermore, given that it is often possible to implement a method using different algorithms or techniques, seemingly disparate workflows may be related at a higher level of abstraction, based on their common functionality. In this thesis we address the issue of reusability and abstraction by exploring how workflows relate to one another in a workflow repository, mining abstractions that may be helpful for workflow reuse. In order to do so, we propose a simple model for representing and relating workflows and their executions, we analyze the typical common abstractions that can be found in workflow repositories, we explore the current practices of users regarding workflow reuse and we describe a method for discovering useful abstractions for workflows based on existing graph mining techniques. Our results expose the common abstractions and practices of users in terms of workflow reuse, and show how our proposed abstractions have potential to become useful for users designing new workflows&quot;.

]]>
5850//cdn.slidesharecdn.com/ss_thumbnails/diaw3cpublishinginacademia-151015121612-lva1-app6891-thumbnail-2.jpg?cb=1444911516presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0EDBT 2015: Summer School Overviewhttps://www.slideshare.net/dgarijo/edbt-2015-summer-school-overview
edbt2015-150924090111-lva1-app6892 This presentation shows an overview of the main concepts introduced in the EDBT2015 Summer School, which took place in Palamos. For each area, we summarize the main issues and current approaches. We also describe the challenges and main activities that were undertaken in the summer school]]>
This presentation shows an overview of the main concepts introduced in the EDBT2015 Summer School, which took place in Palamos. For each area, we summarize the main issues and current approaches. We also describe the challenges and main activities that were undertaken in the summer school]]>
Thu, 24 Sep 2015 09:01:11 GMThttps://www.slideshare.net/dgarijo/edbt-2015-summer-school-overviewdgarijo@slideshare.net(dgarijo)EDBT 2015: Summer School OverviewdgarijoThis presentation shows an overview of the main concepts introduced in the EDBT2015 Summer School, which took place in Palamos. For each area, we summarize the main issues and current approaches. We also describe the challenges and main activities that were undertaken in the summer school<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/edbt2015-150924090111-lva1-app6892-thumbnail-2.jpg?cb=1443085433" style="border:1px solid #C3E6D8;float:right;" /><br> This presentation shows an overview of the main concepts introduced in the EDBT2015 Summer School, which took place in Palamos. For each area, we summarize the main issues and current approaches. We also describe the challenges and main activities that were undertaken in the summer school

]]>
5780//cdn.slidesharecdn.com/ss_thumbnails/oeg-wiki-slides4-150907114154-lva1-app6891-thumbnail-2.jpg?cb=1441626225presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0Semantic web 101: Benefits for geologistshttps://www.slideshare.net/dgarijo/semantic-web-101-benefits-for-geologists-51515587
semanticweb101-150811191825-lva1-app6892 Short introduction to the main concepts of the Semantic Web (RDF, Linked Data) for geologists. Identification of the main challenges.]]>
Short introduction to the main concepts of the Semantic Web (RDF, Linked Data) for geologists. Identification of the main challenges.]]>
Tue, 11 Aug 2015 19:18:25 GMThttps://www.slideshare.net/dgarijo/semantic-web-101-benefits-for-geologists-51515587dgarijo@slideshare.net(dgarijo)Semantic web 101: Benefits for geologistsdgarijoShort introduction to the main concepts of the Semantic Web (RDF, Linked Data) for geologists. Identification of the main challenges.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/semanticweb101-150811191825-lva1-app6892-thumbnail-2.jpg?cb=1439320772" style="border:1px solid #C3E6D8;float:right;" /><br> Short introduction to the main concepts of the Semantic Web (RDF, Linked Data) for geologists. Identification of the main challenges.

]]>
4180//cdn.slidesharecdn.com/ss_thumbnails/semanticweb101-150811191825-lva1-app6892-thumbnail-2.jpg?cb=1439320772presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0Is preserving data enough? Towards the preservation of scientific methods https://www.slideshare.net/dgarijo/open-research-data-day-is-pre
openresearchdatdayv3-150601105747-lva1-app6892 In recent years there have been many efforts towards the preservation of data belonging to scientific research. Institutions like the Virtual Observatory and journals like PLOS ONE, Geoscience Data Journal, Ecological Archives accept datasets that support or were produced in scientific publications. Other efforts like Figshare allow citing data from unpublished research and research in progress, allowing acknowledging authors and improving the shareability of their work. At the same time, many of the challenges associated to the preservation and sharing of data has been a topic of discussion in international initiatives like the Research Data Alliance, which through its working and interest groups aims at identifying requirements and proposing reference solutions to improve such tasks like data citation and provision of correct e-infrastructure for repositories.
However, data per se is often not relevant without proper description metadata, its provenance and the software used for its creation. In fact, scientists are starting to be more concerned about the preservation of the software and methods used to deliver a particular scientific result. Reproducibility and inspectability are crucial for enabling the interpretation and the reusability of a given dataset. In "in vitro" and "in vivo" sciences, protocols exist to capture the methods necessary to reproduce an experiment. In computational sciences this is achieved with scientific workflows, which capture the method (i.e., steps and data dependencies) used to obtain a specific result. In this short talk we will introduce the set of checklists we have developed for the proper conservation of scientific workflows, encapsulated as Research Objects, by adapting existing standards for data preservation.]]>
In recent years there have been many efforts towards the preservation of data belonging to scientific research. Institutions like the Virtual Observatory and journals like PLOS ONE, Geoscience Data Journal, Ecological Archives accept datasets that support or were produced in scientific publications. Other efforts like Figshare allow citing data from unpublished research and research in progress, allowing acknowledging authors and improving the shareability of their work. At the same time, many of the challenges associated to the preservation and sharing of data has been a topic of discussion in international initiatives like the Research Data Alliance, which through its working and interest groups aims at identifying requirements and proposing reference solutions to improve such tasks like data citation and provision of correct e-infrastructure for repositories.
However, data per se is often not relevant without proper description metadata, its provenance and the software used for its creation. In fact, scientists are starting to be more concerned about the preservation of the software and methods used to deliver a particular scientific result. Reproducibility and inspectability are crucial for enabling the interpretation and the reusability of a given dataset. In "in vitro" and "in vivo" sciences, protocols exist to capture the methods necessary to reproduce an experiment. In computational sciences this is achieved with scientific workflows, which capture the method (i.e., steps and data dependencies) used to obtain a specific result. In this short talk we will introduce the set of checklists we have developed for the proper conservation of scientific workflows, encapsulated as Research Objects, by adapting existing standards for data preservation.]]>
Mon, 01 Jun 2015 10:57:47 GMThttps://www.slideshare.net/dgarijo/open-research-data-day-is-predgarijo@slideshare.net(dgarijo)Is preserving data enough? Towards the preservation of scientific methods dgarijoIn recent years there have been many efforts towards the preservation of data belonging to scientific research. Institutions like the Virtual Observatory and journals like PLOS ONE, Geoscience Data Journal, Ecological Archives accept datasets that support or were produced in scientific publications. Other efforts like Figshare allow citing data from unpublished research and research in progress, allowing acknowledging authors and improving the shareability of their work. At the same time, many of the challenges associated to the preservation and sharing of data has been a topic of discussion in international initiatives like the Research Data Alliance, which through its working and interest groups aims at identifying requirements and proposing reference solutions to improve such tasks like data citation and provision of correct e-infrastructure for repositories.
However, data per se is often not relevant without proper description metadata, its provenance and the software used for its creation. In fact, scientists are starting to be more concerned about the preservation of the software and methods used to deliver a particular scientific result. Reproducibility and inspectability are crucial for enabling the interpretation and the reusability of a given dataset. In "in vitro" and "in vivo" sciences, protocols exist to capture the methods necessary to reproduce an experiment. In computational sciences this is achieved with scientific workflows, which capture the method (i.e., steps and data dependencies) used to obtain a specific result. In this short talk we will introduce the set of checklists we have developed for the proper conservation of scientific workflows, encapsulated as Research Objects, by adapting existing standards for data preservation.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/openresearchdatdayv3-150601105747-lva1-app6892-thumbnail-2.jpg?cb=1495826029" style="border:1px solid #C3E6D8;float:right;" /><br> In recent years there have been many efforts towards the preservation of data belonging to scientific research. Institutions like the Virtual Observatory and journals like PLOS ONE, Geoscience Data Journal, Ecological Archives accept datasets that support or were produced in scientific publications. Other efforts like Figshare allow citing data from unpublished research and research in progress, allowing acknowledging authors and improving the shareability of their work. At the same time, many of the challenges associated to the preservation and sharing of data has been a topic of discussion in international initiatives like the Research Data Alliance, which through its working and interest groups aims at identifying requirements and proposing reference solutions to improve such tasks like data citation and provision of correct e-infrastructure for repositories.
However, data per se is often not relevant without proper description metadata, its provenance and the software used for its creation. In fact, scientists are starting to be more concerned about the preservation of the software and methods used to deliver a particular scientific result. Reproducibility and inspectability are crucial for enabling the interpretation and the reusability of a given dataset. In &quot;in vitro&quot; and &quot;in vivo&quot; sciences, protocols exist to capture the methods necessary to reproduce an experiment. In computational sciences this is achieved with scientific workflows, which capture the method (i.e., steps and data dependencies) used to obtain a specific result. In this short talk we will introduce the set of checklists we have developed for the proper conservation of scientific workflows, encapsulated as Research Objects, by adapting existing standards for data preservation.

]]>
6050//cdn.slidesharecdn.com/ss_thumbnails/openresearchdatdayv3-150601105747-lva1-app6892-thumbnail-2.jpg?cb=1495826029presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0LDP4ROs: Managing Research Objects with the Linked Data Platform (WWW2015 demo)https://www.slideshare.net/dgarijo/ldp4ros
wwwdemoldp4ros-150519094252-lva1-app6891 In this demo we present LDP4ROs, a prototype implementation that allows creating, browsing and updating Research Objects (ROs) and their contents using typical HTTP operations. This is achieved by aligning the RO model with the W3C Linked Data Platform (LDP).]]>
In this demo we present LDP4ROs, a prototype implementation that allows creating, browsing and updating Research Objects (ROs) and their contents using typical HTTP operations. This is achieved by aligning the RO model with the W3C Linked Data Platform (LDP).]]>
Tue, 19 May 2015 09:42:52 GMThttps://www.slideshare.net/dgarijo/ldp4rosdgarijo@slideshare.net(dgarijo)LDP4ROs: Managing Research Objects with the Linked Data Platform (WWW2015 demo)dgarijoIn this demo we present LDP4ROs, a prototype implementation that allows creating, browsing and updating Research Objects (ROs) and their contents using typical HTTP operations. This is achieved by aligning the RO model with the W3C Linked Data Platform (LDP).<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/wwwdemoldp4ros-150519094252-lva1-app6891-thumbnail-2.jpg?cb=1432028721" style="border:1px solid #C3E6D8;float:right;" /><br> In this demo we present LDP4ROs, a prototype implementation that allows creating, browsing and updating Research Objects (ROs) and their contents using typical HTTP operations. This is achieved by aligning the RO model with the W3C Linked Data Platform (LDP).

]]>
4200//cdn.slidesharecdn.com/ss_thumbnails/presentacionsimposio04-05-2015-150507131623-lva1-app6892-thumbnail-2.jpg?cb=1431004664presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0Towards Workflow Ecosystems Through Semantic and Standard Representationshttps://www.slideshare.net/dgarijo/works-14
works-14-141121034305-conversion-gate01 Workflows are increasingly used to manage and share scientific computations and methods. Workflow tools can be used to design, validate, execute and visualize scientific workflows and their execution results. Other tools manage workflow libraries or mine their contents. There has been a lot of recent work on workflow system integration as well as common workflow interlinguas, but the interoperability among workflow systems remains a challenge. Ideally, these tools would form a workflow ecosystem such that it should be possible to create a workflow with a tool, execute it with another, visualize it with another, and use yet another tool to mine a repository of such workflows or their executions. In this paper, we describe our approach to create a workflow ecosystem through the use of standard models for provenance (OPM and W3C PROV) and extensions (P-PLAN and OPMW) to represent workflows. The ecosystem integrates different workflow tools with diverse functions (workflow generation, execution, browsing, mining, and visualization) created by a variety of research groups. This is, to our knowledge, the first time that such a variety of workflow systems and functions are integrated]]>
Workflows are increasingly used to manage and share scientific computations and methods. Workflow tools can be used to design, validate, execute and visualize scientific workflows and their execution results. Other tools manage workflow libraries or mine their contents. There has been a lot of recent work on workflow system integration as well as common workflow interlinguas, but the interoperability among workflow systems remains a challenge. Ideally, these tools would form a workflow ecosystem such that it should be possible to create a workflow with a tool, execute it with another, visualize it with another, and use yet another tool to mine a repository of such workflows or their executions. In this paper, we describe our approach to create a workflow ecosystem through the use of standard models for provenance (OPM and W3C PROV) and extensions (P-PLAN and OPMW) to represent workflows. The ecosystem integrates different workflow tools with diverse functions (workflow generation, execution, browsing, mining, and visualization) created by a variety of research groups. This is, to our knowledge, the first time that such a variety of workflow systems and functions are integrated]]>
Fri, 21 Nov 2014 03:43:05 GMThttps://www.slideshare.net/dgarijo/works-14dgarijo@slideshare.net(dgarijo)Towards Workflow Ecosystems Through Semantic and Standard RepresentationsdgarijoWorkflows are increasingly used to manage and share scientific computations and methods. Workflow tools can be used to design, validate, execute and visualize scientific workflows and their execution results. Other tools manage workflow libraries or mine their contents. There has been a lot of recent work on workflow system integration as well as common workflow interlinguas, but the interoperability among workflow systems remains a challenge. Ideally, these tools would form a workflow ecosystem such that it should be possible to create a workflow with a tool, execute it with another, visualize it with another, and use yet another tool to mine a repository of such workflows or their executions. In this paper, we describe our approach to create a workflow ecosystem through the use of standard models for provenance (OPM and W3C PROV) and extensions (P-PLAN and OPMW) to represent workflows. The ecosystem integrates different workflow tools with diverse functions (workflow generation, execution, browsing, mining, and visualization) created by a variety of research groups. This is, to our knowledge, the first time that such a variety of workflow systems and functions are integrated<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/works-14-141121034305-conversion-gate01-thumbnail-2.jpg?cb=1416541529" style="border:1px solid #C3E6D8;float:right;" /><br> Workflows are increasingly used to manage and share scientific computations and methods. Workflow tools can be used to design, validate, execute and visualize scientific workflows and their execution results. Other tools manage workflow libraries or mine their contents. There has been a lot of recent work on workflow system integration as well as common workflow interlinguas, but the interoperability among workflow systems remains a challenge. Ideally, these tools would form a workflow ecosystem such that it should be possible to create a workflow with a tool, execute it with another, visualize it with another, and use yet another tool to mine a repository of such workflows or their executions. In this paper, we describe our approach to create a workflow ecosystem through the use of standard models for provenance (OPM and W3C PROV) and extensions (P-PLAN and OPMW) to represent workflows. The ecosystem integrates different workflow tools with diverse functions (workflow generation, execution, browsing, mining, and visualization) created by a variety of research groups. This is, to our knowledge, the first time that such a variety of workflow systems and functions are integrated

]]>
6180//cdn.slidesharecdn.com/ss_thumbnails/works-14-141121034305-conversion-gate01-thumbnail-2.jpg?cb=1416541529presentationBlackhttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0Workflow Reuse in Practice: A Study of Neuroimaging Pipeline Usershttps://www.slideshare.net/dgarijo/workflow-reuseinloni
workflowreuseinloni-141027070124-conversion-gate01 eScience 2014, Guarujá (Brasil). Abstract: Workflow reuse is a major benefit of workflow systems and shared workflow repositories, but there are barely any studies that quantify the degree of reuse of workflows or the practical barriers that may stand in the way of successful reuse. In our own work, we hypothesize that defining workflow fragments improves reuse, since end-to-end workflows may be very specific and only partially reusable by others. This paper reports on a study of the current use of workflows and workflow fragments in labs that use the LONI Pipeline, a popular workflow system used mainly for neuroimaging research that enables users to define and reuse workflow fragments. We present an overview of the benefits of workflows and workflow fragments reported by users in informal discussions. We also report on a survey of researchers in a lab that has the LONI Pipeline installed, asking them about their experiences with reuse of workflow fragments and the actual benefits they perceive. This leads to quantifiable indicators of the reuse of workflows and workflow fragments in practice. Finally, we discuss barriers to further adoption of workflow fragments and workflow reuse that motivate further work.]]>
eScience 2014, Guarujá (Brasil). Abstract: Workflow reuse is a major benefit of workflow systems and shared workflow repositories, but there are barely any studies that quantify the degree of reuse of workflows or the practical barriers that may stand in the way of successful reuse. In our own work, we hypothesize that defining workflow fragments improves reuse, since end-to-end workflows may be very specific and only partially reusable by others. This paper reports on a study of the current use of workflows and workflow fragments in labs that use the LONI Pipeline, a popular workflow system used mainly for neuroimaging research that enables users to define and reuse workflow fragments. We present an overview of the benefits of workflows and workflow fragments reported by users in informal discussions. We also report on a survey of researchers in a lab that has the LONI Pipeline installed, asking them about their experiences with reuse of workflow fragments and the actual benefits they perceive. This leads to quantifiable indicators of the reuse of workflows and workflow fragments in practice. Finally, we discuss barriers to further adoption of workflow fragments and workflow reuse that motivate further work.]]>
Mon, 27 Oct 2014 07:01:24 GMThttps://www.slideshare.net/dgarijo/workflow-reuseinlonidgarijo@slideshare.net(dgarijo)Workflow Reuse in Practice: A Study of Neuroimaging Pipeline UsersdgarijoeScience 2014, Guarujá (Brasil). Abstract: Workflow reuse is a major benefit of workflow systems and shared workflow repositories, but there are barely any studies that quantify the degree of reuse of workflows or the practical barriers that may stand in the way of successful reuse. In our own work, we hypothesize that defining workflow fragments improves reuse, since end-to-end workflows may be very specific and only partially reusable by others. This paper reports on a study of the current use of workflows and workflow fragments in labs that use the LONI Pipeline, a popular workflow system used mainly for neuroimaging research that enables users to define and reuse workflow fragments. We present an overview of the benefits of workflows and workflow fragments reported by users in informal discussions. We also report on a survey of researchers in a lab that has the LONI Pipeline installed, asking them about their experiences with reuse of workflow fragments and the actual benefits they perceive. This leads to quantifiable indicators of the reuse of workflows and workflow fragments in practice. Finally, we discuss barriers to further adoption of workflow fragments and workflow reuse that motivate further work.<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/workflowreuseinloni-141027070124-conversion-gate01-thumbnail-2.jpg?cb=1414393694" style="border:1px solid #C3E6D8;float:right;" /><br> eScience 2014, Guarujá (Brasil). Abstract: Workflow reuse is a major benefit of workflow systems and shared workflow repositories, but there are barely any studies that quantify the degree of reuse of workflows or the practical barriers that may stand in the way of successful reuse. In our own work, we hypothesize that defining workflow fragments improves reuse, since end-to-end workflows may be very specific and only partially reusable by others. This paper reports on a study of the current use of workflows and workflow fragments in labs that use the LONI Pipeline, a popular workflow system used mainly for neuroimaging research that enables users to define and reuse workflow fragments. We present an overview of the benefits of workflows and workflow fragments reported by users in informal discussions. We also report on a survey of researchers in a lab that has the LONI Pipeline installed, asking them about their experiences with reuse of workflow fragments and the actual benefits they perceive. This leads to quantifiable indicators of the reuse of workflows and workflow fragments in practice. Finally, we discuss barriers to further adoption of workflow fragments and workflow reuse that motivate further work.

]]>
7330//cdn.slidesharecdn.com/ss_thumbnails/workflowreuseinloni-141027070124-conversion-gate01-thumbnail-2.jpg?cb=1414393694presentationWhitehttp://activitystrea.ms/schema/1.0/posthttp://activitystrea.ms/schema/1.0/posted0Frag Flow: Automated Fragment Detection in Scientific Workflowshttps://www.slideshare.net/dgarijo/frag-flow-automatedfragmentdetectioninscientificworkflows
fragflowautomatedfragmentdetectioninscientificworkflows-141027064828-conversion-gate02 eScience 2014, Guarujá (Brasil). Abstract—Scientific workflows provide the means to define, execute and reproduce computational experiments. However, reusing existing workflows still poses challenges for workflow designers. Workflows are often too large and too specific to reuse in their entirety, so reuse is more likely to happen for fragments of workflows. These fragments may be identified manually by users as sub-workflows, or detected automatically. In this paper we present the FragFlow approach, which detects workflow fragments automatically by analyzing existing workflow corpora with graph mining algorithms. FragFlow detects the most common workflow fragments, links them to the original workflows and visualizes them. We evaluate our approach by comparing FragFlow results against user-defined sub-workflows from three different corpora of the LONI Pipeline system. Based on this evaluation, we discuss how automated workflow fragment detection could facilitate workflow reuse]]>
eScience 2014, Guarujá (Brasil). Abstract—Scientific workflows provide the means to define, execute and reproduce computational experiments. However, reusing existing workflows still poses challenges for workflow designers. Workflows are often too large and too specific to reuse in their entirety, so reuse is more likely to happen for fragments of workflows. These fragments may be identified manually by users as sub-workflows, or detected automatically. In this paper we present the FragFlow approach, which detects workflow fragments automatically by analyzing existing workflow corpora with graph mining algorithms. FragFlow detects the most common workflow fragments, links them to the original workflows and visualizes them. We evaluate our approach by comparing FragFlow results against user-defined sub-workflows from three different corpora of the LONI Pipeline system. Based on this evaluation, we discuss how automated workflow fragment detection could facilitate workflow reuse]]>
Mon, 27 Oct 2014 06:48:28 GMThttps://www.slideshare.net/dgarijo/frag-flow-automatedfragmentdetectioninscientificworkflowsdgarijo@slideshare.net(dgarijo)Frag Flow: Automated Fragment Detection in Scientific WorkflowsdgarijoeScience 2014, Guarujá (Brasil). Abstract—Scientific workflows provide the means to define, execute and reproduce computational experiments. However, reusing existing workflows still poses challenges for workflow designers. Workflows are often too large and too specific to reuse in their entirety, so reuse is more likely to happen for fragments of workflows. These fragments may be identified manually by users as sub-workflows, or detected automatically. In this paper we present the FragFlow approach, which detects workflow fragments automatically by analyzing existing workflow corpora with graph mining algorithms. FragFlow detects the most common workflow fragments, links them to the original workflows and visualizes them. We evaluate our approach by comparing FragFlow results against user-defined sub-workflows from three different corpora of the LONI Pipeline system. Based on this evaluation, we discuss how automated workflow fragment detection could facilitate workflow reuse<img alt="" src="//cdn.slidesharecdn.com/ss_thumbnails/fragflowautomatedfragmentdetectioninscientificworkflows-141027064828-conversion-gate02-thumbnail-2.jpg?cb=1414393658" style="border:1px solid #C3E6D8;float:right;" /><br> eScience 2014, Guarujá (Brasil). Abstract—Scientific workflows provide the means to define, execute and reproduce computational experiments. However, reusing existing workflows still poses challenges for workflow designers. Workflows are often too large and too specific to reuse in their entirety, so reuse is more likely to happen for fragments of workflows. These fragments may be identified manually by users as sub-workflows, or detected automatically. In this paper we present the FragFlow approach, which detects workflow fragments automatically by analyzing existing workflow corpora with graph mining algorithms. FragFlow detects the most common workflow fragments, links them to the original workflows and visualizes them. We evaluate our approach by comparing FragFlow results against user-defined sub-workflows from three different corpora of the LONI Pipeline system. Based on this evaluation, we discuss how automated workflow fragment detection could facilitate workflow reuse