Data Treasure Hunting

Hashtags:

Contact:

Tags:

Platform, Data Visualization

Background

In recent years, NASA and other government agencies worldwide have been publishing open data in machine-readable, non-proprietary, and no-cost format on the web (e.g., http://data.nasa.gov/). Everyone is interested in new ways to search that publicly available data and integrate these information assets into innovative databases and applications.

Inconsistent metadata (i.e., information such as keywords that empower search engines such as Google to discover these assets) is a consistent challene across organizations. The challenge is to develop a new technique or application that would enable anyone to add meaningful keywords to the descriptions of our data – keywords that describe the hidden potential of these assets to better leverage our data beyond space applications to other data that may appear unrelated.

Challenge

Devise a clever way to discover good keywords to describe the potential, hidden, secondary uses of open data. For example, how might you discover that a particular information asset might be relevant to or benefit from other keywords, such as waste-processing or disaster-preparedness? Remember: without these additional and seemingly unrelated keywords, entrepreneurs like you might not discover and use open data to solve your most perplexing problems.

You can use any technique that might help discover new keywords. For example, a crowdsourcing application could display information about these assets online and query people about how the assets can be used. You may want to consider predictive analytics or machine-learning techniques to compare the metadata and the data of one information asset to another in order to find new keywords. Or, you might use the unique identifiers of the published data-files to search on the web, discover who already used the data and for what purpose, then catalog it. In fact, you could even develop a clever solution to download the data itself and ‘squeeze it’ in order to generate new keywords.

Not only are we asking you to discover new keywords, but also to retain the log file that explains how these new keywords were discovered.

Considerations

A starter toolkit is now available. This will include the complete existing metadata and download links for information assets that were published on Open Data websites or by other agencies worldwide.

Sample Resources (Participants do not have to use these resources, and NASA in no way endorses any particular entity listed).

https://project-open-data.cio.gov/schema - Provides a dictionary of existing metadata-fields in the popular data.json catalog that agencies use to prepare and upload information assets to their Open Data portals.

http://www.engagedata.eu - The European Engage project that uses a crowdsourcing technique to address a similar problem.

Ecosystem Treasure Hunting Team

NLP_HACKER_CLAN

>The aim of this project is to not create a list of keywords, but to devise
a method that allows semantic recognition of entered keywords, and newly
formed keywords, to provide reference to cross-agency documentation and
other information, as well as external documentation via the web.
>
>T... Visit Project

DHM - Data Hunters Macedonia

DHM (Data Hunters Macedonia)
1.Already achieved
1.1. Discover new keywords using an API.
1.2. Storage and catagorization of the gathered keywords
1.4. Using them in a ("web") application that we developed for Crowd-Sourcing
2. Future plans(Software)
2.0. Ranking the gathered information... Visit Project

Degrees of Data

The problem facing data mining is that it's difficult to find relevant data given a keyword due to badly tagged data. We aim to solve this using the Twitter API. Humans have already tagged their tweets with relevant hashtags (that are more often than not related to each other). With this, we sear... Visit Project

KeyRecommender

Our team will propose a Statistical Machine Learning model to learn the semantics of the text in an unsupervised way. We will use the Word Vector representation model to represent the text, then apply clustering techniques to identify the relevance between words. Visit Project

Metatron

*Metatron: Metadata improvements for public data sets*
Metatron is a public dataset search portal for people who don't know in advance what data set they might want, comprising:
* A data search front end with a catalog of general topics
* A cloud-based back end comprising:
* Robo... Visit Project

Keyword Distillery

# Keyword Distillery
NASA Space Apps Challenge 2015 (Data Treasure Hunting)
Generates a ranked list of keywords. Each keyword can generate a ranked list of databases which relate to it, ranked by the strength of the relationship.
#Usage
* Generate a list of keywords you would like to map,... Visit Project

An Extendible Bluemix Framework for Improving keywords and themes in data.json Files.

This framework uses Bluemix services to accomplish the following:
Allow data set owners to solicit input about their data sets related to keywords and themes. The framework uses Relationship Extraction and Visual Recognition services to pull entities from user's social media feeds such as Fac... Visit Project

Smart Suit for Travelers

Creation and conception of a multiplatform application Linked to a space suit
3 different space and time circumstances analysis
-real time healthcare support
-data mining
-Enhance simulations
-Adapt environment
-Surpass time limit in space i
Visit Project

Ap^2 - Analitycs Projections Program

The aim of our project is to solve the problem of open data connection.
We have used the NASA Open data and crossed them with other external open data to trace relationship between space and earth activities. Through a mobile app, user can choose existing keywords or add new ones and see how ... Visit Project

Kvasir

##Kvasir
Pronounced Kwah-seer, is a tool which aims to create a centralized place for users to find open government data.
This project aims to use open sourced data visualization tools to represent the government data in a much more user friendly fashion. We plan to implement maps for data ... Visit Project

SMART SPACE SUIT

Creation and conception of a multiplatform application Linked to a space suit
3 different space and time circumstances analysis
real time healthcare support
data mining
Enhance simulations
Adapt environment
Surpass time limit in space
Visit Project

Open Data Social Gold Miner

# Open Data Gold Digger
*Formerly known as Open Data Social Gold Miner*
This project is focused on taking the conglomerates of raw data and mining this data into smaller streams of useful, relevant data. We have done this by utilizing the Project Open Data Metadata Schema v1.1 in analyzing a... Visit Project

opendatapedia.com

Objectives:
- Generate "new keywords" for opendata assets.
- Create a free online and mobile source of easy to use open-data around the world.
- Use Crowdsourcing for integrity of information.
- dashboard of opendata trending in social.
How to achieve:
- Free Online Web "opendatapedia.co... Visit Project

Data Odyssey

This project aims to solve the data treasure hunting challenge.
This project will provide a manual way for users to search for data extensively when the search algorithms fail to find the revelant data.
It will use this data to help create new keywords dynamically which may be used to generat... Visit Project

medapp

I have made an application in which several salts that are existing on mars can be combined to form the medicines which can be effective for the body of human beings for surviving on mars.I have created a database of 25-30 medicines in which I have combined several elements to form the effective... Visit Project

miningDTH

##Background
Everyday data from space are generated by NASA; however, these data need to be analysed to allow people have a better understanding.
For this challenge, the APOD catalog was chosen to be a study case. The goal of this project is to build a dynamic knowledge using auto generated key... Visit Project

epilep.si

Most of (scientific) discoveries come from the outer edges of the »known«. On one hand we have »superhuman« astronauts with robust (biosensors - brains) in space and on the other hand there is exponentially growing number of persons challenged with epilepsy (sensitive biosensors –brains).
We a... Visit Project

NYSpaceTag

NASA has a lot of data, but it's hard to find what it's about, and how it connects to other data. To solve the first, we have built a tagging system that extracts natural keywords from titles and descriptions. We ran this across not only NASA data, but on datasets from all government departments.... Visit Project

Welcome to the collaborative hackpad! You can use this open document to collaborate with others, self organize, or share important data. Please keep in mind that this document is community created and any views, opinions, or links do not reflect an official position of the Space Apps Challenge, NASA, or any of our partners.

Building a team or looking for one to join? Feel free to create a Matchmaking section at the bottom of the document to help in gathering great minds together!