UMETRICS Presentation (Weinberg)

Developments in the US:
STAR & U METRICS
Motivation
The President recently asked his Cabinet to carry out an aggressive management agenda
for his second term that delivers a smarter, more innovative, and more accountable
government for citizens. An important component of that effort is strengthening agencies'
abilities to continually improve program performance by applying existing evidence
about what works, generating new knowledge, and using experimentation and
innovation to test new approaches to program delivery.
STAR METRICS
• National program: White House led interagency initiative, now housed at NIH
• Broad participation: >100 research orgs (45% of
NSF/NIH funding)
• Unique data: Project level data on internal
financial and HR data on expenditures from
federal grants
• Low burden / cost: uses
algorithms & existing data
• Theoretically grounded:
Builds on microfoundations
Conceptual Framework
Empirical Framework
• Level 1: Document science inputs: the
workforce and equipment expenditures
supported by federal funding
• Level 2: Develop an open automated data
infrastructure and tools that will enable us to
document and analyze the inputs, outputs,
and outcomes resulting from federal
investments in science
U METRICS
• Private initiative to use STAR METRICS data
from 15 major research universities that
comprise the Committee on Institutional
Cooperation (CIC) to analyze:
1. Impact of Science
2. Structure of research workforce
3. Optimize research
Products
•
•
•
•
Templated, scalable reports
Integrated dashboard
Open source data infrastructure
Sandbox for research to develop tools
•
•
•
•
•
•
•
•
•
•
•
•
•
•
•
University of Chicago
University of Illinois
Indiana University
University of Iowa
University of Maryland
University of Michigan
Michigan State University
University of Minnesota
University of Nebraska-Lincoln
Northwestern University
Ohio State University
Pennsylvania State University
Purdue University
Rutgers University
University of Wisconsin-Madison
The CIC
The CIC
• Most of the leading research universities in
the Midwest United States
• $9.3B in research
• “Shanghai” ARWU range from #8 (U Chicago)
to 68; median ranking is 35
U METRICS Projects
• Impact, NSF Funded
– Analyze distribution, geographic location, and industry of vendors
– Analyses the collaborative networks supported by research funding
– Offers the first comprehensive picture of the effects of federal R&D on
economic resilience and job creation
• Training (Discuss tomorrow), NSF Funded
• Food Security, US Dept. of Agriculture proposal
– Impact from federally funded research targeted at the agricultural
sector in general and food safety in particular
• Communication, NSF proposal
– How new comp sci, data, and tools can inform policymakers’ attitudes,
decisions, and understanding of WHAT science research is funded and
with what RESULTS
• Innovation and Aging, National Institute on Aging proposal
– Understand impact and responses to aging innovative workforce
– Build data infrastructure to support research
CIC Activity: Economic Impacts
Zoom Out
Analysis of:
Entire scientific
enterprise
Useful for:
Government and
institutions to
justify and set
level of science
investments
Econometric
analysis
Method:
Status:
In progress
Zoom In
Scientific
fields
Entire
research
institutions
and funders
Government Institutions
to justify and and funders
allocate
to document
investments performance
to fields
Econometric Bibliometric,
Text, +
analysis
Econometric
In progress
Planned
Specific fields Specific labs,
at specific
researchers
institutions
Identify bestpractices,
target
investments
Bibliometric,
Text, +
Econometric
Planned
Micro-benchmark
performance,
identify
underexploited
opportunities
Bibliometric, Text,
+ Econometric
Planned
Training Environments Project
• Funded by National Science Foundation
• Uses CIC STAR METRICS LEVEL I data
• Researchers from OSU, Iowa, Illinois, Penn State,
Chicago and AIR
• Examines the impact of different research funding
structures on the training of future scientists,
particularly graduate students and postdoctoral fellows,
and the impact on their subsequent outcomes.
• Link the data to universe data on student jobs, earnings
and industries
• Uses computer science technologies that permit the
capture of information from text documents – to
describe what research the students are being trained in
Building a Social Science Research Community around
a new R&D Data Infrastructure
• Funded by National Science Foundation
• Uses CIC STAR METRICS Level 1 Data
• Researchers Jason Owen Smith and Maggie Levenstein,
University of Michigan
• Analyses the distribution, geographic location, and
industry of vendors that supply federally funded
research,
• Analyses the collaborative networks supported by
research funding
• Offers the first comprehensive picture of the effects of
federal R&D spending on economic resilience and job
creation via companies that support university research.
Proposed study on food safety
•
•
•
•
Proposed to USDA
Uses CIC STAR METRICS Level 1 data
Researchers: Kaye Husbands Fealing, University of Minnesota
Goal: Outcomes and impacts from federally funded research
targeted at the agricultural sector in general and food safety in
particular.
• Basic facts: What expenditures have been made in food safety
and security and how have these expenditures changed over
time? Who is doing research in food safety and What are the
research outputs that are most relevant in the near and longer
term to food safety and food security in the United States and
abroad.
• What is the competitive advantage is for each of the
universities, and to shed light on how to facilitate technology
transfer without changing the main mission of the University—
that is to educate students and do high quality faculty research.
Proposed study on communicating
research to policy makers
• Proposed to NSF
• Use CIC STAR METRICS Level 1 data
• Researchers: U of Nebraska, Chicago,
Maryland, AIR, and OSU
• Goal: the extent to which new computer
science techniques, new data, and new tools
can inform policymakers’ attitudes, decisions,
and understanding of WHAT science research is
funded and with what RESULTS
Complementary Aging Project
• Proposed to NIA(ging) to
1.
2.
3.
Document aging of biomedical research workforce
Estimate effect on innovation, health, and economy
Make policy recommendations
• Team from NBER, Albany, Harvard, Illinois, Maryland,
MIT, OSU, Stanford, UCSD, Waterloo
• Develop STAR METRICS Level II Data Infrastructure:
link researchers to their support (grants)  scientific
output (publications and citations)  technological
products (patents and drug approvals)  Impacts
(Health and economy)
What goes back to CIC
• Reports and analyses
• Dashboard that links funding (top left), STAR
METRICS Level 1 data (top right and bottom
left( and results (bottom right)
International Context
ASTRA (Australia)
HELIOS (France)
CAELIS (Czech Republic)
NORDSJTERNEN (Norway)
STELLAR (Germany)
TRICS (UK)
SOLES (SPAIN)
First International Workshop in Paris (Sept 16/17)
..which is where Julia, Bruce and Joshua all are!
The U METRICS Initiative
STEM Workforce Training:
A Quasi-Experimental Approach Using
the Effects of Research Funding
Overview and Goals
• Examine the impact of research environment
and funding structures on the training and
outcomes of graduate students and post docs
• Link to universe data on student jobs, earnings
and industries
• Use emerging Comp. Sci. methods to mine
text of grants to describe what people are
being trained in
Teams and Funding Structures
Historically used co-authors, but this is late stage, and
selective
1. Labs rather than just co-authors
• How to measure a lab?
– PI and PI’s funding?
2. Networks (PI collaborations on projects)
• How to measure?
- Collaborative projects and clusters of projects joined by Pis
- Subsequent projects initiated by postdocs or graduates
students getting new jobs OR returning to their firms
Automated Data Construction
• Most data efforts focus on hand-curated data
• Scalable, Low cost / burden: Algorithmically
link researchers to their support (grants) 
scientific output (publications and citations)
 technological products (patents and drug
approvals)  Impacts (Health, economy,
productivity)
• Link to linked employee / employer data
• Probabilistic matches
Possible Analyses
• Estimate how training environment affects
retention in US, sector of employment, wages
• Estimate how flows of trainees to companies
affects productivity
• Measure impact on innovation by linking text
of patents to the research done in the labs
where people trained
• Open the knowledge transfer black box and
estimate returns to training
Team
•
•
•
•
•
•
•
•
Bruce Weinberg (Ohio State U, Econ)
Julia Lane (American Inst. for Research, Econ)
Lee Giles (Illinois, Comp Sci)
Vetle Torvik (Pennsylvania State, Comp Sci)
Christopher Morphew (Iowa, Educ)
James Evans (U Chicago, Sociology)
Barbara Allen (CIC, Executive Director)
Roy Weiss (U Chicago, Medicine)