You are here

Staff Notes Daily Announcements

All research projects are undertaken with the hope to produce findings and products of lasting value. It is often unthinkable to consider that someone could forget the details relating to a project, especially how the results are produced. However, the state of becoming an “unloved data set” is often reached unintentionally over time. Specifically, if the research projects lose sight of data management actions, research results and products could be at the risk of becoming forgotten or “unloved” when the team moves on to new projects.

The Data Stewardship Engineering Team (DSET) is a cross-organizational team formed by the NCAR Directors. DSET charter specifies that the DSET leads the organization’s efforts to provide enhanced, comprehensive digital data discovery and access, and the team is focused on providing a user-focused, integrated system for the discovery and access of digital scientific assets.

The DSET and the DASH services are here to help in promoting NCAR’s scientific results and allow them to be used, so that they would be valued for the long term.

If you would like to learn more about DSET/DASH and its services after the LYD week, please contact us at datahelp@ucar.edu.

Thank you for participating in Love Your Data Week by reading this and the previous four posts. If you have missed any of the 5 posts during this week, they are available in Staff Notes, or please feel welcome to contact the Data Curation & Stewardship Coordinator.

Are you interested in K-12 education and public outreach? If so, please join your colleagues in a discussion about this topic on Tuesday, February 20th from 2:30-3:30pm in CG1-3131 (note the room change from the previous announcement). During this meeting we'll share updates on education and outreach efforts happening across the organization, discuss ideas for collaborations across groups, and discuss how we'd like these meetings to be structured in the future.

To RSVP please send an email to Becca Hatheway (hatheway@ucar.edu) and you'll be added to the calendar invite. In addition, if you haven't already signed up for the email list for this group (k12@ucar.edu) please let Becca know that you would like to be added to the list.

In this paper we show how multiple data sets, including observations and models, can be combined using the “N-cornered hat method” to estimate vertical profiles of the errors of each system. Using data from 2007, we estimate the error variances of radio occultation, radiosondes, ERA-Interim and GFS model data sets at four radiosonde locations in the tropics and subtropics. A key assumption is the neglect of error covariances among the different data sets, and we examine the consequences of this assumption on the resulting error estimates.

This three-day course, April 17-19, 2018, will provide an introduction to fundamental methods used in Machine Learning. We will start with dimension reduction methods, which are often used as precursors to subsequent analysis. This will be followed by an overview of unsupervised vs. supervised learning. For unsupervised learning, we will cover various cluster analysis methods such as k-means. For supervised learning, we will introduce data-driven approaches such as regression trees and modeling-based approaches with a special focus on artificial neural networks and deep learning. The course is aimed at an applied audience and will make heavy use of data examples to illustrate the concepts. We'll use the open-source statistical software R [https://www.r-project.org/]. The format of the course is hands-on and participants will use their own laptops.

Instructors

The lead instructor for the course is Valerie Monbet, Professor of Statistics at the University of Rennes. She will be assisted by graduate students and post-doctoral fellows specializing in Statistics and Machine Learning. Seats are limited to 12 participants to allow for effective one-on-one coaching. To apply, please visit the Machine Learning Application link on the left hand-side of the workshop webpage. Note the application deadline is March 2, at 5:00 PM MST.

This training is for UCAR employees only. For more information, see the webpage here. To apply, click here.

Announcement regarding HR Staff availability from 2/14/2018-2/23/2018: As part of the Human Capital Management initiative, many of our Human Resources staff will be attending all-day vendor demonstrations. These all-day demos are scheduled for February 14-February 15 and February 20-February 23. Though we will be checking our emails and phone messages intermittently during these days, please be aware that we will do our best to answer requests in a timely manner. The HR Department will have a limited number of employees available in our FLA office to handle in-person requests.

If your request is urgent, you may contact Martha Jones, x8715, martha@ucar.edu, and she will notify the appropriate individual. Additionally, all Data Entry changes for Pay Period 5 are due by 5:00PM on Wednesday, February 21. Submissions received after this deadline will not be processed until the following pay period. We appreciate your understanding and are excited to embark on this new initiative.

Higher model resolution model implies a higher number of degrees of freedom and a need for dense observation networks (e.g. satellite, radar and surface observations) to constrain the model initial state. Like in many other NWP centers, only a small fraction of the available observations is being used in ECCC operational systems. The horizontal thinning for all assimilated radiances is 150 km; radar observations are not yet assimilated operationally; and the screen level wind observations are not yet operationally assimilated over land. Although data assimilation for convective scale NWP has been the object of intense research lately, the resolution and the quality of background error covariances remain factors limiting the assimilation of dense observations.

The data assimilation component for a new short-term convective-scale numerical weather prediction (NWP) system covering most of Canada at2.5 km resolution is currently being developed. It is based on a fully cycling deterministic 4DEnVar scheme with analysis increments initially computed at 10 km resolution. Several practical approaches have been evaluated and compared for generating ensembles of short-term forecasts for specifying the required background-error covariances. This includes ensembles from an EnKF and also from much simpler approaches. The new system is evaluated and compared with using Environment and Climate Change Canada's currently operational regional data assimilation system (with increments computed at 50 km resolution) for initializing forecasts from the identically configured atmospheric model.

The NCAR Fellows Association is hosting a panel on proposals and budgets on Wed., Feb. 21st, and we would like to invite all postdocs and grad students in the NCAR|UCAR|UCP. Please invite those in your lab who might not have access to Staff Notes because they are visitors.

Panel Agenda: Budget & Planning manager Valerie Koch will provide insights on important aspects of budgets. NCAR scientists Wiebke Dierling and Matthew Long will answer questions and share their experiences.

Topics might include: how to develop collaborations that lead to grant writing, can you send in a short description to a program manager to see if the ideas align with funding goals, and what are important things to understand about budgets.

The NCAR Fellows Association is here to create a supportive community and provide career development workshops for postdocs and graduate students, whether they are visiting or employed here.

Thank you for your help in spreading the word to postdocs and grad students in your lab!

The Community Earth System Model (CESM) is a state-of-the-art climate model and is NCAR’s flagship climate model. It is used to simulate the Earth’s climate system, from the distant past into the future, and to investigate the processes underlying the climate system. Simulations done with CESM, depending on the specific model configuration, can replicate time periods from as short as few days to tens of thousands of years.

In the last few years, the use of large ensembles of CESM simulations has become more common, in which the identical model configuration is run from dozens to thousands of times. As a result, large volumes of model data are generated, from tens of terabytes to over a petabyte from a single project. The management and analysis of the output from these petascale projects can be a daunting task.

The talk will go over the past, present and future of the data engineering and management of CESM data. I'll focus on the tools that I use to handle the scale and complexity of these data, and their application to some recent CESM petascale projects, such as the LENS, LME, DPLE, and GLENS. The upcoming set of simulations for CMIP6 and the future directions of large-scale data engineering and management within CESM will also be discussed. I'll also talk about my views regarding best practices in CESM data management, and the policies that guide and influence CESM data management, in the present and in the future.

BiographyGary Strand is a software engineer in the Climate Change Prediction group of the Climate and Global Dynamics Laboratory (CGD) of NCAR. He began work at NCAR as a student assistant, and has been involved in several generations of climate model development in CGD. He is the primary data manager and data scientist for the NCAR climate model, the Community Earth System Model (CESM). He has led major data management activities and projects for the CESM since 2003, including CMIP3, CMIP5 and other large-scale CESM projects.

All registrations include conference, tutorials and symposia (we can NOT offer separate registrations at a lower cost)

IMPORTANT INFORMATION

All attendees, speakers, sponsors and volunteers at UCAR/NCAR-run events are required to agree with the UCAR/NCAR code of conduct.

Gender specific restrooms are located on the first floor. All-gender/gender neutral, single-stall restrooms are located on the third floor of the conference venue, next to conference room CG-3150.

Lactation rooms are available for use of workshop participants, and are located on the second floor, in CG-2615 and CG-2617. If you require a small refrigerator to be placed in either of these rooms, please contact the registration desk. Child care may be also arranged if necessary. Contact Davide Del Vento if you'd like more informations about it.

The conference venue is fully accessible to wheelchairs, including restrooms, but feel free to contact Davide Del Vento should you have a special need and would like to double-check.

The random forest (RF) algorithm and logistic regression (LR) are implemented to develop skillful, calibrated contiguous United States (CONUS)-wide probabilistic forecasts of locally extreme precipitation, as quantified by 1- and 10-year average recurrence interval (ARI) exceedances. Models are created for two different 24-hour periods representing lead times of 36-60 hours and 60-84 hours. CONUS is partitioned into eight regions which exhibit similar hydrometeorological properties. Within each of these regions, a model is trained to produce probabilistic exceedances forecasts on a ~0.5°grid, based on historical forecasts spanning an 11-year 2003-2013 period. Predictor data used to generate forecast probabilities come from simulated atmospheric fields taken from a record of NOAA’s 11-member Second Generation Global Ensemble Forecast System Reforecast (GEFS/R), and includes not only the quantitative precipitation forecast (QPF) output from the model, but also variables that characterize the meteorological regime, including winds, moisture, and instability; spatiotemporal variability of fields is also considered. Results from a variety of sensitivity experiments are presented, and the use of these models to explore the physics of the forecast problem and objectively quantify the statistical biases of the GEFS/R is explored. These models are being developed for operational implementation at the Weather Prediction Center to assist forecasters with Excessive Rainfall Outlook generation. In association with that effort, forecasts from this model were evaluated during their 2017 Flash Flood and Intense Rainfall Experiment. Subjective forecaster evaluations are presented alongside objective verification during this period as well as an extended 4-year period beginning in September 2013. Overall, it is found that the machine learning-based forecasts add significant (>1 day lead time) skill over forecasts produced from the raw QPFs of both the GEFS/R and ECMWF ensemble across almost all regions of CONUS. The seminar will conclude with discussion of how well the methodology extrapolates to other datasets and predictands.

Speaker BioGreg Herman is currently a Ph.D. candidate in atmospheric science at Colorado State University, advised by Dr. Russ Schumacher. Greg graduated from the University of Washington with a B.S. in atmospheric science, computer science, and physics, and defended his M.S. research at CSU in autumn 2015. His research primarily concerns the application of machine learning towards the improvement of probabilistic high-impact weather forecasts at short-to-medium range timescales, with particular emphasis on extreme precipitation forecasting. In particular, Greg has developed machine learning-based real-time forecast products for forecasting cloud ceiling and visibility at select airports, warm-season convection over northeastern Colorado, and severe weather and extreme precipitation across the contiguous United States in analogous fashion to the Storm Prediction Center’s convective outlooks and Weather Prediction Center’s (WPC’s) excessive rainfall outlooks. He is currently working with colleagues at WPC to transition the latter forecast product into operations. Greg has also performed several forecast verification studies of extreme precipitation in order to make better informed machine learning model design choices and better contextualize the performance of those models. As a graduate student, Greg has also acquired extensive field project experience from numerous different field programs across the country. Furthermore, he has also engaged in interdisciplinary collaborations between National Weather Service forecasters, social scientists, and atmospheric researchers to better understand the meteorological and communication challenges associated with concurrent and collocated tornado and flash flood hazards. Currently in the process of beginning to assemble his dissertation, Greg anticipates graduating in autumn 2018.