Calendar

During the fall 2018 semester, the Computational Social Science (CSS) and the Computational Sciences and Informatics (CSI) Programs have merged their seminar/colloquium series where students, faculty and guest speakers present their latest research. These seminars are free and are open to the public. This series takes place on Fridays from 3-4:30 in Center for Social Complexity Suite which is located on the third floor of Research Hall.

If you would like to join the seminar mailing list please email Karen Underwood.

The behavior of electrons in materials underpins key materials properties, which means that computing and understanding the electronic structure of various materials systems is of vital importance. These computations are, nowadays, often performed using density functional theory (DFT), a first-principles methodology that is an important tool in the computational materials scientist’s toolbox. One of the materials properties that is accessible via DFT is magnetism, and DFT can be used to study the magnetic interaction between electrons (called the exchange interaction) in a material. This involves constructing models such as the Heisenberg model and mapping DFT calculations onto it, which allows one to understand how tuning different features impacts important parameters such as the critical temperature and the stability of magnetic ground state. This approach to studying magnetic materials is of particular appeal in the spin electronics field, where the encoding and processing information using the magnetic states of electrons is of central importance. In this talk I will: 1) introduce the basic concepts of computational materials science using DFT in an accessible manner, and 2) present calculations on two different materials where I used DFT in conjunction with modeling to analyze the magnetic interactions. The first presented material will be the dilute magnetic semiconductor (Ba, K)(Zn, Mn)2As2, which exhibits ferromagnetism when a small amount of manganese and potassium are substituted into the material, and where changing the relative quantity of potassium influences the strength of the magnetic interactions. The second presented material is MnAu2, a magnetic metal that has a cork-screw noncollinear magnetic ground state which can be tuned in intriguing ways using pressure and chemical substitution. Using modeling in combination with DFT, I will show how we are able to understand the nature of the microscopic magnetic interactions in each material and that the microscopic mechanisms driving the the magnetic interactions in both compounds is the same. These results can then be used to resolve several experimental questions, one of which had gone unaddressed for several decades.

COLLOQUIUM OF THE COMPUTATIONAL MATERIALS SCIENCE CENTER
AND THE DEPARTMENT OF COMPUTATIONAL AND DATA SCIENCES (CSI 898-Sec 001)

Local Structure Analysis in Simulated Materials via Voronoi Topology

Emanuel A. Lazar
Laboratory for Research on the Structure of Matter
School of Engineering and Applied Science
University of Pennsylvania
Philadelphia, PA

Describing how atoms are arranged in real and simulated materials is a very natural problem that arises in numerous computational materials science applications. However, aside from perfect crystals, insightful yet tractable descriptions of local arrangements of atoms can be tricky to develop. We consider several conventional order-parameter methods for describing local structure and highlight their theoretical and practical limitations. We then introduce a topological approach more naturally suited for structure analysis and highlight its versatility and robustness. In particular, the Voronoi tessellation method can aid in the study of materials at high temperatures, close to melting, without uncontrolled modification of raw data. Applications to the study of grain boundary evolution and melting will be briefly presented.

Making sense of user-generated Web content such as social media data, blogs, or even Wikipedia entries poses interesting research challenge considering its lack of structure, amount, and associated noise. This work introduces a range of online content summaries for such unstructured data. Besides the typical spatial, temporal and thematic summaries, we introduce two additional views. Geoevents are emerging events that are limited in spatial scope hashtags and that are detected automatically by comparing their spatial distribution to a global topic search.

Links-of-Interests (LOIs) are connection summaries between geographic locations, people and concepts. These content summaries are available as part of the functionality of a Web-based tool that allows for the interactive visualization, querying and exploration of such unstructured data. This talk will discuss research results and demo the capabilities of the visualization prototype.

In this talk I will give an overview of of scientific data mining (SDM) followed by a few examples of SDM applications to the analysis of tropical cyclones (TCs), focusing on their intensity changes. Because rapidly intensifying (RI) tropical cyclones are the major error sources in TC intensity forecasting, association rules facilitate the RI process by mining for sets of conditions that have strong interactions with rapidly intensifying TCs. The technique of association rules explores associations among multiple conditions in a simple manner identifying a predictor set with fewer factors but improved RI probabilities. Furthermore, in searching the “optimal” RI condition combinations, a peculiar condition combination was identified that gives a very high RI probability. Such combination can be considered as a sufficient condition for RI that almost guarantees that a RI will take place. Applications of classification techniques to the intensity forecasting will also be discussed. Several drawbacks and future directions for SDM with the TC intensity change problem will be discussed at the end of the talk.

Establishing robust quantitative metrics which allow decision makers to determine the amount of risk in a system with extreme loss events is a problem of interest in many scientific fields. One of the fundamental metrics which is universally accepted in all fields of risk management is the quantity known as Value-at-Risk (VaR). A subfield of risk management, modern Operational Risk Management (ORM), closely investigates methodologies on robustly estimating VaR, “Robust Estimation of VaR.” Currently, academic researchers and industry practitioners are actively looking at ways to make this estimate more statistically robust and accurate with minimal assumption requirements.

In this talk I will present two new quantitative approaches for estimating VaR that are agnostic regarding the relationship between frequency and severity: (1) Data Partition of Frequency and Severity (DPFS) using K-means to estimate VaR; (2) Distribution based partitioning (DBP) of frequency and severity using copulas. Verification is conducted on five simulated scenario datasets while validation is conducted on five publicly available datasets from four different domains: –US Financial Indices data of Standard & Poor’s 500 and Dow Jones Industrial Average; –Chemical Loss spills as tracked by the US National Coast Guard; –Australian automobile accidents; –US hurricane data. It is observed that previous VaR calculations inaccurately estimate the VaR for 80% of the cases in simulated data and 60% of the cases in real-world data studies while new methodologies attains accurate VaR estimates which are within the 95% confidence interval bounds for both simulated and real-world data.

Polymers and other soft materials have an important role as a coating for nanoscale building blocks like metallic nanoparticles (NPs) and nanorods. This coating mediates interactions between these building blocks and their environment. Atomistic molecular dynamics (MD) simulations are ideal for examining the role of chemistry and atomic interactions at the sub-nanometer scale, for example in the interactions between a NP and a solvent, or between pairs of NPs. Unfortunately, atomistic MD simulations are limited to lengths of order 50 nm and times of order 50 ns. The time scale limitation precludes modeling nanoscale self-assembly, and limits dynamic simulations to extremely high rates of deformation or thermalforcing. These simulations are also limited to sizes that represent a small number of NPs, making it impossible to model large assembled structures.

Faced with these limitations, we have developed coarse-grained (CG) models of polyethylene, a simple polymer used to coat NPs and nanorods. These models have enabled simulations of bulk polymer melts that overcome the limits of atomistic MD by providing a computational speedup of greater than 104 while retaining fundamental details at the sub-nanometer scale. These details produce the viscoelastic properties and semi-crystalline behavior that are intrinsic to polyethylene and that are missed by generic CG models. When applied to a NP coating the CG models capture the coating morphology, indicating the value of using these CG models in nanoscale applications.

Many applied processes generate complex microstructures or patterns which are hard to quantify due to the lack of any underlying regular structure. These patterns may evolve with time or include some element of stochasticity. The resulting variations in the detail structure frequently force one to concentrate on rougher geometric features. From a mathematical point of view, several notions from algebraic topology suggest themselves as natural quantification tools in such a setting. In this talk I will describe some of these tools, in particular homology and persistent homology, and how they can be efficiently computed using open source software. I will also present some applications motivated by materials science problems.

Leveling the Playing Field: Information Asymmetry in the Used Vehicle Buying Process

Monday, January 29, 4:30-5:45
Exploratory Hall, Room 3301

Abstract

In 1970, Economist and Nobel Prize winner (2001) George Akerlof published a study: “The Market for “Lemons”: Quality Uncertainty and the Market Mechanism” (The Quarterly Journal of Economics, Vol. 84, No. 3. (Aug., 1970), pp. 488-500). In the study, Akerlof attempts to show where, in a market where as seller of a product has more data/information than the buyer of the product, about the product’s quality, will potentially result in “an adverse selection of low-quality products.” In no place is Akerlof’s theory more represented than in the used vehicle market, where buyers and sellers don’t always have the same information about a vehicle’s quality, potentially resulting in low-quality cars being bought and sold. In 1986, CARFAX sought to begin to “level the playing field” between buyers and sellers of used vehicles by collecting, analyzing and making relevant data/information available in the marketplace to both buyers and sellers.

What data does CARFAX collect?

Why this data vs other data?

How does the market drive the type of data collected?

How does CARFAX analyze and make this data available to answer real world questions?

Is this car safe?

How much should I pay for it?

What’s it worth?

What’s the risk to insure it?

What’s the risk to finance it?

Faisal Hasan is the General Manager, Data & Public Policy at CARFAX. In his over 17 years at CARFAX, Faisal has been responsible for helping to build CARFAX’s Vehicle History Database through public and private data acquisition efforts across North America, including overcoming legislative and regulatory hurdles to data access. Faisal focuses on CARFAX’s efforts to secure and analyze data to feed the CARFAX “Onetime to Lifetime” Game Plan and develop future CARFAX products. Faisal earned his B.A. in Government & Politics at George Mason University and his M.A. in Government at the Johns Hopkins University. Faisal has been a Fairfax County resident for over 35 years. He is married w/four kids, including a GMU Junior studying Biology and a 2017 GMU Kinesiology graduate.

Sri Melkote is Head of Business Analytics at CARFAX. Sri has been at CARFAX for 2 years and leads teams responsible for valuation modeling, pricing analytics, marketing research, media measurement and optimization. He has over 12 years of data science experience. Prior to joining CARFAX, Sri developed dynamic pricing systems, personalized recommendation engines and inventory planning systems for the travel industry. He holds a Master’s degree in Mathematics from Purdue University and a Bachelor’s degree in Chemical Engineering from Indian Institute of Technology, Madras. He is the author of several journal articles and a recipient of the 2011 INFORMS Revenue Management and Pricing Practice Award.

Feras A. Batarseh
Research Assistant Professor
Department of Geography and Geoinformation Science
College of Science
George Mason University

Why an open mind on open data can transform our collective intelligence

Monday, February 5, 4:30-5:45
Exploratory Hall, Room 3301

Abstract: In 1822, the founding father James Madison said: “A popular government, without popular information, or the means of acquiring it, is but a prologue to a farce or a tragedy, or perhaps both”. Recent technological waves have evidently served Madison’s vision of government transparency. The latest advancements in Artificial Intelligence (AI), Data Science, and Machine Learning can make federal data openness a low hanging fruit. Moreover, the big data and open government initiatives (signed in 2012 and 2013) are major enablers for transforming government into a new era of intelligent and data-driven policy making. However, to be able to use data in reforming the political discussion, public federal data needs to devise the promised openness.

Besides benefiting government, Open Data benefits many other domains and applications of data science, such as healthcare, finance, and academia. For example, Open Data could lead to a general openness in science (i.e. Open Science), clearer experimental research, and begin reshaping the human knowledge in general. These topics and other facets will be discussed in this talk.

Bio: Feras A. Batarseh is a Research Assistant Professor,Department of Geography and Geoinformation Science, College of Science, George Mason University in Fairfax, VA. His research spans the areas of Data Science, Artificial Intelligence, and Context-Aware Software Systems. Dr. Batarseh obtained his Ph.D. and M.Sc. in Computer Engineering from the University of Central Florida (UCF) (2007, 2011), and a Graduate Certificate in Project Leadership from Cornell University (2016). His research work has been published at various prestigious journals and international conferences. Additionally, Dr. Batarseh published and edited several book chapters.

Dr. Batarseh has taught data science and software engineering courses at multiple universities including GMU, UCF as well as George Washington University (GWU). Prior to joining GMU, Dr. Batarseh was a Program Manager with the Data Mining and Advanced Analytics team at MicroStrategy, Inc., a global business intelligence corporation based in Tysons Corner, Virginia. During his tenure, he helped several clients make sense of their data and gain insights into improving their operations. For more information on his research, and contact details, please refer to these webpages: http://ferasbatarseh.com/

Abstract: Modernization of nearly all the technology that underlies the provision of rail and bus transit service over the past 30 years has resulted in a vast amount of data that until recently has been more or less neglected. Meanwhile, challenges that face rail and bus transit systems continue to mount, from maintaining a state of good repair to capturing and keeping riders in the age of Uber/Lyft and bike share. The key to providing safe, convenient, affordable, and reliable transit service into the next century lies in the hands of data scientists and policy analysts. This talk will review the different data-generating technologies and the types of data they create, followed by an exploration of the pressing issues faced by transit agencies and the questions begging for answers.

Bio: Michael currently serves as Strategic Planning Advisor at WMATA in the Office of Planning’s Applied Planning Intelligence unit, where he focuses on transforming data into information to help inform policy and planning decisions. He currently focuses on fare policy, crowding, GTFS data and online tools, and customer-focused performance metrics. Before joining WMATA in 2010, he worked for Oracle Corporation, an IT start-up, and the Metropolitan Washington Council of Governments. He holds a BS in Systems Analysis and Engineering from The George Washington University, and masters in City and Regional Planning and Transportation Engineering from UC Berkeley.