Distinguished Professor, Department of Government
Department of Mathematics & Statistics,
Founding Director, Center for Data Science
Member, Center for Neuroscience and Behavior

American University, 4400 Massachusetts Avenue, NW, Washington, DC 20016

Current Research

Selected Current Research: Jeff Gill, Updated 1/10/2019

Models for Identifying Substantive Clusters and Fitted Subclusters in Social Science Data
(with George Casella)

Unseen grouping, often called latent clustering, is a common feature in social science data.  Subjects may intentionally or unintentionially group themselves in ways that complicate the statistical analysis of substantively important relationships. This work introduces a new model-based clustering design which incorporates two sources of heterogeneity.  The first source is a random effect that introduces substantively unimportant grouping but must be accounted-for. The second source is more important and more difficult to handle since it is directly related to the relationships of interest in the data.  We develop a model to handle both of these challenges and apply it to data on terrorist groups, which are notoriously hard to model with conventional tools.

An Imputation Solution for Differentiating between Unreported Attitudes and Genuine Nonattitudes in Survey Data
(with Natalie Jackson)

Most survey analyses treat “don’t know” or nonattitude responses as missing values and drop them from analysis with case wise (list wise) deletion.  To date, considerable research has been devoted to minimizing such responses, so that missing data are minimized. There are two problems with this approach: (1) We know that case wise deletion is the wrong way to deal with unrecorded data unless it is missing completely at random (not conditional on other data, observed or unobserved). Otherwise, statistical principles dictate that we should use some form of imputation. Imputation, though, implies that these respondents actually have attitudes on the questions but have declined to state them, leading to the second issue: (2) We do not know whether non-substantive responses are true nonattitudes or the respondent is choosing not to reveal an existing attitude. In this work we demonstrate first that nonattitudes and “don’t know” responses are not random, but rather come from a distinct group of survey respondents. This is shown by modeling relevant missingness as a dichotomous outcome variable explained by various characteristics, including demographic attributes, other attitudinal questions, and group level contexts. This model allows us to produce an imputational model to predict missingness due to ignorance versus intransigence. We use these “data” as part of the survey analysis, using the appropriate statistical treatment of the coefficient variability, to produce estimates that are not plagued by case wise deletion or fictitious attitudes generated by imputation. Our results demonstrate that this approach is useful for a wide range of survey research, including pre-election polls and non-political surveys.

A Flexible Class of Bayesian Frailty Models For Political Science Data
(with Jonathan Homola)

This manuscript reviews basic nonparametric (Cox) survival models and shows how heterogeneous effects on time-to-event outcomes can be captured by frailty terms, which are analogous to hierarchies in multilevel models. A derivation and simulations are provided to emphasize that not accounting for frailties when present in the data leads to biased coefficients. We then extend the use of frailty models in political science by adding multiple nested and non-nested hierarchies in a Bayesian context. We also specify group-level covariates, which has not been done with political science data even though data in the discipline frequently have levels of aggregation. We illustrate the strength and flexibility of our model with applications in American Politics, Comparative Politics, and the Women in Politics literature.

(with Beat Kaufmann and Allan Doctor)

Introduction: Exaggerated hypoxic pulmonary vasoconstriction is a hallmark of high altitude pulmonary edema (HAPE) and has been ascribed to reduced nitric oxide (NO) availability. Because erythrocytes capture, process and dispense NO as a function of oxygen (O2) gradients, we investigated the role of hemoglobin-bound NO in HAPE.  Methods: 23 mountaineers were studied at low altitude and after ascent to high altitude (HA, 4559m). Echocardiographic parameters including systolic pulmonary artery pressure (sPAP), blood gases, and the erythrocyte NO content (as total (NO:Hb) and that bound to either thiol (SNO:Hb) or to heme (FeNO:Hb)) was studied in mountaineers susceptible or resistant to develop HAPE (HAPE-S and HAPE-R).  Results: At HA, hemoglobin O2 saturation decreased by 15-20%. sPAP increased from 20±3mmHg to 38±8mmHg in HAPE-R, and from 21±3mmHg to 47±9mmHg in HAPE-S (p<0.05 vs HAPE-R). The absolute amount of NO:Hb, SNO:Hb, and FeNO:Hb in arterial and venous blood did not differ between HAPE-S and HAPE-R, but regression analysis demonstrated influence of erythocyte NO metabolism parameters on sPAP. Pooled analysis of all studied subjects showed significant increases of venous NO content at HA (p<0.001) and unexpected reversal of the erythrocyte NO gradient across the systemic circulation, irrespective of HAPE susceptibility.  Conclusions: Erythrocyte NO metablism does not contribute to the increased sPAP in HAPE-S, but has an influence on changes in sPAP at HA. Also, at HA, there is a reversal of the erythrocyte NO gradient across the pulmonary vasculature suggesting peripheral erythrocyte NO loading as a mechanism counterbalancing hypoxic pulmonary vasoconstriction at altitude.

Using Novel Biomarkers to Predict the Progression of Intracranial Pressure in Pediatric Traumatic Brain Injury
(with Jose Pineda)

Severe traumatic brain injury (TBI) remains a leading cause of pediatric death and disability (1). Motor vehicle accidents, falls and abusive head trauma constitute the most common etiologies (2, 3). Pharmacological neuroprotective therapies are not available for severe TBI, but guideline-based intensive care can improve outcomes (4-10). Guideline-based intensive care recommends avoidance of secondary insults that are consistently associated with abnormal brain metabolism and bad outcome, including intracranial hypertension, hyperventilation, hypoxia and hypotension (11, 12).  While all these secondary insults continue to be reported in pediatric patients with severe TBI, intracranial hypertension (ICH) is the most common one [REF]. It can result in direct brain injury and even cerebral herniation and death, or contribute to low cerebral perfusion, worsening brain metabolism. Recommendations for initiation of intracranial pressure (ICP) monitoring and management of ICH are based on clinical and radiological indicators of injury severity (Glasgow Coma Scale score and computerized tomography) and ICH thresholds associated with worse outcome in a time-dose dependent fashion. Multiple reports support neuropathological effects of exposure to ICH, and improved outcomes have been observed when more aggressive ICH directed therapy is provided. In contrast, pediatric randomized trials lowering ICH have not demonstrated a benefit on outcomes [REF]. Not surprisingly, clinical practice and outcomes remain highly variable, suggesting a better understanding of patient trajectories that influence outcome is needed. Efforts to address these challenges by incorporating anatomical gradation of injury and quantification of physiological trajectories are of limited utility or in early stages of validation (13)(14-16).  We propose that neuropathological information from two serum based biomarkers of cellular injury, ubiquitin carboxyl-terminal esterase L1 (UCH-L1) and glial fibrillary acidic protein breakdown products (GFAP-BDPs) will improve our ability to characterize injury progression in pediatric severe TBI patients. These biomarkers may in the future also allow biological quantification of response to therapy, facilitating timely and individualized adjustments in patient care and consequently better outcomes.

(with Jose Pineda)

Defense Neurotrauma Pharmacology Group highlights the absence of effective pharmacological agents for neuroprotection in patients with severe TBI. In contrast, multiple reports in adult and pediatric patients provide evidence supporting that the adoption of guideline based care improves mortality and functional outcome in adult and pediatric patients with severe TBI.  Despite the fact that guideline based care in TBI improves outcomes, implementation of these guidelines is limited, particularly in pediatric patients. There is a need to more effectively implement guideline based care. Challenges preventing wide and effective implementation of guideline based care include the lack of strategies that account for the complex, multilevel nature of both TBI care and the teams caring for these patients. These challenges often compromise fidelity to the guidelines, resulting in large variability in care and outcomes. Our long-term goal is to develop and rigorously test an implementation strategy that fits the realities of patient care and contributes to sustained implementation of guideline based care for children with severe TBI. This innovative proposal enables us to take the first step in this long-range implementation program. Our electronic standalone system aims to overcome substantial challenges associated with less effective approaches such as non-electronic guideline implementation strategies and presentation of advice within electronic health record (EHR) systems [Bickman+][Roshanov][Vison]. Our user-centered design brings together characteristics of clinical decision support systems associated with improved clinical practice and patient outcomes. These characteristics include point of care, real time automatic provision of recommendations -rather than just clinical assessments [Roshanov, Kawamoto and Bickman]). Importantly, by using a dynamic sustainability framework we account for variation in resources, infrastructure and operating procedures [Vison and Chambers].

Measuring the Ideology of State and Congressional Districts Using Universal Kriging
(with Jamie Monogan)

In this paper, we develop and make available measures of public ideology in 2010 for the 50 Ameri- can states, 435 congressional districts, and state legislative districts. We do this using the geospatial statistical technique of Bayesian kriging, which uses the locations of survey respondents, as well as population covariate values, to predict ideology for simulated citizens in districts across the country.  In doing this, we improve on past research that uses the kriging technique for forecasting public opinion by incorporating Alaska and Hawaii, making the important distinction between ZIP codes and ZIP code tabulation areas, and introducing more precise data from the 2010 Census. We show that our estimates of ideology at the state, congressional district, and state legislative district levels appropriately predict the ideology of legislators elected from these districts, serving as an external validity check.

Optimized Formulation, Delivery and Dosing for ErythroMer (Artificial Red Cell)
(with Allan Doctor)

There is need for an artificial oxygen (O2) carrier to substitute for use when banked Red Blood Cells (RBC) are unavailable or (2) undesirable. To address this need, we developed ‘ErythroMer’ (EM), a first-in-class, bio-synthetic, nano-cyte RBC mimetic. EM is a deformable, hybrid polymeric nanoparticle that incorporates high per particle payloads of hemoglobin (Hb). Our bio-inspired ‘artificial cell’ design has yielded a prototype that emulates RBC physiology in all key respects and represents a potentially disruptive introduction into Transfusion Medicine.  Two major approaches have been pursued to develop an artificial O2 carrier: perfluorocarbon emulsions (PFCs) and modified Hb agents (HBOCs)1-3. Both have fallen short4,5, possibly because designs do not emulate normal physiology – resulting in interactions with NO and O2 that disrupt homeostatic controls (particularly, controls matching vascular tone to tissue metabolism, e.g. hypoxic vasodilation (HVD))6-8. When free in plasma (as for most HBOCs), (1) Hb loses allosteric control and exhibits abnormally high O2 affinity and (2) globin-chain crosslinking (required to stabilize HBOC tetramers) interferes with normal cooperativity. Both changes impair O2 delivery. Moreover, free Hb disturbs vasoregulation due to avid NO trapping/consumption9-13. Such impaired vasoregulation is a critical problem; because this effect reduces blood flow, O2 delivery even by native RBCs is prevented14 (and in particular, to hypoxic tissue, by impairing HVD). Chemically modified cell-free Hbs have also suffered an unfavorable risk-benefit profile: a recent HBOC meta-analysis demonstrated a significant increase in hypertension, myocardial damage and mortality in surgical patients15. Alternatively, perfluorocarbon-based O2 carriers exhibit fewer side effects. However, for any given pO2, Hb binds significantly more O2 than can be dissolved in PFCs, and in contrast to the Hb sigmoidal binding curve, PFCs demonstrate a flat O2 solubility curve. As a result, most of the O2 carried by PFCs is prematurely released16,17, limiting tissue delivery18. Finally, neither PFCs nor most HBOCs can be lyophilized for prolonged storage. At this time, the majority of products under active development are RBC-imitating vesicles or nanoparticles; these continue to struggle with: 1) complement activation by liposomal shells, 2) static O2 affinity, 3) NO trapping, 4) complex metHb reduction systems, and 5) designs not amenable to lyophilization19-25.  The EM design surmounts these weaknesses by: 1) encapsulating Hb in a novel bio-compatible polymeric shell with RBC-emulating morphology, 2) controlling O2 capture/release with a novel 2,3-DPG shuttle (2,3-DPG is the major heterotropic effector for Hb and diminishes O2 affinity), 3) attenuating NO uptake through shell properties, and 4) retarding metHb formation by co-packaging a reduction system. Moreover, EM is designed for sterile lyophilization and so, is amenable to facile reconstitution after extended dry storage under ambient conditions. EM offers a pragmatic approach to a complex need and is designed for cost-effective production at scale. To date, prototype has passed rigorous initial ex vivo and in vivo “proof of concept” testing. Notably, each parameter is independently controllable by manipulating components of the EM formulation (particle size and payload density, membrane bi-layer thickness and degree of intra-particulate surface cross-linking, and molar ratios amongst payload components [Hb, 2,3-DPG, and leucomethylene blue]). Most importantly, EM is amenable to ongoing optimization through systematic design and structure/activity study.  This project will optimize parameters essential to pragmatic in-field ErythroMer use: (1) formulation to produce a stable, sterile, easily reconstituted, lightweight dry preparation, (2) field-deployable reconstitution and administration procedure and (3) dosing that optimally balances efficacy/toxicity.

Should Missing Values of the Outcome Variable Be Imputed for Regression Models?
(with Ben Baggozi)

While approaches to missingness in explanatory variables are now well understood, esearchers are often confused about missingness in modeled outcomes.  In this work we look at the question of whether missing outcome variables should be imputed with standard tools.  We summarize the current state of practice in statistics and empirical social science, analytically derive the effects of imputation, and demonstrate properties with a Monte Carlo simulation. In general standard imputation (multiple imputation, random imputation, Bayesian stochastic imputation, etc.) are appropriate for missing Y-variables, but we also point out areas of caution. We specifically address the questions of when and how to impute missing values in the outcome variable.  Some scholars still feel that one should not use various methods to fill-in such missingness, since the subsequent model specification is providing fitted outcome values given levels of covariates and this would then be a redundant process.  We have show that this is misguided since the general imputation process differs from the modeling process, and failure to include the outcome variable in the imputation process leads to biased results which can be worse than listwise deletion. Our new method for including the outcome variable in the imputation process, easily implemented in \R\ and other languages, was shown to be superior to alternative approaches through the Monte Carlo simulations.  We then demonstrate through a published example that the strategy for handling missing outcome variables can have a profound effect on the key substantive conclusions.  Our intention here has been to raise issue of one kind of missingness, provide solutions for the problem, and demonstrate that it matters to all empirical political scientists.


Distinguished Professor, Department of Government and Mathematics & Statistics
Director, Center for Data Science, Member, Center for Neuroscience and Behavior
American University, 4400 Massachusetts Avenue, NW, Washington, DC 20016

Copyright ©2024 American University