Updated Current Research

Models for Identifying Substantive Clusters and Fitted Subclusters in Social Science Data
(with George Casella)

Unseen grouping, often called latent clustering, is a common feature in social science data.  Subjects may intentionally or
unintentionially group themselves in ways that complicate the statistical analysis of substantively important relationships.
This work introduces a new model-based clustering design which incorporates two sources of heterogeneity.  The first source
is a random effect that introduces substantively unimportant grouping but must be accounted-for.
The second source is more important and more difficult to handle since it is directly related to the relationships of interest in
the data.  We develop a model to handle both of these challenges and apply it to data on terrorist groups, which are
notoriously hard to model with conventional tools.

An Imputation Solution for Differentiating between Unreported Attitudes and Genuine Nonattitudes in Survey Data
(with Natalie Jackson)

Most survey analyses treat “don’t know” or nonattitude responses as missing values and drop them from analysis with case wise
(list wise) deletion.  To date, considerable research has been devoted to minimizing such responses, so that missing data are
minimized. There are two problems with this approach: (1) We know that case wise deletion is the wrong way to deal with unrecorded
data unless it is missing completely at random (not conditional on other data, observed or unobserved). Otherwise, statistical
principles dictate that we should use some form of imputation. Imputation, though, implies that these respondents actually have
attitudes on the questions but have declined to state them, leading to the second issue: (2) We do not know whether
non-substantive responses are true nonattitudes or the respondent is choosing not to reveal an existing attitude. In this work we
demonstrate first that nonattitudes and “don’t know” responses are not random, but rather come from a distinct group of survey
respondents. This is shown by modeling relevant missingness as a dichotomous outcome variable explained by various
characteristics, including demographic attributes, other attitudinal questions, and group level contexts. This model allows us to
produce an imputational model to predict missingness due to ignorance versus intransigence. We use these "data" as part of the
survey analysis, using the appropriate statistical treatment of the coefficient variability, to produce estimates that are not
plagued by case wise deletion or fictitious attitudes generated by imputation. Our results demonstrate that this approach is
useful for a wide range of survey research, including pre-election polls and non-political surveys.

A Flexible Class of Bayesian Frailty Models For Political Science Data
(with Jonathan Homola)

This manuscript reviews basic nonparametric (Cox) survival models and shows how heterogeneous effects on time-to-event outcomes
can be captured by frailty terms, which are analogous to hierarchies in multilevel models. A derivation and simulations are
provided to emphasize that not accounting for frailties when present in the data leads to biased coefficients. We then extend the
use of frailty models in political science by adding multiple nested and non-nested hierarchies in a Bayesian context. We also
specify group-level covariates, which has not been done with political science data even though data in the discipline frequently
have levels of aggregation. We illustrate the strength and flexibility of our model with applications in American Politics,
Comparative Politics, and the Women in Politics literature.

(with Beat Kaufmann and Allan Doctor)

Introduction: Exaggerated hypoxic pulmonary vasoconstriction is a hallmark of high altitude pulmonary edema (HAPE) and has been
ascribed to reduced nitric oxide (NO) availability. Because erythrocytes capture, process and dispense NO as a function of oxygen
(O2) gradients, we investigated the role of hemoglobin-bound NO in HAPE.  Methods: 23 mountaineers were studied at low altitude
and after ascent to high altitude (HA, 4559m). Echocardiographic parameters including systolic pulmonary artery pressure (sPAP),
blood gases, and the erythrocyte NO content (as total (NO:Hb) and that bound to either thiol (SNO:Hb) or to heme (FeNO:Hb)) was
studied in mountaineers susceptible or resistant to develop HAPE (HAPE-S and HAPE-R).  Results: At HA, hemoglobin O2 saturation
decreased by 15-20%. sPAP increased from 20±3mmHg to 38±8mmHg in HAPE-R, and from 21±3mmHg to 47±9mmHg in HAPE-S (p<0.05 vs
HAPE-R). The absolute amount of NO:Hb, SNO:Hb, and FeNO:Hb in arterial and venous blood did not differ between HAPE-S and HAPE-R,
but regression analysis demonstrated influence of erythocyte NO metabolism parameters on sPAP. Pooled analysis of all studied
subjects showed significant increases of venous NO content at HA (p<0.001) and unexpected reversal of the erythrocyte NO gradient
across the systemic circulation, irrespective of HAPE susceptibility.  Conclusions: Erythrocyte NO metablism does not contribute
to the increased sPAP in HAPE-S, but has an influence on changes in sPAP at HA. Also, at HA, there is a reversal of the
erythrocyte NO gradient across the pulmonary vasculature suggesting peripheral erythrocyte NO loading as a mechanism
counterbalancing hypoxic pulmonary vasoconstriction at altitude.

Using Novel Biomarkers to Predict the Progression of Intracranial Pressure in Pediatric Traumatic Brain Injury
(with Jose Pineda)

Severe traumatic brain injury (TBI) remains a leading cause of pediatric death and disability (1). Motor vehicle accidents, falls
and abusive head trauma constitute the most common etiologies (2, 3). Pharmacological neuroprotective therapies are not available
for severe TBI, but guideline-based intensive care can improve outcomes (4-10). Guideline-based intensive care recommends
avoidance of secondary insults that are consistently associated with abnormal brain metabolism and bad outcome, including
intracranial hypertension, hyperventilation, hypoxia and hypotension (11, 12).  While all these secondary insults continue to be
reported in pediatric patients with severe TBI, intracranial hypertension (ICH) is the most common one [REF]. It can result in
direct brain injury and even cerebral herniation and death, or contribute to low cerebral perfusion, worsening brain metabolism.
Recommendations for initiation of intracranial pressure (ICP) monitoring and management of ICH are based on clinical and
radiological indicators of injury severity (Glasgow Coma Scale score and computerized tomography) and ICH thresholds associated
with worse outcome in a time-dose dependent fashion. Multiple reports support neuropathological effects of exposure to
ICH, and improved outcomes have been observed when more aggressive ICH directed therapy is provided. In contrast,
pediatric randomized trials lowering ICH have not demonstrated a benefit on outcomes [REF]. Not surprisingly, clinical practice
and outcomes remain highly variable, suggesting a better understanding of patient trajectories that influence outcome is needed.
Efforts to address these challenges by incorporating anatomical gradation of injury and quantification of
physiological trajectories are of limited utility or in early stages of validation (13)(14-16).  We propose that
neuropathological information from two serum based biomarkers of cellular injury, ubiquitin carboxyl-terminal esterase L1
(UCH-L1) and glial fibrillary acidic protein breakdown products (GFAP-BDPs) will improve our ability to characterize injury
progression in pediatric severe TBI patients. These biomarkers may in the future also allow biological quantification of
response to therapy, facilitating timely and individualized adjustments in patient care and consequently better outcomes.

(with Jose Pineda)

Defense Neurotrauma Pharmacology Group highlights the absence of effective pharmacological agents for neuroprotection in patients
with severe TBI. In contrast, multiple reports in adult and pediatric patients provide evidence supporting that the adoption
of guideline based care improves mortality and functional outcome in adult and pediatric patients with severe TBI.  Despite
the fact that guideline based care in TBI improves outcomes, implementation of these guidelines is limited, particularly in
pediatric patients. There is a need to more effectively implement guideline based care. Challenges preventing wide and
effective implementation of guideline based care include the lack of strategies that account for the complex, multilevel nature of
both TBI care and the teams caring for these patients. These challenges often compromise fidelity to the guidelines,
resulting in large variability in care and outcomes. Our long-term goal is to develop and rigorously test an
implementation strategy that fits the realities of patient care and contributes to sustained implementation of guideline based
care for children with severe TBI. This innovative proposal enables us to take the first step in this long-range implementation
program. Our electronic standalone system aims to overcome substantial challenges associated with less effective approaches such
as non-electronic guideline implementation strategies and presentation of advice within electronic health record (EHR) systems
[Bickman+][Roshanov][Vison]. Our user-centered design brings together characteristics of clinical decision support systems
associated with improved clinical practice and patient outcomes. These characteristics include point of care, real time automatic
provision of recommendations -rather than just clinical assessments [Roshanov, Kawamoto and Bickman]). Importantly, by using a
dynamic sustainability framework we account for variation in resources, infrastructure and operating procedures [Vison and

Measuring the Ideology of State and Congressional Districts Using Universal Kriging
(with Jamie Monogan)

In this paper, we develop and make available measures of public ideology in 2010 for the 50 Ameri- can states, 435 congressional
districts, and state legislative districts. We do this using the geospatial statistical technique of Bayesian kriging, which uses
the locations of survey respondents, as well as population covariate values, to predict ideology for simulated citizens in
districts across the country.  In doing this, we improve on past research that uses the kriging technique for forecasting public
opinion by incorporating Alaska and Hawaii, making the important distinction between ZIP codes and ZIP code tabulation areas, and
introducing more precise data from the 2010 Census. We show that our estimates of ideology at the state, congressional district,
and state legislative district levels appropriately predict the ideology of legislators elected from these districts, serving as
an external validity check.

Optimized Formulation, Delivery and Dosing for ErythroMer (Artificial Red Cell)
(with Allan Doctor)

There is need for an artificial oxygen (O2) carrier to substitute for use when banked Red Blood Cells (RBC) are unavailable or (2)
undesirable. To address this need, we developed ‘ErythroMer’ (EM), a first-in-class, bio-synthetic, nano-cyte RBC mimetic. EM is a
deformable, hybrid polymeric nanoparticle that incorporates high per particle payloads of hemoglobin (Hb). Our bio-inspired
‘artificial cell’ design has yielded a prototype that emulates RBC physiology in all key respects and represents a potentially
disruptive introduction into Transfusion Medicine.  Two major approaches have been pursued to develop an artificial O2 carrier:
perfluorocarbon emulsions (PFCs) and modified Hb agents (HBOCs)1-3. Both have fallen short4,5, possibly because designs do not
emulate normal physiology – resulting in interactions with NO and O2 that disrupt homeostatic controls (particularly, controls
matching vascular tone to tissue metabolism, e.g. hypoxic vasodilation (HVD))6-8. When free in plasma (as for most HBOCs), (1) Hb
loses allosteric control and exhibits abnormally high O2 affinity and (2) globin-chain crosslinking (required to stabilize HBOC
tetramers) interferes with normal cooperativity. Both changes impair O2 delivery. Moreover, free Hb disturbs vasoregulation due to
avid NO trapping/consumption9-13. Such impaired vasoregulation is a critical problem; because this effect reduces blood flow, O2
delivery even by native RBCs is prevented14 (and in particular, to hypoxic tissue, by impairing HVD). Chemically modified
cell-free Hbs have also suffered an unfavorable risk-benefit profile: a recent HBOC meta-analysis demonstrated a significant
increase in hypertension, myocardial damage and mortality in surgical patients15. Alternatively, perfluorocarbon-based O2 carriers
exhibit fewer side effects. However, for any given pO2, Hb binds significantly more O2 than can be dissolved in PFCs, and in
contrast to the Hb sigmoidal binding curve, PFCs demonstrate a flat O2 solubility curve. As a result, most of the O2 carried by
PFCs is prematurely released16,17, limiting tissue delivery18. Finally, neither PFCs nor most HBOCs can be lyophilized for
prolonged storage. At this time, the majority of products under active development are RBC-imitating vesicles or nanoparticles;
these continue to struggle with: 1) complement activation by liposomal shells, 2) static O2 affinity, 3) NO trapping, 4) complex
metHb reduction systems, and 5) designs not amenable to lyophilization19-25.  The EM design surmounts these weaknesses by: 1)
encapsulating Hb in a novel bio-compatible polymeric shell with RBC-emulating morphology, 2) controlling O2 capture/release with a
novel 2,3-DPG shuttle (2,3-DPG is the major heterotropic effector for Hb and diminishes O2 affinity), 3) attenuating NO uptake
through shell properties, and 4) retarding metHb formation by co-packaging a reduction system. Moreover, EM is designed for
sterile lyophilization and so, is amenable to facile reconstitution after extended dry storage under ambient conditions. EM offers
a pragmatic approach to a complex need and is designed for cost-effective production at scale. To date, prototype has passed
rigorous initial ex vivo and in vivo “proof of concept” testing. Notably, each parameter is independently controllable by
manipulating components of the EM formulation (particle size and payload density, membrane bi-layer thickness and degree of
intra-particulate surface cross-linking, and molar ratios amongst payload components [Hb, 2,3-DPG, and leucomethylene blue]). Most
importantly, EM is amenable to ongoing optimization through systematic design and structure/activity study.  This project will
optimize parameters essential to pragmatic in-field ErythroMer use: (1) formulation to produce a stable, sterile, easily
reconstituted, lightweight dry preparation, (2) field-deployable reconstitution and administration procedure and (3) dosing that
optimally balances efficacy/toxicity.

Should Missing Values of the Outcome Variable Be Imputed for Regression Models?
(with Ben Baggozi)

While approaches to missingness in explanatory variables are now well understood, esearchers are often confused about missingness
in modeled outcomes.  In this work we look at the question of whether missing outcome variables should be imputed with standard
tools.  We summarize the current state of practice in statistics and empirical social science, analytically derive the effects of
imputation, and demonstrate properties with a Monte Carlo simulation. In general standard imputation (multiple imputation, random
imputation, Bayesian stochastic imputation, etc.) are appropriate for missing Y-variables, but we also point out areas of caution.
We specifically address the questions of when and how to impute missing values in the outcome variable.  Some scholars still feel
that one should not use various methods to fill-in such missingness, since the subsequent model specification is providing fitted
outcome values given levels of covariates and this would then be a redundant process.  We have show that this is misguided since
the general imputation process differs from the modeling process, and failure to include the outcome variable in the imputation
process leads to biased results which can be worse than listwise deletion.  Our new method for including the outcome variable in
the imputation process, easily implemented in \R\ and other languages, was shown to be superior to alternative approaches through
the Monte Carlo simulations.  We then demonstrate through a published example that the strategy for handling missing outcome
variables can have a profound effect on the key substantive conclusions.  Our intention here has been to raise issue of one kind
of missingness, provide solutions for the problem, and demonstrate that it matters to all empirical political scientists.



works.in_.progress.txt16 KB


Distinguished Professor, Department of Government

Government, Matehmatics & Statistics, Center for Data Member, Behavioral Neuroscience
American University, 4400 Massachusetts Avenue, NW, Washington, DC 20016