Purpose Past studies of sepsis epidemiology did not address misclassification bias due to imperfect verification of sepsis detection methods to estimate the true prevalence. Methods We examined 273,126 hospitalizations from 2008 to 2012 at a tertiary-care center to develop surveillance-aimed sepsis detection criteria, based on the presence of the sepsis-explicit International Classification of Diseases, Ninth Revision, Clinical Modification codes (995.92 or 785.52), blood culture orders, and antibiotics administration. We used Bayesian multinomial latent class models to estimate the true prevalence of sepsis, while adjusting for the imperfect sensitivity and specificity and the conditional dependence among the individual criteria. Results The apparent annual prevalence of sepsis hospitalizations based on explicit International Classification of Diseases, Ninth Revision, Clinical Modification codes were 1.5%, 1.4%, 1.6%, 2.2%, and 2.5% for the years 2008 to 2012. Bayesian posterior estimates for the true prevalence of sepsis suggested that it remained stable from 2008, 19.2% (95% credible interval [CI]: 17.9%, 22.9%), to 2012, 17.8% (95% CI: 16.8%, 20.2%). The sensitivity of sepsis-explicit codes, however, increased from 7.6% (95% CI: 6.4%, 8.4%) in 2008 to 13.8% (95% CI: 12.2%, 14.9%) in 2012. Conclusions The true prevalence of sepsis remained high, but stable despite an increase in the sensitivity of sepsis-explicit codes in administrative data.
Publications by Type: Journal Article
An accelerometer, a wearable motion sensor on the hip or wrist, is becoming a popular tool in clinical and epidemiological studies for measuring the physical activity. Such data provide a series of activity counts at every minute or even more often and displays a person’s activity pattern throughout a day. Unfortunately, the collected data can include irregular missing intervals because of noncompliance of participants and therefore make the statistical analysis more challenging. The purpose of this study is to develop a novel imputation method to handle the multivariate count data, motivated by the accelerometer data structure. We specify the predictive distribution of the missing data with a mixture of zero-inflated Poisson and Log-normal distribution, which is shown to be effective to deal with the minute-by-minute autocorrelation as well as under- and over-dispersion of count data. The imputation is performed at the minute level and follows the principles of multiple imputation using a fully conditional specification with the chained algorithm. To facilitate the practical use of this method, we provide an R package accelmissing. Our method is demonstrated using 2003-2004 National Health and Nutrition Examination Survey data. Keywords Accelerometer, physical activity, missing count data, multiple imputation, zero-inflated model, Poisson log-normal
Background. The long-term and cumulative effect of multiple episodes of bacteremia and sepsis across multiple hospitalizations on the development of cardiovascular (CV) events is uncertain.
Purpose To quantify the coinciding improvement in the clinical diagnosis of sepsis, its documentation in the electronic health records, and subsequent medical coding of sepsis for billing purposes in recent years. Methods We examined 98,267 hospitalizations in 66,208 patients who met systemic inflammatory response syndrome criteria at a tertiary care center from 2008 to 2012. We used g-computation to estimate the causal effect of the year of hospitalization on receiving an International Classification of Diseases, Ninth Revision, Clinical Modification discharge diagnosis code for sepsis by estimating changes in the probability of getting diagnosed and coded for sepsis during the study period. Results When adjusted for demographics, Charlson-Deyo comorbidity index, blood culture frequency per hospitalization, and intensive care unit admission, the causal risk difference for receiving a discharge code for sepsis per 100 hospitalizations with systemic inflammatory response syndrome, had the hospitalization occurred in 2012, was estimated to be 3.9% (95% confidence interval [CI], 3.8%–4.0%), 3.4% (95% CI, 3.3%–3.5%), 2.2% (95% CI, 2.1%–2.3%), and 0.9% (95% CI, 0.8%–1.1%) from 2008 to 2011, respectively. Conclusions Patients with similar characteristics and risk factors had a higher of probability of getting diagnosed, documented, and coded for sepsis in 2012 than in previous years, which contributed to an apparent increase in sepsis incidence. Keywords: Causality, ICD-9-CM, Sepsis, Systemic inflammatory response syndrome, Risk difference
Identification of modifiable risk factors is gravely needed to prevent adverse prostate health outcomes. We previously developed a murine precancer model in which exposure to maternal obesity stimulated prostate hyperplasia in offspring. Here, we used generalized linear modeling to evaluate the influence of additional environmental covariates on prostate hyperplasia. As expected from our previous work, the model revealed that aging and maternal diet-induced obesity (DIO) each correlated with prostate hyperplasia. However, prostate hyperplasia was not correlated with the length of maternal DIO. Cage density positively associated with both prostate hyperplasia and offspring body weight. Expression of the glucocorticoid receptor in prostates also positively correlated with cage density and negatively correlated with age of the animal. Together, these findings suggest that prostate tissue was adversely patterned during early life by maternal overnutrition and was susceptible to alteration by environmental factors such as cage density. Additionally, prostate hyperplasia may be acutely influenced by exposure to DIO, rather than occurring as a response to worsening obesity and comorbidities experienced by the mother. Finally, cage density correlated with both corticosteroid receptor abundance and prostate hyperplasia, suggesting that overcrowding influenced offspring prostate hyperplasia. These results emphasize the need for multivariate regression models to evaluate the influence of coordinated variables in complicated animal systems. Keywords prostate hyperplasia, cage overcrowding, maternal obesity, developmental programming, generalized linear modeling
Background: The National Cancer Institute’s Transdisciplinary Research in Energetics and Cancer initiative is in its second round of funding. Despite increasing agreement that trans-disciplinary team-based research is valuable in addressing complex problems like energy balance and cancer, methods for constructing and maintaining transdisciplinary teams is lacking. Purpose: We articulate a method for assessing trans-disciplinary teams that relies on social network analysis and using this knowledge to improve their functioning. Methods: Using data from the Washington University TREC site in 2011 and 2013, we demonstrate the use of social network analysis to assess and provide feedback on team functioning. Results: We portray broker functioning in both years. By 2013, the director and co-director had begun to share broker functions with other members. Some brokers fostered communication with less central network members. Conclusions: The information obtained can help to train a new generation of investigators to optimally participate on transdisciplinary research teams.A Social Network Analysis Approach to... (PDF Download Available). Available from: https://www.researchgate.net/publication/279790188_A_Social_Network_Analysis_Approach_to_Diagnosing_and_Improving_the_Functioning_of_Transdisciplinary_Teams_in_Public_Health
We apply a specialized Bayesian method that helps us deal with the methodological challenge of unobservedheterogeneity among immigrant voters. Our approach is based on \emphgeneralized linear mixed Dirichlet models (GLMDM) whererandom effects are specified semiparametrically using a Dirichlet process mixture prior that has been shown to account forunobserved grouping in the data. Such models are drawn from Bayesian nonparametrics to help overcome objections handling latenteffects with strongly informed prior distributions. Using 2009 German voting data of immigrants, we show that for difficultproblems of missing key covariates and unexplained heterogeneity this approach provides (1) overall improved model fit, (2)smaller standard errors on average, and (3) less bias from omitted variables. As a result, the GLMDM changed our substantiveunderstanding of the factors affecting immigrants’ turnout and vote choice. Once we account for unobserved heterogeneity amongimmigrant voters, whether a voter belongs to the first immigrant generation or not is much less important than the extantliterature suggests. When looking at vote choice we also found that an immigrant’s degree of structural integration does notaffect the vote in favor of the CDU/CSU, a party which is traditionally associated with restrictive immigration policy.