# American University: Statistics 618/GOV 618 (Every FALL): Bayesian Statistics for Social and Biomedical Sciences

Fall

## Offered:

2020

Course Description
Principles and applications of modern statistical decision theory, with a special focus on Bayesian modeling, data analysis, inference, and optimal decision making. Prior and posterior; comparison of Bayesian and frequentist approaches, including minimax decision making and elementary game theory. Bayesian estimation, hypothesis testing, credible sets, and Bayesian prediction. Introduction to Bayesian computing software and applications to diverse fields. Grading: A-F only. Prerequisite: STAT-514 or permission of instructor.

Location: Online.

Learning Outcomes: By the end of this course, students will be able to:
1. Demonstrate a basic understanding of Bayesian model specification, Bayesian posterior inference, and model assessment and comparison.

2. Use this understanding of Bayesian statistics to specify and estimate Bayesian multilevel (hierarchical) models with linear and nonlinear outcomes, treat missing data in a principled and correct manner using multiple imputation, gain facility in the R and bugs statistical languages, know how to compute the appropriate sample size and power calculations for Bayesian models, gain exposure to Bayesian approaches including MCMC computation, and be able to assess model reliability and fit in complex models.

3. Apply this understanding of Bayesian statistics to data in the social and biomedical sciences.

4. Convey analytical results from these models to both lay and technical audiences clearly in both writing and speech.

Prerequisite Details: This course assumes a knowledge of basic statistics as taught in a first year undergraduate or graduate sequence. Topices should include: probability, cross-tabulation, basic statistical summaries, and linear regression in either scalar or matrix form. Knowledge of R is essential, knowledge of basic matrix algebra and calculus is helpful.

Course Requirements and Expectations: The final grade will be based on two components: weekly attendance and participation (20%) and exercises (80%). Late assignments will not be accepted. Graduate students will have one additional component of their exercise grade that constitutes 10 points out of the 80 points total: submission of an analysis of real research using a multilevel model applied to data in their field along with 5-10 pages of discussion to include a description of the data, model diagnostics, and the subsequent findings. Consider this assignment to be the start of a research manuscript to be eventually submitted to a an academic journal. Graduate students will still submit all exercises assigned below in addition to this work. Some guidelines are here.

Office Hours: Tuesday 12-3. Zoom, link on the class Github page.

Incompletes: Due to the scheduled nature of the course, no incompletes will be given.

Teaching Assistant: Le Bao, lb4126a@american.edu, Zoom hours Monday 10-12, link on the Github page.

Required Reading: Gelman and Hill, "Data Analysis Using Regression and Multilevel/Hierarchical Models (Cambridge University Press 2007). Some papers will be available at jstor.org or distributed by the instructor. Readings should be completed before class. A NYT story about radon.

Statement Regarding Student Resources: see the following link https://edspace.american.edu/ctrl/classroomsupport/.

Emergency Preparedness. In the event of an emergency, students should refer to the AU Web site (http://www.american.edu/emergency) and the AU information line at (202) 885-1100 for general university-wide information. In case of a prolonged closure of the University, I send updates to you by email and will post all announcements on the course web site.

Support Services. A wide range of services is available to support you in your efforts to meet the course requirements. Mathematics & Statistics Tutoring Lab (x3154, x3120, Don Myers Building, Room 103) provides tutoring in Mathematics and Statistics. Lab hours are Mo-Th 11 am – 8 pm, Fr 11 am – 3 pm, and Su 3 pm – 8 pm. http://www.american.edu/cas/mathstat/tutoring.cfm.  Academic Support and Access Center (x3360, MGC 243) offers study skills workshops, individual instruction, tutor referrals, Supplemental Instruction, writing support, and technical and practical support and assistance with accommodations for students with physical, medical, or psychological disabilities. Writing support is also available in the Writing Center, Battelle-Tompkins 228. CTRL Connect – software support with R (ctrl@american.edu, x2117). Counseling Center (x3500, MGC 214) offers counseling and consultations regarding personal concerns, self-help information, and connections to off-campus mental health resources.

Datasets: the data are either provided in links below or are available at Gelman's webpage for the book.

Topics (subject to minor change):
August 26: Introducing Bayesian Inference. Reading: R For Beginners (to make sure you are fluent on the basics), Gelman & Hill, Chapters 1 and 2,  MLE Review, Intro code from the lecture, Bayesian mechanics slides, Preview of multilevel models. Exercises: Gelman & Hill 2.2, 2.3, 2.4.

September 2: Linear Model Theory Review. Reading: Gelman & Hill, Chapters 3 and 4, Chapter 3-4 code from the lecture, Binomial PMF likelihood grid search, lecture slides (do not print!). Anaemia data. Tweed data. clx.R. Exercises: Gelman & Hill 3.4, 4.4, 5.4, 6.1.

September 9: Multilevel Structures and Multilevel Linear Models: the Basics. Reading: Gelman & Hill, Chapters 11 and 12, Introductory Chapter (Gill and Womack, from the SAGE Handbook of Multilevel Modeling). Lecture slides and chapter 11-12 code. Radon dataUranium data. Smoking data. Exercises: Gelman & Hill 11.4, 12.2, 12.5.

September 16: Multilevel Linear Models: Varying Slopes, Non-Nested Models and Other Complexities. Reading: Gelman & Hill, Chapter 13, Lecture slidesChapter 13 code from the lecture. Exercises: Gelman & Hill 13.2, 13.4, 13.5.

September 23: Multilevel Logistic Regression, Multilevel Generalized Linear Models. Reading: Gelman & Hill, Chapter 14 (skip Section 14.3), Chapter 15, Lecture slidesChapter 14 code from the lecture. Exercises: Gelman & Hill 14.5, 14.6, 15.1, 15.2. Speed Dating Data, NES Data (remove .txt appendix, load with foreign library), polls.dta file (remove .txt appendix, load with foreign library), cheney.asia.sub.txt, police_stops_data.txt.

September 30: Multilevel Modeling in Bugs and R: the Basics, MCMC Theory. Part 1. Reading: Gelman & Hill, Chapter 18, Bayesian Estimation Case Study (Gill and Witko 2012), R to JAGS code for the model (get data from here), Lecture slides. Exercise: Replicate the model in Gill and Witko (2012).

October 7: Causal Inference. Guest lecture by Dr. Ryan Moore. Reading: Gelman & Hill Chapters 9 and 10. Exercises: 9.4.

October 14: Multilevel Modeling in Bugs and R: the Basics, MCMC Theory. Part 2. Reading: Gelman & Hill Chapter 16, Chapter 16 code from the lecture. Lecture slides. Exercises: Gelman & Hill 16.1, 16.2, 16.3.

October 21: Fitting Multilevel Linear and Generalized Linear Models in Bugs and R, MCMC Coding. Reading: Gelman & Hill, Chapter 16, Chapter 17 code from the lecture. Exercises: Gelman & Hill Rerun 16.3 with instructions from 17.2 & 17.3, AND do 17.5 using the age guessing data.

October 28: Understanding and Summarizing the Fitted Models, Multilevel Analysis of Variance. Reading: Gelman & Hill, Chapter 21 slides, Chapter 22 slides, Chapter 21 code from the lecture, Chapter 22 code from the lecture. CD4 data. Caesarian data. Bypass data. Depression data. Exercises: 21.1, 21.3, 21.4, 22.1.

November 4: Model Checking and Comparison. Reading: Gelman & Hill, Chapter 24. Lecture slides. Chapter 24 code from the lecture. Exercises: 24.2, 24.3. Dogs data.

November 11: Treatment of Missing Data. Reading: Gelman & Hill, Chapter 25, Paper by van Buuren and Groothuis-Oudshoorn. Lecture slides. Chapter 25 code from the lecture. Exercises: missing data problem set (use this dataset).

November 18: Sample Size and Power Calculations. Reading: Gelman & Hill, Chapter 20. Lecture slides. Exercises: 20.1 and 20.3.

November 25: Bayesian Nonparametrics. Reading: Gill & Casella, Nonparametric Priors For Ordinal Bayesian Social Science Models: Specification and Estimation.'' Journal of the American Statistical Association, 104, 453-464, (June) 2009.

December 2: Online Wrap Up and Presentation of Projects. Exercises: none. All remaining homework and the project due this day.