American University: Statistics 618/GOV 618 (Every FALL): Bayesian Statistics for Social and Biomedical Sciences
Semester: Fall
Year offered: 2023
Course Description:
Principles and applications of modern statistical decision theory, with a special focus on Bayesian modeling, data analysis, inference, and optimal decision making. Prior and posterior; comparison of Bayesian and frequentist approaches, including minimax decision making and elementary game theory. Bayesian estimation, hypothesis testing, credible sets, and Bayesian prediction. Introduction to Bayesian computing software and applications to diverse fields. Grading: A-F only. Prerequisite: STAT-514 or permission of instructor.
Location: Wednesday, 5:30 PM – 8:00 PM, DMTI 217.
Learning Outcomes: By the end of this course, students will be able to:
1. Demonstrate a basic understanding of Bayesian model specification, Bayesian posterior inference, and model assessment and comparison.
2. Use this understanding of Bayesian statistics to specify and estimate Bayesian multilevel (hierarchical) models with linear and nonlinear outcomes, treat missing data in a principled and correct manner using multiple imputation, gain facility in the R and bugs statistical languages, know how to compute the appropriate sample size and power calculations for Bayesian models, gain exposure to Bayesian approaches including MCMC computation, and be able to assess model reliability and fit in complex models.
3. Apply this understanding of Bayesian statistics to data in the social and biomedical sciences.
4. Convey analytical results from these models to both lay and technical audiences clearly in both writing and speech.
Prerequisite Details: This course assumes a knowledge of basic statistics as taught in a first year undergraduate or graduate sequence. Topics should include: probability, cross-tabulation, basic statistical summaries, and linear regression in either scalar or matrix form. Knowledge of R is essential, knowledge of basic matrix algebra and calculus is helpful.
Course Requirements and Expectations: The final grade will be based on two components: weekly attendance and participation (20%) and exercises (80%). Graduate students will have one additional component of their exercise grade that constitutes 30 points out of the 80 points total: submission of an analysis of real research using a multilevel model applied to data in their field along with 5-10 pages of discussion to include a description of the data, model diagnostics, and the subsequent findings. Consider this assignment to be the start of a research manuscript to be eventually submitted to a an academic journal. Graduate students will still submit all exercises assigned below in addition to this work. Some guidelines are here.
Office Hours: Wednesdays 12-3 in Kerwin 109B or via zoom, https://american.zoom.us/j/98991201424
Incompletes: Due to the scheduled nature of the course, no incompletes will be given.
Teaching Assistant: Yasir Atalan, ayasiratalan@gmail.com, office hours Monday 9:30-11:30.
Required Reading: Gelman and Hill, “Data Analysis Using Regression and Multilevel/Hierarchical Models (Cambridge University Press 2007). Some papers will be available at jstor.org or distributed by the instructor. Readings should be completed before class. A NYT story about radon.
Statement Regarding Student Resources: see the following link https://edspace.american.edu/ctrl/classroomsupport/.
The Academic Integrity Code: http://www.american.edu/academics/integrity/code.cfm.
Emergency Preparedness. In the event of an emergency, students should refer to the AU Web site (http://www.american.edu/emergency) and the AU information line at (202) 885-1100 for general university-wide information. In case of a prolonged closure of the University, I send updates to you by email and will post all announcements on the course web site.
Support Services. A wide range of services is available to support you in your efforts to meet the course requirements. Mathematics & Statistics Tutoring Lab (x3154, x3120, Don Myers Building, Room 103) provides tutoring in Mathematics and Statistics. Lab hours are Mo-Th 11 am – 8 pm, Fr 11 am – 3 pm, and Su 3 pm – 8 pm. http://www.american.edu/cas/mathstat/tutoring.cfm. Academic Support and Access Center (x3360, MGC 243) offers study skills workshops, individual instruction, tutor referrals, Supplemental Instruction, writing support, and technical and practical support and assistance with accommodations for students with physical, medical, or psychological disabilities. Writing support is also available in the Writing Center, Battelle-Tompkins 228. CTRL Connect – software support with R (ctrl@american.edu, x2117). Counseling Center (x3500, MGC 214) offers counseling and consultations regarding personal concerns, self-help information, and connections to off-campus mental health resources.
Datasets: the data are either provided in links below or are available at Gelman’s webpage for the book.
Topics (subject to minor change):
August 30: Introducing Bayesian Inference. Reading: R For Beginners (to make sure you are fluent on the basics), Gelman & Hill, Chapters 1 and 2, Bayes Intro, MLE Review, Intro code from the lecture, Bayesian mechanics slides, Preview of multilevel models. Exercises: Gelman & Hill 2.2, 2.3, 2.4.
September 6: Linear Model Theory Review. Reading: Gelman & Hill, Chapters 3 and 4, Chapter 3-4 code from the lecture, Binomial PMF likelihood grid search, lecture slides (do not print!). Anaemia data. Tweed data. clx.R. Exercises: Gelman & Hill 3.4, 4.4, 5.4, 6.1.
September 13: Multilevel Structures and Multilevel Linear Models: the Basics. Reading: Gelman & Hill, Chapters 11 and 12, Introductory Chapter (Gill and Womack, from the SAGE Handbook of Multilevel Modeling). Lecture slides and chapter 11-12 code. Radon data. Uranium data. Smoking data. Exercises: Gelman & Hill 11.4, 12.2, 12.5.
September 20: Multilevel Linear Models: Varying Slopes, Non-Nested Models and Other Complexities. Reading: Gelman & Hill, Chapter 13, Lecture slides, Chapter 13 code from the lecture. Exercises: Gelman & Hill 13.2, 13.4, 13.5.
September 27: Multilevel Logistic Regression, Multilevel Generalized Linear Models. Reading: Gelman & Hill, Chapter 14 (skip Section 14.3), Chapter 15, Lecture slides, Chapter 14 code from the lecture. Exercises: Gelman & Hill 14.5, 14.6, 15.1, 15.2. Speed Dating Data, NES Data (remove .txt appendix, load with foreign library), polls.dta file (remove .txt appendix, load with foreign library), cheney.asia.sub.txt, police_stops_data.txt.
October 4: Multilevel Modeling in Bugs and R: the Basics, MCMC Theory. Part 1. Reading: Gelman & Hill, Chapter 18, Bayesian Estimation Case Study (Gill and Witko 2012), R to JAGS code for the model (get data from here), Lecture slides. Exercise: Replicate the model in Gill and Witko (2012).
October 11: Multilevel Modeling in Bugs and R: the Basics, MCMC Theory. Part 2. Reading: Gelman & Hill Chapter 16, Chapter 16 code from the lecture. Lecture slides. Exercises: Gelman & Hill 16.1, 16.2, 16.3.
October 18: Fitting Multilevel Linear and Generalized Linear Models in Bugs and R, MCMC Coding. Reading: Gelman & Hill, Chapter 16, Chapter 17 code from the lecture. Exercises: Gelman & Hill Rerun 16.3 with instructions from 17.2 & 17.3, AND do 17.5 using the age guessing data. Continuation of lecture slides from the previous week.
October 25: Understanding and Summarizing the Fitted Models, Multilevel Analysis of Variance. Reading: Gelman & Hill, Chapter 21 slides, Chapter 21 code from the lecture. CD4 data. Caesarian data. Bypass data. Depression data. No exercises for this week. Finish last assignment for next week.
November 1: Multilevel Analysis of Variance. Chapter 22 Slides, Chapter 22 code. from the lecture. Exercises: 21.1, 21.3., 21.4, 22.1.
November 8: Model Checking and Comparison. Reading: Gelman & Hill, Chapter 24. Lecture slides. Chapter 24 code from the lecture. Exercises: 24.1, 24.4. Dogs data.
November 15: Causal Inference. Guest lecture by Dr. Ryan Moore. Reading: Gelman & Hill Chapters 9 and 10. Exercises: 9.4.
November 22: No meeting due to Thanksgiving holiday.
November 29: Treatment of Missing Data. Reading: Gelman & Hill, Chapter 25, Paper by van Buuren and Groothuis-Oudshoorn. Lecture slides. Chapter 25 code from the lecture. Exercises: missing data problem set (use this dataset).
December 6: Sample Size and Power Calculations.a. Reading: Gelman & Hill, Chapter 20. Lecture Slides. Chapter 20 code from the lecture. Exercises: 20.1, 20.2, 20.3.
December 13: Wrap Up and Presentation of Projects. Exercises: none. All remaining homework and the project due this day.