Boot Camp Syllabus

Click on this PDF icon to download a copy of the syllabus with full module details.

Boot Camp Module Completion Map and Checklists
Click on this WORD DOCX icon to download these to help you stay on track.


Modern data streams, whether from biomedical research labs or environmental and social-science research teams, are quantitative and noisy. The goal of this class is to help you begin to think quantitatively and statistically about data, and to help you answer the question What do your data (actually) say?.  


To help you answer this question, we present an approach centered on applying statistical and mathematical techniques the modern way: with a computer.  More specifically, this course is not a substitute for a Machine Learning, AI, or classical statistics course. Instead, we will cover topics like parameter estimation, hypothesis testing, and dimensionality reduction in a way that is centered on the specific information that you know about your data.  By the end of the course, you will have a set of tools that should allow you to attack any quantitative problem, make a conclusion, and assess your confidence in that conclusion.  

This boot camp is taught as a 12-week course at Northwestern University. The course listing is ESAM 421 Models in Applied Mathematics.


Overarching Goals:

  • Students will learn to make and manipulate empirical data distributions in their computers.  Students will be able to use these distributions to test different hypotheses about their data without relying on specialized statistics software.  

  • Students will be able to identify the advantages and disadvantages of different statistical, mathematical, or computational techniques and will be able to determine appropriate methods for their data.  

  • Students will be able to use data to construct models appropriate to their research question.  

  • Students will be able to evaluate their model’s accuracy and efficacy and to compare their models to others.

Specific Learning Objectives:

  • Students will use Python to simulate experiments, generate statistics, build and test models, and execute most other learning objectives.

  • Students will visualize and plot distributional features of data in useful and easily-understood ways.

  • Students will construct and analyze probability distributions empirically and theoretically.

  • Students will learn to analyze arbitrary data distributions, even those that are not Gaussian.

  • Students will assess the ways by which any finite data sample places fundamental limits on what can be learned from a dataset.

  • Students will assess the likelihood that different explanatory models will produce given data sets and clearly articulate their reasons for “confidence” in a given model. 

Instructional Strategies

  1. Refer to the Whatdoyourdatasay_CompletionHelpAids that contains the links to course materials, a module completion map, and handy checklists.

  2. Course Notes will be made available with each Module in Northwestern Box.

  3. Core material (i.e. key concepts) will be discussed in Lecture Videos on Panopto.  

  4. “Class time” will consist of facilitated Study Sessions via Zoom, specifically devoted to addressing student challenges including conceptual difficulties with core material, coding questions, etc.  

  5. Utilize the Discussion Board on the website to address any questions to the community. We encourage you to answer posted questions as well.


Other resources

  • Course resources can be found on the course website:

  • Useful texts include Physical Models of Living Systems (Nelson, 1st ed. 2015) and Elements of Statistical Learning (Hastie, Tibshirani & Friedman, 2nd ed. 2016)

  • A Python coding tutorial is available to students who have never coded before. This is essential to take before you start the core material of the class. Worksheets and assignments are other venues within which to grow students’ proficiency at coding.


Course Assessments

Similar to an experimental lab course, the majority of student time will be spent in problem solving: writing code on their computer to analyse a dataset.  Problems will be assigned as daily worksheets and weekly assignments. 


  1. Worksheets are short focused problem-solving adventures. It is through coding and analyzing data that the practical methods taught in this class will come into focus. 

  2. Assignments are longer and more challenging, typically with far less guidance than the worksheets. Pre-recorded videos will introduce each worksheet and assignment.

  3. Self-Assessments will be assigned with each Module Assignment. These are to be completed after receiving the Solution Key.


2200 Campus Drive, Evanston, IL 60208