Schedule
THIS IS AN INCOMPLETE DRAFT SCHEDULE
Submit your homework answers, as an html file, via this Google form before midnight on the day before the class for which the homework was assigned.
Key resources: R for Data Science (2e) by Hadley Wickham, Mine Çetinkaya-Rundel, and Garrett Grolemund, Preceptor’s Primer for Bayesian Data Science by David Kane, Analyzing US Census Data: Methods, Maps, and Models in R by Kyle Walker.
See the home pages for positron.tutorials, r4ds.tutorials, primer.tutorials, and tidycensus.tutorials for installation instructions.
Always reinstall the relevant package before starting a new tutorial. Example for primer.tutorials:
remove.packages("primer.tutorials")
remotes::install_github("PPBDS/primer.tutorials")
We are always fixing mistakes. You want to use the latest version.
If you do not complete the tutorials on time, you will be removed from the class. Send in your tutorial answers saved in html format. Please try to ensure that the name of the file is the default name, like tutorials-in-positron_answers.html
. Avoid using the name which will be assigned to the file if you download the same answers twice (or more), stuff like tutorials-in-positron_answers (2).html
. You can just change the name of the file by hand if this happens.
Registration
Do not start the registration process yet.
Week 1: June 16
Introduction to the tools we use for doing data science.
Monday
Note that Monday’s homework must be completed before Monday’s class. I ensure this by only sending the Zoom link for class to students who submitted the homework.
Read the Introduction from R for Data Science (2e).
From the r4ds.tutorials package, complete the “Introduction” (
00-introduction
) tutorial.From the positron.tutorials package, complete the “Positron and Code” (
01-code
) tutorial. Make sure that you have already completed the “Introduction” tutorial above before you start the “Positron and Code” tutorial.
Wednesday
From the positron.tutorials package, complete the “Positron and Quarto” (
02-quarto
) and “Terminal” (03-terminal
) tutorials.Read Chapters 1 - 3 from R for Data Science (2e).
From the r4ds.tutorials package, complete the “Data visualization” (
01-data-visualization
) and “Data transformation” (03-data-transformation
) tutorials.
Reminder: All work must be completed before class. Failure to submit your tutorial answers will result in you being removed from the course. It is not fair to your fellow students, with whom you will be working in small groups, for you to not be prepared for class.
Thursday
From the positron.tutorials package, complete the “Positron and GitHub Introduction” (
04-github-1
) tutorial.Read Chapters 4 and 5 from R for Data Science (2e).
From the r4ds.tutorials package, complete the “Data tidying” (
05-data-tidying
).Read Cardinal Virtues.
Week 2: June 23
Going forward, we recommend that you read the chapters in R for Data Science (2e) associated with each tutorial we assign from the r4ds.tutorials package. However, we stop listing them explicitly.
Finish learning about tools like Quarto and GitHub Pages. Start learning about models and inference.
Monday
Read Chapter 1 Rubin Causal Model and complete the “Rubin Causal Model” tutorial (
011-rubin-causal-model
) from the primer.tutorials package.From the positron.tutorials package, complete the “Positron and GitHub Advanced” (
05-github-2
) tutorial.From the r4ds.tutorials package, complete the “Data import” (
07-data-import
) and “Layers” (09-layers
) tutorials.
Wednesday
Read Chapter 2 Probability and complete the “Probability” tutorial (
021-probability
) from the primer.tutorials package.From the positron.tutorials package, complete the “Quarto Websites Introduction” (
06-websites-1
) tutorial.From the r4ds.tutorials package, complete the “Exploratory data analysis” (
10-exploratory-data-analysis
) and “Communication” (11-communication
) tutorials.
Thursday
Read Chapter 3 Sampling and complete the “Sampling” tutorial (
031-Sampling
) from the primer.tutorials package.From the positron.tutorials package, complete the “Quarto Websites Advanced” (
07-websites-2
) tutorial.From the r4ds.tutorials package, complete the “Logical vectors” (
12-logical-vectors
) tutorial.Read Cardinal Virtues.
Week 3: June 30
Identify the data which you will use for your Data Project. The U.S. Census is a great source.
Monday
Read Chapter 4 Models and complete the “Models” tutorial (
041-Models
) from the primer.tutorials package.Read chapters 1 – 3 from from Analyzing US Census Data: Methods, Maps, and Models in R by Kyle Walker.
From the tidycensus.tutorials package, complete the “An introduction to tidycensus” (
02-introduction
) and “Wrangling Census data with Tidyverse tools” (03-wrangling
) tutorials.Come to class with the url for at least one data source you will be using for your Data Project.
Wednesday
Read Chapter 5 Two Parameters and complete the “Two Parameters” tutorial (
051-two-parameters
) from the primer.tutorials package.Read chapters 4 and 5 from from Analyzing US Census Data: Methods, Maps, and Models in R by Kyle Walker.
From the tidycensus.tutorials package, complete the “Exploring US Census data with visualization” (
04-exploring
) and “Census geographic data and applications in R” (05-geographic
) tutorials.
Thursday
Read Chapter 6 Three Parameters: Causal and complete the “Three Parameters: Causal” tutorial (
061-three-parameters-causal
) from the primer.tutorials package.From the r4ds.tutorials package, complete the “Numbers” (
13-numbers
) tutorial.Read Cardinal Virtues.
Week 4: July 7
The main focus on the 4th week is individual presentations of your Data Project. Do you love soccer or wine or NYC politics? The Data Project provides you with an opportunity to study that topic in depth. Your Data Project will be, for most of you, the first item in your professional portfolio, something so impressive that you will be eager to show it to graduate schools or potential employers.
Monday
From the primer.tutorials package, complete the “Project” tutorial before class, or you will be removed the course. There are no extensions for this assignment.
You must have a draft of your final project, including a Quarto website and your paragraph introduction.
Wednesday
- Practice presentations and feedback. You must have your presentation ready to go!
Thursday
- Data project presentations. You must invite someone, and bcc your TF when you do so.
Week 5: July 14
Renewed focus on models for inference.
Monday
Read Cardinal Virtues.
Read Chapter 7 Mechanics and complete the “Mechanics” tutorial (
071-mechanics
) from the primer.tutorials package.Read chapters 14, 15 and 16 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Strings,” “Regular expressions” and “Factors” tutorials.
Wednesday
Read Chapter 8 Four Parameters: Categorical and complete the “Four Parameters: Categorical” tutorial (
081-four-parameters-categorical
) from the primer.tutorials package.Read chapter 17 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Dates and times” tutorial.
Read chapter 18 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Missing values” tutorial.
Thursday
Read Chapter 9 Five Parameters and complete the “Five Parameters” tutorial (
091-five-parameters
) from the primer.tutorials package.Read chapter 19 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Joins” tutorial.
Week 6: July 21
Monday
Read Chapter 10 N Parameters and complete the “N Parameters” tutorial (
101-n-parameters
) from the primer.tutorials package.Read chapter 20 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Spreadsheets” tutorial.
Read chapter 21 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Databases” tutorial.
Wednesday
Complete the “Cumulative” tutorial (
111-cumulative
) from the primer.tutorials package.Read chapter 22 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Arrow” tutorial.
Read chapter 23 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Hierarchical data” tutorial.
Thursday
Complete the “Stops” tutorial (
131-stops
) from the primer.tutorials package.Read chapter 24 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Web scraping” tutorial.
Week 7: July 28
Monday
- Read chapter 25 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Functions” tutorial.
Wednesday
- Read chapter 26 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “Iteration” tutorial.
Thursday
- Read chapter 27 from R for Data Science (2e) and, from the r4ds.tutorials package, complete the “A field guide to base R” tutorial.
Week 8: August 4
Monday
From the primer.tutorials package, complete the “Project” tutorial before class, or you will be removed the course. There are no extensions for this assignment.
You must have a draft of your final project, including a Quarto website and your paragraph introduction.
Wednesday
- Practice presentations and feedback. You must have your presentation ready to go!
Thursday
- Final project presentations. You must invite someone, and bcc your TF when you do so.