Week 1

This is a short week, with only one class. We’ll mostly be getting to know one another and getting up to speed with Google Drive and Slack. By Sunday night, you need to have introduced yourself on Slack and written your first Data Diary entry. For Tuesday in class, you need to come prepared with some links for your Wikipedia article.


  • Introduction to the practice of data journalism
  • Paper and pencil data collection on NYTimes
  • What is newsworthy?

Week 2

On Tuesday, we wrapped up our NYTimes data collection, discussed chapter 1 of Numbers in the Newsroom, and talked through the wikipedia authoring process. Thursday class was cancelled.

Wikipedia entries are due Monday at midnight.

By Tuesday, I’d like you to have read Numbers in the Newsroom through the section “Going further with changes” (where through means you read that section but you don’t need to go beyond), and Data Organization in Spreadsheets. We’ll be starting to interview a spreadsheet during our Zoom meeting.


  • Wrapping up NYTimes data collection
  • Discussing Chapter 1 of Numbers in the Newsroom
  • Starting Wikipedia entry


  • No class

Week 3

On Tuesday, we discussed the process of Wikipedia authoring, broke down some pieces of a story, talked about the news (and how it related to the Numbers in the Newsroom piece), and started interviewing our spreadsheet.

For Thursday, you should do the following:

Read this article on interviewing, and compare/contrast with the infographic I showed in the last class about starting a data analysis.

  • Go to the google sheet on the natural amenities scale and choose File -> Make a Copy, and create a copy in the Google Drive folder with your name.
  • Once it’s copied into your folder, you can do whatever seems useful to start “interviewing” it.
  • If you do something to the data, write down the steps you’ve taken so we can discuss them together.


  • Zoom meeting
  • Discuss the experience of writing a Wikipedia article
  • Discussing Chapter 2 (beginning) of Numbers in the Newsroom
  • Journalistic style/elements
  • Beginning to interview a dataset (natural amenities score)


Week 4

Your one-number stories are due Monday at midnight. As always, you should be completing four data diaries for the week.

For Tuesday, I would like you to read the following:

For Thursday, I would like you to read:



  • FOIA the dead
  • Science writing

Week 5

You have a few things due on Monday this week:

  • Edited one-number story
  • Edited Wikipedia article
  • FOIA the dead submission

You also need to do some reading:

  • Your group’s assigned research paper
  • Chapter 3 of Numbers in the Newsroom



Week 6

For Monday, you need to finish your one-variable visualization. Upload it to Moodle or send me a link on Slack. You should be working on your Science Reporting, and trying to find a time to interview your faculty member. We’ll keep talking about data visualization (read the selections linked below), and also explore a few new technologies: OpenRefine and Tabula.


  • More on data visualization
  • What makes something data journalism?


  • Pulling data from PDFs using Tabula
  • Research tips from the library

Week 7

I’m not making any writing due on Monday. But, please come to class with your csv of data from the Freeing Data from PDFs exercise, and be prepared to work with git and GitHub (instructions are linked on the same page). This will require you to make a GitHub username, install git, and install a git client (optional, but recommended). I would also like you to install OpenRefine (linked at the top of moodle) for data wrangling.

You also need to do some reading.

  • Chapter 4 of Numbers in the Newsroom, The Standard Stories
  • The LA Times investigation on arrests of homeless people, and the associated Ask me Anything on Reddit. I’ve also linked to the GitHub repo, which I would like you to glance at, in particular the iPython notebook where they organized their R code.
  • Finally, you should be moving forward with your Science Reporting story, coordinating with the faculty member to find an interview time, prepping questions, doing background research, etc. Remember, you may want to bring a recording device to the interview with you, but be sure to ask the faculty member if it’s okay to record.


  • Standard stories
  • Standardizing data and writing a data dictionary


  • Standard stories
  • Story ideas
  • git and GitHub: fork, clone

Spring break

Over spring break, you should be thinking about your Science Reporting (due March 26), and working on your Standard Story (due March 19, right when we get back). I’d also like you to try working through some of the [redacted] activities I assigned. There are no data dairies due for break.

Week 8

This is going to be a pretty light week. On Tuesday, we’re visiting the College Archives. You don’t need to do anything to prepare for that.


  • Visit to the Smith Archives
  • Presentation of the SDS major, Ford 240


  • OpenRefine
  • More git and GitHub: fetching upstream
  • Working with survey data

Week 9

We’re starting to think about programming now. You should be finishing up those DataCamp courses. We’re also considering polling data! For Thursday, I’d like you to read


  • Final story idea brainstorming
  • Intro to R


  • Survey data
  • More R

Week 10

Based on the initial feedback from the mid-semester assessment, we’re going to take a bit of a step back on Tuesday and talk more about journalism and writing. There is no writing assignment due on Monday. However, I would like you to complete your [redacted] courses (or, if they are all review, choose a different course to work through) and submit another pull request for your cleaned-up BLL data. Reading for Tuesday:

  • Handbook of Independent Journalism, sections 1, 3, 4, and 7
  • Failing the Frail
  • Nursing Home Neglect (starts page 11 of the PDF below)

Reading for Thursday:


  • Taking a step back
  • What is journalism?
  • What is data journalism?


  • Sentence workshopping
  • Mapping in R

Week 11

This week, we’re thinking about algorithms. We’ve been discussing them throughout the course because they’ve been in the news a lot, but now we’ll turn our focus there more directly. For Tuesday, I would like you to read the following:

For Thursday, read:


  • More mapping
  • Algorithms


  • Algorithms

Week 12

On Tuesday, I’d like us to think about timelines in data journalism. I think these can either be written or visual, although most of them blend elements of both. Please read:

Thursday, we’ll be hosting Rachel Schutt, who will be on campus for a talk later in the day. I’d like you to come with some questions for her. These can be about her career path, advice for students, work she’s done, etc. Rachel has done a ton of interesting stuff in her life. She coauthored a book called Doing Data Science with Cathy O’Neil, who wrote Weapons of Math Destruction. She was chief data scientist at News Corp, and is now a managing director at BlackRock. Both those places are major organizations, and not without controversy. I often talk to people about whether they should try to work for companies whose mission they totally believe in, or try to be a force for good in a place that might not be the most admirable. It seems to me that Rachel has chosen to do the latter, which is really interesting and probably hard!

I’d like you to research Rachel for yourself. The News resources on the library website could be a good place to look (maybe try Nexis Uni). Here are a few links to get you started:


  • Timelines


  • Rachel Schutt visiting class
  • Extra credit opportunity Rachel Schutt’s talk, A Humanistic Approach to Data Science. 5 pm in Stoddard G2.

Week 13

This week, we’ll be thinking about interactivity in journalism. I think we can continue to consider many of the pieces I brought in last week, because they all have interactive elements. In addition to those, I’d like you to look at the following:

Like last week, I’m not expecting you to read every word of these pieces, but rather focus on the form. How do the interactive elements support the story? We’ll be trying out some interactives ourselves, in class.


  • Interactivity


  • Interactivity


  • Final workshop


  • Present final work