Assignment 1: Evaluation and extension of Hegre et al.’s predictions of armed conflict
Description
Hegre et al. (2013) predict “changes in global and regional incidences of armed conflict for the 2010-2050 period.” Hegre et al. (2021) evaluate these predictions over the 2010-2018 period. Both papers provide their code in their supplemental materials, and the UCDP/PRIO Armed Conflict Data Set is publicly available:
You have two goals for this assignment:
-
Evaluation. Perform your own evaluation of the quality and value of the 2013 paper’s predictions over the 2010-2018 period. You have complete freedom on how you evaluate these predictions, as long as your decisions are justified. You may use the 2021 paper for inspiration, but your goal is not to simply replicate their evaluation procedure or results.
-
Extension. Extend the 2013 and/or the 2021 paper(s) in some way. Here are some possible directions, but please do not feel limited by these ideas:
- Calibration. How calibrated are the 2013 paper’s risk scores? Table 1 suggests that reducing the classification threshold can substantially increase the true positive rate without a large increase in the false positive rate.
- Longer-run predictions. How well do longer-run predictions from the 2013 paper’s models hold up? You could consider generating additional predictions from the 1970-2000 model and/or using additional data from the 2022 update of the UCDP/PRIO Armed Conflict Data Set.
- Time-varying drivers. Relatedly, the 2021 paper states that “the results indicate that the drivers of armed conflict are fairly stable over time” because the model did not perform much worse over the 2010-2018 period than during the 2001-2009 period of the original study. But note that the latter and former are both the first 9 years after the cutoff dates for their respective models, and perhaps drivers do not shift in such a short term. Is there another way to get at this question of time variance of drivers?
- Extending the original models. Does training the original model on more available data help increase the effectiveness of these models? The 2021 paper suggests that adding data about political institutions could increase the predictivity of their models, and cites several recent papers with more comprehensive democracy data or forecasts of changes to political institutions.
Groups
You should work on this assignment in groups of 2 - 3. Interdisciplinary groups are strongly recommended, but not required. If you would like to work in a group of 1 or 4, please email us for permission (we will only permit this if there is a compelling reason).
Schedule
- Wednesday, February 21, 2024, 11:59pm: Send an email to Shreyas at sgandlur@princeton.edu with a sketch of how you intend to evaluate and extend the Hegre papers.
- Please send one email per group and include the names of all team members.
- You don’t have to stick to this sketch, but we do think it is important for you to spend some time thinking and designing your study before you begin computing.
- If we think this sketch doesn’t meet the bar for what we’re looking for, we will let you know before class on Monday, February 26. If you don’t hear from us by then, you can assume that your sketch is adequate.
- Wednesday, March 6, 2024, 11:59pm: Send an email to Shreyas at sgandlur@princeton.edu with your completed assignment:
- Along with a writeup, your submission should include documented code, any necessary datasets (or scripts to download such datasets), and a script that allows us to reproduce all the figures and numbers in your writeup. This is called computational reproducibility and is an important part of reproducible science.
- Your submission can be a ZIP file or a link to a git repository (make a separate submission branch or tag if you plan to keep working on it).
We recognize that you only have 3 weeks to complete this assignment so please be realistic about what is possible. If the assignment seems interesting and generative, you are welcome to keep working on it for the final project.
This assignment is worth 25% of your course grade. We will evaluate you on problem selection, creativity, correctness, thoroughness, quality of writing, and related work. See here for details.
Please submit on time. This being a grad seminar, adjudicating lateness penalties and such is not a good use of instructor time. If you are unable to submit on time due to unforeseeable circumstances, reach out to us.
Assignment 2: Participate in the Predicting Fertility data challenge
Participate in the Predicting Fertility data challenge (PreFer).
- Complete your application by Sunday, March 24, 2024.
- Submit a model by the last day of classes: Friday, April 26, 2024.
- Participants in the PreFer challenge will have the chance to be co-authors in a community paper and submit to a special issue of a journal, as described here.
- Some students may choose to continue to develop their submissions as their final project for this class.
Description
Your assingment has three main parts:
- Make and test one hypothesis about predictability using the PreFer data. In your write-up you should be clear how your hypothesis relates to some of the readings in class (or what it is different from what we learned about in class).
- Submit a model to PreFer by Friday, April 26, 2024 by 5pm ET.
- Submit a write-up to us describing what you did by the end of reading period (Friday, May 3, 2024 at 5pm ET).
Rubric
Title and abstract (5 points)
- 5 pts: Clear and correct
- 3 pts: Problems with clarity or correctness
- 1 pt: Problems with clarity and correctness
State and test one hypothesis about predictability (20 points)
- 16-20 pts: Thorough investigation with no obvious holes. Clear connection to some of the ideas in class or discussion of why such conncetion is not possible.
- 11-15 pts: Covered a few bases but left some obvious holes.
- 5-10 pts: Left gaping holes in the investigation / incorrect data analysis etc.
Submit a model to PreFer by April 26, 2024 (20 points)
- 20 pts: Correctly submit some model following rules of PreFer (partial credit possible).
Quality of writing and presentation (25 points)
This includes linguistic clarity, exposition of technical concepts, logical structure, justification of claims, explanation of background concepts, quality of figures, and discussion of results.
- 25 pts: The writing makes the work more interesting and accessible
- 21-24 pts: The writing does not add too nor detract from the research itself
- 10-20pts: The writing interferes with the reader’s ability to learn from work