Predicting how Ukrainian refugees move under changing war situation
Published:
This project proposes a simple framework of predicting how Ukrainian refugees move under the changing war situation. The accuracy of predicting refugee movement under unseen war situations reaches 80%. The first part shows the simulation of the trained model. The latter part shows the methodology behind it.
Simulation
- Simulation 1: the movement of 10000 people in Bakhmut district from January 24, 2022 to July 24, 2022.
- Simulation 2: the movement of 10000 people in Mariupol district from January 24, 2022 to July 24, 2022.
- Simulation 3: the movement of 10000 people in Luhansk district from January 24, 2022 to July 24, 2022.
Dataset
- Two datasets are used in this project. The first contains the movement data of Ukrainian people from February, 2021 to July, 2022. The second is Ukraine-Russia war events dataset collected by ACLED.
Framework
- Derive trjectories:
- Filtering: We look at the data at a week level. For every week, we derive the raion for every person that they spend most days in and then use this raion to represent the location of this person in this week. The goal of this step is to filter the unwanted noise in the raw dataset (We found very heavy unexplainable daily patterns in the raw dataset).
- We then take the intersection to get all individuals that appears every week and drop those who never moved. We now get 10913 trajectories that can be used to training.
- Derive war vectors: We group the war events at weekly and raion level. This vector contains the severity of war situation in each raion at a specific week.
- Below are three visulizations of war vectors in different time periods. We can clearly see the breakout and progression of the war.
- Train a neural network predictor: The neural network takes the current raion the individual is in and the current war vector as input and output the probability distribution of the next raion that the individual will be in. The heuristic is that an individual largely bases their decision on the current war situation when choosing their next location.
Result
- To test how the network performs under unseen trajectories and unseen war situations, we split the data into one training dataset and two test datasets.
- We first cut off the trajectory at a certain point. The trajectory after this point serves as the dataset to test the model’s performance under unseen war situations.
- Then we split the trjectory before the point at the ratio of 0.8. The 0.2 part serves as the dataset to test the model’s ability under unseen trajectories.
- The model achieves an accuracy of 86.6% on unseen trajectories and 80.3% on unseen war situations.
References
Raleigh, C., Kishi, R. & Linke, A. Political instability patterns are obscured by conflict dataset scope conditions, sources, and coding choices. Humanit Soc Sci Commun 10, 74 (2023). https://doi.org/10.1057/s41599-023-01559-4