I am an (almost finished) PhD student in biostatistics and infectious disease modelling (population-level); my research focuses on Bayesian statistical methods to produce improved estimates of the number of new COVID-19 infections. During the pandemic, I was a member of SPI-M-O (the UK government committee providing expert scientific advice based on infectious disease modelling and epidemiology).
I enjoy applying my knowledge broadly, including to models of future pandemics, big picture thinking on pandemic preparedness, and forecasting.
Could you please expand on why you think a Pareto distribution is appropriate here? Tail probabilities are often quite sensitive to the assumptions here, and it can be tricky to determine if something is truly power-law distributed.
When I looked at the same dataset, albeit processing the data quite differently, I found that a truncated or cutoff power-law appeared to be a good fit. This gives a much lower value for extreme probabilities using the best-fit parameters. In particular, there were too few of the most severe pandemics in the dataset (COVID-19 and 1918 influenza) otherwise; this issue is visible in fig 1 of Marani et al. Could you please add the data to your tail distribution plot to assess how good a fit it is?
A final note, I think you’re calculating the probability of extinction in a single year but the worst pandemics historically have lasted multiple years. The total death toll from the pandemic is perhaps the quantity most of interest.