From the Nested Knowledge Team
Our team at Nested Knowledge is exploring some interesting questions on the relationship between social behavior and spread of the novel coronavirus. One concept we explored was trying to find an indicator of compliance with law; in this case, we used tax compliance as a proxy for law-abiding behavior. The basic hypothesis: if a country is more tax-compliant, we could assume that they are law-abiding and therefore will follow social distancing and other guidelines conscientiously.
Our analysts used this shared datasheet of statistics on COVID-19 cases and mortality, which summarizes the raw data made available based on a range of sources here, including cases and deaths by country and population. The data on tax evasion comes from a publication from the Journal of International Development here and the data of tax revenue for each country comes from the International Monetary Fund World Revenue Longitudinal Data (IMF-WoRLD).
Some disclaimers: Pandemic infection rates are complex issues with a lot of variable factors. Every country has different population densities, timelines for the pandemic, varying diagnostic capabilities, differing standards for tracking patients and quality of healthcare in addition to how much social distancing is recognized and followed. Due to those considerations, our statisticians do not perform inferential statistics due to the potential impact of confounding variables, However, one thing that we do know is human contact spreads the virus faster, and so a culture of disregarding laws or rules surrounding social distancing *could* potentially lead to an increased occurrence of COVID-19 infection. Analyzing this data, our experts from Nested Knowledge performed correlation analysis, simple linear regression, and Mann-Whitney U Tests.
Some key results from the analysis:
- In the analyzed data, we found Pakistan experienced serious revenue loss due to tax evasion as it is the only country on the Top 10 country list of both total tax loss and percentage tax loss (the percentage of the total tax loss in the total governmental tax revenue) – it had 12.06 billion dollars tax loss which took up 51.16% of the total tax revenue in 2013. But it doesn’t have a high percentage confirmed and death rate (the percentage of total confirmed/death cases in the total population). (Confirmed % Pop: 0.0033%, Death % Pop: 0.000064%)
- The percentage tax loss (total revenue loss/gross domestic product) has moderate negative correlation with percentage confirmed/death rate (total confirmed/death cases divided by /population). The spearman correlation between the percentage tax loss and the percentage confirmed rate is -0.48 and that between the percentage tax loss and the percentage death rate is -0.43 (Figure 1).
- The result above is very counterintuitive, and we felt excluding some potential outliers might help. So we have 2 exclusion criteria: ① excluding countries with less than 1M population (10 countries, Figure 2), ② excluding countries with the percentage confirmed rate more than 20%. Choosing either criteria resulted in very similar results with the original one (Figure 1).
- Also, we found the pearson correlation increases after performing the exclusion criteria and simple linear regression was performed on the data using exclusion criteria ② (Figure 3). Although it is a slight correlation, we still can reconfirm the counterintuitive negative correlation between the percentage tax loss and the percentage confirmed rate.
- We divide all the countries into 2 groups – if they get positive tax loss, then they are in the tax loss group; otherwise they are in the tax gain group. After performing Mann-Whitney U Test, we found there is a significant difference between these 2 groups in terms of the percentage confirmed and death rate (p-value is 0.001 and 0.009 respectively) and these rates in the tax gain group are higher than those in the tax loss group. This also shows likely that the more tax loss, the less percentage confirmed/death rate.
Figure 1. Spearman correlation between the percentage tax loss and the percentage confirmed/death rate for 3 scenarios (Original, Excluding countries with less than 1M population, Excluding countries with the percentage confirmed rate more than 20%).
Figure 2. 10 countries excluded from exclusion criteria ①.
Figure 3. Left: Pearson correlation between the percentage tax loss and the percentage confirmed/death rate for 3 scenarios (Original, Excluding countries with less than 1M population, Excluding countries with the percentage confirmed rate more than 20%). Right: Linear regression for the last scenario.
These observations do not represent our opinion on tax compliance in different countries in any way, and that expert recommendations (not ours) on social activity and social distancing should be the central guide in behavior during this crisis.
As the scientific community rallies around to fight the pandemic, we encourage data analytics experts and interested scientists can use our data sheet to build their data models; you can “mirror” it in your own Google Sheet using the ImportRange function. We hope that this can be a start for researchers to pool data and construct their own analysis on factors related to population, socioeconomics, culture, and other datasets.