News - Blog

From the “correlation is not necessarily causation” department

March 6, 2017

We have added new population data from the U.S. Census Bureau to State Data Lab.  This data includes breakdowns, by state, of the share of population into two categories – the “Native Born” share of total population, and the “Foreign Born” share of population. 

In turn, the Census Bureau breaks down the latter category into two sub-categories – “Foreign Born, Naturalized Citizens” and “Foreign Born, Non-US Citizens.”

Looking across the 48 continental United States, there is a modest tendency – states that rank higher on “Foreign Born, Non-US Citizens” share of total population tend to have weaker state government financial conditions, according to our Taxpayer Burden.

The relationship, if any, is a weak one, however.  Here’s a look at a scatter plot with those rankings.  The 48 states are ranked across the bottom on “Foreign Born, Non-US Citizens” share of total population, and the dots relate those rankings to the corresponding state ranking on our “Taxpayer Burden” measure of state government financial condition. 

The trend line indicates the tendency – states that rank higher on “Foreign Born, Non-US Citizens” share of population tend to rank lower on our measure of state financial conditions.  But you can also see that there is a wide range of outcomes – e.g., in statistics language, this is a weak correlation.

And when you compare the simple correlation coefficients for pairwise rankings for our Taxpayer Burden with state rankings for the share of public sector employees covered by collective bargaining agreements, the share of lawyers in total population, and the share of votes for Democrat presidential electors, you see significantly stronger correlation coefficients – e.g., stronger apparent relationships.

In turn, in the world of statistics, there is a branch of analysis called “Regression Analysis.”  Once you start dipping your toes in these waters, you can get into endless arguments about mathematics.  But when running a simple multivariate equation explaining state financial conditions with things that seem to matter, and then adding a variable for state rankings on the share of population that is “Foreign Born, Non-US Citizen,” that latter variable shows up as “statistically insignificant.”

 
 
comments powered by Disqus