Clustering Of Ability In The US

Part II of a two part series looking at ability and income. Part I focused on ability income payoffs by occupation and ability’s contribution to income inequality. Part II examines the geographical distribution of ability and the degree to which ability income payoffs are distorted by cost of living differences by state.


  • Part I of this study of looked at “ability” as measured by the difference between what someone makes and what the average wage they would be expected to earn based on primary determinants of income: age, education, occupation, gender and race (“AEOSR” wage). Two main findings were that income payoffs to ability varied by occupation, and, the existence of ability significantly contributes to income inequality.
  • This essay analyzes the distribution of ability by state. My measure of ability shows geographic concentrations. I examine whether this result is being partially caused by cost of living (“COL”) differences that may affect wage levels.
  • The data suggest COL is not correlated with clustering of high ability workers in high wage occupations. This indicates high ability workers in those occupations concentrate where their earning potential is highest. Because income inequality is driven by high wage occupations, the two main findings of Part I are not affected.
  • Cost of living does appear correlated with clustering of "high ability" workers in low wage occupations, particularly those that are more heavily unionized. For those occupations, intra-occupational wage differences are a less convincing indicator of ability.
  • Low ability workers appear to be more concentrated in low cost of living states. Because COL is correlated with wages in low wage occupations, it is very likely that both low ability clustering is occurring and that employers are paying lower wages in low COL states. Thus the Wage/AEOSR ratio underestimates the ability of low wage workers in low COL states (and vice-versa).


This analysis used US American Community Survey data from 2015 provided by IPUMS-USA, University of Minnesota.[1] I looked at the subset of individuals with positive wage income (incwage > 0) who were the primary “breadwinners” (the individual who earned the most in a household). I defined “high ability” as those individuals who earned 120% or more than their AEOSR expected income. Low ability was defined as those earning 80% or less than their AEOSR expected income. State location was based on place of work if available, otherwise on place of residence.

The Cost of Living Problem

I defined "ability" as the ratio of a worker's actual wage to the expected AEOSR Wage. However, if employers in high cost of living (“COL”) states must pay a wage premium to entice a worker to live there, my measure overestimates ability in those areas because the wage is inflated by a COL bonus. Similarly it would even further underestimate ability in low COL areas if employers in low COL states are able to pay workers less.

High Ability Distribution And Cost Of Living

Figure 1 shows the statistically significant positive correlation between a state’s cost of living and its percentage of high ability workers (r-squared = 0.38). This correlation could be seen to support the idea that ability is overestimated in high COL states and underestimated in low COL states.

However, it is also possible that high ability workers independently cluster in high cost of living states (i.e., and would do so whether or not they receive a COL wage premium). This second possibility would hypothesizes that high ability workers migrate to high COL states because their net earnings potential is higher there due to the presence of skill centers with customer concentration (e.g., Silicone Valley, New York City’s financial center, Hollywood etc.). If this is true, normalizing wages by COL would underestimate the role of ability as a driver of income.

Figure 1: Percent of State Workers That Are High Ability vs. Cost of Living

Insight into the relationship between COL and high ability concentration is gained by looking in more detail at specific occupations. Despite the pattern in Figure 1, Figure 2 shows there is no relationship between cost of living and a state’s share of physicians that are high ability. At an average wage of $235,000, Physicians have the highest mean wage of all Census occupation codes. At that salary level, inter-state cost of living differences do not appear to be important.

Figure 2: Percent of State Physicians That Are High Ability vs. Cost of Living

However for nurses with a mean salary of $69,900, there is a strong statistically significant relationship between a state’s share of apparent high ability nurses and cost of living (R-squared = 0.55) as shown in Figure 3.

Figure 3: Percent of State Registered Nurses That Are High Ability vs. Cost of Living

This strong correlation suggests that my simple ability measure of Wage/AEOSR Wage may overstate nurses ability in high COL states, unless one believes the best nurses migrate to high COL states because their best earnings potential (net of COL) happens to exist there. I suspect that is not what is happening, however, because the 29% difference between the 75th percentile and the median nurses salary would be largely offset by differences in state COLs. For doctors, there is a 64% spread between 75th percentile and median.

In general this is the pattern in the data: occupations with higher average income show less correlation between COL and a state's percentage of high ability workers. Interestingly, the occupations which show the highest correlation between apparent ability share and COL -- police and teachers (followed by nurses) -- are significantly more unionized. For those occupations, the ratio of actual wage to the AEOSR expected wage is likely to be a misleading signal of high ability.

In aggregate, 52% of all of the Census occupational codes show a statistically significant relationship between state COL and its percentage of workers that are high ability. However those occupations are lower wage jobs: the average salary of occupations showing a relationship with COL was only 24% of the average salary of all occupations in 2015.

If we look at all primary earners from the top decile of household income (regardless of occupation), we see no strong relationship between COL and a state's percentage of high ability workers as shown in Figure 4.[2] Seventy five percent of the primary earners in that decile earned $100,000 or more in 2015.

Figure 4: Percent of State Top Household Income Decile Primary Earners That Are High Ability vs. Cost of Living

Part I showed that high ability in the top household income decile significantly amplifies national income inequality. Figure 4 shows that the findings in Part I are not affected by cost of living. This is also the reason that the headline map at the top of this essay showing the percent of state workers that are high ability adjusted for COL is indistinguishable from a map that is not adjusted for COL (and also why both are correlated with mean income and income inequality by state).[3]  

Low Ability Distribution And Cost Of Living

Figure 5 shows that a state’s percentage of low ability workers is negatively correlated with COL (statistically significant, r-squared = 0.54). That is, low ability workers are concentrated in states where the cost of living is low.

Figure 5: Percent of State’s Workers That Are Low Ability vs. Cost of Living

In aggregate 66% of all of the Census occupational codes show a statistically significant negative relationship between state COL and its share of low ability workers. Once again, the relationship occurs primarily in lower wage occupations where the mean salary is only 34% of the average salary for all occupations.

It is difficult to ascertain how much of this relationship is due to 1) the possibility that low ability workers really do cluster in low COL states because it is cheaper to live there, or 2) whether it's due to the fact that employers may pay “low ability” workers less in low COL states (due to labor supply and prevailing wages). In the second case, the ability of “low ability” workers in low COL states is underestimated by the Wage/AEOSR ratio.

It was easier to draw conclusions for high income households because so few high wage occupations show a relationship with COL. That is not the case for low income households where low wage jobs predominate. It is quite probable that both low ability clustering occurs, and that employers take advantage of the greater supply of low ability labor and the prevailing COL to pay lower wages.

It is worth noting that the distribution of low ability workers is not correlated with state welfare benefits, unlike the distribution of homeless. The distribution of homeless is positively correlated with state COL as shown in Figure 6.

Figure 6: Percent of State’s Residents That Are Homeless vs. Cost of Living

The homeless and other non-working poor may concentrate in high COL states because those states pay higher benefits as shown in Figure 7.

Figure 7: Welfare Benefits vs. Cost of Living

There are no obvious solutions for national household income inequality arising from “apparent low ability” clustering. Even if these workers’ abilities are truly higher than that implied by their low relative wages, the wage benefit from moving may be offset by higher costs of living.[4] Indeed, interstate mobility has declined. Clustering of low ability/low wage occupations also makes it more difficult to locally fund improved resources for the poor.

[1] IPUMS-USA, University of Minnesota, The data is provided freely but is subject to their licensing restrictions.
[2] This is also true for the top household income quintile: there is no statistically significant relationship between a state’s percentage of high ability workers and its cost of living.
[3] Sample sizes are smaller for less populous states, so less confidence should be placed on estimated ability shares in those states (e.g., the Dakotas, Wyoming, Delaware etc.).
[4] Those who move may also lose the economic benefit of extended family networks that can provide childcare, financial assistance etc.

Transparent and reproducible: All Figures can be generated using the free, publicly-available R program, the data links in this article and in the R code available in “AbilityCostOfLivingAdj.r" on github (look for the comments with "Figure X").


Popular Posts