Gay money

The truth about lesbian & gay economics

Office of Tax Analysis

Working Paper 108 [PDF]

August 2016

Joint Filing by Same-Sex Couples after Windsor:
Characteristics of Married Tax Filers in 2013 and 2014

Robin Fisher, Geof Gee, and Adam Looney

[Abstract]

In June 2013, the Supreme Court invalidated a key provision of the 1996 Defense of Marriage Act (Windsor v. United States), allowing same-sex spouses to be treated as married for all federal tax purposes. Treasury and the Internal Revenue Service (IRS) subsequently ruled that same-sex spouses legally married in jurisdictions that recognize their marriages will be treated as married for federal tax purposes. This paper provides estimates of the population of same-sex tax filers in the first two years affected by the decision drawn from the population of returns filed and using methods developed by the Census to address measurement error in gender classification. In 2014, we estimate that about 0.35% of all joint filers were same-sex couples or about 183,280 couples.

Introduction

In June 2013, the Supreme Court invalidated a key provision of the 1996 Defense of Marriage Act (Windsor v. United States) in a case concerning whether a same-sex partner was eligible to claim the estate tax exemption for surviving spouses. The ruling allowed same-sex couples to be treated as married for all federal tax purposes, including income and gift and estate taxes. Treasury and the Internal Revenue Service (IRS) subsequently ruled that same-sex couples legally married in jurisdictions that recognize their marriages will be treated as married for federal tax purposes. The 2013 ruling applied regardless of whether the couple lives in a jurisdiction that recognizes same-sex marriage or a jurisdiction that does not recognize same-sex marriage. As a result, legally-married same-sex couples generally were required to file their 2013 federal income tax return using either married filing jointly or married filing separately filing status. In 2015, the Supreme Court, in Obergefell v. Hodges subsequently established the right to same-sex marriage in 2015 in all states, including those whose state governments had not permitted same-sex marriage.

This paper provides the first estimates of the U.S. domestic population of married same-sex tax filers from the first two tax years affected by the decision. In 2013, we estimate that about 0.25% of all joint filers were same-sex couples, or about 131,080 couples (out of 52.6 million total joint filers). In 2014, the number of same-sex joint filers increased by 40% to about 183,280 (0.35% of all joint filers). Same-sex joint filers are generally younger, higher income, less likely to claim dependent children (especially for male couples), and disproportionately located in metropolitan areas and costal states. Tabulations by state and finer geographic areas reveal large differences in the rate of same-sex marriage across the country, with the highest rates in states which had legalized same-sex marriage prior to 2013.

These data also provide new insights into the demographics of same-sex couples that differ in important ways from information available from survey data. Because these estimates are drawn from the universe of returns filed and because most married couples file joint returns, these estimates also provide new and more accurate information on the distribution and frequency of same-sex marriage. Measuring the rate of same-sex marriage and how it changes over time is difficult in survey-based data because of the relatively small share of the population in same-sex marriages and because of serious mismeasurement problems arising from misclassification of gender. Building on methods developed by the Census to address such errors, these data provide greater detail on the geographic distribution of the same-sex married population and, in some cases, reveal substantial differences between Census‑ and tax-derived estimates. For instance, the number of same-sex filers is roughly 55% of the Census estimate of same-sex spouses.

One source of this difference may be the influence of state policies and state tax systems on whether same-sex couples filed joint returns, at least in the years prior to the 2015 Obergefell ruling. In particular, the rate of joint filing among couples – both relative to same-sex joint filers, and relative to Census-estimated same-sex couples – is highest in those states that both recognize same-sex marriage and, correspondingly, have state income tax systems that accommodated filing same-sex joint returns. Filing rates are generally lowest in states which barred same-sex marriage and whose income tax systems required same-sex couples to file separate state returns, which imposed substantial additional compliance burdens.

A central empirical challenge for providing estimates for small populations, such as the population of same-sex marriages, is that small measurement errors may lead to large biases. For example, if same-sex marriages make up roughly 0.2% of all filers filing joint returns, a 1-in-1000 error in the reported gender of either spouse would lead to measured estimates of the same-sex filing population that was roughly double the actual rate.

Data and Methodology

The data are tabulated from individual returns of married-filing-jointly (MFJ) taxpayers to which information on the gender of the primary and secondary taxpayer listed is linked from Social Security Administration (SSA) records. The vast majority of married tax filers (roughly 97.5%) file joint tax returns. The remaining 2.5% of couples file married-filing.separately on different returns, and we do not examine those returns in this paper. The data were extracted in late 2015 for tax years 2013 and 2014. While most returns are filed and available in the year they are due, roughly 1% are filed late. Hence, a small number of returns for those years have not yet been filed and processed. Nevertheless, the data includes information on about 52.5 million couples per year.

While administrative records appear to have much lower classification errors than survey estimates, classification errors still appear to result in large biases. Indeed, an initial tabulation of the data showed that approximately 0.8% of Married Filing Jointly (MJF) returns appeared to be same-sex couples, roughly double the rate estimated by the Census. Moreover, the correlation of tax-and Census-estimated rates across geography and demographic characteristics was weak, which is consistent with attenuation bias from measurement error. In short, the key methodological challenge to estimating accurately rates of same-sex filing rates is addressing the very small, but economically significant, measurement error in the SSA administrative data.

The estimates in this paper adapt Census-developed methods for reducing misclassification error using indices based on the gender specificity of first names. The Census method relies on an internally-developed name directory for each state identifying the ratio of the number of times each name was associated with a male respondent to the total number of times the name was recorded. If this index is inconsistent with the respondent-reported gender of a member of an apparent same-sex couple (at an index level of 95% or more), the gender is edited to match the gender indicated by the name (e.g. they are re-classified as different sex) (O’Connell and Feliz 2011).

Similarly, we construct an index indicating the likelihood an individual is male (female) based on first name, birth year, state, and whether the individual is listed as the primary or secondary filer among different-sex filers. The index is constructed from the 2013 and 2014 return data of different-sex couples and the Social Security Administration’s database of names, which includes all first names of Social Security Card applicants that occur at least 5 times since misclassified “F” instead of “M” or secondary filer misclassified as “M” instead of “F”), which can be used to improve the accuracy of the correction.

the 1880 birth cohort. For individuals whose first name appears in the SSA name database, the index indicates the fraction of individuals with a given first name that are male (or female). (Details of the construction of the index are provided in the appendix.)

We use the index to provide an independent estimate of whether a couple is likely to be in a Male-Female (MF), Male-Male (MM), or Female-Female (FF) relationship. Specifically, we assume that an individual’s gender is reported accurately (‘validated’) if their name index is greater than 95% specific to their SSA-reported gender. We classify couples as MF, MM, or FF based on the validated gender. (As described below, couples for whom the index and SSA-reported gender disagree or are non-informative are imputed the rate of marriage based on their characteristics and state of residence.)

For extremely rare names (less than 5 occurrences in the history of SSA records), or individuals whose name is not recorded in the tax data the name index is missing. The name index may be missing in the tax data because the first name is recorded only by the first initial, there is a typographical error in the name so it cannot be recognized as a proper name, or only the last name is included. In about 9.5% of couples either the primary or the secondary’s name index is missing.

For cases where the index is available, in 85% of couples the information from the name index matches the SSA-reported gender of both individuals. (This means, for example, that 85% of the time when we observe M-F in the SSA-reported gender, the name index indicates ‘male’ in more than 95% primary individuals and ‘female’ among more than 95% of secondary taxpayers.) (For observed MM and FF couples, however, the correspondence rate is 33% – in two thirds of cases the name and reported gender of at least one individual does not match.) The name index is highly concentrated close to 1 – primary filers whose name index is greater than 95% male are reported to be male in the SSA data 99.65% of the time. Excluding couples missing one or both name indices, and those couples where the name index fails to confirm the SSA-reported gender leaves 77% of the original population with name-validated gender information.

This method substantially reduces the extent of misclassification error. Intuitively, the likelihood of misclassification of gender in the administrative data is very small, on the order of 1-in-1000. By construction, the likelihood that an individual’s gender does not match their name index is less than 5% (and closer to 0.4%, on average). As a result, the likelihood that an individual is both misclassified in the SSA data and according to the name index is roughly two orders of magnitude smaller (proportionate to the product of the two probabilities).

Alternatively, this method can be viewed as examining the rates of same-sex marriage within the population of individuals with highly gender specific names, like James, John, or Robert (all more than 99.5% male) and Mary, Elizabeth, and Patricia (all more than 99.5% female). In effect, we estimate rates of same sex marriage by comparing the ratio of James-Robert (and male-male, according to SSA) and Mary-Elizabeth (female-female) marriages, to the number of John-Mary and Elizabeth-James marriages. Of the 90,025 individual first names included in the SSA database, 89,199 names are more than 95% male or female, which means that the index includes not just Roberts and Elizabeths, but names ranging from Aaditya, Brazos, and Candarius to Xana, Yasmeen, and Zayne. Hence, this method includes individuals from a very wide range of geographic, ethnic, national, and religious naming conventions.

To arrive at national estimates, and estimates by state, AGI class, age, and presence of children, the subsample of name-validated couples was raked to match population totals.

Specifically, the data are weighted by the ratio of the population total to the name-validated population within cells formed by tax year, state of residence, an indicator for presence of children, age of primary taxpayer, and AGI income class. In effect, this method estimates the rate of same-sex marriage within these detailed demographic groups with the name-validated sample and weights the rates by the share of the population of each group to arrive at national totals.

Under the assumption that the sample of name-validated filers is representative of the population within each demographic group, we believe the estimates of the relative frequency of same-sex marriage in this population provides an accurate estimate of the rate of filing in the population.

Put another way, we have chosen a subset of the population where the classification error is relatively small, by testing for consistency between the name and assigned sex. If we assume that most of the covariance between the demographic variables and the classification error is explained by names, and the relationship between names and gender is consistent among the demographic groups, then this method should work well. These assumptions are probably not strictly kept, but we can still expect to reduce the classification error by reducing the component that varies with name. The approximation is less exact when the gender-specificity has a large variability by demographic variable.

Overall, the method is an attempt to find a good compromise between classification error and model error. We can show that, as the classification error gets small, the effect of the model on the expectation also gets smaller, while the effect of dropping part of the sample and of the simplifying assumptions become important. It’s an open question what the best value of the threshold is, though we keep the tradition established by previous authors by setting it at 0.95. In the appendix to this paper we present a table produced with several alternative values of the threshold. The estimates appear not to be very sensitive to the exact value, which is reassuring.

Nevertheless, one concern with this approach is that naming conventions may vary across groups because of factors like changes in naming conventions across birth cohorts, regional differences, or differences across among immigrant or ethnic groups. For instance, a name which is highly gender-specific in some parts of the country or age groups may not be in others, even though the name still meets the 0.95 threshold overall. The resulting reduction in classification error might therefore not be as great in those areas where the name index is less specific, leading us, for instance, to identify more individuals in those areas as misclassified even though they were not.

Similarly, classification error may vary by region, birth cohort, demographic, or filing characteristics. If that error is correlated with likelihood of being in a same-sex couple, that could result in bias (either up or down) toward the rate of same-sex marriage in the population less likely to be misclassified. In effect, our method diminishes the contribution of misclassified groups, which matters for the average reported to the extent the same-sex marriage rate of the group differs from the overall population.

The adjusted data were then tabulated by state, 3-digit zip code, AGI class, age categories, and the presence of children. Totals were rounded to the nearest 5 filers and the number (and rate) of same-sex filers was bottom coded at “less than 10” by assigning them a value of 5 (and rate of 5/(number of observations)) in small-population geographic areas. Because the data represent population tabulations where any error arises from misclassification and our model-based correction, no standard errors are computed.

Estimates of the population and characteristics of same-sex joint filers

In 2013, we estimate that about 0.25% of all joint filers were same-sex filers, or about 131,080 couples (out of 52.6 million joint filers). Table 1A provides estimates of the number and share of joint filers that are same-sex male, same-sex female, and different sex couples by state in 2013. According to these estimates, the proportion of same-sex couples varied substantially across the country, from about 3.0% of couples in Washington DC, 0.8% in Massachusetts and Vermont, and close to 0.5% in Delaware, California, Washington, Maine, New Hampshire, New York, and Connecticut to less than 0.08% in North Dakota, Montana, Mississippi, Wisconsin, Kentucky, Idaho, and Arkansas.

Table 1B provides the corresponding estimates for 2014. In almost every state, rates of same-sex filing appeared to increase. Rates were little changed in Alabama, Mississippi, Missouri, and South Dakota. Rates of same sex marriage more than doubled in Indiana, Illinois, Montana, Wisconsin, Idaho, Pennsylvania, Oklahoma, North Carolina, and Colorado. Between 2013 and 2013, for the country as a whole, the number of same-sex joint filers increased by about 52,200, an increase of about 40%.

To examine one source of differences in the rate of same-sex joint filing across states, figure 1 relates the proportion of same-sex filers by state to the year in which same-sex marriage was recognized or legalized. In general, rates of same-sex filing are highest in states that had legalized same-sex marriage prior to 2013 or in 2013. While rates were relatively lower in 2013 and 2014 in states that had not legalized same-sex marriage until 2014, the percentage increase in filing rates between 2013 and 2014 were relatively high in those states.

Tables 2A and 2b provide estimates of the share of joint filers type of couple, income class, age of primary taxpayer, and presence of dependents. According to Table 2B, in 2014 same-sex couples were slightly younger (based on the age of the primary taxpayer) relative to different-sex couples, and substantially less likely to be over age 65. While 49% of different-sex couples claimed children as dependents, only about 7% of male-male couples claimed children, and about 28% of female-female couples. Same-sex couples generally appeared to be higher income than different-sex couples. For instance, male-male couples were almost twice as likely to earn more than $150,000 than different sex filers and female-female filers somewhat more likely. The average adjusted gross income (AGI) of male-male filers was about $176,000, versus $124,000 for female-female couples and $113,000 for different-sex couples.

These differences in income partly reflect the fact that same-sex couples are more likely to be of working age and, as described more below, to live in major metropolitan areas and coastal states where incomes (and costs of living) are high. Table 3 provides more detailed analysis of the economic characteristics of different-sex and same-sex filers in 2014 and examines the relationship between these and other factors and income. For each group of different-sex couples, FF couples, and MM couples, the table provides information on the average income and distribution of income for each group and by subsample. For instance, the table shows that the average AGI of different-sex couples is about $113,115 and about 18% had income over $150,000. Different-sex couples with dependent were slightly higher income ($122,150) and only slightly more likely (20%) to earn more than $150,000.

This pattern in which families with dependent children are higher income is also true of FF and MM couples, but is particularly striking for MM couples where the average income of couples with children is almost $275,000; more than half of MM couples with children earn more than $150,000.

Geographic differences in where same-sex couples live are an important contributor to differences in incomes across groups. Table 3 presents two measures intended to illustrate how geographic differences in where same-sex couples live affect their relative economic status. The first measure takes the population of working age (25-55) different-sex couples and weighting the sample according to the geographic residence (measured by 3-digit zip code) of MM and FF couples. In effect, this adjustment is intended to reflect what the distribution of income of MF is among MF couples whose geographic residence is the same as for MM or FF couples. This analysis, presented as “reweighted to MM (and FF) geographic distribution,” shows that the average income of MF couples weighted to correspond to FF places of residence is about $132,360. In contrast, the average income of FF couples in the same age range is about $121,220. In other words, while FF couples appear to be higher income than different-sex couples nationwide, relative to MF couples in their local neighborhoods their income is somewhat lower. A similar analysis, which provides the mean income of different-sex couples living in each FF couples three-digit zipcode, also suggests that the income of local MF couples is more than $9,000 greater.

Reweighting MF couples to approximate the geographic distribution of MM couples shows that the average incomes of MF couples is higher than in the nation as a whole ($155,425), but MM couples remain much higher incomes. The average income of MM couples in the same age range is about $180,525. Likewise, the average income of MF couples living in the same 3-digit zipcode as MM couples is about $154,265, showing that MM couples are relatively higher income even relative to other couples in their own neighborhoods.

Table 4 compares the number of same-sex joint filers to the estimated number of same-sex marriages estimated in the same year (2013 or 2014) by the U.S. Census Bureau using the American Community Survey (ACS). The first two columns for each year provide the Census estimates of the number of same-sex householders and the number of same-sex spouses. The Third column provides the relevant estimates from Table 1 of the number of same-sex filers by state. The fourth column is the ratio of same-sex filers to same-sex spouses. The final column shows the% change in the number of same-sex filers between 2013 and 2014.

Overall, the estimated number of same-sex filers is just over half the estimated number of same-sex spouses in the ACS in both 2013 and 2014. One potential source for this difference is measurement error and/or estimation error arising from the application of the name index in our sample, or sampling error or measurement error in the Census based estimates. Because the population of tax filers changes little from year to year, and because the methodology applied to the tax data is unchanged between years, there is effectively no sampling error in the tax estimates. In contrast, the ACS-based estimates are derived from samples and sampling error may be especially pronounced in the state-by-state estimates. (For instance, the variance of changes in same-sex marriage rates from year to year is greater in the ACS data and eight states are reported to have declining rates of same-sex marriage, which seems improbable in the first years when it became legally recognized at the federal level.) Moreover, misclassification, non-response, or missing information appears to occur much less frequently in the administrative data, suggesting that errors from imputation of marital status or gender in the ACS may be larger.

Another source of difference is that not all households file tax returns. For instance, the 2014 ACS estimate of the number of married-couple households is 56.1 million compared to the 52.6 million married-filing jointly couples in the 2014 domestic filing population. This difference – about 6.7% – could explain part of the gap. However, non-filers tend to be older and lower income, which are both associated with lower rates of same-sex marriage in Table 2. Hence, non-filers are unlikely to account for a large share of the difference.

Nevertheless, measurement-related errors seem unlikely to account for all of the differences in estimates. For instance, in several states (Alaska, Delaware, DC, Hawaii, Maine, New Mexico, Washington, and Oregon) the estimated populations are consistently relatively close, and sampling error in large states like California, Texas, or New York should be relatively smaller.

An alternative explanation for relatively low rates of same-sex filing is that legal, administrative, or other economic barriers made it difficult for same-sex couples to file in the first years after Windsor. The Windsor decision occurred mid-year in 2013 and the official Treasury and IRS guidance was released somewhat later. Hence, considerable uncertainty existed regarding the legal status, filing requirements, and other tax-related issues until late in the year.

In more than a dozen states, same-sex couples were prohibited from filing joint state returns even if they filed joint returns federally, imposing considerable uncertainty and compliance costs on would-be joint filers. For instance, in 2014 in 10 states taxpayers were faced with state tax systems that required them to file a joint state return if they filed a joint federal return, while simultaneously prohibiting same-sex couples from filing joint state returns. Some couples may have filed married-filing separate returns or continued filing separate single or head-of-household returns pending the resolution of these differences. While these states (and several others that did not recognize same-sex marriage) provided guidance to same-sex taxpayers on how to file, the procedures often involved substantial compliance burdens, such as providing duplicative pro forma single federal returns to accompany their state returns.

Figure 2 provides some evidence that the rate of joint filing among same-sex spouses was relatively low in states that delayed legalizing same-sex marriage. The figure presents the ratio of joint filers to Census-estimated counts of same-sex spouses by state in 2014 (from Table 4) according to the year in which same-sex marriage was recognized in each state. It is clear that the propensity to file a joint return is lower in states where same-sex marriage is not legally recognized or was recognized only in 2014.

Indeed, in 2014 the states with the lowest apparent joint filing rate among ACS-estimated same-sex spouses were almost uniformly those that prohibited same-sex couples from filing joint state returns. For example, 10 of the 11 states with the lowest rate of joint filing among ACS-estimated same-sex spouses were those that prohibited same-sex couples from filing joint state returns (according to the Tax Foundation 2014): Mississippi, Louisiana, Alabama, Arkansas, Ohio, Michigan, Tennessee, Kansas, North Dakota, and Kentucky; South Carolina is the exception. Missouri, Georgia, and Nebraska, which also prohibited joint filing on state returns also fell into the bottom 20 states.

Table 5 provides additional information on geographic differences in the rate of same-sex marriage and presents the range in rates among the top 100 largest commuting zones in the U.S. (Commuting Zones (CZs) provide a local labor market geography that covers the entire land area of the United States (Autor and Dorn 2013).10) Even within the most populous labor markets in the country, the rate of same-sex marriage differs widely. In the San Francisco area, the rate is 1.4% of married couples more than 22 times the rate in Brownsville, TX (0.06%).

Figure 3 provides an expanded illustration of the geographic distribution of same-sex couples by 3-digit zip code. Same-sex filers are highly concentrated in certain regions: the North East, Mid-Atlantic states, the West Coast, and New Mexico. In between, same-sex filers are concentrated in very small geographic areas, particularly urban areas of otherwise rural states, or cities and towns hosting colleges and universities.

To examine some of these differences, Tables 6 and 7 list the 20 3-digit zip code areas with the highest rates of male and female same-sex marriage among the 500 most populous 3-digit zip code areas (those with more than about 31,000 married couples). For example, Table 6 shows that more than 3% of married couples in downtown San Francisco are male same-sex couples. The highest rates of male same-sex marriage exist in the central areas of San Francisco, Washington DC, New York, and in other major cities like Seattle, Boston, Atlanta, Chicago, Portland, and Minneapolis. While many of the same major cities also appear in Table 7, which provides a similar analysis for female same-sex couples, relatively small cities and towns like Springfield, MA, Madison, WI, Santa Fe, NM, Durham, NC, Burlington, VT, and those on the coast of Delaware.

Conclusion

This paper provides new, detailed statistics on the characteristics of same-sex married couples filing joint tax returns in 2013 and 2014 drawn from administrative data sources. The use of administrative data has strong advantages over survey-based measures for studying small populations like the married same-sex couples, providing more precise information regarding their economic and demographic characteristics, and geographic distribution.

The data show striking differences between same-sex and different-sex couples in terms of income, presence of children, and place of residence. While we explore some sources of differences and speculate as to others, many interesting and important questions related to employment, income, family structure, living arrangements of children, the relationship between family responsibilities and economic outcomes, or the role of state and federal policies fall beyond the scope of this analysis.