September 23, 2013

Population Decline of Unauthorized Immigrants Stalls, May Have Reversed

Appendix B: Methodology

Overview

The estimates presented in this report for the unauthorized immigrant population are based on a residual estimation methodology that compares a demographic estimate of the number of immigrants residing legally in the country with the total number of immigrants as measured by a survey—either the American Community Survey or the March Supplement to the Current Population Survey; the difference is assumed to be the number of unauthorized immigrants in the survey, a number that is later adjusted for omissions from the survey (see below). The basic estimate is:

Basic Estimate Formula

The legal resident immigrant population is estimated by applying demographic methods to counts of legal admissions covering the period from 1980 to 2012 obtained from the Department of Homeland Security’s Office of Immigration Statistics and its predecessor at the Immigration and Naturalization Service. The initial estimates here are calculated separately for age-gender groups in six states (California, Florida, Illinois, New Jersey, New York and Texas) and the balance of the country; within these areas the estimates are further subdivided into immigrant populations from 35 countries or groups of countries by period of arrival in the United States. Variants of the residual method have been widely used and are generally accepted as the best current estimates (Hoefer, Rytina and Baker, 2012; Warren and Warren, 2013). See also Passel and Cohn (2008) and Passel (2007) for more details.

The estimates presented in this report are the residual totals, adjusted for survey omissions for these six states and the balance of the country, subdivided for Mexican immigrants and all others. Subsequent work by the Pew Research Center will assign individual foreign-born respondents in the survey a specific status (one option being unauthorized immigrant) based on the individual’s demographic, social, economic, geographic and family characteristics. Later reports will focus on more detailed information on the countries and regions of origin of the immigrants, estimates for all states and major metropolitan areas, and various demographic, social and economic characteristics of the unauthorized and legal immigrant populations.

Data Sources

The American Community Survey is an ongoing survey conducted by the U.S. Census Bureau. The survey collects detailed information on a broad range of topics, including country of birth, year of immigration and citizenship—the information required for the residual estimates. The ACS has a continuous collection design with monthly samples of about 250,000; the nominal annual sample size is about 3.1 million households with about 2.1 million included in the final sample through 2011. (http://www.census.gov/acs/www/methodology/sample_size_and_data_quality/).

For this report, public-use samples of individual survey records from the ACS are tabulated to provide the data used in the estimation process. The public-use file is a representative 1% sample of the entire U.S. (including about 3 million individual records for each year 2005-2011) obtained from the Integrated Public-Use Microdata Series or IPUMS (Ruggles et al., 2010). The ACS began full-scale operation in 2005 covering only the household population; since 2006 it has covered the entire U.S. population. ACS data are released by the Census Bureau in September for the previous year. At the time this research was prepared, the most recent ACS data were for 2011.

The other survey data source used for residual estimates comes from March Supplements to the Current Population Survey. The CPS is a monthly survey currently of about 55,000 households conducted jointly by the U.S. Bureau of Labor Statistics and the Census Bureau. Since 2001, the March supplement sample has been expanded to about 80,000 households; before then, the expanded March Supplement sample included about 50,000 households. The CPS universe covers the civilian noninstitutional population. The CPS was redesigned in 1994 and, for the first time, included the information required for the residual estimates (i.e., country of birth, date of immigration and citizenship). Some limitations of the initial March Supplement of redesigned CPS, 1994, preclude its use in making these estimates, so the first CPS-based estimates are for March 1995. CPS data are released by the Census Bureau in September for the previous March. At the time this research was prepared, the most recent March CPS data were for 2012.

Survey Weights

Population figures from both the ACS and CPS are based on the Census Bureau’s official population estimates for the nation, states and smaller areas through a weighting process that ensures the survey figures agree with pre-specified national population totals by age, sex, race and Hispanic origin. At the sub-national level, the two surveys differ in their target populations. The March CPS data agree with state-level totals by age, sex and race and are based on a process that imposes other conditions on weights for couples (U.S. Census Bureau, 2006). The ACS weights use estimates for much smaller geographic areas that are summed to state totals (http://www.census.gov/acs/www/methodology/methodology_main/ – especially Chapter 11).

The population estimates for the surveys are based on the latest available figures at the time the survey weights are estimated. This process produces the best estimates available at the time of the survey, but it does not guarantee that a time series produced across multiple surveys is consistent or accurate. Significant discontinuities can be introduced when the Census Bureau changes its population estimation methods, as it did several times early in the 2000s and in 2007 and 2008 (Passel and Cohn, 2010), or when the entire estimates series is recalibrated to take into account the results of a new census.

Previous ACS or CPS weights are not revised to take into account updated population estimates.3 One clear example of the impact of such a discontinuity occurred between the 2009 ACS, which was weighted to population estimates based on the 2000 Census, and the 2010 ACS, which was weighted to results of the 2010 Census. Of the apparent change in the foreign-born population between these two surveys (1.5 million, Table B1), about 60% could be attributed to the weighting change (Table B1 and Passel and Cohn, 2012b).

The estimates shown for unauthorized immigrants and the underlying survey data are derived from ACS IPUMS 1% samples for 2005-2011 and March CPS public-use files for 1995-2012, which have been reweighted to take into account population estimates consistent with the 1990 Census, the 2000 Census, the 2010 Census and the 2011 population estimates. The population estimates used to reweight the March 2011 CPS come from the Census Bureau’s Vintage 2011 population estimates (http://www.census.gov/popest/data/index.html); they are consistent with the 2010 Census and the estimates used to weigh the March 2012 CPS. The population estimates used to reweight the CPS for March 2001 through March 2010 are the Census Bureau’s intercensal population estimates for the 2000s (http://www.census.gov/popest/data/intercensal/index.html); these population estimates use demographic components of population change for 2000-2010 and are consistent with both the 2000 and 2010 censuses. Similarly, the population estimates used to reweight the CPS for March 1995 through March 2000 are the intercensal population estimates for the 1990s (http://www.census.gov/popest/data/intercensal/index.html), which are consistent with the 1990 and 2000 censuses. The ACS data for 2010 and 2011 do not require reweighting as they are weighted to the Vintage 2011 population estimates, are based on the 2010 Census. For the 2005-2009 ACS, the reweighting uses the same intercensal population estimates as used for the CPS.

The reweighting methodology for both the ACS and CPS follows, to the extent possible, the methods used by the Census Bureau in producing the sample weights that equal the population totals. For both surveys, the process followed by the Pew Research Center starts from the existing weights and adjusts them to equal the revised population estimates. It is not possible to completely replicate the Census Bureau’s weighting process because not all the information the Census Bureau used is publicly available. The CPS reweighting adheres more closely to the final phase of the Census Bureau’s weighting process because all of the variables used can be found in the public-use data sets. A more detailed discussion of the methods can be found in the Methodological Appendix to Passel and Cohn (2010) and in the Census Bureau’s documentation of CPS weighting procedures (http://www.census.gov/prod/2006pubs/tp-66.pdf).

For the ACS reweighting more approximations are required because the geographic detail available in the IPUMS data set does not replicate the small weighting areas used by the Census Bureau. Moreover, not all the detailed population estimates used are available and not all the weighting procedures are spelled out in detail. The ACS reweighting uses states as the basic weighting areas as well as estimates of the household population and the institutional and noninstitutional group quarters populations. A more detailed discussion of the ACS reweighting can be found in the Methodology Appendix of Passel and Cohn (2011) and in Chapter 11 of the Census Bureau’s documentation (http://www.census.gov/acs/www/methodology/methodology_main/)

In a few instances, additional changes or modification beyond simple reweighting were required to arrive at a consistent data series. The 2005 ACS did not cover the entire U.S. population; it included only the household population and omitted the group quarters population. Because the group quarters population tends to change little from year to year, either in numbers or characteristics, we augmented the 2005 ACS with individual microdata records from the group quarters population in the 2006 IPUMS data set. The records were initially reweighted to produce 2005 data that agreed with the Census Bureau’s original estimates for the 2005 group quarters population. The augmented 2005 ACS, including the household and group quarters populations, was then reweighted to agree with the revised 2005 population estimates.

In previous Pew Research Center estimates and analyses based on the 2000 March CPS, the weights were not the original CPS weights. Rather, they were a set of research weights produced by the Census Bureau to bring the March 2000 CPS into line with the 2000 Census (Passel, 2001). For 2001, the CPS March Supplement used by the Census Bureau in its published data series had a sample size and survey design consistent with the CPS for 1994-2000. For March 2002, the CPS sample size was greatly expanded, a new sample design was implemented and new weighting procedures were introduced, among other significant changes (http://www.census.gov/prod/2006pubs/tp-66.pdf). To test the new procedures and to provide an overlapping data point, the Census Bureau released an alternative March 2001 CPS data set (called the “SCHIP” file) that used the new sample size, survey design and weighting schemes. The Pew Research Center has used this SCHIP file in all previous analyses and as a basis for the revised weights.

Finally, the original weights released by the Census Bureau for the March 1995 CPS contained a significant error that had a large impact on the numbers of Asians and immigrants overall. Passel and Clark (1998) produced a set of alternative weights that corrected this initial error. These weights were used as input to the revised weights produced by the Pew Research Center.

Results. Although the changes caused by reweighting are relatively small as a share of the population (see Table B2), their impact can be relatively greater on subgroups such as the foreign-born population and, hence, residual estimates of unauthorized immigrants.

Table B1 compares the total and foreign-born populations based on the original weights with the same figures based on the new weights. For recent years, the revised weights increased the foreign-born population by 2% or more (roughly 800,000-1.2 million) for the 2008-2009 ACS and 2009-2011 CPS while there were smaller changes in the total population; for some years, the revisions led to reductions in the total population. For the 2007 ACS and 2008 CPS, the increases in the foreign-born population were just shy of 1%, amounting to increased numbers of about 350,000. Changes to the foreign-born estimates in the remaining two ACS years, 2005-2006, were much smaller, less than 50,000.

For the 2001-2007 CPS estimates (which are based on the expanded and redesigned post-2000 Census samples), the changes introduced by reweighting are erratic. Some years (2001, 2003, 2004) show reductions in the foreign-born estimates exceeding 100,000; 2006 has an increase of more than 100,000. These different patterns of change reflect a number of factors: changes in the methodology used to measure migration in the population estimates; the substitution of final for preliminary data in the population estimates; and the smoothing introduced into the intercensal population estimates to close differences between the initial estimates and the 2010 Census results.

For the 1996-2000 CPS, the changes introduced by the revised weights are much larger, exceeding 1 million for 1996-1999 and more than 600,000 in 2000. These differences reflect the failure of the pre-2000 population estimates to fully capture the immigration that was occurring during the 1990s, particularly the second half of the decade (Passel, 2001). The new weights for these years accurately capture the trends in the changing overall population and correctly attribute most of the shortfall of the estimates to the immigrant population.

Adjustment for Undercount

Adjustments for omissions from the surveys (also referred to as adjustments for undercount) are introduced into the estimation process at several points. The initial comparisons with the survey (based on the equation shown above) take the difference between the immigrants in the survey and the estimated legal population. Since the comparison is people appearing in the survey, the estimated legal population must be discounted slightly because some legal immigrants are missed by the survey. This initial estimate represents unauthorized immigrants included in the survey. To estimate the total number of unauthorized immigrants in the country, it must be adjusted for those left out. Similarly, the estimated number of legal immigrants appearing in the survey must also be adjusted for undercount to arrive at the total foreign-born population.

These various coverage adjustments are done separately for groups based on age, sex, country of birth and year of arrival. The patterns and levels of adjustments are based on Census Bureau studies of overall census coverage (see http://www.census.gov/coverage_measurement/ for links to evaluation studies of the 1980, 1990, 2000 and 2010 Censuses; also Passel, 2001)  that are adjusted up or down to reflect the results of a number of specialized studies that focus on immigrants. Census Bureau undercount estimates have generally been subdivided by race/Hispanic origin, age, and sex. So the adjustments to the Pew Research Center data use rates for countries of birth based on the predominant race of immigrants from the country—Hispanic and non-Hispanic races for white, black and Asian. Undercount rates for children do not differ by gender, but for younger adults (ages 18-29 and 30-49) the undercount rates for males tend to be higher, and for some groups much higher, than those for females. At older ages, the undercount rates are lower than for younger adults with no strong patterns of gender differences (and with some estimated overcounts).

The basic information on specific coverage patterns of immigrants is drawn principally from comparisons with Mexican data, U.S. mortality data and specialized surveys conducted at the time of the 2000 Census (Van Hook et al, forthcoming; Bean et al., 1998; Capps et al., 2002; Marcelli and Ong, 2002). In these studies, unauthorized immigrants generally have significantly higher undercount rates than legal immigrants who, in turn, tend to have higher undercounts than U.S. natives. More recent immigrants are more likely than longer-term residents to be missed. The most recent study (Van Hook et al, forthcoming) finds marked improvements in coverage of Mexicans in the ACS and CPS between the late 1990s and the 2000s. This and earlier work suggest very serious coverage problems with immigrants in the data collected before the 2000 Census but fewer issues in the 2000 Census and subsequent data sets. This whole pattern of assumptions leads to adjustments of 10% to 20% for the estimates of unauthorized immigrants in the 1995-2000 CPS, with slightly larger adjustments for unauthorized Mexicans in those years. (Note that this means even larger coverage adjustments, sometimes exceeding 30% for adult men below age 40.)

After 2000, the coverage adjustments build in steady improvements in overall coverage and improvements specifically for Mexican immigrants. The improvements are even greater than noted in the research comparing Mexico and U.S. sources because the reweighted ACS and CPS data provide even greater improvements in reducing undercounts, since they incorporate results of the 2010 Census (Passel and Cohn, 2012b). With all of these factors, coverage adjustments increase the estimate of the unauthorized immigrant population by 8% to 13% for 2000-2009 and by 5% to 7% for 2010-2012. For the overall immigrant population, coverage adjustments hovered slightly below 5% during the 1990s and trended downward to around 2% to 3% by 2012. Since the population estimates used in weighting the ACS and the CPS come from the same sources, the coverage adjustments tend to be similar.

Margins of Error

Estimates of the unauthorized immigrant population are computed as the difference between a deterministic, administratively based estimate (i.e., the legal foreign-born population, or “L” in the equation above) and a sample-based estimate (i.e., the survey total of the foreign-born population, or “F”). Consequently the margin of error (or variance) for the estimated unauthorized population is the margin of error for “F,” the sample-based estimate of the foreign-born population. The margins of error shown in this report, for the entire U.S. and for individual states, are based on the variance of the foreign-born population entering since 1980. For all ACS years and for the 2005-2012 CPS, variances were computed with replicate weights supplied for the CPS by the Census Bureau (U.S. Census Bureau, 2012a; data available at http://thedataweb.rm.census.gov/ftp/cps_ftp.html#cpsmarch) and for the ACS through IPUMS (Ruggles et al., 2010; documentation of the weights at http://www.census.gov/acs/www/methodology/methodology_main/, especially Chapter 12); for earlier CPS data, generalized variance formulas supplied in Census Bureau documentation were used to compute margins of error (U.S. Census Bureau, 2012b, especially Appendix G).

Other Methodological Issues

Rounding of Estimates. All state- and national-level estimates for unauthorized immigrant populations are presented as rounded numbers to avoid the appearance of unwarranted precision in the estimates. No estimates smaller than 10,000 are shown. Estimates in the range of 10,000-100,000 are rounded to the nearest 5,000; estimates in the range of 100,000-250,000 to the nearest 10,000; estimates smaller than 1 million to the nearest 25,000; estimates of 1-10 million are rounded to the nearest 50,000; and estimates larger than that to the nearest 100,000. Unrounded numbers are used for significance tests, in plotting charts and in computations of differences and percentages.

Country of Birth. Some modifications in the original CPS countries of birth were introduced to ensure that all foreign-born respondents could be assigned to a specific country or region of birth. See Passel and Cohn (2008) for a detailed treatment of how persons with unknown country of birth were assigned to specific countries.

Population by Nativity and Alternative Weights: 2005-2011 American Community Survey and 1995-2012 March Current Population Survey

New and Previous Estimates of the U.S. Unauthorized Immigrant Population, 2000-2011

  1. The only recent exception was for the monthly 2000-2002 CPS to incorporate large changes engendered by the replacement of the updated 1990 Census with results from the 2000 Census. Because of a large change in estimates due to revised methods between the estimates produced for 2006 and those for 2007, the Census Bureau revised CPS weights for research purposes, but for only one month of data—December 2007.