Fixing the Facts: Editing of the 1880 U.S. Census of Occupations with Implications for Long-Term Labor Force Trends and the Sociology of Official Statistics

Susan B. Carter, University of California-Riverside, and Richard Sutch, University of California-Berkeley

Occupation statistics published in the Federal census of 1880 have had a disproportionate influence in statistical portraits of the level and distribution of employment in 19th-century America. They are as complete as any other 19th-century census of occupations. They were collected and compiled by Francis Amasa Walker, the most prominent and decorated economist, statistician, and public servant in America at his time. Because Walker placed special emphasis on the collection and reporting of accurate occupation data, the 1880 occupation statistics are generally viewed as the most trustworthy of all those collected in the 19th century.

We challenge this portrait of reliability and accuracy. We believe that the enumerators' occupational returns from this crucial census were heavily edited to reduce the number of individuals with occupations prior to their publication. For youthful and older males and all women the editing was so substantial as to qualitatively affect the apparent trend in labor force participation for these groups over time. The stylized facts regarding labor market dynamics during the period of American industrialization and the historical stories constructed around them will now need to be reexamined. We contend that the editing was secretly authorized by Walker himself. No one has heretofore suggested that the official statistics of the United States were covertly altered to present a picture different from information collected by census enumerators. If we are correct, the sociology of official 19th-century American statistics will require rethinking.

1880 Occupation Statistics And Labor Force Trends

According to the standard estimates, industrialization had profoundly different effects on the labor force involvement of youths, older workers, and married women. The participation of youths is thought to have traced a humped shape, first rising and then falling in the later stages of industrialization; the rate for older men is believed to have declined monotonically; and paid employment for married women is said to have followed a U shape, falling to nearly zero levels before beginning the well-known steady advance in the 20th century.

In the standard view child labor became increasingly common in the latter part of the 19th century. This stylized fact suggests that the social movement against child labor was a response to a growing evil and that the decline of youth employment after 1900 was due to the success of Progressive Era reforms embodied in compulsory schooling and child labor legislation (Cubberley 1909; Ensign 1921; Abbott 1947; Trattner 1970; Ashby 1985; Licht 1992).

The late-19th century rise in the employment rate of youths, however, should be rather surprising. Between 1870 and 1900 real wages rose, including those of the lowest paid workers (Abbott 1905; Coombs 1926; Long 1960; Rees 1961; Williamson and Lindert 1980). Since, in the cross-section, child labor is associated with low relative incomes of fathers (Haines 1979, 309; Horan and Hargis 1991, 590-593), the rise in real wages would be expected to lead to a lower incidence of child labor over time. Throughout most of the 19th century compulsory school laws were either absent or not enforced (Tyack 1974). Nonetheless, school expenditures rose, school attendance increased, and illiteracy fell (Fishlow 1966; Bowles and Gintis 1976; Field 1978; Tyack 1974). Since children's primary alternative to paid labor is schooling, this evidence of a voluntary increase in school attendance might be supposed to imply greater costs and lower returns to child labor, again leading to a reduction in its incidence (Solmon 1970; Lindert 1978; Landes and Solmon 1972; Tyack 1976; Rodgers and Tyack 1982).

The standard belief in a rising incidence in the employment of children also clashes with the findings of demographic and social historians that, during the last half of the 19th century, fertility declined and parental attitudes toward children shifted from opportunistic to altruistic. According to this view, parents chose to have smaller numbers of children and began to downplay their children's value as economic assets and as potential contributors to family income. Increasingly children came to be viewed as precious (priceless) beings upon which parents were willing to lavish considerable affection, education, and protective attention (Lindert 1978; Becker 1981; Zelizer 1985). Presumably, these precious children would not face an increasing risk of work.

The traditional story about the employment behavior of older men is that their involvement in paid labor declined continuously over time. Durand offers two explanations for this decline. First, over time, in response to the rise in incomes and the spread of corporate-based pension plans, more men decided to "retire voluntarily to live on savings or pensions after the age of 60 or 65" (Durand 1948, 34). Second, this economy-wide tendency was reinforced by the shift of employment out of agriculture and into industry. In industry, especially in large corporate firms, according to Durand, older workers experienced an increasing difficulty in retaining their jobs (Durand 1948, 34-35).

Ransom and Sutch have challenged this traditional view, arguing instead that overall retirement rates were stable, at least between 1870 and 1937. Recently the Ransom and Sutch view received additional support from work by Costa (1995), Moen (1994), and Carter and Sutch (1996) who develop evidence suggesting that many farmers retired about the turn of the century and that they retired at rates very comparable to those of workers in the industrial sector. These findings raise the possibility that retirement may have been common in the 19th century as well.

The employment of married women is thought to trace a U-shape with industrialization (Goldin 1994). Though comprehensive statistics for the pre-industrial era in the United States are not available, documentary evidence suggests that the overwhelming majority of married women were involved in the household production of commodities for consumption and sale in that early period (Goldin 1986b). The rise of the factory system and the separation of home from work is thought to have reduced married women's involvement in the production of market goods. By 1890 when the first economy-wide figures become available, only 4.6% of married women appear to have been directly involved in the production of commodities. It wasn't until the 20th century that improvements in women's education and the spread of white-collar work reversed this trend.

Accuracy Of The Published Occupation Returns For 1880

This paper has its origins in our discovery that certain occupational statistics which appear in the official publications of the Tenth Census differ radically from those collected by the census enumerators in 1880. The enumerators' reports can be analyzed using the Public Use Microdata Sample (PUMS) from that census. The 1880 PUMS is a random, 1-in-100 sample of the original enumerators' returns. Except for the recoding of some number of persons returned with the occupation "housekeeper" and a small number of laborers and nurses, occupation in the 1880 PUMS was coded exactly as entered by the original census enumerators. The recoded housekeepers were assigned a code that allowed them to be easily identified and the enumerators' tally replicated (Ruggles and Menard et al. 1993, 10).

The magnitude of the discrepancies between the enumerators' occupation returns and the published census occupation figures for youth, the aged, and prime-age married women are large. Among males, the PUMS indicates 33% more employment among youths and 29% more employment among the aged than the published reports. Among younger, prime age, and older females, the differentials are 35, 44, and 146%, respectively. There are no such discrepancies between the published and PUMS totals for 1900, 1910, and 1940. Instead, all of these ratios are close to 1.00. There is also no large discrepancy for prime-aged men in 1800. The discrepancies between the published numbers and the sample values for 1880 are large and entirely outside deviations that might arise from sampling variance. If the PUMS values are accepted, they mean a radical revision to the standard estimates for 1880 and force us to reconsider trends in labor force participation of children, older men, and married women during the last third of the 19th century.

Explaining The Discrepancy Between The Published Occupation Figures And The PUMS

Which figures ought to be accepted - those which appear in the published census volumes or those in the PUMS? The answer depends, in part, on one's beliefs about the source of the discrepancy. We examine three possibilities.

The Sampling Bias Hypothesis

The 1880 PUMS may be have been selected in an inappropriate fashion. If the PUMS sample was drawn in a way that misrepresented the gainful occupations of youth, the aged, and women this could explain the patterns we have observed. In the paper we demonstrate that the sampling bias hypothesis is easily rejected.

The Mistabulation Hypothesis

Perhaps clerical errors in the Census Office can explain the divergence between the enumerators' manuscripts and the published totals in 1880. Daniel Scott Smith, who drew his own sample from the 1880 census manuscripts for a study of old age, has suggested this possibility. Smith conjectured that when tallying the occupational results into a three-column worksheet (corresponding to the three broad age groupings, 10-15, 16-59, and 60 and older) with many individual occupations listed, the tallier would focus on finding the correct occupational row upon which to enter the tally mark. Once the occupation line was located, over 85% of the cases would fall into the broad age category 16 to 59 and would be properly recorded in the middle column. Since youthful and older laborers were comparatively rare, there might have been a tendency to misrecord some of these workers erroneously into column two. If Smith's conjecture is correct it would explain the apparent underreporting of both youthful and older males - though not of females - in 1880. It would also imply that youthful and older male workers were underreported in 1870 since the tabulation forms and procedures were the same. Finally, it would explain the consistent behavior of the participation rates of older men across the two censuses when examined by state and specific occupation by Ransom and Sutch (1989, 175-183).

It is difficult to reject the Mistabulation Hypothesis outright since there are other puzzling discrepancies between the PUMS and the published occupation tallies. Before embracing this conclusion, however, we consider a third possibility.

The Editing Hypothesis

Our Editing Hypothesis states that Census officials in Washington decided to remove a large number of youthful, older, and female workers prior to the tabulation and publication of the gainful worker count. One difficulty with this hypothesis is that, to the best of our knowledge, there is no record of any such action in the annals of the Tenth Census. Another difficulty with our Editing Hypothesis is that it indicts the Superintendent of the 1880 Census with knowingly, willfully, and secretly directing the alteration of the enumerators' returns. The Superintendent of the 1880 Census was no faceless bureaucrat, either. He was Francis Amasa Walker (1840-1897), one of the most prominent economists and statisticians of the late 19th century. Walker's accomplishments include presidency of the Massachusetts Institute of Technology (1881-1897), the American Statistical Association (1883- 1897), and the American Economic Association (1885-1892). He was the first to develop explicitly the concept of "perfect competition" (Newton 1967, 24) and was the author of over a hundred technical books and articles on economics and statistics. Walker, who also superintended the Census of 1870, is generally credited as the first Bureau Chief to approach the job in a professional manner. He insisted on examinations for the Bureau clerical staff, wrested control over appointment of enumerators away from local politicians, and developed the general methodology still in use today by agencies the world over that collect and present mass data (Newton 1967, 150; Magnuson and King 1995, 28-29). The Editing Hypothesis would appear to accuse this seemingly scrupulous, public-spirited, and highly-decorated man with frivolous or perhaps malicious tampering of the public record. To help motivate such an audacious hypothesis we examine Walker's character and career. We argue that the evidence warrants the following conclusions:

1) Walker was a man of integrity, imagination, and sophistication, dedicated to developing occupation statistics which would accurately portray the industrial structure of the nation's economy.

2) Walker was concerned with the internal consistency of the occupation tally as well as with its overall level.

3) Walker relied upon a wide range of evidence to form prior expectations about the level of occupational attachment and its distribution across age and gender groups. He considered it his duty to edit enumerators' reports whenever he felt they were inconsistent with other evidence.

In addition, we formulated two working hypotheses:

1) To achieve a consistent measure of attachment to gainful occupations, we suggest that Walker questioned the reported occupation of all those he suspected of part-time employment. We suspect that this would lead him to question the labor force attachment, in particular, of youths attending school; women of any age living with spouses, parents, or children; and the elderly with some visible, nonwage means of support.

2) We further suspect that, to achieve consistency across similar categories, Walker edited out positive responses which implied that the person had broken the law. This might include, in addition to occupations such as prostitution and gambling, the illegal work of minors and women.

Tests Of The Editing Hypothesis

We test our two working hypotheses by exploring the extent to which they are able to identify systematic patterns of editing of the manuscript returns. Our tests rely on the PUMS together with published tables on gainful employment by state, gender, and age group.

Our tests rely on weighted least squares regression analysis. The dependent variable is the difference between the enumerated and published occupation totals by state. Our independent variables are proxies for labor force attachment and illegal activity. Our results are consistent with the Editing Hypothesis. Walker appears to have authorized the removal of a substantial number of youthful, elderly and female workers from the occupation returns prior to their publication.

Editing And Labor Force Trends

Walker's interest in the occupation statistics were somewhat different than those of scholars today. Walker was primarily interested in assessing the industry of the country. Today, the principal use of the occupation statistics of 1880 is in identifying trends in labor force participation over time. For this modern goal, consistency in the definition of the labor force across census years is crucial. In this section of the paper we ask whether Walker's editing is appropriate for our contemporary purposes and whether Walker's editing made the published occupation statistics for 1880 more or less consistent with those for later years.

After examining instructions to enumerators at various censuses as well as narrative evidence we conclude that Walker's editing reduced the comparability of the 1880 occupation statistics for children and women with those for later years but that his editing improved the comparability of statistics for older males. Our study thus suggests that the incidence of child labor was far greater in 1880 (and quite probably in 1870) than has been previously thought. The required correction is large enough to reverse the supposed upward trend between 1880 and 1900. Our findings further suggest stability in the employment trend of older males about the turn of the century and a need to reexamine the view that a deep U-shaped trend characterized the involvement of married women over the period of industrialization. These revisions to the accepted view raise important questions about the role of compulsory schooling and child labor legislation, Social Security and private pensions, and the appearance of white-collar jobs in accounting for the evolution of the American labor force over time.

Editing And The Sociology Of Official Statistics

The official statistics of the United States were often described in the 19th century as the best in the world. They were more voluminous than those of any other country. They were also extraordinarily accessible, objective, and reliable. Americans devised institutional safeguards shielding statistical agencies from meddling by politicians and interest groups. They developed questionnaires, instructions to enumerators, classification systems, coding instructions and an administrative structure that greatly increased uniformity and precision. The objectivity and accuracy of the resulting data made the American protocols a model for official statistical agencies around the world.

Much of the credit for the high quality of these official statistics belongs to Walker and his protege, Carroll Wright. Walker and Wright elevated the 19th-century American "celebration of numbers" to its greatest height. Their particular genius was to translate the popular love of and confidence in statistics into a vehicle for addressing the social upheavals that accompanied industrialization. If workers and bosses, the native- and foreign-born, or parents and children were at odds, then, Walker argued, an impartial statistical survey might suggest a resolution. Scholars have interpreted Walker's statements to mean that, except where they corrected obvious errors and inconsistencies, the census and other government statistical agencies published the exact responses to questions asked.

Our discovery that Walker apparently authorized an extensive and secret editing of the manuscript occupation returns is therefore a surprise. Walker had the opportunity to describe his revisions. We show that he did in fact describe other editing he had authorized. Thus there are two puzzles: why did he do it and why didn't he tell?

It is easy to imagine why he did it. His purpose in collecting occupation statistics was to assess the industrial structure of the country. The inclusion of marginal workers would distort this picture. In retrospect, he might have preferred to have issued enumerator instructions that eliminated these marginal workers from the gainful worker count in the first place. Given that they were recorded as gainfully occupied, however, it was better to edit them out of the published totals than to slavishly include them. This approach, which we associate today with post-modernism, holds that a number is not a "fact" but a text. As text, numbers require interpretation. Our suggestion, then, is that Walker removed marginal workers so that the occupational count would more accurately portray the industrial structure of the country.

Why didn't he tell? We don't know. Perhaps he was afraid of igniting social protest. Even the publication of his lower, modified figures brought a storm of protest over the evils of child labor from labor and social reform groups (Abbott 1908, 36). He may have worried that the unadjusted figures would exacerbate rather than ameliorate social problems. Perhaps he was afraid of undermining public trust in the integrity and reliability of official statistics. He may have worried that the public would perceive his actions as an effort at manipulation or distortion. Perhaps he was afraid of being held up to public scorn. He was appointed Superintendent of the Census after ridiculing his predecessor. He may have worried that a public disclosure that his instructions to enumerators produced a badly-distorted picture of full-time labor input would expose him to the same sort of judgement.

In any case, our study suggests that American statisticians took less of a positivist approach to their work than previously thought. If true, this fact is important both for data users and for those interested in the social construction of official information.