Internet Data Sources for Social Scientists
Online Reference Tools
BLS Handbook of Methods
| Describes the methodology, evolution, and coverage of the major economic and labor force surveys produced by the Bureau of Labor Statistics. An important reference for users of the CPS, PPI and CPI statistics, Consumer Expenditure Surveys, and many others. |
Carnegie Classification Codes
| A system for representing accredited, degree-granting colleges in the U.S. based on size and character of enrollment, scope of degrees offered, research focus, and other characteristics. Developed by the Carnegie Foundation for the Advancement of Teaching, the scheme was first developed in 1970 and has recently been overhauled. |
Cartographic Boundary Files
| For use with Census data products. Includes many specialized geographies; for example, traffic zones, school districts, state legislative districts, voting districts. |
Dictionary of Occupational Titles
| A classification system commonly used in datasets and compiled at the Department of Labor. The most recent edition is 1991. The DOT has been replaced by O*Net, which defines and described occupations: http://online.onetcenter.org |
Eagle Geocode
| Provides geocodes (Census geography, latitude/longitude) for standard addresses. You can search a few for free for evaluation purposes, other large-scale uses are available for a fee. Batch processing options are useful for coding survey respondants. |
FIPS (ANSI) codes
| These standardized codes identify U.S. geographic areas. States are assigned 2-digit codes, counties have 3-digit codes, and there are also codes for metropolitan areas and places. FIPS codes are fairly ubiquitous in data files and useful for joining geographic records from different files. FIPS stands for Federal Information Processing Standards, and codes are assigned by the National Institute of Standards and Technology (NIST). |
| Terms and concepts: Federal Information Processing Series (FIPS) and American National Standards Institute (ANSI). |
| Codes for places (including those unincorporated), primary county divisions, and other entities This link connects to the online search mechanism to display or download results. You can download the entire file from here http://geonames.usgs.gov/domestic/download_data.htm (The GNIS features database incorporates and supersedes the FIPS55 files.) Maintained by the U.S. Board on Geographic Names, created to maintain a uniform geographic names. |
| One page for each State, listing FIPS codes for all geographies code one could imagine for all geographies, plus school districts (for which there are no FIPS codes). Handy beyond your wildest dreams. |
Geographic changes to counties,
| Boundary changes to counties or their equivalents deemed "significant" by Census. |
Geography Tools
| A wealth of codes for use with historical Census products, displayed in tabular format or ready to download as ascii files. Includes labor market areas and commuting zones, state economic areas, PUMAs of migration, county composition of metro areas back to the mid-1800s, and much more. Collected by the nice people at IPUMS. |
Glossary of Decennial Census Terms and Acronyms
| Maintained by the Census Bureau, defines every imaginable term used within the Census context, both current and superseded. |
Glossary of Social Science Computing Terms
| Although aimed at those who staff data archives, this list is handy for anyone working with varying formats of research data. Compiled by Jim Jacobs. |
MABLE/GeoCorr
| Generates equivalency files for geographic areas used in the 1980 and 1990 Censuses, |
MABLE/Geocorr2K
| Creates files or reports of equivalency codes for Census 2000 geographies and more (State legislative and Congressional districts, school districts, voter tabulation districts, and more). Not able to create correlations with previous Census geographies. |
Master Area Geographic Area Glossary of Terms
| Definitions for geographic entities used by Census products and many corresponding SAS format label programs . Maintained by the Missouri Census Data Center and OSEDA. |
Metropolitan areas and codes
| Based on the application to the Census 2000 and Census 2010 data. |
National Crosswalk Service Center
| Delivers many occupational and educational crosswalk files, including DOT-to-1980 Census occupations, 1970-to-1980 Census, and OES-to-CIP classifications. |
North American Industry Classification System (NAICS)
PUMA
| A geographic concept used with Census microdata files. The composition of PUMAs varies according to the microdata sample. For the 2000 Census, PUMAs for the 5% sample must contain at least 100,000 people. PUMAs for the 1% sample (also called a Super PUMA) have a population threshold of 400,000 people. PUMAs are not compatable across decennial Censuses. |
| Lists PUMAs and their component parts for 1980-2000 Censuses. |
| Handy maps of 1970-2000 PUMAs. PUMA maps for 2000 are available from Census: http://www.census.gov/geo/www/maps/CP_MapProducts.htm |
Rural Urban Continuum Codes
| Classifies counties or county equivalents by degree of urbanization and proximity to urban areas. Also known as Beale codes. |
| This link downloads an Excel file. |
| Lookup feature of individual counties or download the entire file in Excel format. |
SIC (Standard Industrial Classification) Codes
| Page also links to a 1972/1987 SIC concordance. |
| Search and browse by keyword or code. |
Standard Industry Classifications
| Links to classifications systems such as NAICS 1997 and 2002, SIC, ISIC, and their revisions. |
Using, Documenting, and Citing Data
Bibliographic Citations for Data Files
| Dedicated to citing numeric files, liberal use of Canadian datasets as examples. |
Citing Electronic Data Files
| Uses examples based on ICPSR studies. |
Guide to Social Science Data Preparation and Archiving
| Although tailored to the needs of those preparing datasets for archiving at ICPSR, this document is handy for any researcher who collects, manages, and shares data. It takes a "life cycle" approach to archiving, in that the very first steps in the process begin well before data collection. The PDF version of the document links from this page. |
How to Cite Electronic Media
| Don't forget to accurately and appropriately cite data you use! This page has examples for datafiles, web and FTP sites, e-mails, e-lists, and more. |
How to Use a Codebook
| Detailed instructions for translating codebook and record information into SAS, SPSS, and Stata programs to read and prepare data for analysis. |
Introduction to Data Handling
| Introduction to data structures (rectangular, hierarchical, et al.), how to use a codebook, merging files. The nuts and bolts to preparing data for analysis. Not updated recently but still useful. Compiled by Social Sciences Computing Services, University of Chicago. |
Suggested Citation Styles for Internet Information
| Recommended citation formats for static and dynamic products provided on US Census sites. |
Tools and Guidelines for managing household survey microdata
| These pages cover metadata creation, file formats and organization, data editing, principles of archiving, and minimizing disclosure risk. Compiled by the International Household Survey Network. |