Data Archive

The collection includes federal and state census files, administrative records, public opinion surveys, economic and social data from national and international organizations, along with studies compiled by individual researchers. In addition, Cornell University subscribes to many different data resources including ICPSR and the Roper Center, providing access to a large variety of additional data sources. The CISER archive was founded in 1982.

Finding Data

Please use the search archive or browse archive option to find data that matches your research needs. CISER also provides several download centers for your convenience.

Using the CISER Archive

Data files, most documentation, and other files (for example, SAS, SPSS, and Stata programs) are housed on the CISER file server. Some data require facility with a statistical software package that can read and manipulate raw data files or software-specific datasets.

The following are different ways Cornell users can use or obtain data archive holdings (click to expand):

USAGE NOTE: Use of Archive data and documentation is limited to the purpose of academic research. Other use or redistribution is a violation of Cornell’s Campus Code of Conduct and Policy Regarding Abuse of Computers and Network Systems. Some datasets have additional requirements for their use, see Cornell’s ICPSR Membership and Cornell’s Roper Center Membership for additional information. Acknowledgement of the use of data (or other services) from CISER is respectfully requested. Please see details here.

Understanding Traffic Light File Permissions

Some studies are restricted to those having accounts on the CISER Research Computing System. Other studies have additional restrictions from data distributors; see our policies page for additional information. Restricted files cannot be downloaded from the archive catalog.

Green Light These files can be downloaded by anyone. CISER staff provide help with their use to current faculty, staff, and students affiliated with Cornell University.

CISER staff continue to review studies in the data archive to determine which can be made widely available to the public. When studies are supplied to non-Cornell users, they are provided as is. Non-Cornell users are expected to rely on assistance provided by their local institutions.

Yellow Light These files can be downloaded from links in the online catalog with CUWebAuth authentication. In some cases, these files must be used within the CISER research computing environment (CISERRSCH). Cornell faculty, staff, and students can apply for a computing account.
Red Light These files may be used by Cornell-affiliated users with a research computing account and with appropriate authorization from the data provider. Contact the CISER Data Archive for further information.
Downloads from catalog links

Current Cornell faculty, staff, and students can download most archive files to their own machines from links in our online catalog. Files are downloaded in ZIP compressed format. If needed, use 7-Zip or a similar utility to open them.

The example below provides a non-working example of what the catalog file information looks like.

Current Population Survey, November 2003: Tobacco Use Supplement

Bibliographic Information: U.S. Bureau of the Census. — June 2006 — Washington: The Bureau, 2006 [producer]. Note: Also known as the Tobacco Use Special Cessation Supplement. Co-sponsored by the National Cancer Institute and the Centers for Disease Control and Prevention. Codebook: CPH-011(2003).

Abstract: Data are provided on labor force activity for the week prior to the survey. Comprehensive data are available on the employment status, occupation, and industry of persons 15 years old and over. Also shown are personal characteristics such as age, sex, race, marital status, veteran status, household relationship, educational background, and Hispanic origin. The data also contain information about smoking history, prevalence, tobacco type use, workplace smoking policies, smoking rules in the home, attitudes towards smoking in public places, opinions about the degree of youth access to tobacco in the community, and attitudes toward advertising and promotion of tobacco.

File Information:

Type of file Directory \ File Name
Size / Size Zipped
Codebook V:\cph\011\cps_febjunnov_03.pdfNote: Binary – use Adobe Acrobat to view. Also used for CPH-002(2003) & CPH-006(2003).
3 MB / 2 MB
File Layout V:\cph\011\cpsnov03.txt
168 KB / 32 KB
Data V:\cph\011\cpsnov03.dat
217 MB / 19 MB

Direct from CISER’s research servers

Social science faculty, students, and research staff are eligible for accounts on our computing environment. CISER’s User Accounts Guides have helpful information on using the system.

The archive’s holdings catalog lists the directory locations and files associated with each study. See the “Downloads from catalog links” section above for an example of how this information appears in the catalog.

Once you log onto a research server, archive files are located in the V:\ drive. For example, these files comprise the World Values Surveys (CISER codebook SIND-071):

V:\sind\071\da2790 data file
V:\sind\071\cb2790 codebook
V:\sind\071\sp2790 SPSS program
V:\sind\071\sa2790 SAS program

Usually, you don’t have to copy archive files to your own user space or a CISER research server to use them. Most documentation files can be viewed from their original location using Notepad, Wordpad, or the Adobe Acrobat Reader. You can read raw data files from their original location within your program (using the infile statement in SAS, the file handle in /name = statement in SPSS, or the use command in Stata). Here is a simple SAS program that extracts three variables from the World Values Survey data file:

data mylib.world;
infile ‘v:\sind\071\da2790’ lrecl=352;
input
survey 1
country 2-3
religion 214-215;
run;

SAS, SPSS, and Stata input programs can be copied to your own user space and edited, or you can cut and paste the needed sections as you write your program.

Printed Documentation and Reports

The archive has printed documentation for most datasets: codebooks, data dictionaries, technical reports, and questionnaires. You can browse our shelves, plus use the online catalog to find documentation for which we have no print equivalent. Archive copies are intended for on-site use only. You might also find copies at Cornell University Library locations or on the Internet.

Archive Data Files in other Formats

We own selected datasets on CD-ROMs or DVDs. You can use the archive’s catalog to find these – check the “CD-ROM/DVD” box in part 2 of the search options. The following example illustrates a study owned on CD-ROM:

County and City Data Book 2000
U.S. Bureau of the Census. — 13th Edition — Washington, DC : The Bureau, 2003 [producer]. Washington DC : The Bureau, 2003 [distributor]. Files on CDROM/DVD# 773.

Some may be borrowed for use outside the archive, others must be used in the archive on a public machine. Please ask staff for help with locating these items.

Depositing Research Data

If you have data that you would like to deposit into a data repository, please contact us to discuss your repository and archival needs.