Using the CISER Archive

Please note the following:

  • USAGE: Use of Archive data and documentation is limited to the purpose of academic research. Other use or redistribution is a violation of Cornell’s Campus Code of Conduct and Policy Regarding Abuse of Computers and Network Systems. Acquiring Archive files means you understand and acknowledge these policies.
  • FORMAT: All files are downloaded in compressed (zip) format. Use 7-ZIP or another utility to unzip the files.
  • RESTRICTIONS: Some studies are restricted to those having accounts on the CISER Research Computing System. Other studies have additional restrictions from data distributors; see our policies page for additional information. Restricted files cannot be downloaded from the archive catalog.

ACCESSING AND USING ARCHIVE DATA

Data files, most documentation, and other files (for example, SAS, SPSS, and Stata programs) are housed on the CISER file server. Some data require facility with a statistical software package that can read and manipulate raw data files or software-specific datasets.

The following are different ways Cornell users can use or obtain data archive holdings (click to expand):

Direct from CISER’s research servers

Social science faculty, students, and research staff are eligible for accounts on our computing environment. CISER’s User Accounts Guides have helpful information on using the system.

The archive’s holdings catalog lists the directory locations and files associated with each study.  See the “Downloads from catalog links” section below for an example of how this information appears in the catalog.

Once you log onto a research server, archive files are located in the V:\ drive.  For example, these files comprise the World Values Surveys (CISER codebook SIND-071):

V:\sind\071\da2790     data file
V:\sind\071\cb2790     codebook
V:\sind\071\sp2790     SPSS program
V:\sind\071\sa2790     SAS program

Usually, you don’t have to copy archive files to your own user space or a CISER research server to use them.  Most documentation files can be viewed from their original location using Notepad, Wordpad, or the Adobe Acrobat Reader.  You can read raw data files from their original location within your program (using the infile statement in SAS,  the file handle in /name =  statement in SPSS, or the use command in Stata).    Here is a simple SAS program that extracts three variables from the World Values Survey data file:

data mylib.world;
infile ‘v:\sind\071\da2790’ lrecl=352;
input
survey 1
country 2-3
religion 214-215;
run;

SAS, SPSS, and Stata input programs can be copied to your own user space and edited, or you can cut and paste the needed sections as you write your program.

Downloads from catalog links

Current Cornell faculty, staff, and students can download most archive files to their own machines from links in our online catalog.  Files are downloaded in ZIP compressed format. If needed, use 7-Zip or a similar utility to open them.

The example below provides a non-working example of what the catalog file information looks like.

Current Population Survey, November 2003: Tobacco Use Supplement

Bibliographic Information:  U.S. Bureau of the Census.  — June 2006 — Washington: The Bureau, 2006 [producer].   Note: Also known as the Tobacco Use Special Cessation Supplement. Co-sponsored by the National Cancer Institute and the Centers for Disease Control and Prevention.   Codebook: CPH-011(2003).

Abstract:  Data are provided on labor force activity for the week prior to the survey. Comprehensive data are available on the employment status, occupation, and industry of persons 15 years old and over. Also shown are personal characteristics such as age, sex, race, marital status, veteran status, household relationship, educational background, and Hispanic origin. The data also contain information about smoking history, prevalence, tobacco type use, workplace smoking policies, smoking rules in the home, attitudes towards smoking in public places, opinions about the degree of youth access to tobacco in the community, and attitudes toward advertising and promotion of tobacco.

File Information:

Type of file Directory \ File Name
Size / Size Zipped
Codebook V:\cph\011\cps_febjunnov_03.pdfNote:  Binary – use Adobe Acrobat to view. Also used for CPH-002(2003) & CPH-006(2003).
3 MB / 2 MB
File Layout V:\cph\011\cpsnov03.txt
168 KB / 32 KB
Data V:\cph\011\cpsnov03.dat
217 MB / 19 MB
Printed Documentation and Reports

The archive has printed documentation for most datasets: codebooks, data dictionaries, technical reports, and questionnaires. You can browse our shelves, plus use the online catalog to find documentation for which we have no print equivalent. Archive copies are intended for on-site use only. You might also find copies at Cornell University Library locations or on the Internet.

Archive Data Files in other Formats

We own selected datasets on CD-ROMs or DVDs. You can use the archive’s catalog to find these – check the “CD-ROM/DVD” box in part 2 of the search options.  The following example illustrates a study owned on CD-ROM:

County and City Data Book 2000
U.S. Bureau of the Census.  — 13th Edition — Washington, DC : The Bureau, 2003 [producer].   Washington DC : The Bureau, 2003 [distributor].   Files on CDROM/DVD# 773.

Some may be borrowed for use outside the archive, others must be used in the archive on a public machine.  Please ask staff for help with locating these items.