Cornell University Cornell University CISER

CISER Data Archive

Locating and Using Archive Data on the CISER Research Computing System




This page tells you how to find information about data in CISER's archive and how to locate individual files.  If you can't locate a specific dataset or one that meets your need, please contact the CISER Help Desk for assistance.

Checking Archive Holdings

CISER's collection database allows you to identify a specific study and associated files:

  • Search our catalog; for example, by title or title words, principal investagator, issuing agency, ICPSR number. 
  • Browse our catalog by subject areas, including 19 broad categories and 57 subcategories.

Lists of recent additions to the archive on the CISER file server and on CD-ROM/DVD are also available.  

Accessing and Using Archive Data on File Server

Data files, most documentation, and other files (for example, SAS, SPSS, and Stata programs) are housed on the CISER file server. Some data require facility with a statistical software package that can  read and manipulate raw data files or software-specific datasets.  

Following are different ways Cornell users can use or obtain data archive holdings:

From CISER's research servers (for those having CISER computing accounts):

Social science faculty, students, and research staff are eligible for accounts on our computing environment. CISER's computing user's manual has helpful information on using the system.

The archive's holdings catalog lists the directory locations and files associated with each study.  See below for an example of how this information appears in the catalog.

Once you log onto a research server, archive files are located in the V:\ drive.  For example, these files comprise the World Values Surveys (CISER codebook SIND-071):

V:\sind\071\da2790     data file
V:\sind\071\cb2790     codebook
V:\sind\071\sp2790     SPSS program
V:\sind\071\sa2790     SAS program

Usually, you don't have to copy archive files to your own user space or a CISER research server to use them.  Most documentation files can be viewed from their original location using Notepad, Wordpad, or the Adobe Acrobat Reader.  You can read raw data files from their original location within your program (using the infile statement in SAS,  the  file handle in /name =  statement in SPSS, or the use command in Stata).    Here is a simple SAS program that extracts three variables from the World Values Survey data file:

data mylib.world;
   infile 'v:\sind\071\da2790' lrecl=352;
input
survey 1
country 2-3
religion 214-215;
run;
 
SAS, SPSS, and Stata input programs can be copied to your own user space and edited, or you can cut and paste the needed sections as you write your program. 

Downloads from catalog links (all Cornell users):

Current Cornell faculty, staff, and students can download most archive files to their own machines from links in our online catalog.  Files are downloaded in ZIP compressed format. Use WinZIP, Stuffit Expander, 7-Zip, or similar utilities to open them.

The example below provides a non-working example of what the catalog file information looks like.  The information hyperlink icon takes you to the Obtaining Data Archive Files from the CISER Catalog page, which describes this service in more detail.

 

Current Population Survey, November 2003: Tobacco Use Supplement

U.S. Bureau of the Census  -- June 2006 -- Washington: The Bureau, 2006 [producer].   Note: Also known as the Tobacco Use Special Cessation Supplement. Co-sponsored by the National Cancer Institute and the Centers for Disease Control and Prevention.   Codebook: CPH-011(2003).

File Information: information hyperlink

Type of file Directory \ File Name
Records
Size / Size Zipped
Codebook V:\cph\011\cps_febjunnov_03.pdf
  n/a
3 MB / 2 MB
File Layout V:\cph\011\cpsnov03.txt
6,128
168 KB / 32 KB
Data V:\cph\011\cpsnov03.dat
156,869
217 MB / 19 MB

Finding Printed Documentation and Reports

The archive has printed documentation for most datasets: codebooks, data dictionaries, technical reports, and questionnaires.  These are arranged according to an in-house subject scheme.  You can browse our shelves, plus use the online catalog to find documentation for which we have no print equivalent. Archive copies are intended for on-site use only. You might also find copies at Cornell University Library locations or on the Internet.

Archive Data Files in other Formats

We own selected datasets on CD-ROMs or DVDs. You can use the archive's catalog to find these.  The following example illustrates a study owned on CD-ROM:

County and City Data Book 2000

U.S. Bureau of the Census. Washington, DC : The Bureau, 2003 [producer]. Washington, DC: The Bureau, 2003 [distributor]. Files on CDROM#: 773.

Some may be borrowed for use outside the archive, others must be used in the archive on a public machine.  Please ask staff for help with locating these items.