CISER Data & Reproduction Archive Preservation and Storage Policy

Policy Volume: DA
Responsible Executive: Senior Data Librarian
Responsible Office: Cornell Center for Social Sciences
Revised: 2014-04-03; 2020-11-05

POLICY STATEMENT

The data preservation function is integrated into the operations and planning of CCSS Research Support and throughout the management stages of the research data lifecycle in order to support Social Science and Economic research at Cornell University.

REASON FOR POLICY

The fundamental purpose of the CISER Data & Reproduction Archive is to select, preserve and make available for use primary and secondary data, documentation and metadata, in discipline recognized digital formats that remain suitable for research in perpetuity. The data preservation and storage policy is guided by a variety of community-driven standards, (e.g. Open Archival Information Systems (OAIS) reference model, Trusted Repositories Audit and Certification (TRAC), CoreTrustSeal (CTS), Data Documentation Initiative (DDI), and FAIR Data Principles), that represent an international body of knowledge and expertise pertaining to various issues within digital preservation.

POLICY GUIDELINES

These guidelines address the effective implementation of procedures for the preservation of CCSS’s digital collections within the context of the CISER Data & Reproduction Archive Collection Policy. CCSS reserves the right to review the scholarly and historical value of and user accessibility into the data preservation characteristics.

Data Integrity

Upon receipt of new digital content, the Archive staff process the data and documentation, assess that confidentiality concerns are addressed, in collaboration with the data producer fix errors if necessary, convert data formats, and run a checksum. The metadata pertaining to each data file is stored in a SQL database. (A backup of the SQL database is taken every evening and is retained for a finite period.) Provenance notes are maintained, which relate back to the original deposited version, as part of the metadata for any alterations made in the preservation and dissemination versions.

To ensure that the digital content remains identical and accessible, automated tasks are run to verify checksums. The results are compared to the metadata, held within the SQL database, to validate data integrity. If degradation of any digital content is detected, CCSS would endeavor to reinstate the original version from a backup copy.

Data Normalization

Evaluation of new content types and software/format obsolescence is an ongoing process. It is expected that normalizing the CISER Data & Reproduction Archive collection by migrating to updated content types when new formats become widely available will occur seamlessly. When new formats are created from data files either through migration into new file formats or through creating new file formats for dissemination, the old files are retained alongside. Version control is stored as part of the metadata, as referenced in the CISER Data and Reproduction Archive Versioning Policy.

Management of Storage Infrastructure

The preservation of the CISER Data & Reproduction Archive is dependent upon CCSS’s storage infrastructure. Thus, management of the storage infrastructure is designed to accommodate scalability, reliability, and sustainability, in accordance with quality control specifications and security regulations. In light of increasing user demand and changing technologies, CCSS staff routinely monitors technical developments and evaluates potential archival solutions that will both streamline and enhance CCSS data preservation practices.

Adequate storage capacity for all CISER Data & Reproduction Archive holdings is maintained. In addition, unlimited capacity from external media is available. The disk storage maintains a RAID 6 configuration and all infrastructures are protected by uninterrupted power supplies (UPS).

All data are backed up on a daily basis via the University’s offering of EZ-backup, which also provides off-site storage. EZ-backup makes use of IBM’s Tivoli Storage Manager.

Security

CCSS is committed to taking all necessary precautions to ensure the physical safety and security of the CISER Data & Reproduction Archive holdings that it preserves. The storage infrastructure is housed in the University data center. The data center features uninterrupted power supplies (UPS), fire prevention and protection system, physical intruder prevention and detection systems and environmental control systems. In addition, the server racks that house the CCSS’s disk storage are equipped with unique keys.

Policy Review Process: CCSS will review these policies every three years in conjunction with the CoreTrustSeal certification process or any future certification process.

Related Documents