Data Transfer for CHESS Users

Data collected at most CHESS beamlines are written to a centralized data acquisition and storage system known as CHESS-DAQ. This system consists of a dedicated high-speed network that connects the experimental stations to hundreds of terabytes of enterprise-class redundant disk arrays, as well as an offsite magnetic tape library for long-term archival.


Data Retention

In general, CHESS raw data is kept readily available on disk for two runs after data collection ("current" and "previous"). After two runs, raw data is removed from disk.

After data has been removed from disk, you may ask for it to be restored from our tape archives by submitting a CLASSE-IT service request. Simply email service-classe@cornell.edu with your run number (e.g. 2022-2) and BTR (e.g. pi-3456-a) -- consult with your staff scientist if you do not have this information. Please allow 1-2 days for the restore to finish (depending on the size of your dataset).

In order to conserve disk space for restored raw data, whenever the disk usage exceeds 80%, files that have not been accessed for the past 30 days will removed, from oldest to newest, until the disk usage falls below 70%.

Also, if you have a CLASSE account that was automatically created or renewed for a specific beamtime, that account expires three weeks after the end of the CHESS run (the expiration date is given in your account notification email). To request an extension of your CLASSE account for data retrieval, please email service-classe@cornell.edu.

Remote Data Access

CHESS users are encouraged to download their data directly from the CHESS-DAQ to their own computers. This data transfer can be done while onsite at CHESS (using one's laptop) or back at one's home institution.

For both access methods below, CHESS users will need to obtain a CLASSE account. Please contact the CHESS Users' Office or a staff scientist for assistance.

Method 1 (preferred): Globus

Globus (https://www.globus.org) is our recommended data transfer tool because it is optimized for large data volumes, and it works on any platform. To use it, users simply install Globus Connect Personal (free software) on their computers and then log in to a CLASSE Globus endpoint with their CLASSE password.

Please click here for complete instructions.

  • CHESS users are strongly advised to test Globus data transfers before arriving onsite, using the CHESS Test endpoint.
  • CHESS users may access their data during and after their beamtime using the CHESS Raw and CHESS Aux endpoints.
  • Please report any problems by opening a CLASSE-IT Service Request.

Method 2: Remote Login

CHESS users with a CLASSE account can log into CLASSE's general-purpose Linux login node, lnx201.classe.cornell.edu:
  • Linux and Mac: please see RemoteLinux for instructions on using SSH, SFTP, and SCP.
  • Windows: we recommend the scp client WinSCP, for which we have instructions at SFTP-SCP.

After logging into lnx201.classe.cornell.edu, CHESS users can find their files in these locations:
  • Data for the current run: /nfs/chess/raw/current/[station]/[beam_time_run_id]/.
  • Data from previous runs: /nfs/chess/raw/[run]/[station]/[beam_time_run_id]/.
  • If a particular dataset no longer appears on disk, it can be retrieved from archival tape storage by submitting a ServiceRequest.
  • Sample files for testing are available at /nfs/chess/raw/test-download/, which contains two directories:
    • small contains three small text files
    • large contains one binary file with a size of 2 GB, which may be used to measure transfer speed.


Onsite Data Access

Although it is not recommended as a general solution, CHESS users may also transfer their data to a portable USB device while onsite. Please see DataTransferRecs for important considerations when selecting a USB device. The computers located at each beamline may be used for this purpose. Please consult with the staff scientists for instructions.

If the beamline computers are not available, there are public terminals (Windows and Linux) in Wilson 315 (instructions here).

As a last resort, users may connect their systems to the CHESS Public subnet and access the DAQ directly. Please see DAQClientConfiguration for instructions.


Mail

If all else fails, we can ship a hard drive containing the data that was collected to the user's home institution. The user is responsible for providing the hard drive and bearing the shipping costs. Please contact the CHESS Users' Office to make arrangements.


Getting Additional Help

If you have any computing questions: For help with central Cornell services, contact IT@Cornell.


This topic: CHESS > WebHome > CHESSUsersDataTransfer
Topic revision: 16 Nov 2022, WernerSun
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding CLASSE Wiki? Send feedback