CLASSE Computing Outages: 22-Jul-2020 to 27-Jul-2020
This week, a series of scheduled power outages at Wilson Lab will disrupt CLASSE computing services both onsite and offsite. We expect to have most services restored by next Monday, 27-Jul-2020, although some recovery efforts might extend farther into the week. Here is a summary timeline, along with recommended actions for users:
| Wed 22-Jul
|| 10:00 AM
|| Compute Farm capacity reduced, batch queues disabled
| Fri 24-Jul
|| Before 5:00 PM
|| Please save your work and log out
| Fri 24-Jul
|| 5:00 PM
|| Most CLASSE services shut down, including Compute Farm
| Sat 25-Jul
|| 6:00 AM
|| CESR and CHESS control systems shut down
| Sat 25-Jul
|| Late night
|| CESR and CHESS control systems restored
| Mon 27-Jul
|| Early morning
|| Most CLASSE services restored; reboot your computer if necessary
Please see below for more details. Also, you may subscribe to CLASSE-IT-NEWS-L
to receive email announcements during these outages.
Impact on Compute Farm
In preparation for the Wilson Lab East Module power outage on Friday, 24-Jul-2020, most CLASSE Compute Farm
batch queues will be disabled two days beforehand, starting Wednesday, 22-Jul-2020 at 10:00 AM. Please note that interactive
queues are excluded from this two-day pre-outage pause; jobs running in these queues will instead be terminated early morning on Friday, 24-Jul-2020, when the East Module power is cut.
Any terminated jobs can be resubmitted immediately for continued processing during business hours on Friday, 24-Jul-2020. The Compute Farm will remain operational on Friday thanks to a new farm node that has been deployed in Wilson 221. However, farm capacity will be severely degraded, so you may need to wait for resources to become available before your job is scheduled.
All remaining jobs will be terminated on Friday, 24-Jul-2020 at 5:00 PM, when most CLASSE computing services will be shut down ahead of the Wilson Lab main building power outage on Saturday, 25-Jul-2020. We expect the Compute Farm to be restored to full capacity by Monday, 27-Jul-2020.
Impact on General Computing
In preparation for the Wilson Lab main building power outage on Saturday, 25-Jul-2020, most CLASSE-IT infrastructure will be shut down the evening before, on Friday, 24-Jul-2020 at 5:00 PM. Please save your work to samba (central filesystems)
and log out of your computer
before this time. We recommend leaving end-user computers powered on, to avoid needing manual reboots when power is restored.
During the Wilson Lab main building power outage, the following CLASSE-IT services will be unreachable from anywhere onsite or offsite:
- Remote logins via ScreenConnect, X2Go, ssh, etc.
- CLASSE VPN and other CLASSE networks
- Central filesystems, including Samba and Globus
- Web services: wiki, Indico, Timesheet, CLASSE / CHESS / CBB websites
- Network license servers (Matlab, Autodesk Vault, Cadence, etc.)
- Compute Farm
Also, within Wilson Lab, Cornell wi-fi will be unavailable while power is down. Please note that some end-user computers might not lose power, but they will be unusable without central services (authentication, networking, central filesystems). Similarly, CLASSE software that is locally installed on laptops or home systems will not run if they depend on our network license servers (see above).
CLASSE-IT will begin the recovery process Saturday evening immediately after the power outage, but services will not be fully restored until Monday, 27-Jul-2020. If you have any questions or concerns, please contact us via a ServiceRequest
CESR and CHESS Control Systems
In order to minimize downtime for the CESR and CHESS control systems, we will not shut them down until right before the Wilson Lab main building power outage, on Saturday, 25-Jul-2020 at 6:00 AM. They will also be the first systems to be restored after power returns Saturday evening.
After the Outage
Although we expect most central CLASSE-IT services to be restored by Monday, 27-Jul-2020, individual end-user computers might need additional attention. If you have problems with your workstation or computing resources on Monday, please:
- First, try rebooting your computer, if possible.
- Then, submit a ServiceRequest if the problem remains.
General network and server maintenance will occur every Tuesday from 12 noon to 2:00 PM.
The CLASSE-IT group will always announce any expected disruptions in our NewsLetter
and via CLASSE-IT-NEWS-L
, but with the size and complexity of our network there is always the potential for something to go wrong. We will do our best to contain all network maintenance and planned outages to Tuesdays from 12 noon to 2:00 PM.
Unless other arrangements have been made, CLASSE-managed Windows systems may be updated and rebooted on Tuesday morning at 2:00 AM
, so please avoid critical or lengthy operations at that time. For more details, please see SystemExpectations
Questions or problems? Submit a service request.