The whole ORTOLANG infrastructure is hosted at INIST, which also houses other large-scale national infrastructures. A dedicated team of system engineers is in charge of its IT system security and they use software like Shinken for failure monitoring and notification.
Our repository software architecture relies on multiple VMWare virtual machines. These machines are provisioned using Puppet models and can be rebuilt from scratch in case of a failure. The repository data is backed up daily.
There are 2 machine rooms of 100 square meters each in INIST’s premises. This two-room configuration makes it possible to distribute the different equipment (physical servers, switches), so the ORTOLANG platform hardware is distributed and interconnected between the 2 rooms.
The power supply is provided by 2 inverters operating redundantly and connected to a coupling cabinet. A single inverter is capable of providing the entire power supply to INIST site. Batteries are capable of providing power for 20 minutes in the event of a power cut. In addition, a diesel generator set starts automatically within a few minutes and provides the auxiliary power source with the autonomy of several days in case of a power outage.
There are 2 air conditioners in each room, 1 air conditioner being enough to cool each room. The production of chilled water is provided by 2 chillers, located in technical premises outside the computer building. A single unit can produce enough chilled water needed to cool the computer rooms and the inverter room.
A central detection and automatic neutral gas extinguishing system covers fire safety in addition to the machine-room layout, which is designed to provide fire protection for more than one hour. The four rooms are monitored separately (2 machine rooms, inverters, console) with multiple sensors. The extinguishing gas is of the ARGO 55 type, a natural inert gas composed of 50% nitrogen and 50% argon. CO2-type extinguishers were also installed in the rooms.
Fire drills are carried out regularly, several times a year.
Access controls are systematic for entering INIST campus buildings.
In addition, there are reinforced access checks to the machine rooms (restricted to INIST IT technical staff and senior management).
A security officer is permanently present on the site outside working hours. Unauthorized entries, technical and fire alarms are centralized and transmitted to this officer, who is consequently able to react very promptly. A BMS (Building Management System) also makes it possible to monitor the various operating situations.
Three backup solutions are currently being implemented:
- Tape backups with HP Dataprotector software coupled with a dual robotic Quantum Scalar I6000 (LTO7) and Scalar i80 (LTO6);
- Disk backups with Veeam Backup software;
- Snapshots (instant captures of stored data) on the primary storage arrays.
For Linux application environments, like ORTOLANG, 2 types of backup (Dataprotector and Veeam) are performed following the same principle in each case: one weekly full backup and one incremental backup on the other days with a 6-week data storage retention.
Veeam backups are supported by Quantum (DXi) appliances, which allows on-the-fly compression and deduplication.
All data is hosted at INIST in both machine rooms. Offline archives (tapes) can be created for storage in fireproof vaults, one of which is located in a separate building. Hosting on a remote site of the duplicated data is one possible way of developing the system planned.
We tried out two crash-test strategies successfully before service launch:
- Restoring an entire ORTOLANG infrastructure from backups on empty physical servers.
- Restoring an entire ORTOLANG infrastructure by rebuilding the entire application with its deployment tools (a skill transfer between the development team and INIST was organized to carry out this operation).
Continuous improvement policy
INIST has been committed to an improvement effort for many years, but has not gone as far as certifying its data center yet. It has called upon specialized companies for this effort. We can mention the following:
- A study for a disaster recovery plan (PRAI) by the Ares firm in 2008,
- A safety audit by Ernst & Young in 2018.