For example, move unused data from fast storage systems (disks) to “glacier-like” locations (sites providing tape). As a complementary functionality, a smart engine should infer when data are becoming “hot” again and move them back to the fast storage. Note: this functionality should be available at the infrastructure level, based on an inter-sites data movement, not only as an intra-site data placement.
The WLCG (Worldwide LHC Computing Grid) community counts thousands of researchers distributed worldwide. It is currently maintaining the distributed computing infrastructure for the LHC (Large Hadron Collider) experiments at CERN. The LHC computing models are based on advanced frameworks built on top of the common low level services to add functionalities and to limit the direct interaction of the end-users with the underlying infrastructure.
The WLCG is, in fact, composed by at least four main distinct communities, one per experiment: ALICE, ATLAS, CMS and LHCb. Moreover, the same infrastructure, in several sites, is exploited by many other High Energy Physics (and astrophysics) Virtual Organizations, at national and international level, i.e. VIRGO, BELLE, NA62, AMS, CTA, JUNO, ILC, COMPASS, PADME, DAMPE, JUNO, just to name a few.
This leads to a very variegated and heterogeneous ecosystem of HEP user communities injecting requirements into the same infrastructure.
WLCG is interested in the following functionalities implemented by XDC:
Provide smart caching mechanisms to support the remote extension of a site to remote locations and to provide alternative models for large data centers. Data stored in the original site should be accessible in a transparent way from the remote location.
Caching mechanism should guarantee that data are accessed transparently from any location without the need of explicitly copying them on the client location.
A few examples of Data Management based on Quality of Service and Data Lifecycle Management:
- The User can specify the number of replicas and the QoS associated for each of them, i.e. one on fast storage (disks on SSDs) and two on tape in three different locations. The system should be able to automatically maintain in time that policy verified.
- The User can specify that certain datasets always have a mirror, checking the replicas status in real time or quasi-real time.
- The user can specify that a number of replicas are created and they have to be accessed with different protocols, i.e. http, xrootd, srm)
- The user can specify movements between QoS and/or changes in access controls based on data age (i.e quarantine periods, move to Tape old data)