Abstract
| The Italian National Institute for Nuclear Physics (INFN) has a long history of designing and implementing large-scale computing infrastructures and applications.
INFN has spent the past ten years heavily investing in developing solutions to enable, optimise and simplify transparent access to a multi-site federated Cloud infrastructure. A primary goal of this effort is to provide a generic model that allows INFN and other users to access resources in a fair and simple manner, regardless of the complexity of their requirements, of their proximity to a powerful computing centre, or their ability to administer advanced resources such as those offering GPUs. The ultimate objective is to shorten both the “time-to-market” and the learning curve for deploying, managing, and utilising computing services on a federated cloud system.
For this purpose, INFN Cloud provides a rich set of compute and storage services that can be automatically deployed on geographically distributed sites in an easy and transparent way.
One of the most frequently requested services by members of different scientific communities is based on jupyter notebooks. Therefore, we have been adapting the standard JupyterHub setup to provide a flexible and extensible multi-user service with some key integrations. First of all, the authentication mechanism is based on OpenID-Connect, while the authorization is based on OAuth attributes (like the user’s subject and groups) to grant admin or regular permissions. JupyterLab instances are spawned in containers which may start from custom images that encapsulate the needed libraries, depending on users’ needs (i.e. experiment software, big data analytics tools, etc.). All the containers mount two different types of storage space: a local area, where data is stored on the node filesystem; and a remote storage area, that allows to access the INFN Cloud Object Storage via posix. Files (notebooks, data, etc.) saved in the local storage area can persist until the node hosting the notebook servers is up and running, whereas data saved in the cloud area can be accessed at any time either through the notebook, or through the web interface of the INFN Cloud Object Storage service.
The usage of GPUs is also supported for running compute-intensive workloads. The automated configuration has also been tested with partitioned A100 GPUs: in this case, each notebook container gets an available partition of the GPU.
This contribution will provide details about the implementation of the service and some example use-cases running on INFN Cloud. |