Working Environment at CC-IN2P3

Members of LSST-France use the computing resources and services provided by IN2P3 computing center (CC-IN2P3). You are invited to read CC-IN2P3 user documentation at:

https://doc.cc.in2p3.fr

In these pages you will find details of the CC-IN2P3 working environment relevant for your activities as a member of the LSST-France community.

Overview

CC-IN2P3 computing environment is made up of several components that you can use:

  • a Login Farm, where you can connect and do interactive work

  • a Batch Farm, where you can submit jobs for asynchronous execution in CPU- or GPU-equipped compute nodes

  • several Data Storage and File Systems, where you can access data and software, and store your own

Below you can find more details on each of those components.

How to Get Help

You can get help by asking other LSST-France members via instant messaging (see Instant messaging).

In addition, for questions related to CC-IN2P3 operations, passwords, account support, group membership, storage quotas, etc. you can contact the help desk at:

https://support.cc.in2p3.fr

Important

⚠️ The credentials for contacting the help desk are specific to the ticketing tool and are not the same you would use for loging into the Login Farm. The fist time you contact the help desk you need to create your account by clicking on the link “Register as a new customer” at the bottom of the splash screen as shown below:

../_images/ticketing-splash.jpg

Account Setup

To use any of the computing resources allocated to LSST-France at CC-IN2P3 you need an individual account.

If you don’t have an account yet, please follow these instructions to apply for one. You will be guided to an application form in CC-IN2P3’s Identity Management Portal where you will be guided to submit your request.

Account creation is a multi-step process. After providing some details about you which will be associated to your account (e.g. your given and family names, your institutional e-mail address, etc.), you will be asked in step 2 to select your current home institution: if you are affiliated to a non-French research organization it is possible that your institution does not appear in the drop-down menu labelled “Research Structure”. If that is your case, please ensure that you tick the box “My research structure does not appear in the list”, as shown in the image below. This will ensure your request is routed to the individuals that have the authority to approve the creation of your account as an external collaborator.

When your account creation is approved, it will be created and you will get a notification.

Once you have an account you need to apply for it to be member of the lsst group via the Identity Management Portal. After loging in to the portal with your individual credentials, ask your account to be attached to the lsst collaboration, as shown below:

Your request will be pending for formal approval by a LSST-France representative with the appropriate authority, as shown in the image below. When your account has joined the lsst group you will get a notification.

If you work on several science projects, your account may be configured as member of several groups, including lsst. In some situations you may need to switch to a different primary group, either permanently or for the duration of a single work session. You can find details on how to switch groups in the FAQ section of the generic end-user documentation.

Detailed information on the administrative organization of CC-IN2P3 user accounts in collaborations can be found in the generic end-user documentation, where you can also find details on the roles and responsibilities of collaboration representatives.

Important

⚠️ As a user of CC-IN2P3 computing center services you are expected to comply with its charters. Please make sure you have read them and understand your rights and responsibilities before using those services.

Operations Status

You can get a live overview of the status of CC-IN2P3 computing services at:

https://cctools.in2p3.fr/services_status

Scheduled maintenance

There are 4 scheduled maintenance periods per year (about one per quarter), each one of them typically lasting 12 hours or less. The calendar of those periods is established one year in advance. To be informed when those periods are scheduled, configure your ICS-compatible calendar application to subscribe to the feed:

https://zimbra.in2p3.fr/home/usr6402@cc.in2p3.fr/MAINTENANCE%20CC.ics

Alternatively, you can regularly download this file into your calendar application. More details about scheduled outages are available here.

During each maintenance period a subset of the computing services are impacted. To get informed about the details of each outage, scheduled or otherwise, you may want to subscribe to the very low-traffic mailing list USERS-CC-L by sending an e-mail to listserv@in2p3.fr with the body of your message:

subscribe USERS-CC-L Your_Given_Name Your_Last_Name

You won’t be able to post to this list but will receive all announcements sent by the team in charge of CC-IN2P3 support and operations.

(See also the Monitoring and Dashboards section.)

Login Farm

The login farm is composed of a set of computers where you can connect via ssh using your individual user credentials. All the computers in this farm can access the data in your $HOME or in any other storage space (see Data Storage and File Systems).

You can use the computers in the login farm to perform interactive work (e.g. editing files, writing documents, compiling code, etc.) and to submit jobs to the Batch Farm.

LSST-France members use the computers in the login farm which run the RedHat Enterprise Linux distribution of the Linux operating system. To connect to a computer in the login farm, use the generic host name cca.in2p3.fr (technically, it is a DNS alias), for instance:

ssh yourlogin@cca.in2p3.fr

Your connection request made through that generic host name will be dynamically directed to a concrete host (say cca001.in2p3.fr) depending on several parameters such as the number of connected users or the load of each host in the farm, etc. at the time of your request. Therefore, you are encouraged to use the generic host name instead of the name of a concrete host.

A separate, smaller set of computers for interactive work can be reached via the alias ccahm.in2p3.fr (hm here stands for high memory). These computers have a different hardware configuration from the machines behind the alias cca.in2p3.fr, in particular they have a significantly higher amount of RAM. They are intended for interactive work which requires more memory than typically available in the hosts reachable through the alias cca.in2p3.fr. Please bear in mind that all the machines in the login farm, including the high-memory machines, are shared with other users of CC-IN2P3.

To connect to those high-memory hosts do:

ssh yourlogin@ccahm.in2p3.fr

See How to customize your SSH client to get detailed information on how to configure your SSH client for secure, passwordless connexion to the login farm and How to execute LSST-enabled JupyterLab notebooks to get details on executing notebooks in the login farm.

Batch Farm

The batch farm used by members of LSST-France is composed of a few hundred hosts devoted to compute- and data-intensive tasks executed mostly asynchronously. Those compute nodes are managed by Slurm, a cluster management and job scheduling system, and run the same operating system than the hosts at the Login Farm.

As a user of this system, the Slurm commands below may be useful for you:

  • sinfo(1): to get information about Slurm nodes and partitions

  • sbatch(1): to submit a batch script for execution in a compute node

  • srun(1): to run parallel jobs

  • squeue(1): to get information about jobs located in the Slurm scheduling queue

  • scancel(1): to signal or cancel Slurm jobs, job arrays or job steps

  • slurm.conf(5): to get detailed information on the Slurm cluster configuration

To retrieve specific information about one of those commands, connect to the Login Farm and do:

man 1 sinfo

The version of Slurm can be retrieved by using the command:

sinfo --version

The compute nodes managed by Slurm are grouped in partitions, logical groups of nodes (see this Slurm Quick Start Guide). The partitions currently defined in the farm can be retrieved via the command sinfo and you can also browse them here. There is in particular the partition lsst which is devoted to execute jobs submitted by members of the group lsst. To use this partition, we recommend to submit your jobs with a command similar to:

sbatch --partition=lsst,htc  my_script.sh

or, alternatively, that you specify the partitions in your script:

#!/bin/bash

#SBATCH  --partition=lsst,htc
...

and then to actually submit your job do

sbatch my_script.sh

This mechanism tells Slurm to prefer the compute nodes in the partition lsst for executing your job and if there is no capacity available in that partition to use the partition htc.

You may find instructive to read an overview of CC-IN2P3’s computing platform as well as some job submission examples.

Some of these Slurm tutorials as well as the Slurm commands man pages may be useful for you as well.

Data Storage and File Systems

As a user of CC-IN2P3, you can use several storage areas for your data. Selecting one of those areas for your needs depend on several criteria such as the intended use of the data, their lifetime, desired accessibility, etc.

Please make sure you also read How to decide where to store my data to get familiar with our recommended practices for data storage.

Home directory: $HOME

Your account is configured with a home directory pointed to by the environment variable $HOME. That storage space is accessed via the NFS protocol and has a quota of a few tens of gigabytes.

You are strongly encouraged not to hardcode the path to your home directory and instead use $HOME (or, alternatively ~) in your scripts, including your batch jobs and shell profiles.

Your home directory is accessible in read/write mode from all the hosts in both the Login Farm and the Batch Farm. In other words, a file created during an interactive session in your home directory will be visible by your jobs when executing in one compute node, using the same file path.

Files you store under your home directory are backed-up so may be retrieved in case of accidental deletion. There is no purge policy associated to your home directory but the storage quota is somehow low.

To manage the access rights to your files and directories with fine granularity, you can use the commands nfs4_getfacl(1) and nfs4_setfacl(1) which allow you to get and set access control entries, respectively. An introduction to how NFS access control lists work can be found using the command

man 5 nfs4_acl

Additional details on how to set access control lists to your data are available in the Manage ACLs section of the user documentation.

Intended use: source code, executables, configuration files, documents, etc. Optimized for small- to medium-sized files. It is not intended for storing the output of your application runs.

Shared group area: /pbs/throng/lsst

This storage space, with a higher quota than $HOME, is intended for storing and sharing small- to medium-sized files (e.g. software, code, scripts, documents, etc.) with other members of the group, but it is not intended for large datasets. Although the namespace and usage of this area is under control, you have your own directory under /pbs/throng/lsst/users. The name of your subdirectory is identical to your user name (i.e. the output of the whoami command).

There is no purge policy associated to this area and it is backed-up, so it may be possible to recover files after accidental deletion. This area is visible from hosts both in the Login Farm and in the Batch Farm. The quota for this space is measured in a few terabytes for the entire group and can be adjusted to meet usage needs.

You can manage the access rights to your files and directories under /pbs/throng/lsst by using NFS access control lists, in an analogous way as you can do with your $HOME. Use the commands nfs4_getfacl(1) and nfs4_setfacl(1) to get and set access control entries, respectively. An introduction to how NFS access control lists work can be found via man 5 nfs4_acl. Additional details on how to set access control lists to your data are available in the Manage ACLs section of the user documentation.

Intended use: source code, executables, configuration files, documents, etc., in particular if you want to share them with other members of the group. Optimized for small- to medium-sized files. This space is not intended for storing datasets produced by your applications runs.

Shared group area (large datasets): /sps/lsst

This is a high performance, high volume storage area (managed by the Ceph file system) used for storing large volumes of data, such as image datasets and catalogs (see Datasets: /sps/lsst/datasets). Like your home directory, it is accessible from nodes in both the Login Farm and the Batch Farm using the same path.

Warning

⚠️ Unlike your $HOME directory, /sps/lsst is not backed-up. Although it is configured to be resilient to several kinds of hardware failures, if you accidentally remove some data from this area it is very unlikely that the CC-IN2P3 helpdesk can do something to retrieve your data back.

The quota of this area for the group is measured in the hundreds of terabytes and is regularly adjusted to meet usage needs.

Intended use: data files, large data sets, both for your individual usage and for usage by other members of the group (see Individual user data: /sps/lsst/users and Topical group data: /sps/lsst/groups). It is optimized for high performance I/O of medium- to large-sized files. This space is ideal for storing the output of your application runs.

Individual user data: /sps/lsst/users

You can store data for your individual use in a subdirectory of /sps/lsst/users named after your login name (i.e. the output of the execution of the whoami command or the contents of the $USER environment variable). To retrieve the quota and utilisation of your individual area use the command:

/usr/bin/spsquotalist /sps/lsst/users/$USER

You can use your individual storage area under /sps/lsst/users to share data with your collaborators (see How to share data with your collaborators). For further details about this storage area please see How to decide where to store my data.

Topical group data: /sps/lsst/groups

Data of interest for a topical group composed of members of the lsst group can be stored in this area. See How to decide where to store my data for details.

Datasets: /sps/lsst/datasets

This storage area contains data (e.g. images, reference catalogs, calibration data, simulated data, tabular data) typically used as inputs for data processing or analysis tasks performed at CC-IN2P3 by members of the group. The permissions of those datasets are set read-only for most members of the group.

Data products: /sps/lsst/dataproducts

This storage area contains data produced or being produced by organized campaigns performed at CC-IN2P3. A few production accounts have write permissions for this area.

Interactive working area: /scratch

This storage area is useful for storing data while working in interactive sessions in the machines in the Login Farm. It is visible only from the hosts in the Login Farm: your jobs running in the Batch Farm won’t be able to access files stored under /scratch.

Our recommended practice is to create a subdirectory for your own needs named after your login name (i.e. the output of the whoami command), for instance via the command below:

mkdir -p /scratch/$(whoami)

Please note that this area is not intended to permanently store data: files in this area are regularly purged based on usage criteria and space availability but typically last beyond a single interactive work session.

Batch job working area: $TMPDIR

Jobs submitted for execution in the Batch Farm can use a temporary storage area. This area is local to the compute node where your job executes (i.e. it is allocated in the compute node’s local disks) and is volatile: the files and directories you create there in will dissappear when your job finishes its execution.

Our recommended practice is to use this compute node’s local storage area for storing files produced by your job. As your job’s execution progresses and before it finishes, you can copy the data you want to keep to other storage areas intended for permanent storage. In a similar way, if your job needs to repeatedly read some files, you may want to consider copying those files into the temporary storage of your job and process those files locally.

The path to this area may be unique for each job and is configured by the batch system which manages the execution of your job: you can use the value of the environmental variable $TMPDIR to access that area.

Archival storage

CC-IN2P3 uses HPSS for data archival purposes. Data on this area are physically stored on magnetic tape cartridges in an automated tape library. You can find more information in the Mass storage section of the CC-IN2P3 User Documentation.

Summary: overview of available storage areas

Overview of the storage areas available for LSST users.

Storage area

Visibility

File system type

Backed-up

Quota

Purge policy

$HOME

login and batch farms

NFS

yes

a few 10s GB

No

/pbs/throng/lsst

login and batch farms

NFS

yes

a few TB

No

/sps/lsst

login and batch farms

CephFS

no

hundreds of TB

No

/scratch

login farm only

NFS

no

N/A

Yes

$TMPDIR

compute node only

Local file system (typically XFS)

no

a few 10s GB per job

Yes

See also the Monitoring and Dashboards section for getting additional information about those areas.

Software

Software deployed by CC-IN2P3 staff

General purpose software tools such as file editors, typsetting tools, compilers, interpreters, source code management tools, etc. are available for you to use on hosts in both the Login Farm and the Batch Farm. They are installed and maintained by CC-IN2P3 staff.

You will find more information on what packages are available and how to activate a specific version of one them in the software section of the CC-IN2P3 user documentation.

Software deployed by members of the lsst group

Software deployments under the responsibility of members of the lsst group are available under

/pbs/throng/lsst/software

There you will find a subdirectory for each available software package or software environment. Generally speaking, the owner of each directory is the maintainer of the deployment unless stated otherwise in the README file inside each directory.

Your contribution is welcome if you are willing to install and commit to maintain software packages other members of the group can rely on. Before proceeding, you are encouraged to discuss your intentions with other members and ask for guidance via Instant messaging.

DESC software environment

The LSST Dark Energy Science Collaboration (DESC) sofware environment is installed under

/pbs/throng/lsst/software/desc

See How to activate the DESC software environment for details on how to use it.

LSST science pipelines

The LSST science pipelines are installed under

/cvmfs/sw.lsst.eu

See How to use the LSST Science Pipelines for a description of what you can find in that directory and details on how to use the software.

Monitoring and Dashboards

Several tools are provided for monitoring your individual activity and the activity of the members of the LSST community at CC-IN2P3.

User Portal

You can get an overall view of your individual activity by visiting

https://portail.cc.in2p3.fr/?langage=en

You will be presented a login page like the one below:

../_images/sso.png

Once authenticated with your individual credentials (the same user name and password that you use for logging in to the Login Farm) you will be presented with information about your own batch jobs and those of your group, the storage capacity you are using, the computing ressources asked by the groups you belong to, how to contact the help desk and pointers to several other services provided by CC-IN2P3 that as a member of LSST you could use (e.g. Indico, gitlab, Atrium, etc.)

Operations Dashboard

You can have an overall view of the computing activity at CC-IN2P3 induced by members of LSST by visiting

https://mon.lsst.eu

You will be presented a dialog similar to:

../_images/signin.png

Click on the Sign in with OAuth button in that dialog and you will be taken to the single sign on dialog to let you provide your credentials, the same user name and password that you use for logging in to the Login Farm. If you already logged in recently enough you may not be asked to provide your credentials again.

Once authenticated, you will be presented an interactive dashboard like the one below:

../_images/mon-lsst-eu.png

You will be able to drill down and get specific real-time information about the activities related to storage, batch processing, data exchange and the catalog database.

Storage usage

In this section you can find ways to get information about the usage of storage areas used by LSST.

Storage usage /sps/lsst

The table below presents the locations where you can find information of storage used per individual for the storage area under /sps/lsst:

Storage area

Usage details…

/sps/lsst

…per group member

…per top level directory

This information is refreshed once per day: see at the top right corner of the target page for information on when it was last updated.

The image below presents the storage allocated vs the storage used for several ranges of file ages for files under /sps/lsst. Click on the image to explore other visualizations via the pull down menus at the top of the target page: you will be able to select the specific metric and the timeframe you want to get information about.

https://ccspsmon.in2p3.fr/filestats/lsst-sfre-yearly.png

Evolution of storage allocation and usage for /sps/lsst. Click on the image to get more details.

Storage usage /pbs/throng/lsst

The figure below shows the allocated (blue) vs used (green) storage under /pbs/throng/lsst. Click on the image to get more detailed information.

http://cctools.in2p3.fr/mrtguser/mrtguser/lsst/lsst_pbs-day.png