Working Environment at CC-IN2P3

LSST-France uses the computing resources and services provided by IN2P3 computing center (CC-IN2P3). You are invited to read CC-IN2P3 user documentation at:

https://doc.cc.in2p3.fr

Here you will find an overview of the CC-IN2P3 working environment, relevant for your activities as a member of the LSST-France community.

Overview

CC-IN2P3 computing environment is made up of several components that you can use:

  • a login farm, where you can connect and do interactive work
  • a batch farm, where you can submit jobs for asynchronous execution in CPU- or GPU-equipped compute nodes
  • several data storage services, where you can access data and software, and store your own

Below you can find more details on each of those components.

How to Get Help

You can get help by asking other LSST-France members via instant messaging (see Instant messaging). In addition, for questions related to computer operations, passwords, account support, group membership, storage quotas, etc. you can contact the CC-IN2P3 help desk at:

https://cc-usersupport.in2p3.fr

Account Setup

To use any of the computing resources allocated to LSST-France at CC-IN2P3 you need an individual account.

If you don’t have an account yet, please follow these instructions to apply for one.

If you already have an account and want to be a member of the group lsst, one of the LSST-France representatives with authority to approve your membership must formally request the modification via a ticket. To get information on who the current representatives for the lsst group are, connect to the Login Farm and type the command:

$ laboinfo -g lsst -p compte

See How to Get Help for details on how those representatives must submit their request. Information on the administrative organization of user accounts at CC-IN2P3 in laboratories and groups can be found in the generic end-user documentation, where you can also find details on the roles and responsibilities of those representatives.

Important

As a user of CC-IN2P3 computing center services you are expected to comply with its charter.

Operations Status

You can get a live overview of the status of CC-IN2P3 computing services at:

http://cctools.in2p3.fr/services_status

There are 4 scheduled maintenance periods per year (about one per quarter), each one of them typically lasting 24 hours or less. The calendar of those periods is established one year in advance. To be informed when those periods are scheduled, configure your ICS-compatible calendar application to subscribe to the feed:

https://zimbra.in2p3.fr/home/usr6402@cc.in2p3.fr/MAINTENANCE%20CC.ics

Alternatively, you can regularly download this file into your calendar application. More details about scheduled outages can be found here.

During each maintenance period, a potentially different subset of the computing services are impacted. To get informed about the details of each outage, scheduled or otherwise, you may want to subscribe to the very low-traffic mailing list USERS-CC-L by sending an e-mail to listserv@in2p3.fr with the body of your message:

subscribe USERS-CC-L Your_Given_Name Your_Last_Name

You won’t be able to post to this list but will receive all announcements sent by the people in charge of CC-IN2P3 support and operations.

(See also the Monitoring and Dashboards section.)

Login Farm

The login farm is composed of a set of computers where you can connect via ssh using your individual user credentials. All the computers in this farm can access the data in your $HOME or in any other storage space (see Data Storage and File Systems).

You can use the computers in the login farm to perform interactive work (e.g. editing files, writing documents, compiling code, etc.) and to submit jobs to the batch farm (see Batch Farm).

LSST-France members use the computers in the login farm which run the CentOS 7 distribution of the Linux operating system. To connect to a computer in the login farm, use the generic host name cca.in2p3.fr (technically, it is a DNS alias), for instance:

$ ssh cca.in2p3.fr

Your connection request made through that generic host name will be dynamically directed to a concrete host (say cca001.in2p3.fr) depending on several parameters such as the number of connected users or the load of each host in the farm, etc. at the time of your request. Therefore, you are encouraged to use the generic host name instead of the name of a specific host.

See the Customizing your SSH client tutorial for detailed information on how to configure your SSH client for secure, passwordless connexion to the login farm.

You may also be interested in learning about Executing LSST-enabled JupyterLab Notebooks.

Batch Farm

The batch farm is composed of several hundred hosts devoted to compute- and data-intensive tasks executed mostly asynchronously.

CC-IN2P3 uses Univa GridEngine batch system. Some of the GridEngine commands you may find useful are:

  • qsub: to submit a batch job to Univa Grid Engine
  • qlogin: to submit submit an interactive login session
  • qstat: to show the status of jobs and queues
  • qacct: to report and account for Univa Grid Engine usage
  • qdel: to delete jobs from queues
  • qhold: to hold back jobs from execution
  • qrls: to release jobs from previous hold states
  • qalter: to modify a pending or running batch job
  • qresub: to submit a copy of an existing job
  • qmod: to modify a queue, running job or job class
  • qquota: to show current usage of resource quotas
  • qhost: to show the status of hosts, queues, jobs

To retrieve specific information about each one of them, connect to the Login Farm and then do:

$ man qsub

In addition, you may find useful to read an overview of the computing platform and more detailed information on how to submit jobs.

The batch farm is configured using execution queues each with its own characteristics. You may also want to browse the hardware configuration and operating system of the compute nodes in the farm.

Unfortunately, because unfathomable restrictions imposed by Univa, we are not allowed to make the end user documentation available via a public web page (!). Therefore, to access the documentation of the currently deployed version of Univa GridEngine, you need first to connect to the login farm and then look for the available documents in the directory /opt/sge/doc.

Finally, to know which version of GridEngine is currently in production, do:

$ cat /opt/sge/uge-version

Data Storage and File Systems

As a user of CC-IN2P3, you can use several storage areas for your data. Selecting one of those areas for your needs depend on several criteria such as the intended use of the data, their lifetime, desired accessibility, etc.

Home directory: $HOME

Your account is configured with a home directory pointed to by the environment variable $HOME. That storage space is managed by NFS and it has a quota of a few tens of gibabytes.

You are strongly encouraged to not hardcode the path to your home directory and instead use $HOME (or, alternatively ~) in your scripts, including your batch jobs and shell profiles.

Your home directory is accessible in read/write mode from all the hosts in both the Login Farm and in the Batch Farm. In other words, a file created during an interactive session in your home directory will be visible by your jobs when executing in one compute node, using the same file path.

Files you store under your home directory are backed-up so may be retrieved in case of accidental deletion. There is no purge policy associated to your home directory but the storage quota is somehow low.

To manage the access rights to your files and directories with fine granularity, you can use the commands nfs4_getfacl(1) and nfs4_setfacl(1) which allow you to get and set access control entries, respectively. An introduction to how NFS access control lists work can be found using the command

$ man 5 nfs4_acl

Additional details on how to set access control lists to your data are available in the Manage ACLs section of the user documentation.

Intended use: source code, executables, configuration files, documents, etc. Optimized for small- to medium-sized files. It is not intended for storing the output of your application runs.

Shared group area: /pbs/throng/lsst

This storage space, with a higher quota than $HOME, is intended for storing and sharing small- to medium-sized files (e.g. software, code, scripts, documents, etc.) but not for large datasets. Although the namespace and usage of this area is under control, you have your own directory under /pbs/throng/lsst/users. The name of your subdirectory is identical to your user name (i.e. the output of the whoami command).

There is no purge policy associated to this area and it is backed-up, so it may be possible to recover files after accidental deletion. This area is visible from hosts both in the Login Farm and in the Batch Farm. The quota for this space is measured in hundreds of gigabytes for the entire group and can be adjusted to meet usage needs.

You can manage the access rights to your files and directories under /pbs/throng/lsst by using NFS access control lists, in an analogous way as you can do with your $HOME. Use the commands nfs4_getfacl(1) and nfs4_setfacl(1) to get and set access control entries, respectively. An introduction to how NFS access control lists work can be found via man 5 nfs4_acl. Additional details on how to set access control lists to your data are available in the Manage ACLs section of the user documentation.

Intended use: source code, executables, configuration files, documents, etc., in particular if you want to share them with other members of the group. Optimized for small- to medium-sized files. This space is not intended for storing the output of your application runs.

Shared group area (large datasets): /sps/lsst

This is a high performance, high volume storage area (managed by GPFS) used for storing large volumes of data, such as image datasets and catalogs (see Datasets). Like your home directory, it is accessible from nodes in both the login and batch farms using the same path.

Warning

Unlike your $HOME directory, /sps/lsst is not backed-up. Although it is configured to be resilient to several kinds of hardware failures, if you accidentally remove some data from this area, it is very unlikely that the CC-IN2P3 helpdesk can do something to retrieve your data back.

The quota of this area for the group is measured in the hundreds of terabytes and is adjusted to meet usage needs.

Intended use: data files, large data sets, both for your individual usage and for usage by other members of the group. Optimized for high performance I/O of medium- to large-sized files. This space is ideal for storing the output of your application runs. Recommended practice for storing your individual data is to use the existing subdirectory under /sps/lsst/users named after your login name (i.e. the output of the execution of the whoami command). Files intended for use by the whole group are stored in other subdirectories.

You can use your individual storage area under /sps/lsst/users to share data with your collaborators (see How to share data with your collaborators).

Interactive working area: /scratch

This storage area is useful for storing data while working in interactive sessions in the machines in the Login Farm. It is visible only from the hosts in the login farm: your batch job won’t be able to access files stored in this area.

Recommended practice is to create a subdirectory for your own needs named after your login name (i.e. the output of the whoami command). Please note that this area is not intended to permanently store data: files in this area are regularly purged based on usage criteria and space availability but typically last beyond a single interactive work session.

Batch job working area: $TMPDIR

Jobs submitted for execution in the Batch Farm can use a temporary storage area. This area is local to the compute node where your job executes (i.e. it is allocated in the compute node’s local disks) and is volatile: the files and directories you create there in will dissappear when your job finishes its execution.

Recommended practice is to use this compute node’s local storage area for storing files produced by your job. As your job’s execution progresses and before it finishes, you can copy the data you want to keep to other storage areas intended for permanent storage. In a similar way, if your job needs to repeatedly read some files, you may want to consider copying those files into the temporary storage of your job and process those files locally.

The path to this area is unique for each job and is prepared by the batch system. Recommended practice is to access it via the environmental variable $TMPDIR.

Archival storage

CC-IN2P3 uses HPSS for data archival purposes. Data on this area are physically stored on magnetic tape cartridges in an automated tape library. You can find more information in the Mass storage section of the CC-IN2P3 User Documentation.

Summary: overview of available storage areas

Overview of the storage areas available for LSST users.
Storage area Visibility File system type Backed-up Quota Purge policy
$HOME login and batch farms NFS yes a few 10s GB No
/pbs/throng/lsst login and batch farms NFS yes a few 100s GB No
/sps/lsst login and batch farms GPFS no hundreds of TB No
/scratch login farm only NFS no N/A Yes
$TMPDIR compute node only Local file system (typically XFS) no a few 10s GB per job Yes

(See also the Monitoring and Dashboards section.)

Software

General purpose software tools such as file editors, compilers, interpreters, source code management tools, etc. are available for you to use on hosts in both the Login Farm and the Batch Farm. You can find more information on what software is available and how to activate a specific version of one the available packages in the software section of the CC-IN2P3 user documentation.

Below you find information on where you can find and how to use software specific to LSST.

Location

LSST-specific software, including the LSST software pipelines, colloquially known as the stack, is available on hosts at CC-IN2P3 under the path:

/cvmfs/sw.lsst.eu

There in is a subdirectory for each supported platform (e.g. linux-x86_64 and darwin-x86_64) and for each platform there is a LSST package distribution, for instance lsst_distrib and lsst_sims. To execute the LSST software on CC-IN2P3 nodes, you may want to use the software available under /cvmfs/sw.lsst.eu/linux-x86_64 which is the one built for execution on nodes running Linux.

lsst_distrib

This is the name of the distribution of the LSST software which contains the science pipelines, also known as the stack.

In the directory /cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib you will find several releases, both stable and weekly. Stable releases start by v (e.g. v19.0.0) and weekly releases start by w (e.g. w_2020_18). All the available releases are configured for Python 3 only.

Each of the installed releases is independent, self-consistent and almost self-contained: the Python interpreter, the set of Python packages and shared libraries each release depends on are all contained in its installation directory. However, tools such as the C++ runtime library or the git command are included only for releases w_2020_18 or later.

For your convenience, some Python packages not required by the LSST software framework are also included in each installed release. They are installed via the conda command, which means that the installed versions of those packages are the ones available via the mechanisms used by conda. For instance, packages such as Jupyter and its dependencies are included. See Quick start guide for getting the list of installed Python packages for a given release.

The LSST software is built and deployed following the official instructions on a host running the CentOS 7 operating system and using the version of the C/C++ compiler recommended by the project for each particular version. It may change from version to version, though.

Warning

Only a limited set of weekly releases is available at any given moment, typically those published by the project over the last 3 months. This means that older weekly releases are removed without prior notice. You are therefore strongly encouraged to use a recent release for your own needs. To know what modifications a given release includes, you can look the Rubin Science Pipelines Changelog.

lsst_sims

This is the distribution which includes the software packages for producing simulated LSST images. The available releases of this distribution can be found under the directory /cvmfs/sw.lsst.eu/linux-x86_64/lsst_sims.

As is the case for lsst_distrib, each installed release of lsst_sims is independent and as self-contained as possible. The naming scheme for each release of lsst_sims follows the project’s naming convention and is therefore different than for lsst_distrib, but it should be self explanatory.

Tip

On your personal computer or virtual machine you can use exactly the same LSST software packages than you use when working on CC-IN2P3 nodes, which may be useful for reproducibility purposes. You can find more details on how to configure your personal computer to do so at sw.lsst.eu.

Quick start guide

To use a given release of the software in lsst_distrib, you must first log in to a host in CC-IN2P3 login farm running CentOS 7:

$ ssh cca.in2p3.fr

and then, assuming you want to use w_2020_18 and your shell is bash do:

$ source /cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2020_18/loadLSST.bash

This will bootstrap your environment to use the specified release. As a result of executing this command, some environmental variables are extended or initialized, such as PATH, PYTHONPATH, LD_LIBRARY_PATH and EUPS_PATH.

The LSST software uses EUPS for managing the set of packages which are part of a given release. EUPS allows you to select the packages you need to use in a work session. For instance, to use the command line tasks for processing CFHT images, you can do:

$ setup obs_cfht
$ setup pipe_tasks

After these steps, your working environment is modified so that you can use the command line tasks (e.g. ingestImages.py, processCcd.py, etc.) and import Python modules in your own programs (e.g. import lsst.daf.persistence)

If you need to work with a different release of the stack, say w_2020_20, you must create a new terminal session and bootstrap your environment with the desired release. For instance:

# In a new terminal session
$ source /cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2020_20/loadLSST.bash
$ setup obs_cfht
$ setup pipe_tasks
$ processCcd.py --help

To retrieve the list of Python packages included in release w_2020_20, do:

$ source /cvmfs/sw.lsst.eu/linux-x86_64/lsst_distrib/w_2020_20/loadLSST.bash
$ conda list

Usage of lsst_sims is analogous as usage of lsst_distrib. Please refer to the Tutorials Overview section for more detailed information on how to use the LSST stack.

Datasets

Images

Image datasets are available under

/sps/lsst/datasets

For instance, this is the structure of the hsc dataset:

$ tree -L 2 /sps/lsst/datasets/hsc
/sps/lsst/datasets/hsc
├── calib
│   ├── 20160419
│   └── 20170105
└── raw
    ├── README.txt
    └── ssp_pdr1

5 directories, 1 file

There are usually a high number of files in each directory, so using the ls command to inspect their contents may take a few minutes. For your convenience, a file README.txt is available with some context information about provenance of the image dataset and in particular, with the list of files in the directory, as produced by the ls -al command.

Reference catalogs

Reference catalogs can be found under:

/sps/lsst/datasets/refcats

For instance, at the time of this writing the contents of that directory is:

$ tree -L 2 /sps/lsst/datasets/refcats
/sps/lsst/datasets/refcats
|-- gaia
|   `-- gaia_DR1_v1
|-- htm_baseline
|   |-- gaia -> ../gaia/gaia_DR1_v1
|   |-- gaia_DR1_v1 -> ../gaia/gaia_DR1_v1
|   |-- pan-starrs -> ../pan-starrs/ps1_pv3_3pi_20170110
|   |-- ps1_pv3_3pi_20170110 -> ../pan-starrs/ps1_pv3_3pi_20170110
|   |-- sdss -> ../sdss/sdss-dr9-fink-v5b
|   `-- sdss-dr9-fink-v5b -> ../sdss/sdss-dr9-fink-v5b
|-- pan-starrs
|   `-- ps1_pv3_3pi_20170110
`-- sdss
    `-- sdss-dr9-fink-v5b

13 directories, 0 files

Monitoring and Dashboards

Several tools are provided for monitoring your individual activity and the activity of the members of LSST at CC-IN2P3.

User Portal

You can get an overall view of your individual activity by visiting

https://portail.cc.in2p3.fr/?langage=en

You will be presented a login page like the one below:

../_images/portal.png

Once authenticated with your individual credentials (the same user name and password that you use for logging on to the Login Farm) you will be presented with information about your own batch jobs and those of your group, the storage capacity you are using, the computing ressources asked by the groups you belong to, how to contact the help desk and pointers to several other services provided by CC-IN2P3 that you as a member of LSST could use (e.g. Indico, gitlab, Atrium, etc.)

Operations Dashboard

You can have an overall view of the computing activity at CC-IN2P3 induced by members of LSST by visiting

https://mon.lsst.eu

You will be presented a dialog similar to:

../_images/signin.png

Make sure you select “CCIN2P3/CNRS User” in the drop down menu and then type in your individual credentials, that is, the same user name and password that you use for logging on to the Login Farm.

Alternatively, if you have a personal certificate issued by one of the CNRS certification authorities, you will be presented a dialog box with a button which allows you to identify yourself using your personal certificate.

Once authenticated, you will be presented an interactive dashboard like the one below:

../_images/mon-lsst-eu.png

You will be able to drill down and get specific real-time information about the activities related to storage, batch processing, data exchange and the catalog database.

Storage usage

In this section you can find ways to get information about the usage of storage areas used by LSST.

Storage usage /sps/lsst and /sps/lssttest

The table below presents the locations where you can find information of storage used per individual for the storage areas under /sps/lsst and /sps/lssttest:

Storage area Usage details
/sps/lsst per group member per top level directory
/sps/lssttest per group member per top level directory

This information is refreshed daily: see at the top right corner of the target page for information on when it was last updated.

The images below present the storage allocated vs the storage used for several ranges of file ages, for both /sps/lsst and /sps/lssttest. Click on one the images below to explore other visualizations via the pull down menus at the top of the target page: you will be able to select the specific metric and the timeframe you want to get information about.

https://ccspsmon.in2p3.fr/filestats/lsst-sfre-yearly.png

Evolution of storage allocation and usage for /sps/lsst. Click on the image to get more details.

https://ccspsmon.in2p3.fr/filestats/lssttest-sfre-yearly.png

Evolution of storage allocation and usage for /sps/lssttest. Click on the image to get more details.

Storage usage /pbs/throng/lsst

The figure below shows the allocated (blue) vs used (green) storage under /pbs/throng/lsst. Click on the image to get more detailed information.

http://cctools.in2p3.fr/mrtguser/mrtguser/lsst/lsst_pbs-day.png