BwForCluster MLS&WISO Production Filesystems

From bwHPC Wiki
Jump to: navigation, search

Contents

1 File Systems

There are two separate storage systems, one for $HOME and one for workspaces. Both use the parallel file system BeeGFS. Additionally, each compute node provides high-speed temporary storage on the node-local solid state disk via the $TMPDIR environment variable.

$HOME Workspaces $TMPDIR
Visibility global global node local
Lifetime permanent workspace lifetime batch job walltime
Capacity 36 TB 384 TB 128 GB per node (9 TB per fat node)
Quotas 100 GB none none
Backup no no no
  • global: all nodes access the same file system.
  • node local:each node has its own file system.
  • permanent: files are stored permanently.
  • workspace lifetime: files are removed at end of workspace lifetime.
  • batch job walltime: files are removed at end of the batch job.

1.1 $HOME

Home directories are meant for permanent storage of files that are kept being used like source codes, configuration files, executable programs. There is currently no backup for the home directory. The disk space per user is limited to 100 GB. The used disk space is displayed with the command:

homequotainfo

1.2 Workspaces

Workspace tools can be used to get temporary space for larger amounts of data necessary for or produced by running jobs. To create a workspace you need to supply a name for the workspace and a lifetime in days. The maximum lifetime is 90 days. It is possible to extend the lifetime when needed.

Command Action
ws_allocate foo 10 Allocate a workspace named foo for 10 days.
ws_list -a List all your workspaces.
ws_find foo Get absolute path of workspace foo.
ws_extend foo 5 Extend lifetime of workspace foo by 5 days from now.
ws_release foo Manually erase your workspace foo. Please remove content first.

If you plan to produce or copy large amounts of data in workspaces, please check the availability. The used and free disk space on the workspace filesystem is displayed with the command:

workquotainfo

1.3 $TMPDIR

The variable $TMPDIR points to the local disk space on the compute nodes. All node types are equipped with a local SSD with 128 GB capacity. Each node has its own $TMPDIR. The data in $TMPDIR become unavailable as soon as the job has finished.

2 Access to SDS@hd

SDS@hd University Heidelberg
SDS@hd ScientificDataStorage © University Heidelberg

It is possible to access your storage space on SDS@hd directly on the bwforCluster MLS&WISO. You can access your SDS@hd directory with a valid Kerberos ticket on all compute nodes except standard nodes. Kerberos tickets are obtained and prolongated on the data mover nodes data1 and data2. Before a Kerberos ticket expires, notification is sent by e-mail.

2.1 Direct access in compute jobs

  • Login in to a datamover, i.e. ssh data1 (passwordless)
  • Fetch a Kerberos ticket with command: kinit (use your SDS@hd service password)
  • Prepare your jobscript to use the directory /mnt/sds-hd/<your-sv-acronym>
  • submit your job from the login node

2.2 Copying data on data mover node

Certain workflows or I/O patterns may require the transfer of data from SDS@hd to a workspace on the cluster before submitting jobs. Data transfers are possible on the data mover nodes data1 and data2:

  • Login to a data mover node with command: ssh data1 (passwordless)
  • Fetch a Kerberos ticket with command: kinit (use your SDS@hd service password)
  • Find your SDS@hd directory in /mnt/sds-hd/
  • Copy data between your SDS@hd directory and your workspaces
  • Destroy your Kerberos ticket with command: kdestroy
  • Logout from data1 for further work on the cluster