BwHPC BPG Data Management
1 Local File Systems
In addition to computing capacity the bwHPC clusters are equipped with parallel file systems. For local data management it is important to differentiate if data is frequently used and persistent or quick access during a job's lifetime is desicive.
For each registered user a $HOME directory is provided in the parallel file system. A regular backup secures user's files stored in this directory. But quick access from compute nodes is not possible. For data that is read or written during a job's lifetime additional storage without backup is temporarily placed at the disposal. Since implementation varies between the bwHPC clusters, please visit the sites of bwUniCluster or bwForCluster JUSTUS for details.
|Directory||Characteristics||Kind of Data|
|$HOME||with backup, limited, global file system||software packages, configuration files, important results, ...|
|Workspaces, $WORK, ...||quick access, limited, temporary, global file system||input/output files|
|$TMPDIR,$TMP||local file system, temporarily limited to batch job's lifetime||intermediate results|
As a matter of principle, following rule should be observed: Do not compute in $HOME!
Disk space is like all HPC resources limited. If disk space is not sufficient, external storage services like bwFileStorage can be used.
2 External Storage
Each user of bwHPC clusters can use the storage service bwFileStorage. Since authentication and authorization is implemented via bwIDM mechanisms, group memberships at bwHPC clusters and bwFileStorage are identical.
Basically, data transfer between bwFileStorage and bwHPC clusters can be realized by customary transfer tools like scp, sftp or rsync. At bwUniCluster bwFileStorage is prototypically mounted via dedicated hardware, so called data mover nodes. Data transfer is remotely executed from the login nodes via the user interface rdata.
bwFileStorage is not only a central storage between bwHPC systems but also between local workstations and bwHPC clusters. At workstations, bwFileStorage can be mounted via tools like sshfs or cifs (only on KIT workstations), so pre- and post-processing of computing data can be performed locally.
3 Data Transfer
Transfer of large files achieves higher throughput than transferring files of small size. It is recommended to collect files to a compressed archive file with tools like zip, tar, xz or others before transfer.
3.1 Transfer Tools
|Type||Software||Remark||Executable on||Data transfer from/to|
|Command line tool||scp||Throughput < 150 MB/s (depending on cipher)||+||+||+||+||+||+|
|rdata||Throughput to 350-400 MB/s||+||+|
|Client||WinSCP||based on SCP/SFTP, Windows only||+||+||+|
|FileZilla||based on SFTP||+||+||+|
° Depending on workstation's OS.
An extended list of tools you can find here.
|bwForCluster MLS&WISO Production||bwfor.cluster.uni-mannheim.de|
|bwForCluster MLS&WISO Production||bwforcluster.bwservices.uni-heidelberg.de|