BwHPC BPG Data Management - bwHPC Wiki BwHPC BPG Data Management - bwHPC Wiki

BwHPC BPG Data Management

From bwHPC Wiki
Jump to: navigation, search

1 Local File Systems

In addition to computing capacity the bwHPC clusters are equipped with parallel file systems. For local data management it is important to differentiate if data is frequently used and persistent or quick access during a job's lifetime is desicive.

For each registered user a $HOME directory is provided in the parallel file system. A regular backup secures user's files stored in this directory. But quick access from compute nodes is not possible. For data that is read or written during a job's lifetime additional storage without backup is temporarily placed at the disposal. Since implementation varies between the bwHPC clusters, please visit the sites of bwUniCluster or bwForCluster JUSTUS for details.

Directory Characteristics Kind of Data
$HOME with backup, limited, global file system software packages, configuration files, important results, ...
Workspaces, $WORK, ... quick access, limited, temporary, global file system input/output files
$TMPDIR,$TMP local file system, temporarily limited to batch job's lifetime intermediate results

As a matter of principle, following rule should be observed: Do not compute in $HOME!

Disk space is like all HPC resources limited. If disk space is not sufficient, external storage services like bwFileStorage can be used.

2 Data Transfer

Transfer of large files achieves higher throughput than transferring files of small size. It is recommended to collect files to a compressed archive file with tools like zip, tar, xz or others before transfer.

2.1 Transfer Tools

Type Software Remark Executable on Data transfer from/to
Local° bwUniCluster bwForCluster bwFileStorage www bwHPC cluster bwFileStorage
Command line tool scp Throughput < 150 MB/s (depending on cipher) + + + + + +
sftp + + + + + +
rsync + + + + + +
rdata Throughput to 350-400 MB/s + +
wget Download only + + + + + +
Client WinSCP based on SCP/SFTP, Windows only + + +
FileZilla based on SFTP + + +

° Depending on workstation's OS.

An extended list of tools you can find here.

2.2 Hosts

System Host
bwForCluster JUSTUS
bwForCluster MLS&WISO Production
bwForCluster MLS&WISO Production
bwFileStorage (SSH)