MAP - bwHPC Wiki MAP - bwHPC Wiki

MAP

From bwHPC Wiki
Jump to: navigation, search
Description Content
module load devel/ddt
Availability bwUniCluster | BwForCluster_Chemistry
License Floating or server locked licences for developers and users. | Allinea DDT licensing
Citing n/a
Links Allinea MAP Homepage
Graphical Interface Yes


1 Introduction

Allinea MAP is a profiling tool for C, C++ and F90 code and part of the Allinea Forge toolkit.
Allinea MAP provides information a developer requires to identifiy and remove so called bottlenecks in a software. This helps him to develop software which is faster and more scalable.

Time charts in the main window give a quick overview about memory usage, cpu vectorization, I/O, MPI and threading. MAP can identify functions and source lines which consume most of the run time. An analysis of the cpu instructions shows where vectorization is used or required. Allinea is capable of detecting issues conserning memory performance or I/O of an application, so they can be handled.
Allinea also supports analysis of programs using MPI or OpenMP. This helps to recognize problems with calculations or MPI communication as well as synchronisation performance problems with OpenMP or pthread.
Profiling data can be stored for later usage, so that different versions can be compared during any time of a development. The typical overhead for analysis is said to be less than 5% and the analysis requires no additional instrumentation software. The Allinea MAP tool offers an intuitive yet powerful GUI to provide all these features.

See more infos about MAP

2 Versions and Availability

A list of versions currently available on all bwHPC-C5-Clusters can be obtained from the

Cluster Information System CIS

On the command line interface of any bwHPC cluster you'll get a list of available versions by using the command
'module avail devel/ddt'.

$ : bwUniCluster
$ module avail devel/ddt
------------------------ /opt/bwhpc/common/modulefiles -------------------------
devel/ddt/4.2.1 devel/ddt/5.0.1


3 Loading

A possible option to create a profile with Allinea MAP is to create the profiling data file first and then load it with the tool.
'map -profile [program-name]'
This map tool call analyses the given program and puts the profile result into a so called map file. After this the tool quits. In this case map tools do not use the graphical interface.

An existing map file can be read in using the Allinea MAP tool.
'map [map-name]'

Another possibility is to provide the program which should be analyse directly as an argument to the tool.
'map [program-name]'
This tool call combines both options. First the program is analysed, creating a map file. Then this profile file is read in and the graphical interface opened.

If a program using MPI communication is to be examined the call via mpirun can be omitted in the latest versions. Allinea MAP recognizes that MPI is used and handles the call accordingly.
'map -profile -n [number] [program-name]' or 'map -n [number] [program-name]' instead of 'map mpirun -n [number] [program-name]'

[program-name] is the name of your program that you want to profile.

If the programs are compiled with the debugging options (e.g. -g for the gcc compiler) MAP is able to connect the generated information with the debugging information. For this Allinea MAP does not need the source code. It interacts directly with the executable.
To use Alline MAP it is necessary to load the required module first. Display the installed versions with the 'avail' parameter of the module command. And then load the desired version. In the below samples version 4.2.1 is used, because this version is currently available on all bwHPC-C5-Clusters. The GUI might be different in other versions.

The following program to calculate pi is used as a sample. One version uses MPI, the other doesn't.
Pi calculation: pi.c

#include <stdlib.h>
#include <stdio.h>
#include <math.h>

int main(int argc, char* argv[])
{
        const double PI24 = 3.141592653589793238462643;
        int samples_n, c, hits;
        double x, y;

        if(argc != 2)
        {
                fprintf(stdout, "Incorrect number of samples.\n");
                fflush(stdout);
                return 0;
        }
        sscanf(argv[1], "%d", &samples_n);

        int n = samples_n;
        hits = 0;

        for(c = 0; c < n; c++)
        {
                x = ((double) random()) / ((double) RAND_MAX);
                y = ((double) random()) / ((double) RAND_MAX);
                if( ((x*x) + (y*y)) <= 1) hits++;
        }

        int hits_complete = hits;
        double pi = (hits_complete*4) / ((double) samples_n);

        fprintf(stdout, "Pi is approximately %18.16f\n", pi);
        fprintf(stdout, "Error is %18.16f\n", fabs(pi-PI24));

        return 0;
}

Pi calculation with MPI: pi_mpi.c

#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include "mpi.h"

int main(int argc, char* argv[])
{
        const double PI24 = 3.141592653589793238462643;
        int samples_n, n, c, hits, id, num_procs;
        double x, y;

        MPI_Init(&argc, &argv);
        MPI_Comm_size(MPI_COMM_WORLD, &num_procs);
        MPI_Comm_rank(MPI_COMM_WORLD, &id);

        if(argc != 2)
        {
                if(id == 0)
                {
                        fprintf(stdout, "Incorrect number of samples.\n");
                        fflush(stdout);
                }
                MPI_Abort(MPI_COMM_WORLD, 1);
        }
	sscanf(argv[1], "%d", &samples_n);

        MPI_Bcast(&samples_n, 1, MPI_INT, 0, MPI_COMM_WORLD);

        n = samples_n / num_procs;
        if(id == num_procs - 1) n += samples_n % num_procs;

        hits = 0;
        for(c = 0; c < n; c++)
        {
                x = ((double) random()) / ((double) RAND_MAX);
                y = ((double) random()) / ((double) RAND_MAX);
                if( ((x*x) + (y*y)) <= 1.0) hits++;
        }

	int hits_complete = 0;
        MPI_Reduce(&hits, &hits_complete, 1, MPI_INT, MPI_SUM, 0, MPI_COMM_WORLD);
        double pi = (hits_complete*4) / ((double) samples_n);

        if(id == 0)
        {
                fprintf(stdout, "Pi is approximately %18.16f\n", pi);
                fprintf(stdout, "Error is %18.16f\n", fabs(pi-PI24));
        }

	MPI_Finalize();
        return 0;
}

Sample compiling and loading necessary modules:

# Loading the Allinea Forge toolkit. 
$ module avail devel/ddt
[...]
devel/ddt/4.2.1
$ module load devel/ddt/4.2.1
[...]
# For MPI programs.
$ module avail mpi/openmpi
[...] mpi/openmpi/1.10-gnu-5.2 [...]
$ module load mpi/openmpi/1.10-gnu-5.2
# Compiling the sample programs
$ gcc -lm -g -o pi pi.c
# The MPI version
$ mpicc -lm -g -o pi_mpi pi_mpi.c


One can now use one of the above possibilities to analyse the program.

$ map pi 1234567890
# or e. g.
$ map -n 14 pi_mpi 1234567890


3.1 GUI

3.1.1 Start Window

Map startwin.jpg
When Allinea MAP is started without an option or with a program name as parameter, this window opens. If no option was provided, the GUI is used to choose whether to read in a map file or to initiate a program call.
If a program call was selected or a program name was provided as parameter, the GUI offers options to be set or checked - as shown in this picture. This also includes parameters for MPI, OpenMP or the submission to a queue.

3.1.2 Main Window

Map mainwin.png
The main window opens after program analysis and the creation of a map file or when a map file was provided as parameter. This window has three main segments.
Map mainwin overview.jpg
The upper segment displays the program analysis as a time charts. By default it shows memory usage, MPI calls and CPU floating-point instructions over time.
Map mainwin presets.jpg
There are several presets, which combine different settings. The described setting corresponds to the preset 'default'. To access the different presets, right click above the time charts. Every preset has additional information about a certain area of the time line.
With a right click the current time charts can be adjusted. It is possible to add or remove time charts by selecting the favoured chart.
Map mainwin memory.jpg
The preset 'Memory' shows additional information about memory access of the analysed program.
Map mainwin disk.jpg
The I/O preset presents an overview of read and write activities. I/O is a common cause for bottlenecks in a program. In the screenshot one can see that the samples did not load or store any kind of data.
Map mainwin cpu.jpg
Whereas the preset 'CPU Time' displays a summary showing how busy a CPU was at which point in time...
Map mainwin ins.jpg
...the 'CPU Instructions' preset depicts an overview based on instruction types, for example how many integer instructions where performed by the CPU at any moment.
Map mainwin mpi.jpg
Using the preset 'MPI' one can see the duration and number of MPI calls amongst other information concerning MPI.
Map mainwin source.jpg
If the analysed program is compiled with debugging information the middle section of the main window can be used to walk through the program code. At the start MAP highlights the line of code in which the program spent the most time.
Map mainwin function.jpg
Similar information can be seen in the lower segment of the main window in the tab 'Parallel Stack View'. This segment doesn't show the full source code. Instead the focus is on the timed used by program functions. These are sorted by the amount of time they used. The function with the code line in which the program spent the most time is listed first. The others follow in descending order.
Map mainwin output.jpg
This tab shows the output of the analysed program.