juropa.fz-juelich.de (Intel Xeon, Infiniband, MKL)

Here you find information about the system https://www.fz-juelich.de/jsc/juropa.

Numpy is installed system wide, so separate installation is not needed.

Building GPAW with gcc

Build GPAW using gcc with the configuration file customize_juropa_gcc.py.

scalapack = True

library_dirs += ['/opt/intel/Compiler/11.0/074/mkl/lib/em64t']
libraries = ['mkl_intel_lp64' ,'mkl_sequential' ,'mkl_core',
             'mkl_lapack',
             'mkl_scalapack_lp64', 'mkl_blacs_intelmpi_lp64',
             'pthread'
             ]

define_macros += [('GPAW_NO_UNDERSCORE_CBLACS', '1')]
define_macros += [('GPAW_NO_UNDERSCORE_CSCALAPACK', '1')]
define_macros += [("GPAW_ASYNC",1)]

and by executing:

module unload parastation/intel
module load parastation/gcc

python setup.py install --prefix='' --home=MY_INSTALLATION_DIR

Building GPAW with Intel compiler

Use the compiler wrapper file icc.py

#!/usr/bin/python
"""icc.py is a wrapper for the Intel compiler,
   converting/removing incompatible gcc args.   """

import sys
from subprocess import call

args2change = {"-fno-strict-aliasing":"",
               "-fmessage-length=0":"",
               "-Wall":"",
               "-std=c99":"-qlanglvl=extc99",
               "-fPIC":"",
               "-g":"",
               "-D_FORTIFY_SOURCE=2":"",
               "-DNDEBUG":"",
               "-UNDEBUG":"",
               "-pthread":"",
               "-shared":"-qmkshrobj",
               "-Xlinker":"",
               "-export-dynamic":"",
               "-Wstrict-prototypes":"",
               "-dynamic":"",
               "-O3":"",
               "-O3":"",
               "-O2":"",
               "-O1":""}

fragile_files = ["test.c"]

cmd = ""
fragile = False
for arg in sys.argv[1:]:
    cmd += " "
    t = arg.strip()
    if t in fragile_files:
        fragile = True
    if t in args2change:
        cmd += args2change[t]
    else:
        cmd += arg

flags = "-w -O3 -std=c99"
cmd = f"mpicc {flags} {cmd}"

call(cmd, shell=True)

Internal libxc

Before revision 10429 libxc was internal, the corresponding configuration file is customize_juropa_icc.py.

compiler = './icc.py'
mpicompiler = './icc.py'
mpilinker = 'MPICH_CC=gcc mpicc'

scalapack = True

library_dirs += ['/opt/intel/Compiler/11.0/074/mkl/lib/em64t']
libraries = ['mkl_intel_lp64' ,'mkl_sequential' ,'mkl_core',
             'mkl_lapack',
             'mkl_scalapack_lp64', 'mkl_blacs_intelmpi_lp64',
             'pthread'
             ]

define_macros += [('GPAW_NO_UNDERSCORE_CBLACS', '1')]
define_macros += [('GPAW_NO_UNDERSCORE_CSCALAPACK', '1')]
define_macros += [("GPAW_ASYNC",1)]

External libxc

After svn revision 10429 libxc has to be included as external library (see also the libxc web site). To install libxc we assume that MYLIBXCDIR is set to the directory where you want to install:

$ wget http://www.tddft.org/programs/octopus/down.php?file=libxc/libxc-2.0.2.tar.gz
$ tar -xzvf libxc-2.0.2.tar.gz
$ cd libxc-2.0.2/
$ mkdir install
$ ./configure CFLAGS="-fPIC" --prefix=$PWD/install -enable-shared
$ make |tee make.log
$ make install

This will have installed the libs $MYLIBXCDIR/libxc-2.0.2/install/lib and the C header files to $MYLIBXCDIR/libxc-2.0.2/install/include.

We have to modify the file customize.py to customize_juropa_icc_libxc.py

compiler = './icc.py'
mpicompiler = './icc.py'
mpilinker = 'MPICH_CC=gcc mpicc'

scalapack = True

library_dirs += ['/opt/intel/Compiler/11.0/074/mkl/lib/em64t']
libraries = ['mkl_intel_lp64' ,'mkl_sequential' ,'mkl_core',
             'mkl_lapack',
             'mkl_scalapack_lp64', 'mkl_blacs_intelmpi_lp64',
             'pthread'
             ]

libraries += ['xc']
# change this to your installation directory
LIBXCDIR='/lustre/jhome5/hfr04/hfr047/gridpaw/libxc-2.0.2/install/'
library_dirs += [LIBXCDIR + 'lib']
include_dirs += [LIBXCDIR + 'include']

define_macros += [('GPAW_NO_UNDERSCORE_CBLACS', '1')]
define_macros += [('GPAW_NO_UNDERSCORE_CSCALAPACK', '1')]
define_macros += [("GPAW_ASYNC",1)]

Note that the location of the external libxc on runtime has to be enabled by setting:

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$MYLIBXCDIR/libxc-2.0.2/install/lib

Compiling

Now, default parastation/intel module is used so execute only:

python setup.py install --prefix='' --home=MY_INSTALLATION_DIR

Execution

General execution instructions can be found at https://www.fz-juelich.de/jsc/juropa/usage/quick-intro.

Example batch job script for GPAW (512 cores, 30 minutes):

#!/bin/bash -x
#MSUB -l nodes=64:ppn=8
#MSUB -l walltime=0:30:00

cd $PBS_O_WORKDIR
export PYTHONPATH="MY_INSTALLATION_DIR/ase/lib64/python"
export PYTHONPATH="$PYTHONPATH":"MY_INSTALLATION_DIR/gpaw/svn/lib64/python"
export GPAW_SETUP_PATH=SETUP_DIR/gpaw-setups-0.5.3574
export GPAW_PYTHON=MY_INSTALLATION_DIR/bin/gpaw-python

export PSP_ONDEMAND=1

mpiexec -np 512 -x $GPAW_PYTHON my_input.py

Note that -x flag for mpiexec is needed for exporting the environment variables to MPI tasks. The environment variable PSP_ONDEMAND can decrease the running time with almost a factor of two with large process counts!

Job scripts can be written also using:

gpaw-runscript -h

Simultaneous Multi-Threading

SMT can be used to virtually double the number of nodes. A test case did not show an improvement in performance though.

#cores

t[s]

SMT

date

64

2484

no

9.5.2011

64

2438

no

16.5.2011

128

1081

no

16.5.2011

64

4812

yes

16.5.2011

128

2077

yes

16.5.2011

SMT can be switched on in gpaw-runscript via:

gpaw-runscript -s