SUNCAT
(NOTE: With MKL 10.3 we have seen hangs in the early mpireduce calls for a small number of calculations. Until I have understood this I am backing out to MKL 10.2.)
At SLAC we compiled GPAW for RHEL5 x86_64, on intel Xeon 5650 with intel compilers and mkl. This improved the 8-core performance benchmark by 13% compared to the opencc/ACML approach.
Package |
Version |
---|---|
python |
2.4 |
gpaw |
0.8.0.7419 |
ase |
3.5.0.1919 |
numpy |
1.4.1 |
openmpi |
1.4.3 |
mkl |
10.3 |
intel compilers |
11.1 (includes mkl 10.2 by default) |
openmpi
openmpi was built with the intel compilers as follows:
$ ./configure --prefix=/nfs/slac/g/suncatfs/sw/gpawv15/install CC=icc CXX=icpc F77=ifort FC=ifort
$ make
$ make install
numpy
Build in usual fashion. At the moment we use default gnu compilers for numpy, since gpaw performance benchmark drops by 3% when it is built with icc/mkl/dotblas, for reasons that are not understood. Also, some gpaw self-tests start to fail.
gpaw
For this we use
customize_mkl10.3.py
:
scalapack = False
compiler = 'icc'
libraries =['mkl_rt','pthread','m']
library_dirs = ['/nfs/slac/g/suncatfs/sw/external/intel11.1/openmpi/1.4.3/install/lib','/afs/slac/package/intel_tools/2011u8/mkl/lib/intel64/']
include_dirs += ['/nfs/slac/g/suncatfs/sw/external/numpy/1.4.1/install/lib64/python2.4/site-packages/numpy/core/include']
extra_link_args += ['-fPIC']
extra_compile_args = ['-I/afs/slac/package/intel_tools/2011u8/mkl/include','-xHOST','-O1','-ipo','-no-prec-div','-static','-std=c99','-fPIC']
mpicompiler = 'mpicc'
mpilinker = mpicompiler
Note that this customize.py works only with MKL version 10.3 which has simplified linking.
The environment settings (valid at SUNCAT) to be able to link and run:
#!/bin/bash
export EXTERNALDIR=/nfs/slac/g/suncatfs/sw/external
export NUMPYDIR=${EXTERNALDIR}/numpy/1.4.1/install/lib64/python2.4/site-packages
export SCIPYDIR=${EXTERNALDIR}/scipy/0.7.0/install/lib64/python2.4/site-packages
export ASEBASE=${EXTERNALDIR}/ase/3.5.0.1919/install
export ASEDIR=${ASEBASE}/lib/python2.4/site-packages
export INTELDIR=/afs/slac/package/intel_tools/2011u8
export MKLDIR=${INTELDIR}/mkl/lib/intel64
export OPENMPIDIR=${EXTERNALDIR}/intel11.1/openmpi/1.4.3/install
export MKL_THREADING_LAYER=MKL_THREADING_SEQUENTIAL
export OMP_NUM_THREADS=1
export INSTALLDIR=${GPAW_HOME}/install
export PYTHONPATH=${ASEDIR}:${SCIPYDIR}:${NUMPYDIR}:${INSTALLDIR}/lib64/python
export PATH=/bin:/usr/bin:${OPENMPIDIR}/bin:${INTELDIR}/bin:${INSTALLDIR}/bin:${ASEBASE}/bin
export LD_LIBRARY_PATH=${INSTALLDIR}/lib:${MKLDIR}:${INTELDIR}/lib/intel64:${OPENMPIDIR}/lib:${MKLDIR}/../32
export GPAW_SETUP_PATH=${EXTERNALDIR}/gpaw-setups-0.6.6300
MKL 10.2 Notes
For historical reasons, we also include the customize.py for MKL 10.2:
scalapack = False
compiler = 'icc'
libraries =['mkl_intel_lp64','mkl_sequential','mkl_cdft_core','mkl_core','pthread','m']
library_dirs = ['/nfs/slac/g/suncatfs/sw/external/intel11.1/openmpi/1.4.3/install/lib','/afs/slac/package/intel_tools/compiler11.1/mkl/lib/em64t/']
include_dirs += ['/nfs/slac/g/suncatfs/sw/external/numpy/1.4.1/install/lib64/python2.4/site-packages/numpy/core/include']
extra_link_args += ['-fPIC']
extra_compile_args = ['-I/afs/slac/package/intel_tools/compiler11.1/mkl/include','-xHOST','-O1','-ipo','-no-prec-div','-static','-std=c99','-fPIC']
define_macros =[('GPAW_NO_UNDERSCORE_CBLACS', '1'), ('GPAW_NO_UNDERSCORE_CSCALAPACK', '1')]
mpicompiler = 'mpicc'
mpilinker = mpicompiler
This older version requires a fairly bad hack to make it work in all cases:
$ setenv LD_PRELOAD libmkl_core.so:libmkl_sequential.so
I believe this is because python uses “dlopen” for shared libraries, which has troubles with the circular dependencies present in MKL 10.2.
This hack can cause (ignorable) errors from unrelated commands like “ping” which prevents the use of LD_PRELOAD for security reasons.