cctbx.xray.structure_factors.from_scatterers_direct_parallel module¶
- class cctbx.xray.structure_factors.from_scatterers_direct_parallel.direct_summation_cuda_platform(float_t='double')¶
Bases:
direct_summation_simple
- validate_platform_resources()¶
- class cctbx.xray.structure_factors.from_scatterers_direct_parallel.direct_summation_simple¶
Bases:
object
- use_alt_parallel = True¶
- class cctbx.xray.structure_factors.from_scatterers_direct_parallel.fcalc_container(fcalc)¶
Bases:
object
- f_calc()¶
- class cctbx.xray.structure_factors.from_scatterers_direct_parallel.pprocess(instance, algorithm, verbose=False)¶
Bases:
object
Class to represent a worker process
Software requirements for running direct summation using Cuda parallel processing units:
numpy package (http://numpy.scipy.org) version 1.5.1 OK
pycuda package, version 2011.1 OK
suggested install scheme for Linux (not necessarily exact):
../base/bin/python configure.py --cuda-root=/usr/common/usg/cuda/3.2 \ --cudadrv-lib-dir=/usr/common/usg/nvidia-driver-util/3.2/lib64 \ --boost-inc-dir=$HOME/boostbuild/include \ --boost-lib-dir=$HOME/boostbuild/lib \ --boost-python-libname=boost_python \ --boost-thread-libname=boost_thread make install
gcc 4.4.2 or higher is required for Linux build of pycuda 2011.1
boost_adaptbx.boost.python is required for pycuda; cctbx-installed version is probably OK, not tested. tests were performed with separately-installed boost:
cd boost_1_45_0 ./bootstrap.sh --prefix=$HOME/boostbuild --libdir=$HOME/boostbuild/lib \ --with-python=$HOME/build/base/bin/python \ --with-libraries=signals,thread,python ./bjam variant=release link=shared install
cuda 3.2 separately installed is required for pycuda 2011.1 (http://developer.nvidia.com/object/cuda_3_2_downloads.html)
- Suggested hardware as tested:
Nvidia Tesla C2050 (Fermi, compute capability 2.0): 225-fold performance improvement over CPU.
Nvidia Tesla C1060 (compute capability 1.3): 24-fold performance improvement.
- prepare_gaussians_symmetries_cell(algorithm)¶
- prepare_miller_arrays_for_cuda(algorithm, verbose=False)¶
The number of miller indices and atoms each must be an exact multiple of the BLOCKSIZE (32), so zero-padding is employed
- prepare_scattering_sites_for_cuda(algorithm, verbose=False)¶
- print_diagnostics()¶
- validate_the_inputs(manager, cuda, algorithm)¶