Parallel Processing and the Madrigal Database
The Madrigal database is used throughout the space science community to distribute data on Earth's upper atmosphere. Built into the Madrigal database is a derivation engine that can calculate many additional scientific parameters besides those stored in the underlying data files. However, this extra dynamic computation can slow the response of the database when a user requests a large set of data with many derived parameters. The derivation engine is presently run serially, but its underlying algorithm seems to lend itself to parallelization. The goal of this project is to investigate possible improvement to Madrigal's performance using a NVIDIA CUDA GPU processor. The Madrigal derivation engine is built using both python and C, so parallelization may be done either at the C level or the python level using PyCUDA.
Questions that might be answered:
What kind of speed benefits can be expected given a particular derivation method and CUDA hardware configuration?
How can Fortran derivation methods be handled?
Should Madrigal derivation engine be reorganized to allow python at CUDA level?
Can installation take advantage of CUDA automatically?
Task list
Learn python
Learn Eclipse/Subversion
Learn CUDA
Learn PyCUDA
Learn and install Madrigal
Simulate derivation methods using PyCUDA
Develop timing table for all derivation methods
Study ways to integrate CUDA into Madrigal
C
Python
Notes
Subversion branch for CUDA development in OpenMadrigalSVN/branches/cuda created 2011-06-01 from trunk
Notes on Fermi:
location of pycuda: /usr/local/midas/baseline/2010/pycuda
location of cuda: /usr/local/midas/cuda
command to set python to use pycuda libraries: source /usr/local/midas/baseline/2010/setup.sh
Notes on CUDA:
When the nvcc complies CUDA code, it automatically #includes a number of CUDA header files, the math files are included in this.
All time calls need to be done before handing the computation off to the GPU.
#defines work the same in CUDA as in C, although when using #include, one has to be referencing a library supported by CUDA, not the normal C compiler.
CUDA does not support the NAN float type.
pyCUDA Timing Table:
A simple average and difference calculation, run using pyCUDA on a GPU as well as python on a CPU:
Average and Difference
GPU
CPU
Number of Computations(10^n)
Computation Time(sec)
Computation Time(sec)
7
0.181
57.7
6
0.0160
5.85
5
0.00220
0.578
4
0.000950
0.0576
3
0.000907
0.00565
2
0.00863
0.000943
*A more complex method in the Madrigal engine, actually written in Fortran (convrt.f):
convrt | pyCUDA | Fortran |
---|---|---|
Number of Computations(10^n) | Computation Time(sec) | Computation Time(sec) |
7 | 0.481 | 19.5 |
6 | 0.0463 | 1.95 |
5 | 0.00552 | 0.194 |
4 | 0.00144 | 0.0195 |
3 | 0.00110 | 0.00195 |
2 | 0.00160 | 0.000340 |