Introduction
CUDASW++ is a bioinformatics software for Smith-Waterman protein database searches that takes advantage of the massively parallel CUDA architecture of NVIDIA Tesla GPUs to perform sequence searches 10x-50x faster than NCBI BLAST. In this algorithm, we deeply explore the SIMT (Single Instruction, Multiple Thread) and virtualized SIMD (Single Instruction, Multiple Data) abstractions to achieve fast speed. This algorithm has been fully tested on Tesla C1060, Tesla C2050, GeForce GTX 280 and GTX 295 graphics cards, and has been incorporated to NVIDIA Tesla Bio Workbench.
Downloads
Citation
- Yongchao Liu, Douglas L. Maskell, Bertil Schmidt: "CUDASW++: optimizing Smith-Waterman sequence database searches for CUDA-enabled graphics processing units". BMC Research Notes, 2009, 2:73
- Yongchao Liu, Bertil Schmidt, Douglas L. Maskell: "CUDASW++2.0: enhanced Smith-Waterman protein database search on CUDA-enabled GPUs based on SIMT and virtualized SIMD abstractions". BMC Research Notes, 2010, 3:93
Usage
Preparation
- single or multiple CUDA-enabled GPUs with compute capability 1.2 or higher.
- CUDA toolkits and SDK 2.0 or higher.
Download and compiling
- download and unzip the source code
- type command "make" to compile the source code and an executable binary "cudasw" is generated.
- When using Fermi with compute capability 2.0, please change "-arch" function of nvcc to "-arch sm_20"; For device capability 1.3, use "-arch sm_13"; for 1.2, use "-arch sm_12"
Typical Usage
- ./cudasw, ./cudasw -? or ./cudasw -help to get all the supported parameters.
- ./cudasw -mod simt -query query_file -db db_file -mat blosum62
- ./cudasw -mod simd -query query_file -db db_file -mat blosum45
- ./cudasw -query query_file -db db_file -mat blosum62 -gapo 20 -gape 2
- ./cudasw -mod simd -query query_file -db db_file -gapo 20 -gape 2
- ./cudasw -query query_file -db db_file -use_single 0
- ./cudasw -query query_file -db db_file -use_single 1
Important notes
- query_file and database_file use FASTA format files and many query/subject sequences can be stored in a single file respectively (recommended).
- two models are supported: simt and smid. The simt model uses the optimized SIMT Smith Waterman algorithm, which is independent of the scoring scheme used. The simd model uses the partitioned vectorized Smith Waterman algorithm, which is kind of sensitive to the scoring scheme used.
- supported scoring matrix names: blosum45, blosum50, blosum62 and blosum80. if the scoring matrix is not specified or not supported, blosum62 is used by default
- the default gap open penalty is 10 and gap extension penalty is 2.
- When using a single GPU (option -use_single
), you can specifiy the index of the single GPU used for the comptuation. The index starts from 0.
Contact
If any questions or improvements, please contact Liu Yongchao (liuy@uni-mainz.de; liuy0039@ntu.edu.sg)