This is Luigi Leung's math.cmu page. This page is for:

21-765. Introduction to Parallel Computing and Scientific Computation


Last update: 5/21/2013
Project folder
or use git clone git:// ~/TARGET_DIR
and navigate to the "project/" directory

week7 - Design a modular implementation of the software

Option 1:
Modify exisiting C code in Alfy to become parallel, while having the parallel part to be modular so any new versions of Alfy can easily be converted.
Option 2:
Using tee to parallelize Alfy, and when new versions of Alfy exists, a change in a shell script varible (version number) can make the new version be parallelized.

More information on the exisiting software, Alfy:
Alignment-free detection of local similarity among viral and bacterial genomes. (2011)

week5 - Isolate class of algorithms, identify new research, pick algorithm

progressive alignment (e.g. Clustal, MAFFT)
iterative method (e.g. MUSCLE)
Tree construction:
hierarchical clustering method (e.g. neighbor-joining, UPGMA)

New research:
ClustalOmega is a parallel version of the Clustal alignment algorithm. (2010)
Tree construction:
Phylogenetic tree construction using stochastic optimization and clustering. (2006)

progressive alignment
Tree construction:
hierarchical clustering method

week3 - Clearly establish the problem

In biology, running genetic sequence alignments and building phylogenetic trees take a long time.
For example, investigating variations of a particular gene (Pyl) between closely related organisms (20x species of Archaea) takes over one hour (on an Intel Core 2 Duo 2.4Ghz CPU) to build one phylogenetic tree per gene variation per tree construction "substitution model" method.

Parallel algorithm for multiple sequence alignment (MSA).
Parallel algorithm for phylogenetic tree building.



q7. For a computer with an IP of and netmask of,
the network address is and the broadcast address is

Network address:
convert IP and Netmask to binary, bitwise AND, convert to decimal.
Broadcast address:
bitwise OR the binary network address and inverted binary netmask.

q9. Using the option flag lsof -i can find the ports open by network services. The number under "name" after the ":" symbol is the port number.


q1. Pentium D CPU uses a LGA 775 socket, and a motherboard that can use this CPU is Intel S3120SHLC Server Motherboard. It is $225 on

q2. A CPU that will work with the motherboard is Intel E7500. It is LGA 775 and is $75 on

q3. The motherboard uses DDR2 RAM, and can support up to 8GB ECC Fully Buffered. Crucial 2x4GB DDR2 GB-DIMM Server Memory Model can be used and is $255.

q4. The motherboard uses PCI-E x8 and x16. An Nvidia GPU can can work is Nvidia Tesla C2075. It has PCI-E 2.0 x16 and the monitor connector is DVI. It is $2000 on

q5. A criteria for a SATA harddrive is to be reliable and fast. Crucial M500 is a good one. It uses SATA connectors.

q6. The motherboard is a ATX size. A case that would fit an ATX form factor is Lian Li PC-K65. It is $60 on newegg. A power supply to go with it can be SeaSonic X650. It is efficient with a gold "80 plus" certificate and is $110 on newegg.

q7. The difference between VGA and DVI is the shape of the connector and that VGA is analog and DVI is digital. A monitor for the Nvidia C2075 graphics card requires DVI and a Dell U2713HM can accept DVI connectors. It is a 27" monitor and is $700.

q8. The rest of the components like mouse and keyboard can use USB connectors and the CD drive can use SATA connectors.


