Introduction to Parallel Computing and Scientific Computation

21-765. Introduction to Parallel Computing and Scientific Computation

Location: POS 147 Time: Fri 2:00-3:50pm
First Lecture: Jan 17, 2025

Projects due: graduating students - May 4, 2025 ; non-graduating students - May 9, 2025

Level: Introductory Coverage: General

Instructor: Florin B. Manolache
Objectives Slides, Homework

e-mail: florin@andrew.cmu.edu
General Considerations Projects

Office time: Wean Hall, Room 6218, 9am-3pm

Curriculum Bibliography

Course website: www.math.cmu.edu/~florin/M21-765/
Credit Administrativia

The objectives of this course are:

to develop structural intuition of how the hardware and the software work together, starting from simple systems to complex shared resource architectures;
to provide guidelines about how to write and document a software package;
to familiarize the audience with the main parallel programming techniques and the common software packages/libraries.

General Considerations

The course is intended to be self-consistent, no prior computer skills being required. However, familiarity with the C programming language and Unix command line should give the student more time to concentrate on the core issues of the course, as hardware structure, operating system and networking insights, numerical methods.

The main idea of the course is to give the student a hands-on experience of writing a simple software package that eventually can be implemented on a parallel computer architecture. All the steps and components of the process (defining the problem, numerical algorithms, program design, coding, different levels of documentation) are treated at a basic level. Everything is done in the context of a structured vision of the computing environment.

The typical programming environment makes the computer hardware and operating system transparent to the user. In contrast, each program intended for efficient parallel execution must consider the custom physical and logical communication topology of the processors in a parallel system. The course gives a general image over the entire range of issues that a developer should consider when designing a parallel algorithm, from principles to details. The knowledge provided by the course should be enough to help the audience decide what's the most appropriate technique to approach a problem on a given computer architecture. However, the development of an efficient algorithm will require a lot of additional study, practice, and experimental work.

The examples, exercises, and projects were determined by the computers and software available for practice. The following were preferred: the C language, the x86_64 hardware platform, and the Linux operating system. However, the presentation will be kept at a very general level such that the student is prepared for any real parallel computing environment. The individual study and the midterm project are based on Python.

The course contains three parts:

The first part makes the connection between real life and the computer world.

Module 1: software package structure, design, development, and maintenance concerns.
Module 2: parallel computing basic concepts and programming techniques: SMP, MPI, domain/data decomposition, deadlocks, hybrid programming.
Module 3: tools for programming and cluster management: git, remote access/key management, schedulers.
Module 4: how to transform a real life problem into a sequential computer algorithm, with reference to basic numerical methods.

The second part provides the background needed to understand how computer systems work.

Module 5: the layered model of the computer hardware basics.
Module 6: a model of structural information organization with applications to filesystems and storage.
Module 7: a typical operating system, user interfaces, shell, process communications, user level issues.
Module 8: programming notions with applications to the C language, libraries, compilers, debuggers.
Module 9: describes computer networks, topology, and layered communication protocols.

The third part explores the performance computing world.

Module 10: how to take advantage of multiple cores (SMP) through multi-threading and OpenMP.
Module 11: the MPI standard, several common implementation, additional library issues.
Module 12: the PETSC library, an interesting application of MPI for real life simulations.
Module 13: GPU computing: CUDA and OpenACC.
Module 14: modern developments: Big Data (Spark), Artificial Intelligence (Decision Trees, Neural Networks/Tensorflow).

Credit:

Grading is based on three components:

Class attendance (30%)
Merits of the final project (50%)
Midterm take-home test and/or in-class short quizzes (20%)

Students will need to pick a final project no later than the third week of the semester and to deliver milestones every other week. A project consists in developing a (simple) software package or module that has a defined practical purpose. More details and alternatives are here. Students are welcome to discuss with the instructor projects close to their scientific interests, or pick one of the offered projects.

Merits of the final project considered for grading are:

how well is the program structured such that it will allow easy further development, easy debugging (diagram of modules and APIs)
the quality (and not the length) of the documentation
how functional, efficient, and/or innovative is the numerical algorithm (tested on the provided examples)

Homework will be assigned after most lectures. Submitting solutions is not mandatory, but can add a bonus for the final grade. The purpose of the homework is to help develop practical skills and get used with the computing environment. The homework is recommended for students auditing the lectures as well. Students are encouraged to submit solutions containing interesting approaches or comments.

Administrativia

Students are welcome to participate for credit or for fun. Unregistered students should express their interest by e-mail to florin@andrew.cmu.edu or in person (Wean Hall, Room 6218) at any time before second day of school of the Spring Semester.

Third party websites used during the class (please create accounts if you don't already have):

LinkedIn Learning (free access for CMU affiliates): https://www.linkedin.com/learning/
ACCESS portal to Pittsburgh Supercomputing Center resources: https://access-ci.org/

I'm always available for consultations and for discussions regarding the projects and the curriculum.

Instructor: Florin B. Manolache	Objectives	Slides, Homework
e-mail: florin@andrew.cmu.edu	General Considerations	Projects
Office time: Wean Hall, Room 6218, 9am-3pm	Curriculum	Bibliography
Course website: www.math.cmu.edu/~florin/M21-765/	Credit	Administrativia