NCSI

   

Performance Tools for Holistic HPC Workflows


Shodor > NCSI > XSEDE EMPOWER > XSEDE EMPOWER Positions > Performance Tools for Holistic HPC Workflows

Status
Completed
Mentor NameKaren L. Karavanic
Mentor's XSEDE AffiliationPortland State faculty, former EMPOWER mentor
Mentor Has Been in XSEDE Community4-5 years
Project TitlePerformance Tools for Holistic HPC Workflows
SummaryParticipate in PPerfLab research in developing performance tools for medium to large scale HPC workflows. We are developing monitoring tools to provide useful feedback to developers to guide them in addressing performance issues in their code, particularly related to workflows (applications that comprise some number of separate applications, libraries, and/or platforms) and data movement. This development requires testing various codes and gathering performance measurements with a number of different performance tools.
Job DescriptionThis work will entail conducting experiments with MPI- and OpenACC-based codes, possibly in combination with python scripts and a visualization tool. You will be learning and using a variety of tools to evaluate the runtime performance. The particular focus will be the data movement and storage. The student will receive training in using a shared cluster environment, using the Lustre parallel file system, and using a variety of development tools. Also additional training if needed for writing simple MPI and OpenACC - based codes. One application of focus will be a drought prediction code developed at Portland State in a collaborative research project.
Computational ResourcesUse of XSEDE Resources This project will use time on Linux clusters, on the PSU Coeus cluster, and on machines in the PI's laboratory. Our lab has only a small older 16-node Linux cluster with a "mini-Lustre" installation for initial learning. We hope to use SDSC's Comet or a similar XSEDE resource to allow the student to learn how to run science codes at the medium scale using SLURM or a similar resource manager and state of the art performance tools.
Contribution to CommunityThis project will provide direct HPC training and experience to two undergraduate computer science students. They will be part of our PPerfLab research group and will be mentored by both the PI and the group's Ph.D. students. We will encourage them to present their efforts in a poster.
Position TypeApprentice
Training PlanThe students are encouraged to take Parallel Programming or Introduction to Performance in Winter quarter, if that is possible then they already will have some exposure to MPI and the basics of parallel computing and performance. Our first step will be to get them hands on practice running a set of codes we provide, with some measurement tools. This will require a learning step for the performance tools. Once the learning curve is achieved, they will do actual runs to collect data we need for our research, first on our own facilities, and then, if they have been fast learners, on a larger remote facility. They will be welcome to contribute ideas for new features we might develop for tools. The PPerfLab graduate students will participate in mentoring and training the undergraduates.
Student Prerequisites/Conditions/QualificationsStudents must have completed a course in operating systems (at PSU this is CS 201 and CS 333) and be able to program in C/C++ in a linux environment. They must have good English communication skills both written and oral to participate in the research group meetings and prepare a poster for their work. Ideally the students will have taken either Parallel Programming or Introduction to Performance.
DurationQuarter
Start Date03/29/2021
End Date06/11/2021

Not Logged In. Login