Parallel Cache Oblivious Data Structures and Algorithms
Summary
Students will be working on building a library of cache-oblivious data structures and measuring the performance under different workloads. We will first implement serial versions of the algorithms, and then implement the parallel version of several known cache oblivious data structures and algorithms.
Job Description
Student Duties: Students will be responsible for working a minimum of 12 weeks (a full semester) for 30 hours a week. Students will be meeting with me once a week for a weekly check-in for feedback, demonstration of progress, ask questions, and make forward progress on the next guided training tasks.
There are two scientific applications for this proposal: 1. Scientific Application: Students will be working with me to learn and understand how cache-oblivious data structures work in C/C++. 2. Scientific Application: After implementing the algorithms, we will be writing our results and submitting for workshops for publication.
Computational Resources
Students will make use of XSEDE trainings available online [1]. Additionally, an Education Allocation request will be used for these students to perform work primarily at the end of their training [2].
The training plan is to bring in students(ideally in pairs of 2) who are currently sophomores/junior and have taken a Computer Systems course using C/C++. Students need not have any previous research experience, but generally will have experience using threads(e.g. pthreads) and have taken an algorithms course.
[1] (2 weeks) Students will first work through understanding the basics of Cache-Oblvious Algorithms and Data structures from: http://erikdemaine.org/papers/BRICS2002/paper.pdf [2] (2 weeks) Students will then work through select lectures and exercises on caches from here: https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-172-performance-engineering-of-software-systems-fall-2010/video-lectures/ [3] (1 week) Students will then learn the basics of profiling [4] (2 weeks) Next students will implement a few data structures and algorithms, and then [5] (4 weeks) Students will work to find good real world benchmarks, mining github repositories for benchmarks that suffer from false-sharing performance related problems. [6] The remaining time will be writing up and polishing collected results.
As a reference text, we will be using Computer Architecture: A Quantitative Approach (Specifically the sections on Memory Hierarchy)
Before using XSEDE resources, we will take advantage of Northeastern University's Discovery Cluster to run preliminary experiments.