Large Scale Genetic Programming for Image Understanding
Summary
We are looking for students interested joining our research team to help develop software to scale a prototype Simple Evolutionary Exploration (SEE) library to utilize large scale computing systems. This exploration will look into leveraging High Performance Computing Resources (XSEDE), HTCondor and Cloud Resources. The long-term goal of the project is to build image annotation system that works in "real time" with the researchers to explore the algorithm space for solutions to scientific image understanding problems.
Job Description
Students will work with a research team to help scale a prototype software library to run on very large systems. The current prototype is written in Python and searches the "algorithm space" of image segmentation algorithms using single sample learning. The nature of the search space is very large and highly non-differentiable so standard optimization techniques do not apply. However, this software utilizes Genetic Programming which is pleasantly parallel and therefore this algorithm can easily leverage large scale systems. In this research, we will explore methods to scale the system using a continuous and dynamic master/worker model.
Computational Resources
As a proof of concept will likely start this research on a standard HPC system that can leverage job arrays (or similar pleasantly parallel methods). We then plan to move to HTCondor cluster (Open Science Grid) and maybe even explore cloud based resources.
Contribution to Community
Position Type
Apprentice
Training Plan
I have over a decade of experience working on HPC resources (including XSEDE) and 2 decades of experience working with undergraduate researchers. I also teach courses on Parallel programming. I expect this project to focus more on pleasantly parallel methods. The student will start by getting to know the SEE library and running experiments on a single processors. I will then introduce the student to master/worker models and we will build prototypes using standard HPC systems with multiple single core processes. Once we have a feel for what is possible, we will extend the software to have a flexible workflow and test on a variety of large scale systems including HTCondor Resources available though OpenScience Grid and also look into workflows to leverage cloud resources.
Student Prerequisites/Conditions/Qualifications
Although no prior work experience is required, some knowledge of computer programming (primarily Python) or scientific computing is expected. Ideally, applicants will also have experience in one or more of the following: scientific image understanding, using HPC systems, hacking or tinkering.
The research is gaining a lot of momentum and there are lots of opportunities. I would happily also consider Learners and Apprentices and taking on more than one student (Assuming it aligns with the goals of the program)