Large Scale Genetic Programming for Image Understanding
Summary
We are looking for students interested joining our research team to help develop software to scale a prototype Simple Evolutionary Exploration (SEE) library to utilize large scale computing systems. This exploration will look into leveraging High Performance Computing Resources (XSEDE), HTCondor and Cloud Resources. The goal of the project is to build image annotation system that works in "real time" with the researchers to explore the algorithm space for solutions to scientific image understanding problems.
Job Description
Students will work with a research team continue to scale a prototype software library to run on very large systems. The current prototype is written in Python and searches the "algorithm space" of image segmentation algorithms using single sample learning. The nature of the search space is very large and highly non-differentiable so standard optimization techniques do not apply. However, this software utilizes Genetic Programming which is pleasantly parallel and therefore this algorithm can easily leverage large scale systems.
Computational Resources
The current prototype has been run on a local Tear-III XSEDE resource and multiple cloud resources (Azure, Google Cloud and AWS). We hope to further test the system on these resources and specifically build an interface that works with HTCondor and maybe XSEDE cloud resources such as Jetstream.
Contribution to Community
The tools we are developing should have a broad impact for scientific image understanding. our hope is to build web-based gateway which will allow researchers to annotate their images on the front end while the tool searchers for an automated solution on the back end.
Position Type
Learner
Training Plan
I have over a decade of experience working on HPC resources (including XSEDE) and 2 decades of experience working with undergraduate researchers. I also teach courses on Parallel programming. I expect this project to focus more on pleasantly parallel methods. The student will start by getting to know the SEE library and running experiments on a single processors. I will then introduce the student to master/worker models and we will build prototypes using standard HPC systems with multiple single core processes. Once we have a feel for what is possible, we will extend the software to have a flexible workflow and test on a variety of large scale systems including HTCondor Resources available though OpenScience Grid and also look into workflows to leverage cloud resources.
Student Prerequisites/Conditions/Qualifications
Although no prior work experience is required, some knowledge of computer programming (primarily Python) or scientific computing is expected. Ideally, applicants will also have experience in one or more of the following: scientific image understanding, using HPC systems, hacking or tinkering. The research is gaining a lot of momentum and there are lots of opportunities.