Novel Randomized Algorithms for Large-Scale Matrix Completion
Summary
This project targets one of the most challenging problems in big data analytics, called large-scale matrix completion. In particular, we will design scalable and efficient randomized algorithms to enable matrix completion to handle large matrices and will implement the proposed algorithms on modern parallel/distributed computing systems. The key goal of this project is to support the training of undergraduate students.
Job Description
The students will help the project by investigating randomized techniques to improve scalability and efficiency of matrix completion algorithms. Moreover, the students will implement and optimize the proposed algorithms in big data processing frameworks, such as Apache Spark. We expect that the theoretical research in this project will be examined and validated by numerical experiments on a set of data benchmarks collected from a variety of applications.
Computational Resources
XSEDE Bridges' Hadoop and Spark resources
Contribution to Community
Position Type
Apprentice
Training Plan
I will offer a Machine Learning course in the spring quarter 2018. This course will prepare students with necessary knowledge and skills in Machine Learning and Matrix operations. During this project, I will work closely with the students and plan to have weekly meetings with them to guide their research progress.
Student Prerequisites/Conditions/Qualifications
Must have an undergraduate at California State Polytechnic University, Pomona
Must have a good understanding of linear algebra and good programming skills in Scala, Java, or Python.