Big Data Analytics for Sustainable Transportation Research
Summary
This project entails analytics of big transportation datasets (origin-destination matrices and GPS traces of alternative fuel vehicles and emerging mobility services) and the characterization of representative travel profiles across US regions. The students will be offered the opportunity to fit statistical and machine learning models that project transportation fuel or electricity use in the United States and interpret them.
Job Description
The student will be asked to go through and learn from several resources that will be provided, such as Jupyter notebooks that demonstrate transportation data analytics applications, cloud computing resources, and theoretical statistical and machine learning models. The student will contribute to the analysis of emerging mobility and alternative fuel transportation data via data cleaning, merging, analyzing, modeling, and interpreting practices to provide insights and predictions on emerging mobility services transportation energy use.
Computational Resources
XSEDE allocation (currently preparing the application) and access to a personal server
Contribution to Community
This project will introduce students interested in transportation system research to XSEDE resources and will enable interaction with domain experts in an emerging field of big-data analytics such as the sustainable transportation engineering one.
Position Type
Apprentice
Training Plan
The student will be provided multiple Jupyter notebooks with Python training material so as to get introduced to transportation data analytics. An introduction to leveraging Jetstream will be provided to enable the student to use virtual machines for both training and research purposes. The student through weekly interactions with their supervisor will also enhance their theoretical knowledge on statistical modeling techniques, particularly focusing on linear and logistic regression models, decision trees, random forests, hierachical clustering algorithms, and neural networks for the characterization and prediction of transportation energy use with interpretable models.
Student Prerequisites/Conditions/Qualifications
basic data analysis skills, basic coding skills in Python, basic statistical modeling knowledge or econometrics