Please note that XSEDE EMPOWER ended with the conclusion of the XSEDE project on August 31, 2022.
The following information is provided for archival purposes only.
This page provides a listing of positions that are
under review for the next iteration of the program, as well as those that have been approved for current and
previous iterations of the program. As part of the review and approval process, accepted student
applicants are associated with an approved position.
If you are intending to mentor a student, you should create a position. If you also have a student
in mind with whom you want to work, that student should submit an application. If you intend to
mentor multiple students, please create a separate position for each student.
Mental Health risk analysis and mitigation have been a challenge for public health researchers. Recently available large data sets and machine learning offers new opportunity to tackle this challenge. We have gathered a large data set with hundreds of community characteristics on county and Zipcode levels. We have also developed graph learning methods to better understand the data. In addition, we are working on visualization techniques to improve the understanding of high-dimensional data. The goal is to develop a model to delineate the environmental and social determinants in regard to suicide prevention.
We are looking for students interested joining our research team to help develop software to scale a prototype Simple Evolutionary Exploration (SEE) library to utilize large scale computing systems. The search space involved in this research is extremely large and requires massive computing resources. This research will look into leveraging High Performance Computing Resources (XSEDE), HTCondor and Cloud Resources. The long term goal of the project is to build image annotation system that works in "real time" with the researchers to explore the algorithm space for solutions to scientific image understanding problems.
The project will develop the frontend and backend of a React-based GUI for cryoEM data processing. The student will also design and implement different data visualization elements that help the understanding of cryoEM data processing results. As part of the COSMIC2 (cosmic-cryoem.org) project, it will also involve the deployment of the GUI on the public science gateway. The student should have experience with React and Linux environments. Experience with Python is beneficial too.
We aim to employ the multi-core architecture and high-RAM capacity of the Bridges-2 RM resource to parallelize the processing, slicing, and re-stacking of our credit card transaction dataset, containing more than 11 billion transactions. A student would help the project move forward while gaining supercomputing technical skills and mentorship.
The main goal of this project is to develop and evaluate a deep learning pipeline for reconstructing particle trajectories. These algorithms can then be used by the scientists at CERN to predict the tracks of the particles in the Large Hadron Collider based on their trajectories. With the ever-growing data from scientific experiments, it is imperative to have automatic ways to analyze that data. Specifically, we work with deep learning models including graph neural networks (GNNs) and multilayer perceptron (MLP) and compare them in terms of accuracy and computing time. The integration of the inference pipeline with the ACTS framework will allow scientists to perform a large set of experiments.
Sub-grid models for flowing dispersions are essential for the high-fidelity simulations of cough droplets leaving your mouth, bubble clouds cavitating near ship propellers, and more. Still, these sub-grid models are complex with high arithmetic intensity, dominating the cost of their associated simulations. We will extend a current platform of quadrature-based moment methods to operate on GPUs and interact with large-scale CFD codes.
This project entails analytics of big transportation datasets (origin-destination matrices and GPS traces of alternative fuel vehicles and emerging mobility services) and the characterization of representative travel profiles across US regions. The students will be offered the opportunity to fit statistical and machine learning models that project transportation fuel or electricity use in the United States and interpret them.
The position involves using common HPC tools to collect performance metrics, such as MPI communications, accesses of memory/cache hierarchies, I/O across different storage devices, on a variety of modern HPC platforms for a large number of commonly used scientific applications in different scientific domains. The collected datasets and metrics can serve as benchmarks and guides for scaling and optimizations of the applications and for the optimal configurations of computational resources, for general scientific user communities, and potentially for the acquisition of resources at campus data centers and for better informed development and configurations of hardware for the optimal uses of scientific applications.
This project focuses on developing and testing complex stock analysis and prediction models utilizing state-of-the-art machine learning algorithms. While stock analysis and prediction may not be inherently intensive, this project seeks to expand upon presently accepted techniques in order to develop new and improved methods. The goal of this project is to integrate several databases (such as Google Trends, Tweets, general stock data, etc.) and conduct stock predictions with a large sum of data being fed into the models. Similarly, we seek to expand this project in a way such that we can conduct the analysis and predictions of our stock information in near-real time, offering a rapidly deployable prediction model. The results of this project will be shared in journals and conferences in order to add to modern techniques and approaches to time series analysis and predictive modeling.
The COVID-19 pandemic is one of the most severe global pandemics, leading to the need to rapidly deploy a useful vaccine strategy in order to alleviate the drastic measures taken to protect the public through lockdowns, etc. Vaccinations are the most effective prevention technique and can hugely protect vulnerable populations and regions. Given the shortage of vaccines and the limited speed at which they can be administered to populations, determining which populations should be vaccinated first is an urgent problem. Our project will utilize this as the central theme to attempt to solve this problem with artificial intelligence. One potential dataset is human movements covering millions of people, which can largely influence disease transmission and vaccine distribution.
The intern will assist the PPerfLab research group at Portland State as they develop state-of-the-art performance measurement tools for HPC platforms. We are developing a runtime detection framework for file system interference to allow performance tools to determine a correct diagnosis in the case where an unrelated job is causing the performance bottleneck.
The ongoing COVID-19 pandemic is a global public health emergency requiring urgent development of highly efficacious vaccines. While concentrated research efforts are underway to develop antibody-based vaccines that would neutralize SARS-CoV-2, and several first-generation vaccine candidates are currently in Phase III clinical trials or have received emergency use authorization, it is forecasted that COVID-19 will become an endemic disease requiring second-generation vaccines. The SARS-CoV-2 surface Spike (S) glycoprotein represents a prime target for vaccine development because antibodies that block viral attachment and entry, i.e. neutralizing antibodies, bind almost exclusively to the receptor binding domain (RBD). We have developed computational models for a large subset of S proteins associated with SARS-CoV-2 (with available structures in the Protein Data Bank), implemented through coarse-grained elastic network models and normal mode analysis. We then analyzed local protein domain dynamics of the S protein systems and their thermal stability (via a novel deep learning model) to characterize structural and dynamical variability among them. These results were compared against existing experimental data and used to elucidate the impact and mechanisms of SARS-CoV-2 S protein mutations and their associated antibody binding behavior. We constructed a SARS-CoV-2 antigenic map and offered predictions about the neutralization capabilities of antibody and S mutant combinations based on protein dynamic signatures. We then compared SARS-CoV-2 S protein dynamics to SARS-CoV and MERS-CoV S proteins to investigate differing antibody binding and cellular fusion mechanisms that may explain the high transmissibility of SARS-CoV-2. Our results provide insights into the dynamics-driven mechanisms of immunogenicity associated with coronavirus S proteins, and present a new approach to characterize and screen potential mutant candidates for immunogen design, as well as to characterize emerging natural variants that may escape vaccine-induced antibody responses. In the proposed work, we will use a combination of fully atomistic molecular dynamics simulations and elastic network based coarse-graining approaches to characterize emerging S protein variants to deduce potential dynamic mechanisms of immune escape based on our existing framework.
Machine Learning (ML) and in particular Neural Networks (NN) are currently being used for different image/video processing, speech recognition and other tasks. The goal of supervised NN is to classify raw input data according to the patterns learned from an input training set. Training and validation of NN is very computationally intensive. In this project we want to develop a NN infrastructure to accelerate model training, specifically tuning of hyper-parameters, and model inference or prediction using distributed systems techniques. With a single set of training data, our application will run different classifiers on different servers each running models with tweaked hyper-parameters. To give more control over the automation process the degree by which these hyper-parameters will be tweaked can be set by the user prior to running. To make our implementation robust to common distributed system failures (servers going down, lost of communication among some nodes, and others) we can use heartbeat/gossip style protocol for failure detection and recovery.
Maria Pantoja
California Polytechnic State University-San Luis Obispo
We are developing an in-house code for modeling the cancer cells where nucleus and cytoskeleton will be included. We want to study the cell deformation during adhesion to other dell types or interacting with extracellular matrix.
Student will work on our XSEDE project about computational studies on the origin of optical rotation
(1) understand the pH dependence for the optical rotation of Vitamin C
(2) understand the role of helical arrangement in the optical rotation using H2-helical chains;
In this position, the student will study strongly bound mixed transition metal clusters as models for real world heterogeneous catalytic systems. Specifically, we will investigate mixed palladium - copper clusters here, searching the cluster potential energy surfaces for local and global minima and exploring the relationship of cluster size and dopant concentration on different cluster properties. The results from this investigation will then be compared with related cluster systems.
This project seeks to understand how eukaryotic organisms use antimicrobial peptides to defend themselves against bacterial infections Specifically, you would simulate and analyze how the antimicrobial peptides interact with bacterial membranes using the latest software developed at the University of Illinois.
Ongoing experimental work has shown that the plant signaling protein PLAFP specifically binds to phosphatidic acid (PA) headgroups to transport these signaling lipids through the plant phloem. Prior simulation work has shown how phosphatidic acid binds to PLAFP. Our research goal here is to carry out alchemical free energy calculations to determine the free energy of binding for phosphatidic acid relative to other phospholipids.
One of the most intriguing aspects of biological development is the transition from a differentiated oocyte to a completely totipotent embryo. Remarkably, this oocyte-to-embryo transition (OET) occurs in the absence of new transcription. Instead, gene regulation across these critical developmental stages relies on unique post-transcriptional mechanisms. This project seeks to investigate the role of alternative polyadenylation in regulating mRNA stability across the dynamic transition from oocyte to embryo in mammals, using cutting edge bioinformatic tools for analyzing big data from RNA-seq.
The use of excess thirty-day readmission rates as an indicator of poor hospital service quality has been widely embraced, but it is unclear if patients' characteristics play important roles in readmissions. Our analysis will reveal the effects of patients' characteristics on health outcomes by leveraging a large-scale cancer patient dataset. This study will enhance our understanding of factors that can influence readmissions and help healthcare programs better define or measure readmission activity that can bring large impacts on hospitals and patients.
Students will investigate Rayleigh-Taylor instability (RTI) occurs between the acceleration subjected two fluids interface where the heavy fluid sits on top of the light fluid. Current scientific literature is vastly investigated the RTI under the incompressible assumption, however in many applications such as in high-density energy processes like inertial confinement fusion (ICF) and supernova formations the incompressible assumption is no longer valid. In this study, a new set of simulations will be performed to investigate the compressibility effects on two-dimensional (2D) RTI with two species with large density differences.
We are recruiting a student toadvance a novel deep learning (and other ML) framework for computational fragment-based drug design. We work with chemistry collaborators to develop and evaluate a reinforcement learning-based model using HPC to generate novel drug leads based on protein structure and computational docking analysis. The student will optimize and advance the modeling framework for clinically important drug targets.
In this position, the student will work with me to develop and test computational labs for a proposed computational chemistry course at USM. Since no such course is taught at USM, I envision have lab components for each lecture similar to the workshop. This position is a continuation of the fall 2021 one.
This project expands upon previous work to study the substrate specificity for two related transport proteins within the small multidrug resistance transport family. Having already characterized the substrate-free states, we will be investigating how tetraphenylphosphonium and guanidinium dynamics are different within these transport proteins.
Almost every organism studied to date has a microbiome, the collection of bacteria, viruses, and tiny plants and animals that live in an on other creatures (the "hosts"). Though these microbiota can have important interactions with the host, such as helping them fight disease or digest vitamins, they are poorly understood in most species. The primary goal of this project is to better understand the microbiomes of threatened wildlife species such as turtles and frogs via bioinformatic explorations of vast amounts of DNA sequence data. Using high-powered computing resources, these analyses can teach us much about the microbial communities that inhabit these species.
Student will learn to perform all-atom molecular dynamics simulations and participate in the development of computational methods to investigate processes of the viral life cycle.
The open-source TARDIS supernova simulation code is written in Python and part of it is accelerated with the Numba just-in-time (JIT) compiler framework. The other part of TARDIS relies extensively on pandas table operations. The project for the spring semester will be to profile the current pandas-based plasma code and research if frameworks like dask could be used to parallelize and speed-up the plasma calculation part. The student will gain insights into code performance analysis, modern frameworks like dask, and open-source science code development (using modern practices such as version control, continuous integration, code review).
This project aims to support the work of the mentor involving field theoretic study of the thermodynamics of polymer blends in dense nanoparticle packings, which is supported by the mentor's NSF GRFP grant and NSF CBET-1933704, and is the continuation of an XSEDE EMPOWER Fall 2021 project. The effect of nanoconfinement and polymer-nanoparticle interactions on the behavior of polymer blends in highly-filled nanocomposites is poorly understand, in part because the experimental preparation of these systems proved difficult prior to the development of the capillary rise infiltration (CaRI) method in the lab of the mentor's co-advisor. Molecular dynamics simulations in LAMMPS are an alternate method to field theoretic simulations that can shed light on the phase separation dynamics and transport properties of these composites, and will compose the bulk of the work of the student.
Today, the U.S. spends five times more per capita on health care than countries with similar life expectancies and costs are still rising. Despite this investment, the U.S. has one of the highest rates of maternal mortality and morbidity. There is a significant disparity in maternal outcomes for women of color, particularly for Black and African American women and Indigenous Alaskan Native women. However, Indigenous Hawaiian Native women have one of the lowest rates of adverse maternal outcomes. In 2017, approximately 810 women died every day from preventable causes related to pregnancy and childbirth. This risk has led to a growing body of research to investigate race as it pertains to maternal risk. Skilled care before, during and after childbirth can save the lives of women and newborns, and barriers that limit access to quality maternal health services must be identified and addressed at both health system and societal levels. This project will use machine learning and advanced computational analytics to examine 10 years of birth and death data in the U.S. provided through the public repository curated by the National Vital Statistics System. The goal of this project is to highlight any possible protective factors present for Native Hawaiian women in an effort to further inform known problems in maternal health.
The U.S. has one of the highest rates of maternal mortality and morbidity. There is a significant disparity in maternal outcomes for women of color, particularly for Black and African American women and Indigenous Alaskan Native women. Traditional data collection is limited to clinical encounters and labor/delivery. This data, however, does not provide a sufficient view into factors that may contribute to adverse outcomes in maternal health. This project will use machine learning and advanced computational analytics to examine 10 years of birth and death data in the U.S. provided through the public repository curated by the National Vital Statistics System. Additionally, interviews, news articles and obituary data collected through ProPublica's "Lost Mothers" project will be used to integrate with traditional electronic health records data to examine the value of information collected from this mixed methods analysis.
The City of Tulsa has challenged the ORU Data Science Team to analyze several years of utility bills and eviction actions, and develop predictive models for residential eviction likelihood in order to trigger a potential intervention in order to avoid actual eviction if possible. Existing attempts to address this problem have not resulted in satisfactory predictability.
Membrane proteins are proteins present in the cell membrane and are essential due to several factors: they are responsible for protecting the cells and keeping them healthy, drug targets and ion channels in membranes are critical to the nervous system. The presence of lipid layers makes it difficult to analyze their structures experimentally, in-silico analysis has emerged as a promising tool. However, unlike soluble proteins, membrane protein design remains challenging due to the presence of a complex and diverse lipid layer. The goal of this work is to build an implicit model which can include the electrostatic interaction due to the membrane environment for different lipid layers.
Besides the lanthanides, fission of uranium or plutonium by neutron irradiation in a nuclear reactor also generates highly radioactive "minor actinides" (MA), including Np, Am, and Cm, that constitute the main long-term radiation hazards. To reduce the cost for the long-term disposal of nuclear wastes, the M's are separated from the lanthanides. The chemical similarity between trivalent actinides and lanthanides makes the partitioning extremely challenging. To understand the underlying principles that govern the selective separations of actinides from lanthanides, this project will carry out Ab Initio calculations to obtain the optimal structures, energetics, and electronic properties of the trivalent actinide and lanthanide complexes with bis(methyl)dithiophosphinic acid, (CH3)2PS(H)S.
The goal of this project is to develop a conceptual framework that encompasses scalable provenance data analysis tools, predictive models using machine learning and optimization techniques to investigate causes and outcomes pertaining to loss of scientific computing integrity.
Completing density functional theory (DFT) calculations to determine transition states for the keto-enol tautomerization reaction of acetophenone on Pt(111). In the literature, there are conflicting reports of the mechanism and DFT calculations should provide some insight. The results will be influential with understanding asymmetric hydrogenation reactions on non-chiral surfaces.
Diamond is an ultra-wide bandgap (UWBG) semiconductor with the highest breakdown field and carrier mobility, making it an attractive material of choice for next-generation high-speed and highpower electronic and RF device applications. Recent studies suggest using a thin layer of 2D materials as the gate dielectric layer to mitigate the limitations of oxide-based acceptor layers. This work will explore the electronic properties of BC3, a 2D material with a band gap, as an acceptor layer
Mental Health risk analysis and mitigation have been a challenge for public health researchers. Recently available large data sets and machine learning offers new opportunity to tackle this challenge. We have gathered a large data set with hundreds of community characteristics on county and Zipcode levels. We have also developed graph learning methods to better understand the data. In addition, we are working on visualization techniques to improve the understanding of high-dimensional data. The goal is to develop a model to delineate the environmental and social determinants in regard to suicide prevention.
Collaborators have devised (experimentally) a sustainable bio-sourced alternative material that can reduce needs in pavement for petroleum-based asphalt. We have estimated model compositions (via Reverse Monte Carlo computations) that can represent this bio-based multicomponent system in molecular simulations. In the project, conducting and interpreting fully atomistic molecular dynamics simulations will continue in order to infer how the presence and activity of different molecule types impacts the predicted mechanical properties.
We will use cluster expansions, constructed based on density functional theory (DFT) calculations, to investigate the phase diagram upon intercalation of V2O5, a potential cathode material for non-Li-ion batteries, i.e., batteries that use ions such as Na, Mg, Ca, etc. Cluster expansions are required to be able to run large-scale Monte Carlo simulations. These will be used to introduce temperature to the DFT results, so that phase diagrams for different polymorphs and intercalant concentrations can be obtained. This will also allow for voltage profiles to be calculated.
This is a continuation of the project approved during Fall-2021. It entails the understanding of how the primary structure of knotted proteins folds and how they form knots into a unique configuration.
The ultimate objective of this project is to understand the physics of Rayleigh-Taylor unstable flames. Put simply, a flame is a thin, spatial zone which burns fuel into ashes. If the flame burns upwards in a gravitational field, cold fingers of dense fuel will sink and hot buoyant bubbles of ash will rise, stretching the flame front and increasing the flame speed. This enhanced burning not only drives the explosions of Type Ia supernovae, but also can be harnessed to improve the efficiency of gas-turbine aircraft engines.
Students assist mathematics, computer science and physics faculty in visualizations and analyses of molecular dynamics simulations and density functional theory calculations. The two nano-scale applications are tracking acoustically-controlled defect transitions and simulating thin-film materials to aid photovoltaic technologies. Students adapt code developed by previous student and faculty researchers, and test them on a local Linux cluster before analysis of large-scale production runs on OSC's Owens cluster and DFT calculations using VASP on the University of Toledo's Antec3.
Knee-joint replacement is a procedure of replacing an injured joint with an artificial one, or prosthesis to mimic the function of a knee. The replacement should be customized considering the patient's age, weight, activity level, and overall health. 3D printing, or additive manufacturing (AM), offers the advantage of producing parts with intricate shapes and geometries for patient-specific biomedical implants. However, AM introduces undesirable residual stresses, porosity, and the challenge of maintaining dimensional stability. This project aims to use numerical simulations to study shape distortions and mechanical functions of 3D printed knee-joint replacement.
The primary goal of this project is to investigate the surface chemistry and the mechanical and ion-transport properties of several Li: Metal, Li: Metalloid interfaces using Density Functional Theory (DFT) – for applications as Li-anode in the Li-ion batteries. This research will result in a comprehensive summary of the binding of energies between Li, Na, and graphite with other metallic and non-metallic thin films. Besides performing a systematic study for each of the systems using PBE functionals, we will also use hybrid functional, PBE0, to further increase the accuracy of the results, that were not reported thus far. In addition, to knowing interfaces, it will also include novel systems which not yet studied, either using PBE or PBE0. This investigation will add new knowledge to the field of Li-ion batteries, which experimentalists may use as a guide to fabricate high energy-density Li-ion batteries in the future, which can also extend the life cycling life of the anodes.
This project seeks to predict the amount and the geographic distribution of genetic diversity within species of birds and amphibians of the Atlantic Forest by modeling where, in the landscape, species will encounter obstacles to gene flow that promote genetic differentiation. For that, we will generate models of genetic diversity and divergence from predictor environmental and ecological variables that have the potential to explain the location of barriers to gene flow in the ecosystem. Here, we propose to use machine learning (M) methods that incorporate climatic and life history species-specific data to this end. The ML framework should allow us to extensively navigate the large amount of available genetic data for these groups and to infer mechanistic relationships between predictor and response variables.
This study aims to understand the structural variations of psoriasin by providing a comparative analysis of the chemical and biophysical properties for the two structures of psoriasin in apo and two different metalated forms (low and high zinc structures) at a molecular level by running molecular dynamics simulations using NAMD.
In this project, we investigate the structural, electronic and optical properties of 2D materials based on Density Functional Theory (DFT). The electronic and optical properties can be extended using many body approaches such as GW method. In addition, we will study the effect of layer thickness, single layer properties. Both non- magnetic such as transitional metal dichalcogenides.
(TMDs) and magnetic layered materials such as
transition metal halides will be studied. The cleavage
energy will be calculated that will indicate the
cleavability to single layer or few layers using
mechanical exfoliation technique similar to graphene or
other two-dimensional materials. The detailed atomic,
electronic and optical properties will be studied.
Bio-image data analysis and search engine development to understand the biofilm formation using 3D reconstruction, modeling and printing. Integrate image datasets from multiple sources to understand the 3D structure of biofilm formation.
We are looking for students interested joining our research team to help develop software to scale a prototype Simple Evolutionary Exploration (SEE) library to utilize large scale computing systems. The search space involved in this research is extremely large and requires massive computing resources. This research will look into leveraging High Performance Computing Resources (XSEDE), HTCondor and Cloud Resources. The long term goal of the project is to build image annotation system that works in "real time" with the researchers to explore the algorithm space for solutions to scientific image understanding problems.
Students will investigate Rayleigh-Taylor instability (RTI) occurs between the acceleration subjected two fluids interface where the heavy fluid sits on top of the light fluid. Current scientific literature is vastly investigated the RTI under the incompressible assumption, however in many applications such as in high-density energy processes like inertial confinement fusion (ICF) and supernova formations the incompressible assumption is no longer valid. In this study, a new set of simulations will be performed to investigate the compressibility effects on two-dimensional (2D) RTI with two species with large density differences.
Knee-joint replacement is a procedure of replacing an injured joint with an artificial one, or prosthesis to mimic the function of a knee. The replacement should be customized considering the patient's age, weight, activity level, and overall health. 3D printing, or additive manufacturing (AM), offers the advantage of producing parts with intricate shapes and geometries for patient-specific biomedical implants. However, AM introduces undesirable residual stresses, porosity, and the challenge of maintaining dimensional stability. This project aims to use numerical simulations to study shape distortions and mechanical functions of 3D printed knee-joint replacement.