NCSI

   

High-Throughput Investigation of Organic Molecular Crystal Energy Frameworks an Intermolecular Orbital Overlap.


Shodor > NCSI > XSEDE EMPOWER > XSEDE EMPOWER Positions > High-Throughput Investigation of Organic Molecular Crystal Energy Frameworks an Intermolecular Orbital Overlap.

Status
Completed
Mentor NameBohdan Schatschneider
Mentor's XSEDE AffiliationPrevious research allocation recipient and currently applying for more time
Mentor Has Been in XSEDE Community4-5 years
Project TitleHigh-Throughput Investigation of Organic Molecular Crystal Energy Frameworks an Intermolecular Orbital Overlap.
SummaryThis position will entail two jobs: 1) calculating the energy frameworks of ~1000 organic molecular crystals using the Hirshfeld surface software, CrystalExplorer and 2) calculating the intermolecular orbital overlap of neighboring molecules in the crystal lattice using fragment orbital density functional theory (FO-DFT). This job will necessitate the coordination of two undergraduate researchers (UGRs) to run the calculations and assemble the data in one file for later analysis using machine learning algorithms.
Job DescriptionTwo undergraduate researcher (UGR) will generate C++ scripts for automating crystalline energy framework and intermolecular orbital overlap calculations. These C++ scripts will queue the jobs on XSEDE’s Open Science Grid and will then automatically extract all of the necessary data from the resulting metadata. One UGR will focus on calculating the energy frameworks of ~1000 organic molecular crystals (OMCs) using the CrystalExplorer freeware package. The other UGR will perform high-throughput intermolecular orbital overlap calculations using FO-DFT within the NWChem freeware package on the same set of OMCs. Structural and electronic properties information from both sets of calculations will then be used to develop quantitative structure properties relationships (QSPRs) using a variety of machine learning algorithms such as ordinary least squares linear regression (OLS), artificial neuronal networks (ANNs), support vector models (SVMs), and random forests (RFs) within the R-project software suite.
Computational ResourcesWe have applied for a startup allocation on the Open Science Grid resource to run the crystalline energy frameworks and intermolecular orbital overlap calculations. This startup allocation is also slotted for use with another project running through the EMPOWER program entitled: “Establishing the Electronic Genome of Functionalized Polycyclic Aromatic Hydrocarbons (PAHs) using High-throughput DFT and Machine Learning”. If the 200,000 core allocation is used up between the two projects, we will apply for more time. In addition, we have a high-performance computing cluster here at CPP that we can use to complete the work.
Contribution to Community
Position TypeApprentice
Training PlanBoth of the positions will require the UGRs to use the C++ script to perform the repeated calculations and data extraction associated with the high-throughput routines. The PI will work with the UGRs in the 1st week of the project so that they become familiar with running the scripts and extracting the data. Both UGRs be involved in the development of the QSPR models using the ML algorithms in R-Project. My collaborator, Dr. Carsten Lange will train the two UGRs in the implementation of the R-Project software.
Student Prerequisites/Conditions/Qualifications
DurationSummer
Start Date06/01/2019
End Date08/20/2019

Not Logged In. Login