Javad Rahimikollu

Interpretable AI/ML

Machine Learning in Healthcare | Computational Biology PhD Candidate | Published in Science & Nature Methods

Research Data Scientist with 10+ years of experience leading machine learning initiatives and predictive modeling in healthcare and biomedical domains.

View Projects Contact Me

Research Interests

My research focuses on developing and applying advanced computational methods to solve complex problems in healthcare and biology.

Statistical Machine Learning

Graphical models, generative models, deep learning, interpretable machine learning models, causality

Computational Biology

Multi-omics and multimodal integration, gene perturbation analysis, identifiable matrix factorization

Optimization

Linear and non-linear programming, convex optimization

Awards and Honors

Arts & Sciences Graduate Fellowship in Artificial Intelligence

University of Pittsburgh

2017-2018

Outstanding Teaching Assistant and Instructor Award

West Virginia University

2016

Research Projects

Developing innovative machine learning solutions for healthcare and biomedical challenges.

SLIDE: Significant Latent Factor Interaction Discovery and Exploration

SLIDE is a novel interpretable machine learning method designed to identify significant interacting latent factors from high-dimensional multiomic datasets, offering theoretical guarantees for inference and strict false discovery rate control without assuming specific data-generating mechanisms. Applied to single-cell and spatial omics, SLIDE outperforms existing methods in both predictive performance and biological interpretability, enabling deeper insights into molecular, cellular, and organismal phenotypes.

Machine Learning Python Healthcare Analytics

Repository Publication

Multi-modal Integration of Genetic, Cellular, Molecular, and Protein Interaction Data: Discovering Distinct RA Endotypes

A novel multi-scale framework that couples network-based genome-wide association studies (GWAS) to functional genomic data to uncover network modules distinguishing CCP+ and CCP- rheumatoid arthritis. Using the RACER cohort of 555 CCP+/RF+ and 384 CCP-/RF+ RA patients, we uncovered significant differences in heritability and identified 14 putative gene modules explaining genetic differences between these disease groups. Our findings demonstrate the utility of network-based approaches in elucidating the complex genetic landscape of RA, offering new insights into differential genetic risk factors and paving the way for more personalized therapeutic strategies.

Network Analysis GWAS Multi-omics Integration

Preprint

Publications

Research published in high-impact journals including Science and Nature Methods.

Nature Methods

SLIDE: Statistical Learning for Interpretable Disease Endotypes

Rahimikollu J et al. (2024)

Modern multiomic technologies can generate deep multiscale profiles. However, differences in data modalities, multicollinearity of the data, and large numbers of irrelevant features make analyses and integration of high-dimensional omic datasets challenging. Here we present Significant Latent Factor Interaction Discovery and Exploration (SLIDE), a first-in-class interpretable machine learning technique for identifying significant interacting latent factors underlying outcomes of interest from high-dimensional omic datasets. SLIDE makes no assumptions regarding data-generating mechanisms, comes with theoretical guarantees regarding identifiability of the latent factors/corresponding inference, and has rigorous false discovery rate control. Using SLIDE on single-cell and spatial omic datasets, we uncovered significant interacting latent factors underlying a range of molecular, cellular and organismal phenotypes. SLIDE outperforms/performs at least as well as a wide range of state-of-the-art approaches, including other latent factor approaches. More importantly, it provides biological inference beyond prediction that other methods do not afford. Thus, SLIDE is a versatile engine for biological discovery from modern multiomic datasets.

View Publication

Science Translational Medicine

Deep humoral profiling coupled to interpretable machine learning unveils diagnostic markers and pathophysiology of schistosomiasis

Anushka Saha, Trirupa chakraborty, Rahimikollu J et al. (2024)

Schistosomiasis, a highly prevalent parasitic disease, affects more than 200 million people worldwide. Current diagnostics based on parasite egg detection in stool detect infection only at a late stage, and current antibody-based tests cannot distinguish past from current infection. Here, we developed and used a multiplexed antibody profiling platform to obtain a comprehensive repertoire of antihelminth humoral profiles including isotype, subclass, Fc receptor (FcR) binding, and glycosylation profiles of antigen-specific antibodies. Using Essential Regression (ER) and SLIDE, interpretable machine learning methods, we identified latent factors (context-specific groups) that move beyond biomarkers and provide insights into the pathophysiology of different stages of schistosome infection. By comparing profiles of infected and healthy individuals, we identified modules with unique humoral signatures of active disease, including hallmark signatures of parasitic infection such as elevated immunoglobulin G4 (IgG4). However, we also captured previously uncharacterized humoral responses including elevated FcR binding and specific antibody glycoforms in patients with active infection, helping distinguish them from those without active infection but with equivalent antibody titers. This signature was validated in an independent cohort. Our approach also uncovered two distinct endotypes, nonpatent infection and prior infection, in those who were not actively infected. Higher amounts of IgG1 and FcR1/FcR3A binding were also found to be likely protective of the transition from nonpatent to active infection. Overall, we unveiled markers for antibody-based diagnostics and latent factors underlying the pathogenesis of schistosome infection. Our results suggest that selective antigen targeting could be useful in early detection, thus controlling infection severity.

View Publication

Patterns (Cell Press)

Rahimikollu J, Das J. (2022)

Review of methodologies for integrating diverse biomedical data types. This paper provides a comprehensive overview of current approaches and future directions in multi-omics integration.

View Publication

medRxiv (In Revision)

Multi-modal integration of protein interactomes with genomic and molecular data discovers distinct RA endotypes

Rahimikollu J et al. (2025)

Preprint posted August 5, 2025; manuscript in revision at Arthritis and Rheumatology.

View Publication

Bioinformatics

DataRemix: A universal data transformation for optimal inference from gene expression datasets

Mao W, Rahimikollu J et al. (2021)

RNA-seq technology provides unprecedented power in the assessment of the transcription abundance and can be used to perform a variety of downstream tasks such as inference of gene-correlation network and eQTL discovery.

View Publication

Pediatric Nephrology

Predictors of patency for arteriovenous fistulae and grafts in pediatric hemodialysis patients

Onder AM, ..., Rahimikollu J et al. (2019)

Identification of key factors in disease progression using advanced statistical methods. This work was conducted in collaboration with the midwest pediatric nephrology consortium.

View Publication

Pediatric Nephrology

Predictors of time to first cannulation for arteriovenous fistula in pediatric hemodialysis patients

Onder AM, ..., Rahimikollu J et al. (2020)

Midwest Pediatric Nephrology Consortium study on identifying predictive factors for arteriovenous fistula cannulation timing.

View Publication

Energy

Towards a cleaner future: A novel approach to enhance energy efficiency in the US manufacturing industry with fuzzy logic-TOPSIS

Abolhassani A, Rahimikollu J, Gopalakrishnan B (2025)

Novel approach to enhance energy efficiency in the US manufacturing industry using fuzzy logic and TOPSIS methodology.

View Publication

Advances in Water Resources

Hydroclimatic sustainability assessment of changing climate on cholera in the Ganges-Brahmaputra basin

Nasr-Azadani F et al., Rahimikollu J et al. (2017)

Assessment of climate change impacts on cholera prevalence in the Ganges-Brahmaputra basin using hydroclimatic modeling.

View Publication

Applied Mathematical Modelling

Comparison of first aggregation and last aggregation in fuzzy group TOPSIS

Roghanian E, Rahimi J, Ansari A (2010)

Comparative analysis of aggregation methods in fuzzy group decision-making using TOPSIS methodology.

View Publication

Skills & Expertise

Technical and domain expertise in data science, machine learning, and healthcare analytics.

Machine Learning & AI

Predictive Modeling 95%

Statistical Learning 90%

Natural Language Processing 85%

Deep Learning 80%

Feature Engineering 90%

Programming & Development

Python 95%

R 90%

SQL 85%

MATLAB 80%

C++ 75%

Data Engineering & Analytics

Data Visualization 90%

ETL Pipeline Development 85%

Large-scale Data Processing 80%

Data Integration 90%

Database Management 80%

Healthcare Domain Knowledge

Electronic Medical Records 90%

Biomedical Data Analysis 95%

Clinical Outcome Prediction 90%

Multi-omics Data Analysis 85%

Healthcare Systems 80%

Leadership & Communication

Team Leadership 85%

Technical Mentorship 90%

Stakeholder Communication 85%

Project Management 80%

Research Publication 95%

Education & Experience

Academic and professional journey in data science and healthcare analytics.

Ph.D., Computational Biology

Carnegie Mellon University & University of Pittsburgh

Graduate Research Assistant, Department of Immunology

University of Pittsburgh

M.S., Statistics

West Virginia University, Morgantown, WV

2015 - 2017

M.S., Industrial Engineering

West Virginia University, Morgantown, WV

Graduate Teaching Assistant and Instructor, Department of Statistics

West Virginia University

Graduate Teaching Assistant, Industrial Engineering

West Virginia University

Assistant Project Manager

Aetna Insurance Company, Hartford, Connecticut

B.S., Industrial Engineering

Khaje Nasir University, Tehran, Iran

Contact

Interested in collaboration or have questions about my research? Get in touch.

javad@pitt.edu

Pittsburgh, PA

Available after Jan 30th