Research & Development

My research focuses on applying machine learning to genomics and developing bioinformatics solutions.

For a complete list of publications, visit my Google Scholar profile.

Minimal Residual Disease (MRD) detection
Roche Sequencing Solutions2025
Deep LearningPythonC++Variant Calling

Development of a deep learning-based MRD detection system

Genotype imputation
Roche Sequencing Solutions2024
AutoencoderDeep LearningVariant CallingImputationPython

Development of a deep learning-based genotype imputation system

Germline Filtering and TMB
Roche Sequencing Solutions2024
Machine LearningGermline FilteringTMBVariant CallingPythonC++

Identification and filtering of germline mutation for accurate TMB calculation

Clustering and consensus generation of sequencing data
Roche Sequencing Solutions2023
SequencingAlgorithmPythonC++

Clustering and consensus generation of sequencing data

Deduplication of sequencing data
Roche Sequencing Solutions2021-2022
SequencingAlgorithmJava

Deduplication of sequencing data

DeeplyEssential: Essential gene prediction in bacterial genomes
University of California, Riverside2020
Deep LearningEssential GenesGenomics

Essential gene prediction in bacterial genomes

The genome of Cowpea (Vigna unguiculata [L.] Walp.)
University of California, Riverside2019
Genome AssemblyPlant GenomesData Analysis

The genome of Cowpea (Vigna unguiculata [L.] Walp.)

mClass: Cancer type classification with somatic point mutation data
University of California, Riverside2019
Machine LearningGenomicsPython

A multi-class classification approach for identifying cancer types using somatic mutation profiles

Miscellaneous Projects

Other notable projects and contributions in algorithm development and machine learning.

Multimodal Deep Learning for Patient Survival Prediction
Roche Digital Pathology2021
Deep LearningPython

Development of a multimodal deep-learning model for patient survival prediction

  • Worked on a multimodal deep-learning model for patient survival prediction as part of a digital pathology team
  • The project used H&E images, patients' clinical data, and gene expression data for training the prediction model
  • The model consists of a convolutional neural network for training image data and a multilayer perceptron for training the clinical and mRNA data
  • Won several internal awards for the project
  • The project is currently pending a patent
Virtual Pathologist using LLM
Roche Digital Pathology2023
LLMHuggingFacePython

Development of a Virtual Pathologist using Large Language Models

  • Teamed up with the digital pathology group on a Virtual Pathologist project
  • Uses a large language model (LLM) for generating explanations for pathology images
  • Our team won an internal award for this project

Patents

  • US Patent: In Progress (2024)
RAG system for internal documentation
Roche Sequencing Solutions2024
LLMLangChainRAGPython

Development of a RAG system for internal documentation

  • Developed a RAG system for internal documentation using a large language model (LLM)
  • The system uses a large language model (LLM) for generating explanations for internal documentation
  • Ongoing work on generating test cases for project technical requirements and product requirements