Hi, I’m Haoli Yin

I craft tech, communities, and the future with AI research.

About

Publications

Experience

Contact Me

Resume

𝗡𝗔𝗠𝗘: Haoli Yin 𝗔𝗚𝗘: 22 𝗗𝗘𝗚𝗥𝗘𝗘: M.S. in Computer Science, B.S. in Computer Science and minor in Mathematics 𝗚𝗥𝗔𝗗𝗨𝗔𝗧𝗜𝗢𝗡: May 2025

☕ Coffee Chat 📁 CV | Resume

📕 AI Substack

🥂 LinkedIn 🧑‍💻 GitHub 📷 Instagram 🐦 Twitter (most active)

about me

I’m a Dual MS/BS student majoring in 💻 Computer Science and 📊 Math at ⚓ Vanderbilt University.

Research Interests: data curation, multimodal ML, and scaling (both ML sys and research).

what i’m up to now

Member of Technical Staff @ DatologyAI

Researcher, Data Engineer, or anything for the mission

Scrolling X/Twitter

Lifting, Badminton, Horror Movies

Maybe the purpose of life is to just be alive; we inject meaning to keep us living.

publications

UniCat: Crafting a Stronger Fusion Baseline for Multimodal Re-Identification

GitHub | Paper | Poster | Slides | Twitter Thread

Jennifer Crawford, Haoli Yin, Luke McDermott, Daniel Cummings

Accepted at NeurIPS 2023 UniReps Workshop

Achieved State-of-the-Art (by a significant margin) on several multimodal re-identification benchmark datasets

GraFT: Gradual Fusion Transformer for Multimodal Re-Identification

GitHub | Paper | Poster

Haoli Yin*, Jiayao (Emily) Li*, Eva Schiller*, Luke McDermott, Daniel Cummings

*Work done during Summer 2023 Research Scientist Internship at Modern Intelligence

Borderline Accept reviews at WACV 2024

Digital Staining of Unpaired White and Blue Light Cystoscopy Videos for Bladder Cancer Detection in the Clinic

GitHub | Short Paper | Poster

Shuang Chang, Haoli Yin, Kristen R Scarpato, Amy N Luckenbaugh, Sam Chang, Christian Bolenz, Maximilian C Kriegmair, Nikolaos C Deliolanis, Soheil Kolouri, Audrey Bowden

Accepted to Poster presentation for MIDL 2023 Short Paper track

Manuscript for publication in progress

SpecReFlow: A Specular Reflection Restoration Framework using Flow-Guided Video Completion

GitHub | Paper | Presentation

Haoli Yin, Rachel Eimen, Daniel Moyer, Audrey Bowden

Submitted to SPIE Journal of Medical Imaging

Delivered 20-minute oral talk at SPIE Photonics West 2023 Conference

Prostate Lesion Detection and Salient Feature Assessment using Zone-based Classifiers

GitHub | Paper

Haoli Yin, Nithin Buduma

Selected as top 10 best papers at the Summer STEM Institute

experience

DatologyAI

Member of Technical Staff | June 2024 - Present

Orchestrated data pipeline at multi-billion sample scale, curating image-text multimodal datasets to speed up CLIP pretraining by >10x to reach the same performance as the uncurated raw data baseline and >2x vs CLIPScore filtering

Ported the OpenCLIP repository for internal use, enabling multi-node, multi-gpu distributed training with SLURM and implemented comprehensive eval suite. Monitored CLIP pretraining in WandB and managed artifact storage with AWS S3.

Fine-tuned Multimodal Data Filtering Networks for improved scoring and curating multimodal pretraining datasets

Led a synthetic data generation project to leverage vLLM, improving pretraining time by 40% and avg eval performance by up to 50%.

Come work with me and learn more here!

Modern Intelligence

AI Research Scientist Intern | Jan 2023 - Dec 2023

Authored and developed “GraFT: Gradual Fusion Transformer for Multimodal Re-identification,” which reduced transformer model size by 62%, achieved state-of-the-art performance in multimodal vehicle re-identification benchmarks, and introduced a novel multimodal contrastive learning objective, thereby establishing a unique market differentiator for the company.

Led 3-day sprints🏃‍♀️, swiftly evolving ideas into fully-realized experiments with intern team, catalyzing project momentum 🚀.

Engineered a robust, modular PyTorch infrastructure, harnessing Lightning Fabric for multi-GPU training to supercharge model training speed by 400% 🚄, advancing overall project timeline.

Authored a custom job scheduler with a user-centric interface, optimizing load balancing and training run management, improving resource utilization by 40% 🛠️.

Work with us at modernintelligence.ai/careers

Bowden Biomedical Optics Lab

Research Assistant | Nov. 2021 - Present

Spearheaded 🚀 deep learning research on specular reflection restoration in white-light endoscopy videos as first author in Bowden Biomedical Optics Lab, achieving state-of-the-art results with ASPP model 🥇 (92.8% Dice Score, 52.3% sensitivity increase over U-net models) and flow-guided video completion pipeline 🎥 leveraging optical flow estimation 🌊 and vision transformer models 🤖 (16.8% PSNR, 10.1% SSIM improvements over spatial inpainting methods).

Project areas include 3D hollow organ model reconstruction 🧠, video artifact restoration 🎞️, and current project of using GAN models 🎨 for semantically-aware modality transfer to enhance sensitivity of bladder cancer detection 🚨.

Visit us at lab.vanderbilt.edu/bowdenlab

Yoomi Health

Machine Learning Engineer (Contract) | Sept. 2022 - Feb 2023

Spearheaded research initiatives as third hire in pre-seed physical therapy startup, delivering a state-of-the-art 💪 pose estimation model with 🔥 98% mAP using the EfficientFormerV2 transformer backbone for 📱mobile optimization.

Pioneered efficient in-browser 💻 edge-deployment of the core 2D pose estimation model using TF.js and int8 quantization, achieving real-time inference optimization and driving significant improvements in ⚡️ speed and performance.

Watch as our featured experience was demoed to Mark Cuban 🦈 for $46k in pre-seed funding 💰.

Visit our site at yoomi.health.

Lynntech Inc.

Data Science Research Intern | May 2022 - Aug. 2022

Designed and implemented a 🚀GPU-accelerated state estimation engine using C++ and MAGMA (CUDA wrapper), reducing runtime by 21.8% compared to the MATLAB baseline.

Developed over 20 GPU-accelerated linear algebra utility functions 💪 with unit tests using C++.

Validated the effectiveness of 20+ adversarial ML patterns 🕵️‍♀️ on 30+ state-of-the-art PyTorch classification models using Anaconda, Jupyter Notebook, NumPy, OpenCV, and Pandas

Learn more at lynntech.com/about

honors

Neo Scholar Finalist (2024)

Named one of 150 Neo Scholar Finalists out of pool of over 1,000 talented engineers and researchers. See Neo Scholars Website.

Goldwater Scholarship (2023)

Named one of 413 Scholars out of estimated pool of over 5,000 college sophomores and juniors, highlighting exceptional aptitude and potential for a research career in natural sciences, engineering, and mathematics. See Goldwater Scholarship Website.

Google CS Research Mentorship Program (2023a)

Accepted to a three month program that matches students with Google mentors and peers to support their pursuit of computer science research pathways. See Google CSRMP Website.

Cornelius Vanderbilt Scholarship (2021 - Present)

Awarded full-tuition academic merit scholarship awarded to top candidates who demonstrate academic achievement, intellectual promise, and leadership qualities, continuing the mission of Vanderbilt's founder to unite people and ideas across the world. See CV Scholarship Website.

Equitable Excellence $10k Scholarship (2021)

Recognized as a recipient of the Equitable Excellence Scholarship, a flagship program of Equitable Foundation, providing development opportunities and support to empower students' future plans and make a positive impact in their communities. See Website.

National Merit Scholar (2021)

Coca-Cola Scholarship Semifinalist (2020)

Chosen as one of 1,609 applicants from a pool of 99,403 applications (1.6% acceptance) based on superior leadership, community service, and academic excellence.

Science Olympiad National Medalist (2019-2021)

USA Biology Olympiad Semifinalist (2018-2020)

technical skills

Programming Languages:

Proficient in Python and C++
Knowledge of MATLAB, Java, JavaScript, SQL
Prompt Engineering as a Programming Language (I’ll call this PEaaPL)

ML/DL: PyTorch, OpenCV, Pandas, NumPy, Matplotlib, Optuna

Web Dev: HTML, CSS, Gradio, HuggingFace

MLOps: Hydra, Git, Ubuntu Linux, Weights & Biases

leadership

VandyHacks

Sponsorship Assistant Director | Aug. 2021 - Aug 2023

🎯 Spearhead the coordination of cold calls and lead the management of existing sponsor relations to exceed our goal of raising $80,000 in funding for Vanderbilt's fall hackathon event

🤝 Represent VandyHacks organization at conferences and networking events, increasing lead generation by 40% while building strong partnerships with potential sponsors for future events.

🔍 Always looking for new and creative ways to secure funding and sponsorship to ensure the success and growth of VandyHacks.

Vanderbilt Commodore Orchestra

Viola Section Leader | Aug. 2021 - Nov. 2024

💪 Scheduled and led viola section practices, fostering a collaborative environment that improved the overall quality of performances

🎵 Provide motivation and coaching to slower members, ensuring that all members are able to play to their full potential

👥 Collaborated with fellow section leaders to enhance the overall cohesion and excellence of the orchestra