Hello, I'm Yash

performance engineering
ai inference
database systems
Yash KothariYash Kothari Ghibli

When I'm not optimizing systems, you'll find me exploring new films, strategizing over board games, or planning my next trekking adventure.

Databricks

Databricks

Mountain View

2025 - Present

Firebolt

Firebolt

Seattle

Summer 2024

Carnegie Mellon University

Carnegie Mellon University

Pittsburgh

2023 - 2025

Quadeye

Quadeye

Gurgaon

2020 - 2023

Samsung Research

Samsung Research

Bangalore

Summer 2019

Indian Institute of Technology

Indian Institute of Technology

Guwahati

2016 - 2020

Sri Satya Sai Vidya Vihar

Sri Satya Sai Vidya Vihar

Indore

2002 - 2016

Work Experience

Databricks - Work Experience - Image 1
Feb 2025 - Present
Mountain View, USA

Software Engineer @ Databricks

  • Developed networking optimizations to reduce latency in Unity Catalog metadata operations across distributed cloud environments
  • Implemented advanced caching strategies for frequently accessed catalog metadata, improving query performance
ScalaDistributed SystemsCloud ComputingPerformance Optimization
Quadeye - Work Experience - Image 1
Sep 2020 - Jun 2023
Gurgaon, India

Systems Engineer @ Quadeye

  • Worked on low latency trading system for high frequency trading strategies in C++17
  • Implemented order entry and market data feed systems using state-of-the-art technologies like io_uring, eBPF and kernel bypass using Exablaze NIC for exchanges like National Stock Exchange and Chicago Mercantile Exchange
  • Designed and implemented a geographically distributed trading system that executes a unified portfolio strategy across continents and exchanges, utilizing communication channels such as 10Gb Ethernet, microwave, and shared memory
C++17Low Latency SystemseBPFKernel BypassFinancial Markets

Internships

Firebolt - Internship
Summer 2024
Kirkland, WA

Software Engineering Intern @ Firebolt

  • Enhanced OLAP cloud warehouse performance by implementing asynchronous SSD scanning with io_uring, significantly increasing concurrent query capacity
  • Optimized C++ system efficiency through rigorous performance benchmarking using Google microbenchmarks, with in-depth analysis via flamegraphs for targeted improvements
C++Clickhouseio_uringSSD OptimizationAsync Programming
Samsung Research - Internship
Summer 2019
Bangalore, India

Software Engineering Intern @ Samsung Research

  • Developed ONNX to HVX model conversion package for Samsung Neural SDK, optimizing neural networks for Qualcomm AI processors
  • Optimized model loading in Qualcomm Hexagon SDK, reducing on-device AI inference time by 20%
ONNXQualcomm HexagonHVXMobile AI

Education

Carnegie Mellon University - Education - Image 1
2023 - 2025
Pittsburgh, USA

Carnegie Mellon University

Master of Science in Computer Systems

Deep Learning + Databases + Operating Systems

  • Publications in conferences with Prof. Justine Sherry
  • TA for 15-445 Database Systems with Prof. Andy Pavlo
Indian Institute of Technology - Education - Image 1
2016 - 2020
Guwahati, India

Indian Institute of Technology

Bachelor of Technology in Mathematics and Computing

Mathematics + Computer Science + Finance

  • Secured rank 982 out of 1 million in IIT JEE Advanced
  • Awarded full tuition waiver for academic excellence
Sri Sathya Sai Vidya Vihar - Education - Image 1
2002 - 2016
Indore, India

Sri Sathya Sai Vidya Vihar

Elementary to High School

Reading + Writing + Arithmetic

  • Successfully learned the alphabet without eating the crayons
  • Graduated from counting on fingers to actual mathematics

Publications

How I Learned to Stop Worrying About CCA Contention - Research Diagram
ACM HotNets '23
Cambridge, USA

How I Learned to Stop Worrying About CCA Contention

Lloyd Brown, Yash Kothari, Akshay Narayan, Arvind Krishnamurthy, Aurojit Panda, Justine Sherry, Scott Shenker

The paper argues that CCA contention between flows no longer dominates bandwidth allocation in today's Internet due to operator policies and application limitations.

CCAnalyzer: An Efficient and Nearly-Passive Congestion Control Classifier - Research Diagram
ACM SIGCOMM '24
Sydney, Australia

CCAnalyzer: An Efficient and Nearly-Passive Congestion Control Classifier

Ranysha Ware, Adithya Abraham Philip, Nicholas Hungria, Yash Kothari, Justine Sherry, Srinivasan Seshan

The paper presents a novel CCA classifier that uses bottleneck queue occupancy traces and dynamic time warping to accurately identify CCAs with better efficiency and interpretability.

Reverse-Engineering Congestion Control Algorithm Behavior - Research Diagram
ACM IMC '24
Madrid, Spain

Reverse-Engineering Congestion Control Algorithm Behavior

Margarida Ferreira, Ranysha Ware, Yash Kothari, Inês Lynce, Ruben Martins, Akshay Narayan, Justine Sherry

The paper presents a program synthesis pipeline that reverse-engineers CCA from packet traces by formulating synthesis as an optimization problem.