Dishani Lahiri

I am a 2nd year MS in Computer Vision student (MSCV) in the Robotics Institute at Carnegie Mellon University. I work on computer vision, natural language processing and machine learning.

At CMU, I specifically work with 3D reconstruction, scene understanding, and fine-tuning large language models for personalized domain-specific usecases. I am currently advised by Prof. Kris Kitani to build a low-power visual-inertial odometry system for Aria AR glasses that can be used reliably in unseen environments as well.

During my summer internship at Slingshot AI, I got a chance to work in a very fast-paced environment with high code-quality standards which enriched my research, software engineering, and product skills. I worked on optimizing fine-tuning of personalized text-to-image models (you can see my results on the home page) , improving the results for Generative aging models, and fine-tuning LLaMA2-7B for personalized text style transfer (paper coming soon).

I developed an interest in diffusion models and currently aim to work on text-to-video models.

Previously I worked on impactful and profitable projects at Samsung R&D Institute, Bangalore . At Samsung, I was a key innovator for the development and deployment of AI Night mode in Samsung Flagship series and the Expert RAW application.

I completed my undergraduate studies in ECE from DTU in 2019. My Bachelor's thesis on Neural Caption Generator was advised by Prof. S. Indu , ex-Head of Department, ECE, DTU. Owing to my interest in human activity recognition, I also worked with Prof. D.K. Vishwakarma.

Email  /  CV  /  Bio  /  Google Scholar  /  LinkedIn  /  Github

profile photo
sym

CMU
MS in Computer Vision
Aug. 22 - Dec. 23

sym

Slingshot AI
ML Research Intern
May 23 - Aug. 23

sym

Samsung R&D Institute
Senior CV Engineer
June 19 - July 22
Software Engineer Intern
May 18 - July 18

sym

DTU, Delhi
B.Tech. ECE
Aug. 15 - May 19

Projects & Publications

I'm interested in computer vision, natural language processing, and machine learning, especially in building personalized multi-modal solutions for edge devices.

S2RF: Semantically Stylized Radiance Fields
Dishani Lahiri*, Neeraj Panse*, Moneish Kumar*
ICCV, 2023 Workshop on AI for 3D Content Creation
paper | code | webpage

We present our method for transferring style from any arbitrary image(s) to object(s) within a 3D scene. Our primary objective is to offer more control in 3D scene stylization, facilitating the creation of customizable and stylized scene images from arbitrary viewpoints. To achieve this, we propose a novel approach that incorporates nearest neighborhood-based loss, allowing for flexible 3D scene reconstruction while effectively capturing intricate style details and ensuring multi-view consistency.

Abnormal human action recognition using average energy images
Dishani Lahiri*, Chhavi Dhiman, Dinesh Kumar Vishwakarma
IEEE, 2017 Conference on Information and Communication Technology (CICT)
paper

We propose a solution to detect abnormal human actions in the image using Histogram of Oriented Gradients (HoG) as the feature descriptor, Principal Component Analysis (PCA) as the dimensionality-reduction technique, and Support Vector Machine as the ML tool for classification. We also release a dataset for abnormal human activities of fainting, headache, and chest pain.

Teaching Experience
  • Advanced Computer Vision, CMU (TA) | Instructor: Prof. David Held | Fall 2023
    This is a new PhD-level course wherein I am involved in preparing and improving the assignments, maintaining the course website, holding Office Hours, and helping students with the theory and code of concepts covered throughout the course.
  • Machine Learning, CMU (TA) | Instructor: Prof. Matt Gormley | Spring 2023
    Preparing and suggesting exam and assignment problems, and material in order to make the course more effective. Holding recitations and office hours for students.
Awards and Recognition
  • Winner (most creative use of Github), HackCMU : Awarded for our project, How Do I Look?, using image-to-text and Large Language Models to generate suggestions for attires based on the event
  • Samsung Excellence Award (earlier Samsung Citizen Award), Advanced Development Category : Company-wide Award to recognize major contributions towards the R&D in Night Mode for S21 Flagship series
  • Standout Performer in Advanced R&D Work : Succeeded in being 1 out of 100 people in Camera Systems Group to receive this award for constant exceptional efforts towards research and implementation
  • Samsung Citizen Award, Group Excellence Category : Company-wide Group award to recognize major contributions towards the development of camera usecases in A71-5G device, the first device with SM7250 chipset
  • Standout Performer in Advanced R&D Work : Succeeded in being 1 out of 100 people in Camera Systems Group to receive this award for constant exceptional efforts towards research and implementation
  • 1H-2020 Project Incentives : Succeeded in being 1 in 2 out of 100 people in Camera Systems Group to receive the incentive in lieu of exceptional performance in critical projects
  • Appreciation letter from HRD Ministry of India : For being in top 0.1 percentile scorers in 12th class CBSE examination. HRD Ministry is the Government of India Body formulates the National Policy of Education