We present Finder, a novel approach to the multi-object search problem that leverages vision language models (VLMs) to efficiently locate multiple objects in diverse unknown environments. Our method combines semantic mapping with spatio-probabilistic reasoning and adaptive planning, improving object recognition and scene understanding through VLMs.
OLiVia-Nav: An Online Lifelong Vision Language Approach for Mobile Robot Social Navigation
We introduce OLiVia-Nav, an online lifelong vision language architecture for mobile robot social navigation. By leveraging large vision-language models (VLMs) and a novel distillation process called SC-CLIP, OLiVia-Nav efficiently encodes social and environmental contexts, adapting to dynamic human environments.
4CNet: A Diffusion Approach to Map Prediction for Decentralized Multi-Robot Exploration
Daniel Choi (Acknowledged) IEEE Transactions on Robotics 2024, (Pending) arXiv /
Paper /
Video /
We present a novel robot exploration map prediction method called Confidence-Aware Contrastive Conditional Consistency Model (4CNet), to predict (foresee) unknown spatial configurations in unknown unstructured multi- robot environments with irregularly shaped obstacles.
Trajectory Prediction and LLM Reward Tuning for Robot Social Navigation with Deep Reinforcement Learning
We show a mobile robot social navigation system combining trajectory prediction with reinforcement learning and Large Language Model (LLM) reward tuning in Omniverse Isaac Gym Environment (OIGE).
Other Projects
These include coursework, side projects and unpublished research work.
ESP32 Fitness Tracker
UofT MIE438: Microprocessors and Embedded Microcontrollers
2023-03-14
Paper /
Video /
Code /
We developed and built a wireless, wearable fitness tracker capable of monitoring user steps and heart rate, aiding individuals in achieving their health and fitness targets.
The camera has a detection model trained from scratch to work closely with image augumentation, classification, loss metrics and regression techniques.