Angus Fung

I completed my Ph.D at the University of Toronto, Robotics Institute, advised by Goldie Nejat, where I work on robot perception and control using self-supervised learning and generative AI.

I am currently building Syncere with Aaron Tan to bring consumer robots into society.

Previously, I worked on learning algorithms at the Vector Institute where I was advised by Jimmy Ba.

Outside of research:

Music: I am active as a church organist having held positions at the Metropolitan United Church (under Dr. Patricia Wright) and St. Michael's Cathedral Basilica. I received my ARCT Diploma in Piano and Organ Performance in 2013 at the Royal Conservatory of Music.
Medicine: I build AI models to tackle open questions in neuroscience and neuro-ophthalmology with Dr. Anthony Lang and Dr. Edward Margolin.
Startups: In the past, I co-founded Scholarply and ONE800 with Aaron Tan, leveraging LLMs for scholarship applications and personal companionship.
Entertainment: I worked with 2x Grammy Award recipient Sean Leon to build AI technology for their God's Algorithm Project.

Updated: 12/24

Email / Google Scholar / Github

Research | SaaS Products | Recognition | Teaching | Mentoring

Research

	LDTrack: Dynamic People Tracking by Service Robots using Diffusion Models Angus Fung, Beno Benhabib, Goldie Nejat International Journal of Computer Vision, 2024 Paper We present a novel people tracking architecture for mobile service robots using conditional latent diffusion models, which we name Latent Diffusion Track (LDTrack), to solve the robotic problem of tracking multiple dynamic people under intraclass variations.
	X-Nav: Learning Cross-Embodiment Navigation for Wheeled and Quadrupedal Robots Haitong Wang, Aaron Hao Tan, Angus Fung, Goldie Nejat Under Review at RAL, 2025 Paper / Video (Coming Soon) We introduce X-Nav, a novel framework for cross-embodiment navigation where a single unified policy can be deployed across various embodiments for both wheeled and quadrupedal robots.
	MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models Angus Fung, Aaron Hao Tan, Haitong Wang, Beno Benhabib, Goldie Nejat, Under Review at RAL, 2024 Paper / Video We present MLLM-Search, a novel multimodal language model approach to address the robotic person search problem under event-driven scenarios with incomplete or unavailable user schedules. Our method introduces zero-shot person search using language models for spatial reasoning, a novel visual prompting method generating topological graphs with semantic labels, and an MLLM-based search planner combining region and waypoint planning through our spatial chain-of-thought (SCoT) prompting method.
	Mobile Robot Navigation Using Hand-Drawn Maps: A Vision Language Model Approach Aaron Hao Tan, Angus Fung, Haitong Wang, Goldie Nejat Under Review at RAL, 2024 Paper / Video We introduce a novel Hand-drawn Map Navigation (HAMNav) architecture that leverages pre-trained vision language models for robot navigation across diverse environments, hand-drawing styles, and robot embodiments, even in the presence of map inaccuracies.
	Find Everything: A General Vision Language Model Approach to Multi-Object Search Angus Fung, Daniel Choi, Haitong Wang, Aaron Hao Tan CoRL Workshop: Language and Robot Learning, 2024 Under Review at ICRA, 2025 Project Page / Paper We present Finder, a novel approach to the multi-object search problem that leverages vision language models (VLMs) to efficiently locate multiple objects in diverse unknown environments. Our method combines semantic mapping with spatio-probabilistic reasoning and adaptive planning, improving object recognition and scene understanding through VLMs.
	Robots Autonomously Detecting People: A Multimodal Deep Contrastive Learning Method Robust to Intraclass Variations Angus Fung, Beno Benhabib, Goldie Nejat IEEE Robotics and Autonomation Letters + IROS, 2023 Paper / Talk / Abstract / Poster We present a novel multimodal person detection architecture to address the mobile robot problem of person detection under intraclass variations (e.g. partial occlusion, varying illumination, pose deformation) by introducing our Temporal Invariant Multimodal Contrastive Learning (TimCLR) method.
	Robots Understanding Contextual Information in Human-Centered Environments using Weakly Supervised Mask Data Distillation Daniel Dworakowski, Angus Fung, Goldie Nejat International Journal of Computer Vision (IJCV), 2022 Paper We present the novel Weakly Supervised Mask Data Distillation architecture for autonomously generating pseudo segmentation labels.
	A Multi-Robot Person Search System for Finding Multiple Dynamic Users in Human-Centered Environments Sharaf C Mohamed, Angus Fung, Goldie Nejat IEEE Transactions on Cybernetics, 2022 Paper / Video We present a novel multi-robot person search system to generate search plans for multi-robot teams to find multiple dynamic users before a deadline.
	AC/DCC : Accurate Calibration of Dynamic Camera Clusters for Visual SLAM Jason Rebello, Angus Fung, Steven Waslander IEEE International Conference on Robotics and Automation (ICRA), 2020 Paper We present a method to calibrate the time-varying extrinsic transformation between any number of cameras and achieves measurement excitation over the entire configuration space of the mechanism resulting in a more accurate calibration.
	Using Deep Learning to Find Victims in Unknown Cluttered Urban Search and Rescue Environments Angus Fung, Beno Benhabib, Goldie Nejat Springer Nature, 2020 Paper We investigate the first use of deep networks for victim identification in Urban Search and Rescue, for cases of partial occlusions and varying illumination, on a RGB-D dataset obtained by a mobile robot navigating cluttered USAR-like environments.

SaaS Products

	Scholarply Angus Fung (Founder), Aaron Tan (Founder) 2023 Q3-4 Scholarply / Newsletter / TikTok / Demo Accelerating the scholarship application process via LLM agents to help students secure funding while focusing on their studies.
	ONE800 Angus Fung (Founder), Aaron Tan (Founder) 2023 Q1-2 ONE800 / Twitter / Instagram / Demo An all-in-one service built in to iMessage aimed at lowering the barrier of entry for LLMs and Generative AI. Launched multi-modal conversations with proprietary model 4 months before GPT-4V. Gained significant traction with thousands of monthly active users.

Recognition

2024: Doctoral Completion Award ($4k)
2024: LocalHost Fellowship ($3k)
2024: Microsoft Startup Hub Program ($150k)
2023: Ontario Graduate Scholarship - University of Toronto ($15k)
2022: Rimrott Memorial Graduate Scholarship - University of Toronto ($4k)
2021: RO-MAN Roboethics Competition, McGill University - 1st Place ($1k)
2021: University of Toronto MIE Fellowship ($14k)
2020: Queen Elizabeth II Graduate Scholarship - University of Toronto ($15k)
2020: University of Toronto MIE Fellowship ($14k)
2019: University of Toronto MIE Fellowship ($14k)
2019: Healthcare Robotics NSERC Fellowship ($10k)
2014-2018: Dean's Honour List
2014: Delta Tau Delta Award ($3k)
2014: University of Toronto Scholars (Academic Excellence) ($7.5k)
2014: University of Toronto Scholar ($5k)
2013: ARCT Diploma - Piano Performance
2013: ARCT Diploma - Organ Performance

Teaching

2024F: ROB501: Computer Vision for Robotic, TA, University of Toronto
2024W: MIE443: Mechatronics Systems: Design & Integration Head Tutorial TA, University of Toronto
2023F: MIE443: Mechatronics Systems: Design & Integration Head Tutorial TA, University of Toronto
2022F: ROB501: Computer Vision for Robotic, TA, University of Toronto
2022W: MIE443: Mechatronics Systems: Design & Integration Head Tutorial TA, University of Toronto
2021W: MIE443: Mechatronics Systems: Design & Integration Head Tutorial TA, University of Toronto
2020W: MIE443: Mechatronics Systems: Design & Integration Head Tutorial TA, University of Toronto

Mentoring

2023-2024: Undergraduate Thesis Student: Michelle Quan (Thesis)
2022-2023: Undergraduate Thesis Student: Grace Bae (Thesis)
2021-2022: Undergraduate Thesis Student: Giro Ele (Thesis)