Angus Fung

I am currently building robotic lamps at Syncere.

Previously, I completed my Ph.D at the University of Toronto, Robotics Institute, where I worked on robot perception and control.

Before that, I worked on learning algorithms at the Vector Institute where I was advised by Jimmy Ba.

Outside of research, I am a church organist and hold ARCT Diplomas in Piano and Organ Performance from the Royal Conservatory of Music.

Updated: 04/26

Email  /  Google Scholar  /  Github

profile photo
Research
MLLM-Search: A Zero-Shot Approach to Finding People using Multimodal Large Language Models
Angus Fung, Aaron Hao Tan, Haitong Wang, Beno Benhabib, Goldie Nejat,
Robotics (MDPI), 2025
Paper / Video

We present MLLM-Search, a novel multimodal language model approach to address the robotic person search problem under event-driven scenarios with incomplete or unavailable user schedules. Our method introduces zero-shot person search using language models for spatial reasoning, a novel visual prompting method generating topological graphs with semantic labels, and an MLLM-based search planner combining region and waypoint planning through our spatial chain-of-thought (SCoT) prompting method.

LDTrack: Dynamic People Tracking by Service Robots using Diffusion Models
Angus Fung, Beno Benhabib, Goldie Nejat
International Journal of Computer Vision, 2024
Paper

We present a novel people tracking architecture for mobile service robots using conditional latent diffusion models, which we name Latent Diffusion Track (LDTrack), to solve the robotic problem of tracking multiple dynamic people under intraclass variations.

X-Nav: Learning End-to-End Cross-Embodiment Navigation for Mobile Robots
Haitong Wang, Aaron Hao Tan, Angus Fung, Goldie Nejat
Under Review at RAL, 2025
Paper / Website

We introduce X-Nav, a novel framework for cross-embodiment navigation where a single unified policy can be deployed across various embodiments for both wheeled and quadrupedal robots.

Mobile Robot Navigation Using Hand-Drawn Maps: A Vision Language Model Approach
Aaron Hao Tan, Angus Fung, Haitong Wang, Goldie Nejat
IEEE Robotics and Automation Letters, 2025 + ICRA 2026
Paper / Video

We introduce a novel Hand-drawn Map Navigation (HAMNav) architecture that leverages pre-trained vision language models for robot navigation across diverse environments, hand-drawing styles, and robot embodiments, even in the presence of map inaccuracies.

Find Everything: A General Vision Language Model Approach to Multi-Object Search
Angus Fung, Daniel Choi, Haitong Wang, Aaron Hao Tan
CoRL Workshop: Language and Robot Learning, 2024
Under Review at ICRA, 2025
Project Page / Paper

We present Finder, a novel approach to the multi-object search problem that leverages vision language models (VLMs) to efficiently locate multiple objects in diverse unknown environments. Our method combines semantic mapping with spatio-probabilistic reasoning and adaptive planning, improving object recognition and scene understanding through VLMs.

Robots Autonomously Detecting People: A Multimodal Deep Contrastive Learning Method Robust to Intraclass Variations
Angus Fung, Beno Benhabib, Goldie Nejat
IEEE Robotics and Autonomation Letters + IROS, 2023
Paper / Talk / Abstract / Poster

We present a novel multimodal person detection architecture to address the mobile robot problem of person detection under intraclass variations (e.g. partial occlusion, varying illumination, pose deformation) by introducing our Temporal Invariant Multimodal Contrastive Learning (TimCLR) method.

Robots Understanding Contextual Information in Human-Centered Environments using Weakly Supervised Mask Data Distillation
Daniel Dworakowski, Angus Fung, Goldie Nejat
International Journal of Computer Vision (IJCV), 2022
Paper

We present the novel Weakly Supervised Mask Data Distillation architecture for autonomously generating pseudo segmentation labels.

A Multi-Robot Person Search System for Finding Multiple Dynamic Users in Human-Centered Environments
Sharaf C Mohamed, Angus Fung, Goldie Nejat
IEEE Transactions on Cybernetics, 2022
Paper / Video

We present a novel multi-robot person search system to generate search plans for multi-robot teams to find multiple dynamic users before a deadline.

AC/DCC : Accurate Calibration of Dynamic Camera Clusters for Visual SLAM
Jason Rebello, Angus Fung, Steven Waslander
IEEE International Conference on Robotics and Automation (ICRA), 2020
Paper

We present a method to calibrate the time-varying extrinsic transformation between any number of cameras and achieves measurement excitation over the entire configuration space of the mechanism resulting in a more accurate calibration.

Using Deep Learning to Find Victims in Unknown Cluttered Urban Search and Rescue Environments
Angus Fung, Beno Benhabib, Goldie Nejat
Springer Nature, 2020
Paper

We investigate the first use of deep networks for victim identification in Urban Search and Rescue, for cases of partial occlusions and varying illumination, on a RGB-D dataset obtained by a mobile robot navigating cluttered USAR-like environments.