Justin (Zhaocong) Yuan

I am a MASc student in Reinforcement Learning (RL) and Robotics at the University of Toronto, supervised by Prof. Angela Schoellig at Dynamic Systems Lab (DSL), also part of Vector Institute and UofT Robotics Institute.

I received my BASc degree from Engineering Science (Robotics), UofT. Before joining DSL, I interned in Apple Siri team in Seattle and Nvidia Toronto AI lab led by Prof. Sanja Fidler. I also spent time as a research student at Data-Driven Decision Making Lab led by Prof. Scott Sanner.

I'm generally interested in machine learning, reinforcement learning, and robotics. My current research focus is on safe RL and transfer learning in RL (specifically in Sim-to-Real applications).

Email  /  CV  /  LinkedIn  /  Google Scholar  /  Github

profile photo

News & Activities

  • New: IROS 2022 presentation of the safe-control-gym paper.
  • Oct 2022, Paper accepted to NeurIPS 2022 Workshops (Distribution Shifts: Connecting Methods and Applications & Progress and Challenges in Building Trustworthy Embodied AI).
    DistShift / TEA
  • Aug 2022, Vector AI Engineering Blog by Catherine Glossop on using safe-control-gym to benchmark RL robustness.
  • May 2022, ICRA 2022 Workshop on Releasing Robots into the Wild: Simulations, Benchmarks, and Deployment (co-organizer).
    website / videos
  • Apr 2022, UofT AER1517 guest lecture (co-speaker).
  • Mar 2022, Vector Institute Industry Workshop (co-speaker).
  • Dec 2021, NeurIPS 2021 Workshop on Deployable Decision Making in Embodied Systems (committee team).
    website / videos (from NeurIPS page)
  • Nov 2021, UCSD guest lecture on Safe Learning in Robotics (co-speaker).
  • Sep 2021, IROS 2021 Workshop on Safe Real-World Robot Autonomy (live session team, volunteer).
    website / videos
MASc in Aerospace Science and Engineering     Sept 2020 - Nov 2022

University of Toronto (Supervisor: Angela P. Schoellig )

BASc in Engineering Science (Robotics)     Sept 2015 - Apr 2020

University of Toronto (Supervisor: Sanja Fidler ), graduated with High Honours

Work Experiences
Nvidia Toronto AI Lab           Sept 2018 - Sept 2019

I worked as a deep learning intern and focused on synthetic data generation for computer vision tasks in autonomous driving. I also worked on trajectory prediction, graph neural networks, and distribution matching on videos.

Apple Siri NLU Team (Seattle)           May 2018 - Sept 2018

I worked on learning-to-rank problems for the Siri pipeline, experimented with both tradition supervised learning techniques and deep learning sequence models.

Data-Driven Decision Making Lab (UofT)           May 2017 - Sept 2017

I worked as a research intern and focused on image classification, text classification using deep learning models such as CNN, RNN and attention networks.

Research and Publications
Characterising the Robustness of Reinforcement Learning for Continuous Control using Disturbance Injection
Catherine Glossop, Jacopo Panerati, Amrit Krishnan, Zhaocong Yuan, Angela P. Schoellig
NeurIPS Workshops ( DistShift & TEA) , 2022
paper / arXiv / blog / bibtex

We leverage safe-control-gym to benchmark robust RL methods (specifically robust adversarial RL) against common disturbances in a robotics setting.

Safe-Control-Gym: A Unified Benchmark Suite for Safe Learning-Based Control and Reinforcement Learning in Robotics
Zhaocong Yuan, Adam W. Hall, Siqi Zhou, Lukas Brunke, Melissa Greeff, Jacopo Panerati, Angela P. Schoellig
IROS & RA-L, 2022
arXiv / code / bibtex

We propose safe-control-gym as a benchmark suite for safe-learning in robotics. It implements several PyBullet-based benchmark environments and control algorithms from traditional control, safe-learning control, safe RL, robust RL.

[NEW] RL transfer learning baselines are recently added.

Safe Learning in Robotics: From Learning-Based Control to Safe Reinforcement Learning
Lukas Brunke, Melissa Greeff, Adam W. Hall, Zhaocong Yuan, Siqi Zhou, Jacopo Panerati, Angela P. Schoellig
Annual Review of Control, Robotics, and Autonomous Systems, 2022
project / paper / arXiv / code / bibtex

We conduct an extensive review on safe-learning-based methods in robotics, and provides a formulation of safe-learning-control to bridge between control theory and reinforcement learning. We also show safe control examples to highlight the need for a safety benchmark.

Meta-Sim: Learning to Generate Synthetic Datasets
Amlan Kar, Aayush Prakash, Ming-Yu Liu, Eric Cameracci, Justin Yuan, Matt Rusiniak, David Acuna, Antonio Torralba, Sanja Fidler
ICCV, 2019 (Oral)
project / paper / arXiv / code / bibtex

We propose Meta-Sim, which learns a generative model of synthetic scenes with the help of a graphics engine. It minimizes the distribution gap between synthetic images and real images. Experiments show that Meta-Sim can greatly improve scene quality and help in downstream task training.

Other Projects
Benchmarking Reinforcement Learning for Safe Robotics: Constraints, Robustness and Transfer
Zhaocong Yuan, Angela P. Schoellig (supervisor)
Graduate Thesis, UofT, 2022
doc (upcoming) / code

We provide the full description of safe-control-gym and use it to perform in-depth benchmarks over RL baselines regarding three aspects of safety in robot control. We also propose useful practices to design safe agents by looking at their respective ablations.

Emergent Communication Behaviors in Multi-Agent Systems
Zhaocong Yuan, Sanja Fidler (supervisor)
Undergraduate Thesis, UofT, 2020
code (agent) / code (environment)

We investigate the emergent behaviors in Multi-agent Reinforcement Learning (MARL) with the OpenAI MPE environment. From experiments, we discover that meaningful team collaboration and communication protocols can be learned.

More About Me

I am especially interested in the intersection of machine learning and simulation technologies, with notable applications such as gaming and robotics (a possible influence from the Matrix Trilogy). Besides doing research and programming, I also enjoy reading, pop music, and anime in my free time. My go-to relaxation at the weekend would often be a new chapter of One Piece plus loop-over some music top charts.

Many thanks to Jon Barron's awesome template!
Other template references: Abhishek Kar, Tingwu Wang, Stuart Geiger