I am a Researcher at Shanghai AI Laboratory.
I completed my PhD in Computer Science at the State University of New York (SUNY),
Binghamton, in May 2024.
I have been supported by grants from the Ford Motor Company for 4.5 years during my PhD
studies.
I also got the Academic Excellence in Computer Science (PhD) at Binghamton University.
I was supervised by Professor Chao
Chen, during my master's program.
I received my M.S. in Computer Science in 2019, and got my B.S. in Mechanical
Engineering in 2016 from Chongqing University, China.
The master's thesis, awarded the Outstanding Thesis of Chongqing City, is available via
this Link.
I am seeking interns for Embodied AI and Robotics! Feel
free to contact me at yding25@binghamton.edu.
Research Direction
The current research interests include:
Spatial Intelligence for Robotics: Empowering
Robots to Understand the Real World
Skill Learning for Robotics: Enabling Robots to
Transform the Real World
with a particular emphasis on their applications in the context of mobile manipulators (MoMa).
Robot Family
BestMan X reflects my vision for these robots to be the best assistants
for humans.
Robotic Tool
My team has multiple robots. Therefore, we are developing an open-source robotic tool
called BestMan. This tool supports development
both in simulation and on real machines. By using a unified framework, BestMan
facilitates rapid development, helping researchers save significant time. (Note: BestMan
is still under construction.)
This project encompasses various sub-projects (selected):
It is a comprehensive “BestMan” world, encompassing open data, code, simulators, hardware, and my aspirations. For more information, please visit
[Link].
To address these challenges, we develop the BestMan platform based on the PyBullet
simulator, with the following key contributions:
1) Integrated Multilevel Skill Chain to Address Multilevel Technical Complexity;
2) Highly Modular Design for Expandability and Algorithm Integration;
3)Unified Interfaces for Simulation and Real Devices to Address Interface Heterogeneity;
4)Decoupling Software from Hardware to Address Hardware Diversity.
In this work, we introduce Fast-UMI, an interface-mediated manipulation system comprising two key
components: a handheld device operated by humans for data collection and a robot-mounted device used
during policy inference.
This system offers an efficient and user-friendly tool for robotic learning data acquisition.
This paper presents AlignBot, a novel framework designed to optimize VLM-powered customized task
planning for household robots by effectively aligning with user reminders.
AlignBot employs a fine-tuned LLaVA-7B model, functioning as an adapter for GPT-4o. This adapter
model internalizes diverse forms of user reminders-such as personalized preferences, corrective
guidance, and contextual assistance into structured instruction-formatted cues that prompt GPT-4o in
generating customized task plans.
Mobile manipulators always need to determine feasible base positions prior to carrying out
navigation-manipulation tasks.
Real-world environments are often cluttered with various furniture, obstacles, and dozens of other
objects. Efficiently computing base positions poses a challenge.
In this work, we introduce a framework named MoMa-Pos to address this issue.
LLM-GROP is a method that uses prompting to extract commonsense knowledge about object configurations
from a large language model and instantiates them with a task and motion planner, allowing for
successful and efficient multi-object rearrangement in various environments using a mobile
manipulator.
In this research, we propose ORLA*, which leverages delayed (lazy) evaluation in searching for a
high-quality object pick and place sequence that considers both end-effector and mobile robot base
travel.
The paper introduces a new algorithm (COWP)
that uses task-oriented common sense
extracted from Large Language Models to help
robots handle unforeseen situations and
complete complex tasks in an open world,
with better success rates than previous
algorithms.
The paper presents a new robot planning
algorithm, TMOC, which can handle complex
real-world scenarios without prior knowledge
of object properties by learning them
through a physics engine, outperforming
existing algorithms.
Autonomous vehicles need to balance
efficiency and safety when planning tasks
and motions, and the algorithm Task-Motion
Planning for Urban Driving (TMPUD) enables
communication between planners for optimal
performance.
DAVT proposes a mobile edge computing
solution for vehicle trajectory data
compression, which reduces data at the
source and lowers communication and storage
costs, using three compressors for distance,
acceleration, velocity, and time data parts,
and outperforms other baselines according to
evaluation results.
This paper proposes an online trajectory
compression framework that uses SD-Matching
for GPS alignment and HCC for compression,
and demonstrates its effectiveness and
efficiency using real-world datasets in
Beijing and deployment in Chongqing.
This paper presents an online trajectory
compression framework for reducing storage,
communication, and computation issues caused
by massive and redundant vehicle trajectory
data, consisting of two phases: online
trajectory mapping and trajectory
compression, using Spatial-Directional
Matching and Heading Change Compression
algorithms respectively, which have been
evaluated with real-world datasets in
Beijing and deployed in Chongqing, showing
higher accuracy and efficiency compared to
state-of-the-art algorithms.
This paper proposes a fuel consumption model
based on GPS trajectory and OBD-II data,
which can estimate the fuel usage of driving
paths and help drivers choose fuel-efficient
routes to reduce greenhouse gas and
pollutant emissions.
The SD-Matching algorithm proposes a
three-stage approach to improve the accuracy
and speed of online map-matching by
incorporating vehicle heading direction
data.
Greenhouse gas emissions from vehicles in
modern cities is a significant problem, but
recommending fuel-efficient routes to
drivers through a personalized fuel
consumption model can help alleviate this
issue, as demonstrated by the successful
implementation of GreenPlanner in Beijing,
which achieved a mean fuel consumption error
of less than 7% and an average savings of
20% fuel consumption for suggested
routes.