当前位置: X-MOL 学术Robot. Comput.-Integr. Manuf. › 论文详情
Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)
Partially observable deep reinforcement learning for multi-agent strategy optimization of human-robot collaborative disassembly: A case of retired electric vehicle battery
Robotics and Computer-Integrated Manufacturing ( IF 10.4 ) Pub Date : 2024-04-20 , DOI: 10.1016/j.rcim.2024.102775
Jiaxu Gao , Guoxian Wang , Jinhua Xiao , Pai Zheng , Eujin Pei

The burgeoning electric vehicle (EV) industry has precipitated a commensurate surge in the consumption of EV batteries, which are currently labor-intensive and inefficient for the recycling and disassembly of EV batteries. However, it is a potential trend to enhance the efficacy and safety of the disassembly of EV batteries based on human-robot collaboration (HRC) method. Because of the uncertainty of retired EV battery disassembly and the inefficiency of the existing disassembling sequence, it is difficult to be fully accomplish through HRC disassembly. The collaborative disassembly of EV batteries by humans and robots can be conceptualized as agents engaging with and learning from the environment, and modeled as a multi-agent Markov game process. This paper aims to address the challenge of HRC in the disassembly of EV batteries by recognizing the dual attributes of partial observability and non-smoothness in the suitable disassembly scenario. A partially observable multi-agent reinforcement learning environment is constructed, incorporating the structural aspects of the EV battery and the disassembly task. The framework is extended to the QMIX-HRC algorithm on the QMIX architecture (as a value-based multi-agent deep reinforcement learning algorithm), specifically designed to tackle the sequence problem in human-robot collaborative disassembly of EV batteries. The optimization results would yield a task sequence to offer maximal global co-benefit during the exploration iteration, facilitating a reduction in labor costs and an enhancement of co-efficiency. The viability of the QMIX-HRC disassembly strategy would be verified through the eventual disassembly sequence of a simulated battery pack through a real human-robot collaborative disassembly station.

中文翻译:

人机协同拆解多智能体策略优化的部分可观测深度强化学习——以退役电动汽车电池为例

蓬勃发展的电动汽车(EV)行业导致电动汽车电池的消耗量相应激增,而目前电动汽车电池的回收和拆解属于劳动密集型且效率低下。然而,基于人机协作(HRC)方法提高电动汽车电池拆卸的效率和安全性是一个潜在的趋势。由于退役电动汽车电池拆解的不确定性以及现有拆解序列的低效率,通过HRC拆解很难完全完成。人类和机器人协作拆卸电动汽车电池可以被概念化为与环境互动并向环境学习的代理,并建模为多代理马尔可夫博弈过程。本文旨在通过认识适当拆解场景中部分可观测性和非平滑性的双重属性,解决 HRC 在电动汽车电池拆解中的挑战。构建了一个部分可观察的多智能体强化学习环境,结合了电动汽车电池的结构方面和拆卸任务。该框架扩展到QMIX架构上的QMIX-HRC算法(作为一种基于价值的多智能体深度强化学习算法),专门用于解决人机协作拆卸电动汽车电池中的顺序问题。优化结果将产生一个任务序列,在探索迭代过程中提供最大的全局协同效益,有助于降低劳动力成本并提高协同效率。 QMIX-HRC 拆卸策略的可行性将通过真实的人机协作拆卸站对模拟电池组的最终拆卸顺序进行验证。
更新日期:2024-04-20
down
wechat
bug