Haisor: Human-aware Indoor Scene Optimization via Deep Reinforcement Learning,ACM Transactions on Graphics

当前位置： X-MOL 学术 › ACM Trans. Graph. › 论文详情

Our official English website, www.x-mol.net, welcomes your feedback! (Note: you will need to create a separate account there.)

Haisor: Human-aware Indoor Scene Optimization via Deep Reinforcement Learning
ACM Transactions on Graphics ( IF 6.2 ) Pub Date : 2024-01-03 , DOI: 10.1145/3632947
Jia-Mu Sun ₁ , Jie Yang ₂ , Kaichun Mo ₃ , Yu-Kun Lai ₄ , Leonidas Guibas ₅ , Lin Gao ₁

Affiliation

3D scene synthesis facilitates and benefits many real-world applications. Most scene generators focus on making indoor scenes plausible via learning from training data and leveraging extra constraints such as adjacency and symmetry. Although the generated 3D scenes are mostly plausible with visually realistic layouts, they can be functionally unsuitable for human users to navigate and interact with furniture. Our key observation is that human activity plays a critical role and sufficient free space is essential for human-scene interactions. This is exactly where many existing synthesized scenes fail—the seemingly correct layouts are often not fit for living. To tackle this, we present a human-aware optimization framework Haisor for 3D indoor scene arrangement via reinforcement learning, which aims to find an action sequence to optimize the indoor scene layout automatically. Based on the hierarchical scene graph representation, an optimal action sequence is predicted and performed via Deep Q-Learning with Monte Carlo Tree Search (MCTS), where MCTS is our key feature to search for the optimal solution in long-term sequences and large action space. Multiple human-aware rewards are designed as our core criteria of human-scene interaction, aiming to identify the next smart action by leveraging powerful reinforcement learning. Our framework is optimized end-to-end by giving the indoor scenes with part-level furniture layout including part mobility information. Furthermore, our methodology is extensible and allows utilizing different reward designs to achieve personalized indoor scene synthesis. Extensive experiments demonstrate that our approach optimizes the layout of 3D indoor scenes in a human-aware manner, which is more realistic and plausible than original state-of-the-art generator results, and our approach produces superior smart actions, outperforming alternative baselines.

中文翻译：

Haisor：通过深度强化学习实现人类感知的室内场景优化

3D 场景合成促进并有益于许多现实世界的应用。大多数场景生成器专注于通过从训练数据中学习并利用邻接和对称等额外约束来使室内场景变得可信。尽管生成的 3D 场景在视觉上逼真的布局大多是合理的，但它们在功能上可能不适合人类用户导航和与家具交互。我们的主要观察结果是，人类活动起着至关重要的作用，足够的自由空间对于人机交互至关重要。这正是许多现有合成场景失败的地方——看似正确的布局往往不适合生活。为了解决这个问题，我们提出了一种通过强化学习进行 3D 室内场景布置的人类感知优化框架Haisor，其目的是找到一个动作序列来自动优化室内场景布局。基于分层场景图表示，通过带有蒙特卡罗树搜索（MCTS）的深度 Q 学习来预测和执行最佳动作序列，其中 MCTS 是我们搜索长期序列和大型动作中的最佳解决方案的关键特征空间。多种人类感知奖励被设计为我们人场景交互的核心标准，旨在通过利用强大的强化学习来识别下一个智能动作。我们的框架通过为室内场景提供部分级别的家具布局（包括部分移动信息）进行了端到端优化。此外，我们的方法是可扩展的，允许利用不同的奖励设计来实现个性化的室内场景合成。大量实验表明，我们的方法以人类感知的方式优化了 3D 室内场景的布局，这比原始的最先进的生成器结果更加真实和合理，并且我们的方法产生了卓越的智能动作，优于替代基线。

更新日期：2024-01-04

点击分享查看原文

点击收藏

公开下载

阅读更多本刊最新论文本刊介绍/投稿指南

全部期刊列表>>