Abstract
We study the problem of learning online packing skills for irregular 3D shapes, which is arguably the most challenging setting of bin packing problems. The goal is to consecutively move a sequence of 3D objects with arbitrary shapes into a designated container with only partial observations of the object sequence. We take physical realizability into account, involving physics dynamics and constraints of a placement. The packing policy should understand the 3D geometry of the object to be packed and make effective decisions to accommodate it in the container in a physically realizable way. We propose a Reinforcement Learning (RL) pipeline to learn the policy. The complex irregular geometry and imperfect object placement together lead to huge solution space. Direct training in such space is prohibitively data intensive. We instead propose a theoretically provable method for candidate action generation to reduce the action space of RL and the learning burden. A parameterized policy is then learned to select the best placement from the candidates. Equipped with an efficient method of asynchronous RL acceleration and a data preparation process of simulation-ready training sequences, a mature packing policy can be trained in a physics-based environment within 48 hours. Through extensive evaluation on a variety of real-life shape datasets and comparisons with state-of-the-art baselines, we demonstrate that our method outperforms the best-performing baseline on all datasets by at least 12.8% in terms of packing utility. We also release our datasets and source code to support further research in this direction.1
Supplemental Material
Available for Download
Supplementary material
- 2022. On-line three-dimensional packing problems: A review of off-line and on-line solution approaches. Computers & Industrial Engineering (2022), 108122.
DOI: Google ScholarDigital Library . - 2018. Distributed distributional deterministic policy gradients. In International Conference on Learning Representations. OpenReview.net, Vancouver, BC, Canada. https://openreview.net/forum?id=SyZipzbCb.Google Scholar .
- 2017. A distributional perspective on reinforcement learning. In International Conference on Machine Learning (Proceedings of Machine Learning Research), Vol. 70. PMLR, Sydney, NSW, Australia, 449–458. http://proceedings.mlr.press/v70/bellemare17a.html.Google Scholar .
- 2010. Polygon Mesh Processing. AK Peters. http://www.crcpress.com/product/isbn/9781568814261.Google ScholarCross Ref .
- 2004. Convex Optimization. Cambridge University Press.Google ScholarCross Ref .
- 2016. OpenAI gym. arXiv preprint arXiv:1606.01540 (2016). http://arxiv.org/abs/1606.01540.Google Scholar .
- 2017. Yale-CMU-Berkeley dataset for robotic manipulation research. The International Journal of Robotics Research 36, 3 (2017), 261–268.
DOI: Google ScholarDigital Library . - 2015. ShapeNet: An information-rich 3D model repository. arXiv preprint arXiv:1512.03012 (2015). http://arxiv.org/abs/1512.03012.Google Scholar .
- 2022. Computational design of high-level interlocking puzzles. Transactions on Graphics 41, 4 (2022), 150:1–150:15.
DOI: Google ScholarDigital Library . - 2015. Dapper: Decompose-and-pack for 3D printing. Transactions on Graphics 34, 6 (2015), 213:1–213:12.
DOI: Google ScholarDigital Library . - 2006. Packing, tiling, and covering with tetrahedra. National Academy of Sciences 103, 28 (2006), 10612–10617.Google ScholarCross Ref .
- 2016. PyBullet, a Python module for physics simulation for games, robotics and machine learning. PyBullet (2016).Google Scholar .
- 2019. A multi-task selected learning approach for solving 3D flexible bin packing problem. In International Conference on Autonomous Agents and MultiAgent Systems. International Foundation for Autonomous Agents and Multiagent Systems, Montreal, QC, Canada, 1386–1394. http://dl.acm.org/citation.cfm?id=3331847.Google Scholar .
- 2016. Benchmarking deep reinforcement learning for continuous control. In International Conference on Machine Learning. PMLR, 1329–1338. http://proceedings.mlr.press/v48/duan16.html.Google ScholarDigital Library .
- 1996. A hybrid grouping genetic algorithm for bin packing. Journal of Heuristics 2, 1 (1996), 5–30.
DOI: Google ScholarCross Ref . - 2018. Noisy networks for exploration. In International Conference on Learning Representations. OpenReview.net, Vancouver, BC, Canada. https://openreview.net/forum?id=rywHCPkAW.Google Scholar .
- 2021. Learn2Assemble with structured representations and search for robotic architectural construction. In Conference on Robot Learning (Proceedings of Machine Learning Research), Vol. 164. PMLR, London, UK, 1401–1411. https://proceedings.mlr.press/v164/funk22a.html.Google Scholar .
- 2022. Graph-based reinforcement learning meets mixed integer programs: An application to 3D robot assembly discovery. In International Conference on Intelligent Robots and Systems. IEEE, Kyoto, Japan, 10215–10222.
DOI: Google ScholarCross Ref . - 1999. Part pose statistics: Estimators and experiments. Transactions on Robotics and Automation 15, 5 (1999), 849–857.
DOI: Google ScholarCross Ref . - 2020. PackIt: A virtual environment for geometric planning. In International Conference on Machine Learning (Proceedings of Machine Learning Research), Vol. 119. PMLR, 3700–3710. http://proceedings.mlr.press/v119/goyal20b.html.Google Scholar .
- 2017. An online packing heuristic for the three-dimensional container loading problem in dynamic environments and the physical internet. In Applications of Evolutionary Computation (Lecture Notes in Computer Science), Vol. 10200. Amsterdam, The Netherlands, 140–155.
DOI: Google ScholarCross Ref . - 2017. A formal proof of the Kepler conjecture. In Forum of Mathematics, Vol. 5. Cambridge University Press.Google Scholar .
- 2019. Toward fast and optimal robotic pick-and-place on a moving conveyor. Robotics and Automation Letters 5, 2 (2019), 446–453. Google ScholarCross Ref .
- 1982. Computers and intractability: A guide to the theory of NP-completeness (Michael R. Garey and David S. Johnson). SIAM Review 24, 1 (1982), 90.Google ScholarDigital Library .
- 2018. Rainbow: Combining improvements in deep reinforcement learning. In AAAI Conference on Artificial Intelligence. AAAI Press, New Orleans, Louisiana, USA, 3215–3222. https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17204.Google ScholarCross Ref .
- 2018. Distributed prioritized experience replay. In International Conference on Learning Representations. OpenReview.net, Vancouver, BC, Canada. https://openreview.net/forum?id=H1Dy---0Z.Google Scholar .
- 2017. Solving a new 3D bin packing problem with deep reinforcement learning method. arXiv preprint arXiv:1708.05930 (2017). http://arxiv.org/abs/1708.05930.Google Scholar .
- 2020. TAP-Net: Transport-and-pack using reinforcement learning. Transactions on Graphics 39, 6 (2020), 232:1–232:15.
DOI: Google ScholarDigital Library . - 2022. Equivariant transporter network. Proceedings of Robotics: Science and Systems (2022).Google ScholarCross Ref .
- 2023. Planning irregular object packing via hierarchical reinforcement learning. Robotics and Automation Letters 8, 1 (2023), 81–88.
DOI: Google ScholarCross Ref . - 2017. Packing ellipsoids into volume-minimizing rectangular boxes. Journal of Global Optimization 67, 1-2 (2017), 151–185.
DOI: Google ScholarDigital Library . - 2015. Leveraging big data for grasp planning. In International Conference on Robotics and Automation. IEEE, Seattle, WA, USA, 4304–4311.
DOI: Google ScholarCross Ref . - 2004. A hybrid genetic algorithm for packing in 3D with deepest bottom left with fill method. In Advances in Information Systems (Lecture Notes in Computer Science), Vol. 3261. Springer, Izmir, Turkey, 441–450.
DOI: Google ScholarDigital Library . - 2012. The KIT object models database: An object model database for object recognition, localization and manipulation in service robotics. The International Journal of Robotics Research 31, 8 (2012), 927–934.
DOI: Google ScholarCross Ref . - 2019. ABC: A big CAD model dataset for geometric deep learning. In Conference on Computer Vision and Pattern Recognition. Computer Vision Foundation/IEEE, Long Beach, CA, USA, 9601–9611.
DOI: Google ScholarCross Ref . - 2002. Least squares conformal maps for automatic texture atlas generation. Transactions on Graphics 21, 3 (2002), 362–371.
DOI: Google ScholarDigital Library . - 2018. Box cutter: Atlas refinement for efficient packing via void elimination. Transactions on Graphics 37, 4 (2018), 153.
DOI: Google ScholarDigital Library . - 2019. Atlas refinement with bounded packing efficiency. Transactions on Graphics 38, 4 (2019), 33:1–33:13.
DOI: Google ScholarDigital Library . - 2015. HAPE3D - a new constructive algorithm for the 3D irregular packing problem. Frontiers of Information Technology & Electronic Engineering 16, 5 (2015), 380–390.
DOI: Google ScholarCross Ref . - 2009. 3D polyomino puzzle. Transactions on Graphics 28, 5 (2009), 157.
DOI: Google ScholarDigital Library . - 2002. Two-dimensional packing problems: A survey. European Journal of Operational Research 141, 2 (2002), 241–252. Google ScholarCross Ref .
- 2012. Chopper: Partitioning models into 3D-printable parts. Transactions on Graphics 31, 6 (2012), 1–9. Google ScholarDigital Library .
- 2018. Packing irregular objects in 3D space via hybrid optimization. Computer Graphics Forum 37, 5 (2018), 49–59.
DOI: Google ScholarCross Ref . - 2017. Learning deep policies for robot bin picking by simulating robust grasping sequences. In Conference on Robot Learning (Proceedings of Machine Learning Research), Vol. 78. PMLR, Mountain View, California, USA, 515–524. http://proceedings.mlr.press/v78/mahler17a.html.Google Scholar .
- 2016. Dex-Net 1.0: A cloud-based network of 3D objects for robust grasp planning using a Multi-Armed Bandit model with correlated rewards. In International Conference on Robotics and Automation. IEEE, Stockholm, Sweden, 1957–1964.
DOI: Google ScholarDigital Library . - 2016. Volumetric hierarchical approximate convex decomposition. In Game Engine Gems 3. AK Peters, 141–158.Google Scholar .
- 2008. Computational Geometry Algorithms and Applications. Springer.Google ScholarCross Ref .
- 2000. The three-dimensional bin packing problem. Operations Research 48, 2 (2000), 256–267.
DOI: Google ScholarDigital Library . - 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (2015), 529–533.
DOI: Google ScholarCross Ref . - 2011. Efficient packing of arbitrary shaped charts for automatic texture atlas generation. In Computer Graphics Forum, Vol. 30. Wiley Online Library, 1309–1317. Google ScholarDigital Library .
- 2020. Grasping fragile objects using a stress-minimization metric. In International Conference on Robotics and Automation. 517–523.
DOI: Google ScholarCross Ref . - 2021. Decision making in joint push-grasp action space for large-scale object sorting. In International Conference on Robotics and Automation. 6199–6205.
DOI: Google ScholarDigital Library . - 2017. PointNet: Deep learning on point sets for 3D classification and segmentation. In Conference on Computer Vision and Pattern Recognition. IEEE Computer Society, Honolulu, HI, USA, 77–85.
DOI: Google ScholarCross Ref . - 1972. An iterative procedure for the polygonal approximation of plane curves. Computer Graphics and Image Processing 1, 3 (1972), 244–256.
DOI: Google ScholarCross Ref . - 2016. A container loading algorithm with static mechanical equilibrium stability constraints. Transportation Research Part B: Methodological 91 (2016), 565–581.Google ScholarCross Ref .
- 2003. Generation of radiosity texture atlas for realistic real-time rendering. In Eurographics 2003 — Short Presentations. Eurographics Association.
DOI: Google ScholarCross Ref . - 2016. A dataset for improved RGBD-based object detection and pose estimation for warehouse pick-and-place. Robotics and Automation Letters 1, 2 (2016), 1179–1185.
DOI: Google ScholarCross Ref . - 2013. PacCAM: Material capture and interactive 2D packing for efficient material usage on CNC cutting machines. In Symposium on User Interface Software and Technology. ACM, St. Andrews, United Kingdom, 441–446.
DOI: Google ScholarDigital Library . - 2016. Prioritized experience replay. In International Conference on Learning Representations. San Juan, Puerto Rico. http://arxiv.org/abs/1511.05952.Google Scholar .
- 2018. Generalized motorcycle graphs for imperfect quad-dominant meshes. Transactions on Graphics 37, 4 (2018). Google ScholarDigital Library .
- 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017). http://arxiv.org/abs/1707.06347.Google Scholar .
- 2002. On the online bin packing problem. J. ACM 49, 5 (2002), 640–671.
DOI: Google ScholarDigital Library . - 2019. Towards robust product packing with a minimalistic end-effector. In International Conference on Robotics and Automation. 9007–9013.
DOI: Google ScholarDigital Library . - 2014. BigBIRD: A large-scale 3D database of object instances. In International Conference on Robotics and Automation. IEEE, Hong Kong, China, 509–516.
DOI: Google ScholarCross Ref . - 2020. Learning 3D shape completion under weak supervision. International Journal of Computer Vision 128, 5 (2020), 1162–1181.
DOI: Google ScholarDigital Library . - 1985. Topological structural analysis of digitized binary images by border following. Computer Vision, Graphics, and Image Processing 30, 1 (1985), 32–46.
DOI: Google ScholarCross Ref . - 2010. A fast and efficient compact packing algorithm for SAE and ISO luggage packing problems. Journal of Computing and Information Science in Engineering 10, 2 (2010), 021010. Google ScholarCross Ref .
- 2018. Graph attention networks. In International Conference on Learning Representations. OpenReview.net, Vancouver, BC, Canada. https://openreview.net/forum?id=rJXMpikCZ.Google Scholar .
- 2019. Stable bin packing of non-convex 3D objects with a robot manipulator. In International Conference on Robotics and Automation. IEEE, Montreal, QC, Canada, 8698–8704.
DOI: Google ScholarDigital Library . - 2021. Robot packing with known items and nondeterministic arrival order. Transactions on Automation Science and Engineering 18, 4 (2021), 1901–1915.
DOI: Google ScholarCross Ref . - 2022. Dense robotic packing of irregular and novel 3D objects. Transactions on Robotics 38, 2 (2022), 1160–1173.
DOI: Google ScholarCross Ref . - 2016. Dueling network architectures for deep reinforcement learning. In International Conference on Machine Learning (JMLR Workshop and Conference Proceedings), Vol. 48. JMLR.org, New York, NY, USA, 1995–2003. http://proceedings.mlr.press/v48/wangf16.html.Google Scholar .
- 2021. MOCCA: Modeling and optimizing cone-joints for complex assemblies. Transactions on Graphics 40, 4 (2021), 1–14. Google ScholarDigital Library .
- 2017. Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA. 5279–5288. https://proceedings.neurips.cc/paper/2017/hash/361440528766bbaaaa1901845cf4152b-Abstract.html.Google Scholar .
- 2021. PackerBot: Variable-sized product packing with heuristic deep reinforcement learning. In International Conference on Intelligent Robots and Systems. IEEE, Prague, Czech Republic, 5002–5008.
DOI: Google ScholarDigital Library . - 2015. Level-set-based partitioning and packing optimization of a printable model. Transactions on Graphics 34, 6 (2015), 1–11. Google ScholarDigital Library .
- 2021. Modeling, learning, perception, and control methods for deformable object manipulation. Science Robotics 6, 54 (2021), 8803.
DOI: Google ScholarCross Ref . - 2020. Transporter networks: Rearranging the visual world for robotic manipulation. In Conference on Robot Learning (Proceedings of Machine Learning Research), Vol. 155. PMLR, Cambridge, MA, USA, 726–747. https://proceedings.mlr.press/v155/zeng21a.html.Google Scholar .
- 2020. Robust atlas generation via angle-based segmentation. Computer Aided Geometric Design 79 (2020), 101854.
DOI: Google ScholarCross Ref . - 2021. Online 3D bin packing with constrained deep reinforcement learning. In AAAI Conference on Artificial Intelligence. AAAI Press, 741–749. https://ojs.aaai.org/index.php/AAAI/article/view/16155.Google ScholarCross Ref .
- 2022a. Learning efficient online 3D bin packing on packing configuration trees. In International Conference on Learning Representations. https://openreview.net/forum?id=bfuGjlCwAq.Google Scholar .
- 2022b. Learning practically feasible policies for online 3D bin packing. Science China Information Sciences 65, 1 (2022).
DOI: Google ScholarCross Ref .
Index Terms
- Learning Physically Realizable Skills for Online Packing of General 3D Shapes
Recommendations
Solving 3D packing problem using Transformer network and reinforcement learning
AbstractThe three-dimensional packing problem (3D-PP) is a classic NP-hard problem in operations research and computer science. One of the most popular ways to solve the problem is heuristic methods with a search strategy. However, approaches ...
Highlights- A deep reinforcement learning method for 3D packing problem is proposed.
- The ...
A Q-learning-based algorithm for the 2D-rectangular packing problem
AbstractThis paper presents a Q-learning-based algorithm for sequence and orientation optimization toward the 2D rectangular strip packing problem. The width-filled skyline is used to represent the interior packing state, and a constructive rectangular ...
Improved approximation algorithm for two-dimensional bin packing
SODA '14: Proceedings of the twenty-fifth annual ACM-SIAM symposium on Discrete algorithmsWe study the two-dimensional bin packing problem with and without rotations. Here we are given a set of two-dimensional rectangular items I and the goal is to pack these into a minimum number of unit square bins. We consider the orthogonal packing case ...
Comments