Abstract
In this paper, we propose a physics-inspired contrastive learning paradigm for low-light enhancement, called PIE. PIE primarily addresses three issues: (i) To resolve the problem of existing learning-based methods often training a LLE model with strict pixel-correspondence image pairs, we eliminate the need for pixel-correspondence paired training data and instead train with unpaired images. (ii) To address the disregard for negative samples and the inadequacy of their generation in existing methods, we incorporate physics-inspired contrastive learning for LLE and design the Bag of Curves (BoC) method to generate more reasonable negative samples that closely adhere to the underlying physical imaging principle. (iii) To overcome the reliance on semantic ground truths in existing methods, we propose an unsupervised regional segmentation module, ensuring regional brightness consistency while eliminating the dependency on semantic ground truths. Overall, the proposed PIE can effectively learn from unpaired positive/negative samples and smoothly realize non-semantic regional enhancement, which is clearly different from existing LLE efforts. Besides the novel architecture of PIE, we explore the gain of PIE on downstream tasks such as semantic segmentation and face detection. Training on readily available open data and extensive experiments demonstrate that our method surpasses the state-of-the-art LLE models over six independent cross-scenes datasets. PIE runs fast with reasonable GFLOPs in test time, making it easy to use on mobile devices. Code available
Similar content being viewed by others
References
Al Sobbahi, R., & Tekli, J. (2022). Comparing deep learning models for low-light natural scene image enhancement and their impact on object detection and classification: Overview, empirical evaluation, and challenges. Signal Processing: Image Communication, 109, 116848.
Anaya, J., & Barbu, A. (2018). Renoir: A dataset for real low-light image noise reduction. Journal of Visual Communication and Image Representation, 51, 144–154.
Arici, T., Dikbas, S., & Altunbasak, Y. (2009). A histogram modification framework and its application for image contrast enhancement. IEEE Transactions on Image Processing, 18(9), 1921–1935.
Buchsbaum, G. (1980). A spatial processor model for object colour perception. Journal of the Franklin Institute, 310(1), 1–26.
Bychkovsky, V., Paris, S., Chan, E., & Durand, F. (2011). Learning photographic global tonal adjustment with a database of input/output image pairs. In IEEE conference on computer vision and pattern recognition (pp. 97–104).
Cai, J., Gu, S., & Zhang, L. (2018). Learning a deep single image contrast enhancer from multi-exposure images. IEEE Transactions on Image Processing, 27(4), 2049–2062.
Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder–decoder with atrous separable convolution for semantic image segmentation. In European Conference on computer vision (pp. 801–818).
Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A simple framework for contrastive learning of visual representations. In International conference on machine learning (pp. 1597–1607).
Chen, X., Pan, J., Jiang, K., Li, Y., Huang, Y., Kong, C., Dai, L., & Fan, Z. (2022). Unpaired deep image deraining using dual contrastive learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2017–2026).
Cho, S. W., Baek, N. R., Koo, J. H., & Park, K. R. (2020). Modified perceptual cycle generative adversarial network-based image enhancement for improving accuracy of low light image segmentation. IEEE Access, 9, 6296–6324.
Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., Franke, U., Roth, S., & Schiele, B. (2016). The cityscapes dataset for semantic urban scene understanding. In IEEE conference on computer vision and pattern recognition (pp. 3213–3223).
Deng, J., Guo, J., Ververas, E., Kotsia, I., & Zafeiriou, S. (2020). Retinaface: Single-shot multi-level face localisation in the wild. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5203–5212).
Drago, F., Myszkowski, K., Annen, T., & Chiba, N. (2003). Adaptive logarithmic mapping for displaying high contrast scenes. In Computer graphics forum, Wiley Online Library (pp. 419–426).
Fan, M., Wang, W., Yang, W., & Liu, J.(2020). Integrating semantic segmentation and retinex model for low-light image enhancement. In ACM international conference on multimedia, virtual event (pp. 2317–2325).
Farid, H. (2001). Blind inverse gamma correction. IEEE Transactions on Image Processing, 10(10), 1428–1433.
Felzenszwalb, P. F., & Huttenlocher, D. P. (2004). Efficient graph-based image segmentation. International Journal of Computer Vision, 59, 167–181.
Fu, Z., Yang, Y., Tu, X., Huang, Y., Ding, X., & Ma, K. K. (2023). Learning a simple low-light image enhancer from paired low-light instances. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 22252–22261).
Geng, Q., Liang, D., Zhou, H., Zhang, L., Sun, H., & Liu, N. (2021). Dense face detection via high-level context mining. In 2021 16th IEEE international conference on automatic face and gesture recognition (FG 2021) (pp. 1–8). IEEE.
Girshick, R. (2015). Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440–1448).
Guo, C., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., & Cong, R. (2020). Zero-reference deep curve estimation for low-light image enhancement. In IEEE conference on computer vision and pattern recognition (pp. 1780–1789).
Guo, X., Li, Y., & Ling, H. (2016). Lime: Low-light image enhancement via illumination map estimation. IEEE Transactions on Image Processing, 26(2), 982–993.
Gutmann, M., & Hyvärinen, A. (2010). Noise-contrastive estimation: A new estimation principle for unnormalized statistical models. In Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 297–304).
Han, J., Shoeiby, M., Malthus, T., Botha, E., Anstee, J., Anwar, S., Wei, R., Petersson, L., & Armin, M. A. (2021). Single underwater image restoration by contrastive learning. In 2021 IEEE international geoscience and remote sensing symposium IGARSS (pp. 2385–2388). IEEE.
He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum contrast for unsupervised visual representation learning. In IEEE conference on computer vision and pattern recognition (pp. 9729–9738).
Henaff, O. (2020). Data-efficient image recognition with contrastive predictive coding. In International conference on machine learning (pp. 4182–4192).
Hermans, A., Beyer, L., & Leibe, B. (2017). In defense of the triplet loss for person re-identification. arXiv preprint arXiv:1703.07737
Huang, Y., Tu, X., Fu, G., Liu, T., Liu, B., Yang, M., & Feng, Z. (2023). Low-light image enhancement by learning contrastive representations in spatial and frequency domains. arXiv preprint arXiv:2303.13412
Ibrahim, H., & Kong, N. (2007). Brightness preserving dynamic histogram equalization for image contrast enhancement. IEEE Transactions on Consumer Electronics, 53(4), 1752–1758.
Ignatov, A., Kobyshev, N., Timofte, R., Vanhoey, K., & Van Gool, L. (2017). Dslr-quality photos on mobile devices with deep convolutional networks. In IEEE international conference on computer vision (pp. 3277–3285).
Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., Yang, J., Zhou, P., & Wang, Z. (2021). Enlightengan: Deep light enhancement without paired supervision. IEEE Transactions on Image Processing, 30, 2340–2349.
Jin, X., Han, L. H., Li, Z., Guo, C. L., Chai, Z., & Li, C. (2023). Dnf: Decouple and feedback network for seeing in the dark. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 18135–18144).
Jobson, D. J., Rahman, Zu., & Woodell, G. A. (1997). A multiscale retinex for bridging the gap between color images and the human observation of scenes. IEEE Transactions on Image processing, 6(7), 965–976.
Johnson, J., Alahi, A., & Fei-Fei, L. (2016). Perceptual losses for real-time style transfer and super-resolution. In European conference on computer vision (pp. 694–711).
Karaimer, H. C., & Brown, M. S. (2016). A software platform for manipulating the camera imaging pipeline. In Computer vision–ECCV 2016: 14th European conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I (Vol. 14, pp. 429–444).
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., & Matas, J., (2018). Deblurgan: Blind motion deblurring using conditional adversarial networks. In IEEE conference on computer vision and pattern recognition (pp. 8183–8192).
Land, E. H. (1977). The retinex theory of color vision. Scientific American, 237(6), 108–129.
Ledig, C., Theis, L., Huszáir, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., & Shi, W. (2017). Photo-realistic single image super-resolution using a generative adversarial network. In IEEE conference on computer vision and pattern recognition (pp. 4681–4690).
Lee, C., Lee, C., & Kim, C. S. (2012). Contrast enhancement based on layered difference representation. In IEEE international conference on image processing (pp. 965–968).
Li, C., Guo, C., Feng, R., Zhou, S., & Loy, C. C. (2022). Cudi: Curve distillation for efficient and controllable exposure adjustment. arXiv preprint arXiv:2207.14273
Li, C., Guo, C., Han, L., Jiang, J., Cheng, M. M., Gu, J., & Loy, C. C. (2021a). Low-light image and video enhancement using deep learning: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(12), 9396–9416.
Li, C., Guo, C., Zhou, S., Ai, Q., Feng, R., & Loy, C. C., (2023). Flexicurve: Flexible piecewise curves estimation for photo retouching. In IEEE conference on computer vision and pattern recognition NTIRE workshop (CVPRW)-Oral (pp. 1092–1101).
Li, L., Liang, D., Gao, Y., Huang, S. J., & Chen, S. (2023). All-e: Aesthetics-guided low-light image enhancement. arXiv preprint arXiv:2304.14610
Li, W., Yang, X., Kong, M., Wang, L., Huo, J., Gao, Y., & Luo, J. (2021b). Triplet is all you need with random mappings for unsupervised visual representation learning. arXiv preprint arXiv:2107.10419
Liang, D., Kaneko, S. I., Hashimoto, M., Iwata, K., Zhao, X., & Satoh, Y. (2014). Robust object detection in severe imaging conditions using co-occurrence background model. International Journal of Optomechatronics, 8(1), 14–29.
Liang, D., Kang, B., Liu, X., Gao, P., Tan, X., & Kaneko, S. I. (2021). Cross-scene foreground segmentation with supervised and unsupervised model communication. Pattern Recognition, 117, 107995.
Liang, D., Li, L., Wei, M., Yang, S., Zhang, L., Yang, W., Du, Y., & Zhou, H. (2022). Semantically contrastive learning for low-light image enhancement. In Proceedings of the AAAI conference on artificial intelligence (pp. 1555–1563).
Liang, Z., Li, C., Zhou, S., Feng, R., & Loy, C. C. (2023). Iterative prompt learning for unsupervised backlit image enhancement. arXiv preprint arXiv:2303.17569
Liu, R., Ma, L., Zhang, J., Fan, X., & Luo, Z. (2021). Retinex-inspired unrolling with cooperative prior architecture search for low-light image enhancement. In IEEE conference on computer vision and pattern recognition (pp. 10561–10570).
Lore, K. G., Akintayo, A., & Sarkar, S. (2017). Llnet: A deep autoencoder approach to natural low-light image enhancement. Pattern Recognition, 61, 650–662.
Lv, F., Lu, F., Wu, J., & Lim, C. (2018). Mbllen: Low-light image/video enhancement using cnns. In British machine vision conference (p 4).
Ma, K., Duanmu, Z., Wu, Q., Wang, Z., Yong, H., Li, H., & Zhang, L. (2017). Waterloo exploration database: New challenges for image quality assessment models. IEEE Transactions on Image Processing, 26(2), 1004–1016.
Ma, K., Zeng, K., & Wang, Z. (2015). Perceptual quality assessment for multi-exposure image fusion. IEEE Transactions on Image Processing, 24(11), 3345–3356.
Ma, L., Ma, T., Liu, R., Fan, X., & Luo, Z. (2022). Toward fast, flexible, and robust low-light image enhancement. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5637–5646).
Mantiuk, R., Daly, S., & Kerofsky, L. (2008). Display adaptive tone mapping. ACM Transactions on Graphics, 27(3), 509–518.
Martin, D., Fowlkes, C., Tal, D., & Malik, J. (2001). A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In IEEE international conference on computer vision (pp. 416–423).
Mittal, A., Soundararajan, R., & Bovik, A. (2013). Making a “completely blind’’ image quality analyzer. IEEE Signal Processing Letters, 20(3), 209–212.
Pizer, S., Johnston, R., & Ericksen, J. P. (1990). Contrast-limited adaptive histogram equalization: Speed and effectiveness. Conference on Visualization in Biomedical Computing, 337, 337–345.
Ren, W., Liu, S., Ma, L., Xu, Q., Xu, X., Cao, X., Du, J., & Yang, M. H. (2019). Low-light image enhancement via a deep hybrid network. IEEE Transactions on Image Processing, 28(9), 4364–4375.
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M., & Berg, A. C. (2015). ImageNet large scale visual recognition challenge. International Journal of Computer Vision, 115, 211–252.
Sermanet, P., Lynch, C., Chebotar, Y., Hsu, J., Jang, E., Schaal, S., Levine, S., & Brain, G. (2018). Time-contrastive networks: Self-supervised learning from video. In IEEE international conference on robotics and automation (pp. 1134–1141).
Shi, Y., Wang, B., Wu, X., & Zhu, M. (2022). Unsupervised low-light image enhancement by extracting structural similarity and color consistency. IEEE Signal Processing Letters, 29, 997–1001.
Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556
Sivic & Zisserman (2003). Video google: A text retrieval approach to object matching in videos. In Proceedings ninth IEEE international conference on computer vision (pp. 1470–1477). IEEE.
Sohn, K. (2016). Improved deep metric learning with multi-class n-pair loss objective. In Advances in neural information processing systems (Vol. 29).
Tian, Y., Krishnan, D., & Isola, P. (2020). Contrastive multiview coding. In Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, Aug 23–28, 2020, Proceedings, Part XI (Vol. 16, pp. 776–794). Springer.
Wang, H., Chen, Y., Cai, Y., Chen, L., Li, Y., Sotelo, M. A., & Li, Z. (2022). Sfnet-n: An improved sfnet algorithm for semantic segmentation of low-light autonomous driving road scenes. IEEE Transactions on Intelligent Transportation Systems, 23(11), 21405–21417.
Wang, S., Zheng, J., Hu, H. M., & Li, B. (2013). Naturalness preserved enhancement algorithm for non-uniform illumination images. IEEE Transactions on Image Processing, 22(9), 3538–3548.
Wei, C., Wang, W., Yang, W., & Liu, J. (2018). Deep retinex decomposition for low-light enhancement. arXiv preprint arXiv:1808.04560
Wu, H., Qu, Y., Lin, S., Zhou, J., Qiao, R., Zhang, Z., Xie, Y., & Ma, L. (2021). Contrastive learning for compact single image dehazing. In IEEE conference on computer vision and pattern recognition (pp. 10551–10560).
Wu, W., Weng, J., Zhang, P., Wang, X., Yang, W., & Jiang, J. (2022a). Uretinex-net: Retinex-based deep unfolding network for low-light image enhancement. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5901–5910).
Wu, Y., Guo, H., Chakraborty, C., Khosravi, M., Berretti, S., & Wan, S. (2022b). Edge computing driven low-light image dynamic enhancement for object detection. IEEE Transactions on Network Science and Engineering. https://doi.org/10.1109/TNSE.2022.3151502
Wu, Y., Pan, C., Wang, G., Yang, Y., Wei, J., Li, C., & Shen, H. T. (2023). Learning semantic-aware knowledge guidance for low-light image enhancement. In IEEE conference on computer vision and pattern recognition (CVPR).
Xu, K., Yang, X., Yin, B., & Lau, R. W. (2020). Learning to restore low-light images via decomposition-and-enhancement. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 2281–2290).
Xu, Q., Jiang, H., Scopigno, R., & Sbert, M. (2014). A novel approach for enhancing very dark image sequences. Signal Processing, 103, 309–330.
Yang, S., Luo, P., Loy, C. C., & Tang, X. (2016). Wider face: A face detection benchmark. In IEEE conference on computer vision and pattern recognition (CVPR) (pp. 5525–5533).
Yang, W., Yuan, Y., Ren, W., Liu, J., Scheirer, W. J., Wang, Z., Zhang, T., Zhong, Q., Xie, D., Pu, S., & Zheng, Y. (2020). Advancing image understanding in poor visibility environments: A collective benchmark study. IEEE Transactions on Image Processing, 29, 5737–5752.
Yongqing, H. (2013). Dodging and burning inspired inverse tone mapping algorithm. Computational Information Systems, 9,
Yuan, L., & Sun, J. (2012). Automatic exposure correction of consumer photographs. In Computer vision–ECCV 2012: 12th European conference on computer vision, Florence, Italy, October 7–13, 2012, Proceedings, Part IV (Vol. 12, pp. 771–785). Springer.
Zhang, J., Wang, Y., Tohidypour, H., Pourazad, M. T., & Nasiopoulos, P. (2023). A generative adversarial network based tone mapping operator for 4k hdr images. In 2023 international conference on computing, networking and communications (ICNC) (pp. 473–477).
Zhang, R., Guo, L., Huang, S., & Wen, B. (2021a). Rellie: Deep reinforcement learning for customized low-light image enhancement. In Proceedings of the 29th ACM international conference on multimedia (pp. 2429–2437).
Zhang, W., Ma, K., Zhai, G., & Yang, X. (2021b). Uncertainty-aware blind image quality assessment in the laboratory and wild. IEEE Transactions on Image Processing, 30, 3474–3486.
Zhang, Y., Guo, X., Ma, J., Liu, W., & Zhang, J. (2021c). Beyond brightening low-light images. International Journal of Computer Vision, 129, 1013–1037.
Zhang, Y., Zhang, J., & Guo, X. (2019). Kindling the darkness: A practical low-light image enhancer. In Proceedings of the 27th ACM international conference on multimedia (pp. 1632–1640).
Zhao, L., Lu, S. P., Chen, T., Yang, Z., & Shamir, A. (2021). Deep symmetric network for underexposed image enhancement with recurrent attentional learning. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 12075–12084).
Zhou, S., Li, C., & Loy, C. C. (2022). Lednet: Joint low-light enhancement and deblurring in the dark. In European conference on computer vision (ECCV).
Zhou, Y., Liang, D., Chen, S., Huang, S. J., Yang, S., & Li, C (2023). Improving lens flare removal with general-purpose pipeline and multiple light sources recovery. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 12969–12979).
Acknowledgements
This work was partly supported by NSFC (Grant Nos. 62272229, 62076124, 62222605), the National Key R &D Program of China (2020AAA0107000), the Natural Science Foundation of Jiangsu Province (Grant Nos. BK20222012, BK20211517), and Shenzhen Science and Technology Program JCYJ20230807142001004. The authors would like to thank all the anonymous reviewers for their constructive comnments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Chongyi Li.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Liang, D., Xu, Z., Li, L. et al. PIE: Physics-Inspired Low-Light Enhancement. Int J Comput Vis (2024). https://doi.org/10.1007/s11263-024-01995-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s11263-024-01995-y