Three-dimensional object detection based on an L-shape model in autonomous motion systems

M. O. Chekanov; Чеканов М. О.; S. V. Kudryashova; Кудряшова С. В.; O. S. Shipitko; Шипитько О. С.

doi:10.31857/S0235009225010078

Three-dimensional object detection based on an L-shape model in autonomous motion systems

作者: Chekanov M.O.¹^,2, Kudryashova S.V.¹, Shipitko O.S.¹
隶属关系:
1. Kharkevich Institute for Information Transmission Problems of the Russian Academy of Sciences
2. EvoCargo LLC
期: 卷 39, 编号 1 (2025)
页面: 66-78
栏目: ТЕХНИЧЕСКОЕ ЗРЕНИЕ
URL: https://rjonco.com/0235-0092/article/view/682134
DOI: https://doi.org/10.31857/S0235009225010078
EDN: https://elibrary.ru/UUMPXE
ID: 682134

如何引用文章

全文:

开放存取

##reader.subscriptionAccessGranted##
受限制的访问

订阅或者付费存取

详细
全文:
作者简介
参考
补充文件
统计

详细

The ability of automated vehicles (AVs) to determine the position of objects in three-dimensional space plays a key role in motion planning. The implementation of algorithms that solve this problem is particularly difficult for systems that use only monocular cameras, as depth estimation is a non-trivial task for them. Nevertheless, such systems are widespread due to their relative cheapness and ease of use. In this paper, we propose a method to determine the position of vehicles (the most common type of objects in urban scenes) in the form of oriented bounding boxes in birds’-eye view based on an image obtained from a single monocular camera. This method consists of two steps. In the first step, a projection of the visible boundary of the vehicle in the birds’-eye view is computed based on 2D obstacle detections and roadway segmentation in the image. The resulting projection is assumed to represent the noisy measurements of the two orthogonal sides of the vehicle. In the second step, an oriented bounding box is constructed around the obtained projection. For this stage, we propose a new algorithm for constructing the bounding box based on the assumption of the L-shape model. The algorithm was tested on a prepared real-world dataset. The proposed L-shape algorithm outperformed the best of the compared algorithms in terms of the Jaccard coefficient (Intersection over Union, IoU) by 2.7%.

关键词

3D object detection, L-shape, monocular object detection, autonomous driving

全文:

作者简介

M. Chekanov

Kharkevich Institute for Information Transmission Problems of the Russian Academy of Sciences; EvoCargo LLC

编辑信件的主要联系方式.
Email: mikhail.chekanov@evocargo.com
俄罗斯联邦, Bldg. 1, 19, Bolshoy Karetny All., Moscow, 127051; Bldg. 4, 9, Godovikov St., Moscow, 129085

S. Kudryashova

Kharkevich Institute for Information Transmission Problems of the Russian Academy of Sciences

Email: mikhail.chekanov@evocargo.com
俄罗斯联邦, Bldg. 1, 19, Bolshoy Karetny All., Moscow, 127051

O. Shipitko

Kharkevich Institute for Information Transmission Problems of the Russian Academy of Sciences

Email: mikhail.chekanov@evocargo.com
俄罗斯联邦, Bldg. 1, 19, Bolshoy Karetny All., Moscow, 127051

参考

Shipitko O.S., Teteryukov D.O. Razrabotka algoritma otsenki prostranstvennogo polozheniya korobok dlya avtomatizatsii protsessa formirovaniya zakazov na skladakh [Development of an algorithm for estimating the spatial position of boxes to automate the process of order formation in warehouses]. Materialy VI Vserossiyskoy molodezhnoy shkoly po robototekhni. Obschestvo s ogranichennoy otvetstvennostyu ”Volgogradskoe nauchnoe izdatelstvo” (Volgograd). 2017. P. 9-18 (in Russian)
Arnon D.S., Gieselmann J.P. A linear time algorithm for the minimum area rectangle enclosing a convex polygon, 1983.
Billings G., Johnson-Roberson M. Silhonet: An rgb method for 6d object pose estimation. IEEE Robotics and Automation Letters. 2019. V. 4(4). P. 3727-3734. doi: 10.48550/arXiv.1809.06893
Chen X., Kundu K., Zhang Z., Ma H., Fidler S., Urtasun R. Monocular 3d object detection for autonomous driving. Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. P. 2147-2156. doi: 10.1109/CVPR.2016.236
Fan Z., Zhu Y., He Y., Sun Q., Liu H., He J. Deep learning on monocular object pose detection and tracking: A comprehensive overview. ACM Computing Surveys. 2022. V. 55(4). P. 1-40. doi: 10.1145/3524496
Geiger A., Lenz P., Stiller C., Urtasun R. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research. 2013. V. 32(11). P. 1231-1237. doi: 10.1177/0278364913491297
Jiang D., Li G., Sun Y., Hu J., Yun J., Liu Y. Manipulator grabbing position detection with information fusion of color image and depth image using deep learning. Journal of Ambient Intelligence and Humanized Computing. 2021. V. 12. P. 10809-10822. doi: 10.1007/s12652-020-02843-w
Kim Y., Kim J., Koh J., Choi J.W. Enhanced Object Detection in Bird’s Eye View Using 3D Global Context Inferred From Lidar Point Data. 2019 IEEE Intelligent Vehicles Symposium (IV). IEEE. 2019. P. 2516-2521. doi: 10.1109/IVS.2019.8814276
Kuhn H.W. The Hungarian method for the assignment problem. Naval research logistics quarterly. 1955. V. 2(1‐2). P. 83-97. doi: 10.1002/nav.3800020109
Labayrade R., Aubert D., Tarel J.P. Real time obstacle detection in stereovision on non-flat road geometry through” v-disparity” representation. Intelligent Vehicle Symposium. 2002. IEEE, 2002. V. 2. P. 646-651. doi: 10.1109/IVS.2002.1188024
Liu X., Xue N., Wu T. Learning auxiliary monocular contexts helps monocular 3d object detection. Proceedings of the AAAI Conference on Artificial Intelligence. 2022. V. 36(2). P. 1810-1818. doi: 10.1609/aaai.v36i2.20074
Liu Y., Geng L., Zhang W., Gong Y., Xu Z. Survey of video based small target detection. Journal of Image and Graphics. 2021. V. 9(4). P. 122-134. doi: 10.18178/JOIG.9.4.122-134
Liu Z., Zhou D., Lu F., Fang J., Zhang L. Autoshape: Real-time shape-aware monocular 3d object detection. Proceedings of the IEEE/CVF International Conference on Computer Vision. 2021. P. 15641-15650. doi: 10.1109/ICCV48922.2021.01535
Sholomov D. L. Application of shared backbone DNNs in ADAS perception systems. ICMV. 2020. P. 1160525. doi: 10.1117/12.2586932
Smagina A.A., Shepelev D.A., Ershov E.I., Grigoryev A.S. Obstacle detection quality as a problem-oriented approach to stereo vision algorithms estimation in road situation analysis. Journal of Physics: Conference Series. 2018. V. 1096(1). P. 012035. doi: 10.1088/1742-6596/1096/1/012035
Tekin B., Sinha S.N., Fua P. Real-time seamless single shot 6d object pose prediction. Proceedings of the IEEE conference on computer vision and pattern recognition. 2018. P. 292–301. doi: 10.1109/CVPR.2018.00038
Wang H., Wang Z., Lin L., Xu F., Yu J., Liang H. Optimal vehicle pose estimation network based on time series and spatial tightness with 3D lidars. Remote Sensing. 2021. V. 13(20). P. 4123. doi: 10.3390/rs13204123
Wang P. Research on comparison of lidar and camera in autonomous driving. Journal of Physics: Conference Series. 2021. V. 2093(1). P. 012032. doi: 10.1088/1742-6596/2093/1/012032
Wu D., Liao M. W., Zhang W. T., Wang X. G., Bai X., Cheng W. Q., Liu W. L. You only look once for panoptic driving perception. 2022. V. 19. P. 550–562. doi: 10.1007/s11633-022-1339-y
Yu Q., Araújo H., Wang H. A stereovision method for obstacle detection and tracking in non-flat urban environments. Autonomous Robots. 2005. V. 19. P. 141–157. doi: 10.1007/s10514-005-0612-6
Zhang Z., Weiss R., Hanson A.R. Qualitative obstacle detection. Proceedings of IEEE Conference on Computer Vision and Pattern Recognition. 1994. P. 554-559. doi: 10.1109/CVPR.1994.323881
Zhu Z., Zhang Y., Chen H., Dong Y., Zhao S., Ding W., Zhong J., Zheng S. Understanding the Robustness of 3D Object Detection With Bird’s-Eye-View Representations in Autonomous Driving. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2023. P. 21600–21610. doi: 10.1109/CVPR52729.2023.02069

补充文件

附件文件

动作

1. JATS XML

下载

2. Fig. 1. Classification of 3D object detection methods by the number of sensors used.

下载 (19KB)

索引源数据

3. Fig. 2. Classification of 3D object detection methods by the type of model used.

下载 (14KB)

索引源数据

4. Fig. 3. A naive approach to projecting a bounding box in 2D into the top view.

下载 (192KB)

索引源数据

5. Fig. 4. Incorrect projection of the truck. The area to the left of the truck is perceived as part of the object and prevents you from driving around it.

下载 (168KB)

索引源数据

6. Fig. 5. Projection of the two-dimensional bounding box in the top view considering carriageway segmentation.

下载 (201KB)

索引源数据

7. Fig. 6. Example scenes from the collected dataset.

下载 (681KB)

索引源数据

8. Fig. 7. Example of an incorrectly constructed bounding box.

下载 (34KB)

索引源数据

9. Fig. 8. Examples of operation of the compared algorithms for constructing bounding boxes.

下载 (503KB)

索引源数据

10. Fig. 9. Graph of dependence of the average Jaccard coefficient on the distance limitation to the detected obstacles.

下载 (98KB)

索引源数据

用户名
密码
记住我

忘记您的密码?	注册

用户名
密码
记住我

忘记您的密码?	注册

卷 39, 编号 2 (2025)

卷 39, 编号 2 (2025)

Three-dimensional object detection based on an L-shape model in autonomous motion systems

全文:

详细

关键词

全文:

作者简介

M. Chekanov

S. Kudryashova

O. Shipitko

参考

补充文件