当前位置:首页职业培训

轨迹预测算法—HiVT

作者:职业培训 时间: 2025-01-11 14:15:53 阅读:695

Hierarchical Vector Transformer (HiVT) for Multi-Agent Motion Prediction

HiVT, a method for predicting multi-agent motion, utilizes a hierarchical vector transformer approach. Traditional vector methods excel in capturing complex interactions in traffic scenarios. However, existing motion prediction algorithms often overlook the symmetry of the problem, resulting in high computational complexity and difficulty in lossless, real-time online prediction.

The HiVT method decomposes the multi-agent motion prediction problem into two parts: a local information extraction model and a global information interaction model. This design allows for the aggregation of information at various scales, effectively and efficiently modeling the large number of agents in a scene. The HiVT model also features modules for spatially and rotationally invariant scene representation and learning, enhancing its ability to extract robust representations that are insensitive to input shifts and rotations.

HiVT models the relationships between entities in stages, using relative positions to represent all vectorized entities. It comprises four key components: Scene Representation, Local Encoder, Global Interaction, and Multimodal Future Decoder. Scene Representation involves extracting vectorized entities from the scene, including agent trajectories and map lane segments, using relative coordinates from a central vehicle. Local Encoder focuses on agent interactions, temporal dependencies, and the relationship between agents and lanes. Global Interaction bridges the different local pieces of information. The Multimodal Future Decoder models the multi-modal nature of vehicle motion, predicting future trajectories for all agents.

During the training phase, the model calculates errors for each agent under each trajectory, resulting in an error matrix. For each agent, the model selects the trajectory with the smallest error to calculate the loss. The loss function is formulated using the Laplace probability density function, with the goal of minimizing the loss when the true value aligns with the predicted position. Additionally, a cross-entropy loss is used for classification, with the selected trajectory having a value of 1, and other trajectories having a value of 0.

Experimental results show that the HiVT model outperforms other models in terms of parameter efficiency, inference speed, and prediction accuracy, as demonstrated through ablation experiments. This makes HiVT a valuable and efficient approach for multi-agent motion prediction in complex traffic scenarios.

标签:

本文地址: http://www.goggeous.com/20241227/1/933072

文章来源:天狐定制

版权声明:除非特别标注,否则均为本站原创文章,转载时请以链接形式注明文章出处。

猜你喜欢
猜你喜欢
  • 最新动态
  • 热点阅读
  • 猜你喜欢
热门标签

网站首页 ·

本站转载作品版权归原作者及来源网站所有,原创内容作品版权归作者所有,任何内容转载、商业用途等均须联系原作者并注明来源。

鲁ICP备2024081150号-3 相关侵权、举报、投诉及建议等,请发E-mail:admin@qq.com