talk2bev下载talk2bev源代码下载

talk2bev

其他源码

1.0.0

下载

Talk2Bev：语言增强的鸟的眼景图

项目页面| arxiv |视频

Vikrant Dewangan* ¹ , Tushar Choudhary* ¹ , Shivam Chandhok* ² , Shubham Priyadarshan ¹ , Anushka Jain ¹ , Arun K. Singh ³ , Siddharth Srivastava ⁴ , Krishna Murthy Jatavallabhula $^匕首$ ⁵ ，K。MadhavaKrishna $^匕首$ ¹

¹ International Institute of Information Technology Hyderabad, ² University of British Columbia, ³ University of Tartu ⁴ TensorTour Inc ⁵ MIT-CSAIL

*表示同等的贡献， $^匕首$表示平等的建议

ICRA 2024

Methodology.mp4

抽象的

我们介绍了Talk2Bev，这是一种大型视觉模型（LVLM）界面，用于鸟类视图（BEV）地图，通常在自动驾驶中使用。

While existing perception systems for autonomous driving scenarios have largely focused on a pre-defined (closed) set of object categories and driving scenarios, Talk2BEV eliminates the need for BEV- specific training, relying instead on performant pre-trained LVLMs. This enables a single system to cater to a variety of autonomous driving tasks encompassing visual and spatial reasoning, predicting the intents of traffic actors, and decision- making based on visual cues.

We extensively evaluate Talk2BEV on a large number of scene understanding tasks that rely on both the ability to interpret freefrom natural language queries, and in grounding these queries to the visual context embedded into the language-enhanced BEV map. To enable further research in LVLMs for autonomous driving scenarios, we develop and release Talk2BEV-Bench, a benchmark encom- passing 1000 human-annotated BEV scenarios, with more than 20,000 questions and ground-truth responses from the NuScenes dataset.

数据准备

请下载Nuscenes V1.0-TrainVal数据集。 Our dataset consists of 2 parts - Talk2BEV-Base and Talk2BEV-Captions, consisting of base (crops, perspective images, bev area centroids) and crop captions respectively.

下载链接

我们提供了2个链接，以下提供了下面提供的TAKE2BEV数据集（ Talk2Bev-Mini （仅字幕）和Talk2Bev-Full ）。该数据集托管在Google Drive上。请下载数据集并将文件提取到data文件夹。

姓名	根据	字幕	长椅	关联
talk2bev- mini	✓	✗	✗	关联
Talk2Bev-完整	✗	✗	✗	托多

如果要从头开始生成数据集，请在此处关注该过程。每个数据部分的格式以格式描述。

评估

对Talk2BEV的评估通过2种方法进行 - MCQ（来自Talk2BEV Bench）和空间操作员进行评估。我们使用GPT-4进行评估。请按照GPT-4中的说明进行操作，并在OS Env中初始化API密钥和组织。

ORGANIZATION= < your-organization >
API_KEY= < your-api-key >

评估-MCQ

要获得MCQ的准确性，请运行以下命令：

 cd evaluation
python eval_mcq.py

这将产生MCQ的准确性。

评估空间操作员

要获取距离错误，请为MCQ提供以下命令：

 cd evaluation
python eval_spops.py

Click2Chat

我们还允许与BEV自由形式对话。请按照Click2Chat中的说明与BEV聊天。

talk2bev ben

要发布

托多

空间操作员评估管道
将链接添加到BEV农作物 - 发布Talk2Bev-Full
发布Talk2Bev板凳

展开

附加信息

版本 1.0.0
类型其他源码
更新时间 2025-02-26
大小 77.03MB
来自于 Github

talk2bev

Talk2Bev：语言增强的鸟的眼景图

抽象的

数据准备

下载链接

评估

评估-MCQ

评估空间操作员

Click2Chat

talk2bev ben

托多

waymo open dataset

Sunamu

MySchedule.py

SmartTube

chat.petals.dev

viptools for eslam

chat.petals.dev

GPT Prompt Templates

GPTyped

waymo open dataset

Sunamu

SmartTube

waymo open dataset

termwind

wp functions