-
Notifications
You must be signed in to change notification settings - Fork 665
[PD Disaggregation] Distinguish the pipelines for sending kv signal in different prefill #5514
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
本PR的目的是为PD分离架构中的多个P服务提供区分KV信号发送管道的能力。之前的实现使用 1024 + rank 作为消息队列ID,现在改为使用环境变量 INFERENCE_MSG_QUEUE_ID 来配置,默认值为1。
主要变更:
- 移除了基于rank的固定消息队列ID计算方式 (
1024 + rank) - 引入环境变量
INFERENCE_MSG_QUEUE_ID来动态配置消息队列ID - 统一了XPU和GPU实现中的消息队列ID获取逻辑
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 9 comments.
| File | Description |
|---|---|
| custom_ops/xpu_ops/src/ops/remote_cache_kv_ipc.h | 将消息队列ID的计算从 1024 + rank 改为从环境变量 INFERENCE_MSG_QUEUE_ID 读取,默认值为1 |
| custom_ops/xpu_ops/src/ops/get_output.cc | 同样修改消息队列ID的获取方式,并修复了代码缩进格式 |
| custom_ops/gpu_ops/remote_cache_kv_ipc.h | GPU版本的消息队列ID获取方式更新,与XPU版本保持一致 |
| custom_ops/gpu_ops/get_output_ep.cc | 更新GetOutputKVSignal函数的消息队列ID获取方式,并统一了代码格式 |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #5514 +/- ##
==========================================
Coverage ? 60.42%
==========================================
Files ? 329
Lines ? 41088
Branches ? 6262
==========================================
Hits ? 24828
Misses ? 14372
Partials ? 1888
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
custom_ops/gpu_ops/get_output_ep.cc
Outdated
| static struct msgdatakv msg_rcv; | ||
| static key_t key = ftok("/opt/", msg_queue_id); | ||
| static int msgid = msgget(key, IPC_CREAT | 0666); | ||
| int msg_queue_id = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
只修改此处
| static struct msgdatakv msg_rcv; | ||
| static key_t key = ftok("/opt/", msg_queue_id); | ||
| static int msgid = msgget(key, IPC_CREAT | 0666); | ||
| int msg_queue_id = 1024; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
只修改此处,其他都是format格式化
Motivation
多个p 服务需要区分发送 kv signal 的管道
Modifications
管道标记使用INFERENCE_MSG_QUEUE_ID
Usage or Command
不变
Accuracy Tests
单侧覆盖
Checklist
[FDConfig],[APIServer],[Engine],[Scheduler],[PD Disaggregation],[Executor],[Graph Optimization],[Speculative Decoding],[RL],[Models],[Quantization],[Loader],[OP],[KVCache],[DataProcessor],[BugFix],[Docs],[CI],[Optimization],[Feature],[Benchmark],[Others],[XPU],[HPU],[GCU],[DCU],[Iluvatar],[Metax]]pre-commitbefore commit.releasebranch, make sure the PR has been submitted to thedevelopbranch, then cherry-pick it to thereleasebranch with the[Cherry-Pick]PR tag.