问题为什么这么难?
PyTorch(以及 TensorFlow、JAX)不在 PyPI 发布 CUDA 版本,而是在 自家的独立索引(如 https://download.pytorch.org/whl/cu121)上提供。同一个版本号 torch==2.3.0 在不同索引里是不同的文件:
- PyPI 默认 → CPU 版(Linux/macOS/Windows)
- PyTorch cu121 索引 → CUDA 12.1 版
- PyTorch cu118 索引 → CUDA 11.8 版
- PyTorch rocm 索引 → AMD ROCm 版
你的团队同时存在:① 开发者 Mac(必须 CPU)② 训练服务器 Linux+H100(要 CUDA 12.1)③ CI(用 CPU 跑单元测试)。过去需要写三份 requirements.txt 或在 Dockerfile 里魔法脚本。uv 用命名索引 + fork + marker在一个 pyproject 里搞定。
配置命名索引
[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true # 只有显式引用才从这里找包
[[tool.uv.index]]
name = "pytorch-cu121"
url = "https://download.pytorch.org/whl/cu121"
explicit = true
explicit = true 表示"这个索引不参与默认解析"——uv 不会去这里找 pandas 之类的无关包。
按平台选择 PyTorch 变体
[project]
dependencies = [
"torch>=2.3",
"torchvision",
]
[tool.uv.sources]
torch = [
{ index = "pytorch-cpu", marker = "sys_platform == 'darwin' or platform_machine != 'x86_64'" },
{ index = "pytorch-cu121", marker = "sys_platform == 'linux' and platform_machine == 'x86_64'" },
]
torchvision = [
{ index = "pytorch-cpu", marker = "sys_platform == 'darwin'" },
{ index = "pytorch-cu121", marker = "sys_platform == 'linux'" },
]
解析时 uv 会为不同平台"分叉"(fork),同一个 uv.lock 里记录多组解:
CUDA 版本切换
从 CUDA 11.8 升到 12.1 只改索引 URL:
[[tool.uv.index]]
name = "pytorch-cu121"
url = "https://download.pytorch.org/whl/cu121"
explicit = true
# 换 CUDA 版本只改上面的 url。运行 `uv lock` 会重新解析
pytorch-cu121 要求主机 NVIDIA 驱动支持 CUDA 12.1+(nvidia-smi 查看)。驱动太老装 cu121 的 torch 能装上但 torch.cuda.is_available() 会返回 False。这不是 uv 的锅,是硬件驱动问题。
真实案例:支持 Mac + Linux GPU + CI
[project]
name = "my-ml"
requires-python = ">=3.12"
dependencies = [
"torch>=2.3",
"transformers>=4.40",
"datasets",
"accelerate",
]
[dependency-groups]
dev = ["pytest", "ruff"]
train = ["wandb", "deepspeed; sys_platform == 'linux'"]
[[tool.uv.index]]
name = "pytorch-cpu"
url = "https://download.pytorch.org/whl/cpu"
explicit = true
[[tool.uv.index]]
name = "pytorch-cu121"
url = "https://download.pytorch.org/whl/cu121"
explicit = true
[tool.uv.sources]
torch = [
{ index = "pytorch-cpu", marker = "sys_platform == 'darwin'" },
{ index = "pytorch-cu121", marker = "sys_platform == 'linux'" },
]
[tool.uv]
conflicts = [
[ { group = "train" }, { group = "dev" } ], # 仅示例:两组互斥
]
私有 PyPI / Artifactory / Nexus
企业内部私有源:
[[tool.uv.index]]
name = "internal"
url = "https://pypi.internal.mycorp.com/simple"
default = true # 作为默认源(替代 PyPI)
# 认证通过环境变量
# UV_INDEX_INTERNAL_USERNAME=...
# UV_INDEX_INTERNAL_PASSWORD=...
[[tool.uv.index]]
name = "pypi"
url = "https://pypi.org/simple" # 公开 PyPI 作为 fallback
带认证的完整 URL(不推荐,会进 lock):
url = "https://<user>:<token>@pypi.internal.mycorp.com/simple"
推荐做法:URL 不带凭证,凭证用环境变量注入。uv 按 UV_INDEX_<NAME_UPPERCASE>_USERNAME/PASSWORD 规则查找。
国内镜像
[[tool.uv.index]]
name = "tuna"
url = "https://pypi.tuna.tsinghua.edu.cn/simple"
default = true
或全局配置 ~/.config/uv/uv.toml(个人机器偏好,不进仓库)。
keyring 集成(企业推荐)
让 uv 通过系统 keyring 拿私有仓库凭证,无需环境变量:
[tool.uv]
keyring-provider = "subprocess" # 调用 keyring CLI
# 一次性存储凭证
keyring set https://pypi.internal.mycorp.com/simple myuser
# 之后 uv sync 自动从 keyring 取密码
TensorFlow / JAX 的平台约束
[project]
dependencies = [
# TensorFlow:Mac ARM 要装 tensorflow-macos,其他平台装 tensorflow
"tensorflow; sys_platform != 'darwin' or platform_machine != 'arm64'",
"tensorflow-macos; sys_platform == 'darwin' and platform_machine == 'arm64'",
"tensorflow-metal; sys_platform == 'darwin' and platform_machine == 'arm64'",
# JAX:CPU 版兼容所有平台;GPU 版需要单独装
"jax",
"jaxlib",
]
调试技巧
# 看解析过程(verbose)
uv lock -v
# 看某个包实际装了哪个 wheel
uv pip show torch
# 强制重新下载(绕过缓存)
uv sync --refresh
# 用 dry-run 预览,不真的动环境
uv sync --dry-run
AI 项目的多平台 PyTorch 依赖终于有了干净解法:命名索引 + [tool.uv.sources] 下按 marker 分派。私有源用环境变量或 keyring 注入凭证,不要把 token 写入仓库。一份 pyproject、一份 uv.lock,团队所有平台统一。最后一章是实战:CI/CD 与 Docker。