AI开发平台MODELARTS-分离部署推理服务:步骤四制作推理镜像

时间：2025-02-25 19:54:48

AI开发平台MODELARTS

步骤四制作推理镜像

解压AscendCloud压缩包及该目录下的推理代码AscendCloud-LLM-6.3.911-xxx.zip和算子包AscendCloud-OPP-6.3.911-xxx.zip，并执行build_image.sh脚本制作推理镜像。安装过程需要连接互联网git clone，请确保机器环境可以访问公网。

unzip AscendCloud-*.zip -d ./AscendCloud && cd ./AscendCloud && unzip AscendCloud-OPP-*.zip && unzip AscendCloud-OPP-*-torch-2.1.0*.zip -d ./AscendCloud-OPP && cd .. && unzip ./AscendCloud/AscendCloud-LLM-*.zip -d ./AscendCloud/AscendCloud-LLM && cd ./AscendCloud/AscendCloud-LLM/llm_inference/ascend_vllm/ && sh build_image.sh --base-image=${base_image} --image-name=${image_name}

参数说明：

${base_image}为基础镜像地址。
${image_name}为推理镜像名称，可自行指定。

运行完后，会生成推理所需镜像。

如果推理需要使用npu加速图片预处理，需要安装torchvision_npu，可放到镜像制作脚本里面。内容如下：

git clone https://gitee.com/ascend/vision.git vision_npu
cd vision_npu
git checkout v0.16.0-6.0.rc3
# 安装依赖库 
pip3 install -r requirement.txt
# 编包 
python setup.py bdist_wheel  
# 安装 
cd dist  
pip install torchvision_npu-0.16.*.whl

上一篇：AI开发平台MODELARTS-分离部署推理服务:步骤三上传代码包和权重文件

下一篇：AI开发平台MODELARTS-分离部署推理服务:步骤三上传代码包和权重文件