AI开发平台MODELARTS-在Lite Cluster资源池上使用Snt9B完成推理任务:操作步骤
操作步骤
- 拉取镜像。本测试镜像为bert_pretrain_mindspore:v1,已经把测试数据和代码打进镜像中。
docker pull swr.cn-southwest-2.myhuaweicloud.com/os-public-repo/bert_pretrain_mindspore:v1 docker tag swr.cn-southwest-2.myhuaweicloud.com/os-public-repo/bert_pretrain_mindspore:v1 bert_pretrain_mindspore:v1
- 在主机上新建config.yaml文件。
config.yaml文件用于配置pod,本示例中使用sleep命令启动pod,便于进入pod调试。您也可以修改command为对应的任务启动命令(如“python inference.py”),任务会在启动容器后执行。
config.yaml内容如下:apiVersion: apps/v1 kind: Deployment metadata: name: yourapp labels: app: infers spec: replicas: 1 selector: matchLabels: app: infers template: metadata: labels: app: infers spec: schedulerName: volcano nodeSelector: accelerator/huawei-npu: ascend-1980 containers: - image: bert_pretrain_mindspore:v1 # Inference image name imagePullPolicy: IfNotPresent name: mindspore command: - "sleep" - "1000000000000000000" resources: requests: huawei.com/ascend-1980: "1" # 需求卡数,key保持不变。Number of required NPUs. The maximum value is 16. You can add lines below to configure resources such as memory and CPU. limits: huawei.com/ascend-1980: "1" # 限制卡数,key保持不变。The value must be consistent with that in requests. volumeMounts: - name: ascend-driver #驱动挂载,保持不动 mountPath: /usr/local/Ascend/driver - name: ascend-add-ons #驱动挂载,保持不动 mountPath: /usr/local/Ascend/add-ons - name: hccn #驱动hccn配置,保持不动 mountPath: /etc/hccn.conf - name: npu-smi #npu-smi mountPath: /usr/local/sbin/npu-smi - name: localtime #The container time must be the same as the host time. mountPath: /etc/localtime volumes: - name: ascend-driver hostPath: path: /usr/local/Ascend/driver - name: ascend-add-ons hostPath: path: /usr/local/Ascend/add-ons - name: hccn hostPath: path: /etc/hccn.conf - name: npu-smi hostPath: path: /usr/local/sbin/npu-smi - name: localtime hostPath: path: /etc/localtime
- 根据config.yaml创建pod。
kubectl apply -f config.yaml
- 检查pod启动情况,执行下述命令。如果显示“1/1 running”状态代表启动成功。
kubectl get pod -A
- 进入容器,{pod_name}替换为您的pod名字(get pod中显示的名字),{namespace}替换为您的命名空间(默认为default)。
kubectl exec -it {pod_name} bash -n {namespace}
- 激活conda模式。
su - ma-user //切换用户身份 conda activate MindSpore //激活 MindSpore环境
- 创建测试代码test.py。
from flask import Flask, request import json app = Flask(__name__) @app.route('/greet', methods=['POST']) def say_hello_func(): print("----------- in hello func ----------") data = json.loads(request.get_data(as_text=True)) print(data) username = data['name'] rsp_msg = 'Hello, {}!'.format(username) return json.dumps({"response":rsp_msg}, indent=4) @app.route('/goodbye', methods=['GET']) def say_goodbye_func(): print("----------- in goodbye func ----------") return '\nGoodbye!\n' @app.route('/', methods=['POST']) def default_func(): print("----------- in default func ----------") data = json.loads(request.get_data(as_text=True)) return '\n called default func !\n {} \n'.format(str(data)) # host must be "0.0.0.0", port must be 8080 if __name__ == '__main__': app.run(host="0.0.0.0", port=8080)
执行代码,执行后如下图所示,会部署一个在线服务,该容器即为服务端。python test.py
图2 部署在线服务
- 在XShell中新建一个终端,参考步骤5~7进入容器,该容器为客户端。执行以下命令验证 自定义镜像 的三个API接口功能。当显示如图所示时,即可调用服务成功。
curl -X POST -H "Content-Type: application/json" --data '{"name":"Tom"}' 127.0.0.1:8080/ curl -X POST -H "Content-Type: application/json" --data '{"name":"Tom"}' 127.0.0.1:8080/greet curl -X GET 127.0.0.1:8080/goodbye
图3 访问在线服务
limit/request配置cpu和内存大小,已知单节点Snt9B机器为:8张Snt9B卡+192u1536g,请合理规划,避免cpu和内存限制过小引起任务无法正常运行。
- ModelArts是什么_AI开发平台_ModelArts功能
- ModelArts模型训练_创建训练作业_如何创建训练作业
- ModelArts推理部署_纳管Atlas 500_边缘服务-华为云
- ModelArts计费说明_计费简介_ModelArts怎么计费
- DWS资源管理_GaussDB(DWS)资源管理作用_DWS资源管控
- 性能测试使用教程_性能测试操作步骤_性能测试快速入门-华为云
- ModelArts模型训练_模型训练简介_如何训练模型
- ModelArts自动学习是什么_自动学习简介_零代码完成AI开发
- ModelArts推理部署_AI应用_部署服务-华为云
- ModelArts推理部署_OBS导入_模型包规范-华为云