AI开发平台MODELARTS-使用AWQ量化:Step1 环境准备

时间:2025-01-03 09:38:58

Step1 环境准备

  1. 在节点自定义目录${node_path}下创建config.yaml文件
    apiVersion: apps/v1
    kind: Deployment
    metadata:
      name: yourapp
      labels:
          app: infers
    spec:
      replicas: 1
      selector:
        matchLabels:
          app: infers
      template:
        metadata: 
          labels:
             app: infers
        spec:
          schedulerName: volcano
          nodeSelector:
            accelerator/huawei-npu: ascend-1980
          containers:
          - image: ${image_name}                  # 推理镜像名称
            imagePullPolicy: IfNotPresent
            name: ${container_name}
            securityContext:
              runAsUser: 0
            ports:
            - containerPort: 8080
            command: 
            - "sleep"
            - "1000000000000000000"
            resources:
              requests:
                huawei.com/ascend-1980: "8"             # 需求卡数,key保持不变。
              limits:
                huawei.com/ascend-1980: "8"             # 限制卡数,key保持不变。
            volumeMounts:                             # 容器内部映射路径
            - name: ascend-driver               #驱动挂载,保持不动
              mountPath: /usr/local/Ascend/driver
            - name: ascend-add-ons           #驱动挂载,保持不动
              mountPath: /usr/local/Ascend/add-ons
            - name: hccn                             #驱动hccn配置,保持不动
              mountPath: /etc/hccn.conf
            - name: localtime                       
              mountPath: /etc/localtime
            - name: npu-smi                             # npu-smi
              mountPath: /usr/local/sbin/npu-smi
            - name: model-path                       # 模型权重路径
              mountPath: ${model-path}
            - name: node-path                       # 节点自定义目录,该目录下包含pod配置文件config.yaml
              mountPath: ${node-path}
          volumes:                                   # 物理机外部路径
          - name: ascend-driver
            hostPath:
              path: /usr/local/Ascend/driver
          - name: ascend-add-ons
            hostPath:
              path: /usr/local/Ascend/add-ons
          - name: hccn
            hostPath:
              path: /etc/hccn.conf
          - name: localtime
            hostPath:
              path: /etc/localtime
          - name: npu-smi
            hostPath:
              path: /usr/local/sbin/npu-smi
          - name: model-path
            hostPath:
              path: ${model-path}
          - name: node-path
            hostPath:
              path: ${node-path}

    参数说明:

    • ${container_name}:容器名称,此处可以自己定义一个容器名称,例如ascend-vllm。
    • ${image_name}:Step3 制作推理镜像构建的推理镜像名称。
    • ${node-path}:节点自定义目录,该目录下包含pod配置文件config.yaml。
    • ${model-path}:Step1 上传权重文件中上传的模型权重路径。
  2. 参考Step4 创建pod创建pod以用于后续进行模型量化
support.huaweicloud.com/bestpractice-modelarts/modelarts_llm_infer_91030.html