首页 > 技术知识 > 正文

上一篇:AscendCL快速入门——模型推理篇(上)

继上一篇:

2. 给模型准备输入输出

在上一篇文档中提到了内存管理的概念,解释了如何把待推理数据上传到Device上,这里我们传递到Device上的图片还是裸数据流,这种数据流是没法直接送进模型进行推理的,我们在推理之前,要为模型准备独特的数据结构。

一个模型的所有输入抽象为一个”DataSet”对象。每一个输入抽象为一个“DataBuffer”对象。

比如一个模型有两个输入,第一个输入是若干张图片,第二个输入是每张图片的数据等信息。

第一个输入,所有图片,创建一个DataBuffer对象。第二个输入,图片的信息,创建另一个DataBuffer对象。创建一个DataSet对象。把第1/2步中创建的2个DataBuffer对象放到DataSet对象中。

一个模型有且只有1个“输入DataSet”(数据集对象),里边包含所有的输入;而如果有多个输入的话,每个输入用一个“DataBuffer”来承载。

创建DataBuffer:在创建DataBuffer的时候就要用内存的指针来创建。上边这个接口的参数列表中,data参数就是数据的内存地址,size参数就是这段内存的大小了。

aclDataBuffer *aclCreateDataBuffer(void *data, size_t size)

创建DataSet:和DataBuffer不同,DataSet在创建的时候只能创建空的,因为一个DataSet里边可能包含多个Buffer。

aclmdlDataset *aclmdlCreateDataset()

向DataSet中添加DataBuffer:这里在添加输入的时候,并没有指定输入的名称,所以如果有多个Buffer要添加进DataSet里,要注意按顺序添加。

aclError aclmdlAddDatasetBuffer(aclmdlDataset *dataset, aclDataBuffer *dataBuffer)

阅读下面代码,体会接口调用方式

aclError test4() {  INFO_LOG(“Dataset_databuffer: start.”);  aclError ret = aclInit(nullptr);  int32_t deviceId_ = 0;  ret = aclrtSetDevice(deviceId_);  string picturePath = “./dog1_1024_683.jpg”;  uint32_t pictureDataSize = 0;  void *pictureHostData = nullptr;  ifstream pictureFile(picturePath, ifstream::binary);  pictureFile.seekg(0, pictureFile.end);  pictureDataSize = pictureFile.tellg();  pictureFile.seekg(0, pictureFile.beg);  ret = aclrtMallocHost(&pictureHostData, pictureDataSize);  pictureFile.read((char*)pictureHostData, pictureDataSize);  pictureFile.close();  void *pictureDeviceData = nullptr;  ret = aclrtMalloc(&pictureDeviceData, pictureDataSize, ACL_MEM_MALLOC_HUGE_FIRST);  ret = aclrtMemcpy(pictureDeviceData, pictureDataSize, pictureHostData, pictureDataSize, ACL_MEMCPY_HOST_TO_DEVICE);  aclDataBuffer* inputDataBuffer = aclCreateDataBuffer(pictureDeviceData, pictureDataSize);  aclmdlDataset *inputDataSet = aclmdlCreateDataset();  ret = aclmdlAddDatasetBuffer(inputDataSet, inputDataBuffer);  INFO_LOG(“AclmdlAddDatasetBuffer ret = %d.”,ret);    aclDestroyDataBuffer(inputDataBuffer);  aclmdlDestroyDataset(inputDataSet);  aclrtFree(pictureDeviceData);  aclrtFreeHost(pictureHostData);  ret = aclrtResetDevice(deviceId_);  aclFinalize();  INFO_LOG(“Dataset_databuffer: end.”);  return ret; } test4();
<

组织输出数据结构的时候,也是一个DataSet,1-N个DataBuffer,在模型确定下来之后,输出的个数和占用内存大小就已经完全确定了。比如一个有1000个类别的分类网络的输出,结果就是1000组数据,每组包含一个标签和一个置信度,共2000个数值。AscendCL不支持推理过程中自动申请输出内存,要在调用推理接口之前先把输出内存、DataBuffer、DataSet准备好。“模型描述”系列接口就派上用场了。来观察如下几个接口:

aclmdlDesc* aclmdlCreateDesc()

这个接口用于创建一个“模型描述信息”对象,用于收集模型的描述信息,也就是模型的元数据。有了空的“模型描述信息”对象之后,我们需要一个接口来分析模型,并将其信息放到这个对象里边:

aclError aclmdlGetDesc(aclmdlDesc *modelDesc, uint32_t modelId)

这个接口可以根据模型的modelId(还记得modelId是什么吗?)来分析模型,并将描述信息填充进modelDesc对象中。这样,针对模型的分析就完成了。

阅读下面代码,体会接口调用

aclError test5() {  INFO_LOG”AclmdlCreateDesc: start.”);  aclError ret = aclInit(nullptr);  int32_t deviceId_ = 0;  ret = aclrtSetDevice(deviceId_);  const char *modelPath = “./googlenet.om”;  uint32_t modelId;  ret = aclmdlLoadFromFile(modelPath, &modelId);  INFO_LOG(“Model Id = %d.”, modelId);  aclmdlDesc *modelDesc = aclmdlCreateDesc();  ret = aclmdlGetDesc(modelDesc, modelId);  INFO_LOG(“Function aclmdlGetNumInputs = %zu.”, aclmdlGetNumInputs(modelDesc));  INFO_LOG(“Function aclmdlGetNumOutputs = %zu.”, aclmdlGetNumOutputs(modelDesc)); INFO_LOG(“Function aclmdlGetInputSizeByIndex = %zu.”, aclmdlGetInputSizeByIndex(modelDesc, 0)); INFO_LOG(“Function aclmdlGetOutputSizeByIndex = %zu.”, aclmdlGetOutputSizeByIndex(modelDesc, 0)); INFO_LOG(“Function aclmdlGetInputNameByIndex = %s.”, aclmdlGetInputNameByIndex(modelDesc, 0)); INFO_LOG(“Function aclmdlGetOutputNameByIndex = %s.”, aclmdlGetOutputNameByIndex(modelDesc, 0));  /*  typedef enum {  ACL_FORMAT_UNDEFINED = -1,  ACL_FORMAT_NCHW = 0,  ACL_FORMAT_NHWC = 1,  ACL_FORMAT_ND = 2,  ACL_FORMAT_NC1HWC0 = 3,  ACL_FORMAT_FRACTAL_Z = 4,  ACL_FORMAT_NC1HWC0_C04 = 12,  ACL_FORMAT_FRACTAL_NZ = 29, } aclFormat; */ INFO_LOG(“Function aclmdlGetInputFormat = %d.”, aclmdlGetInputFormat(modelDesc, 0)); INFO_LOG(“Function aclmdlGetOutputFormat = %d.”, aclmdlGetOutputFormat(modelDesc, 0)); /* typedef enum {  ACL_DT_UNDEFINED = -1, //未知数据类型,默认值。  ACL_FLOAT = 0,  ACL_FLOAT16 = 1,  ACL_INT8 = 2,  ACL_INT32 = 3,  ACL_UINT8 = 4,  ACL_INT16 = 6,  ACL_UINT16 = 7,  ACL_UINT32 = 8,  ACL_INT64 = 9,  ACL_UINT64 = 10,  ACL_DOUBLE = 11,  ACL_BOOL = 12, }aclDataType; */  INFO_LOG(“Function aclmdlGetInputDataType = %d.”, aclmdlGetInputDataType(modelDesc, 0));  INFO_LOG(“Function aclmdlGetOutputDataType = %d.”, aclmdlGetOutputDataType(modelDesc, 0));  ret = aclmdlDestroyDesc(modelDesc);  aclmdlUnload(modelId);  ret = aclrtResetDevice(deviceId_);  aclFinalize();  INFO_LOG(“AclmdlCreateDesc: end.”);  return ret; } test5();
<

3.执行推理过程

有了前面的准备工作,现在已经有了模型的modelld,输入的DataSet,输出的DataSet,接下来就可以执行推理了,观察下面接口:

aclError aclmdlExecute(uint32_t modelId, const aclmdlDataset *input, aclmdlDataset *output)

阅读下面代码,下边一段代码,我们把到现在为止的运行资源管理、内存管理与数据传输以及本实验讲到的一些接口串起来,做一个完整的推理流程:

int32_t deviceId_ = 0; uint32_t modelId = 0; size_t pictureDataSize = 0; void *pictureHostData = nullptr; void *pictureDeviceData = nullptr; aclmdlDataset *inputDataSet = nullptr; aclDataBuffer *inputDataBuffer = nullptr; aclmdlDataset *outputDataSet = nullptr; aclDataBuffer *outputDataBuffer = nullptr; aclmdlDesc *modelDesc = nullptr; size_t outputDataSize = 0; void *outputDeviceData = nullptr; void *outputHostData = nullptr; aclError InitResource() {  aclError ret = aclInit(nullptr);  ret = aclrtSetDevice(deviceId_);  INFO_LOG(“InitResource success!”);  return ret; } void ReadPictureTotHost(const char *picturePath) {  string fileName = picturePath;  ifstream binFile(fileName, ifstream::binary);  binFile.seekg(0, binFile.end);  pictureDataSize = binFile.tellg();  binFile.seekg(0, binFile.beg);  aclError ret = aclrtMallocHost(&pictureHostData, pictureDataSize);  binFile.read((char*)pictureHostData, pictureDataSize);  binFile.close();  INFO_LOG(“ReadPictureTotHost !”); } void CopyDataFromHostToDevice() {  aclError ret = aclrtMalloc(&pictureDeviceData, pictureDataSize, ACL_MEM_MALLOC_HUGE_FIRST);  ret = aclrtMemcpy(pictureDeviceData, pictureDataSize, pictureHostData, pictureDataSize, ACL_MEMCPY_HOST_TO_DEVICE);  INFO_LOG(“CopyDataFromHostToDevice !”); } void CreateModelInput() {  inputDataSet = aclmdlCreateDataset();  inputDataBuffer = aclCreateDataBuffer(pictureDeviceData, pictureDataSize);  aclError ret = aclmdlAddDatasetBuffer(inputDataSet, inputDataBuffer);  INFO_LOG(“CreateModelInput!”); } void CreateModelOutput() {  modelDesc = aclmdlCreateDesc();  aclError ret = aclmdlGetDesc(modelDesc, modelId);  outputDataSet = aclmdlCreateDataset();  outputDataSize = aclmdlGetOutputSizeByIndex(modelDesc, 0);  ret = aclrtMalloc(&outputDeviceData, outputDataSize, ACL_MEM_MALLOC_HUGE_FIRST);  outputDataBuffer = aclCreateDataBuffer(outputDeviceData, outputDataSize);  ret = aclmdlAddDatasetBuffer(outputDataSet, outputDataBuffer);  INFO_LOG(“CreateModelOutput !”); } void LoadPicture(const char* picturePath) {  ReadPictureTotHost(picturePath);  CopyDataFromHostToDevice();  CreateModelInput();  CreateModelOutput();  INFO_LOG(“LoadPicture !”); } void LoadModel(const char* modelPath) {  aclError ret = aclmdlLoadFromFile(modelPath, &modelId);  INFO_LOG(“LoadModel success !”); } void Inference() {  aclError ret = aclmdlExecute(modelId, inputDataSet, outputDataSet);  INFO_LOG(“Inference ret %d !”, ret); } void PrintResult() {  aclError ret = aclrtMallocHost(&outputHostData, outputDataSize);  ret = aclrtMemcpy(outputHostData, outputDataSize, outputDeviceData, outputDataSize, ACL_MEMCPY_DEVICE_TO_HOST);  float* outFloatData = reinterpret_cast(outputHostData);  map> resultMap;  for (unsigned int j = 0; j < outputDataSize / sizeof(float);++j)   {   resultMap[*outFloatData] = j;   outFloatData++;   }  int cnt = 0;  for (auto it = resultMap.begin();it != resultMap.end();++it)  {   if(++cnt > 5)   {    break;   }  INFO_LOG(“Top %d: index[%d] value[%lf] “, cnt, it->second, it->first);  } } void UnloadModel() {  aclmdlDestroyDesc(modelDesc);  aclmdlUnload(modelId);  INFO_LOG(“UnloadModel success !”); } void UnloadPicture() {  aclError ret = aclrtFreeHost(pictureHostData);  pictureHostData = nullptr;  ret = aclrtFree(pictureDeviceData);  pictureDeviceData = nullptr;  aclDestroyDataBuffer(inputDataBuffer);  inputDataBuffer = nullptr;  aclmdlDestroyDataset(inputDataSet);  inputDataSet = nullptr;  ret = aclrtFreeHost(outputHostData);  outputHostData = nullptr;  ret = aclrtFree(outputDeviceData);  outputDeviceData = nullptr;  aclDestroyDataBuffer(outputDataBuffer);  outputDataBuffer = nullptr;  aclmdlDestroyDataset(outputDataSet);  outputDataSet = nullptr;  INFO_LOG(“UnloadPicture success !”); } void DestroyResource() {  aclError ret = aclrtResetDevice(deviceId_);  aclFinalize();  INFO_LOG(“DestroyResource success !”); } void mainTest() {  const char *picturePath = “dog1_1024_683.bin”;  const char *mdoelPath = “resnet50.om”;  InitResource();  LoadModel(mdoelPath);  LoadPicture(picturePath);  Inference();  PrintResult();  UnloadModel();  UnloadPicture();  DestroyResource();  return; } mainTest();
<

再回顾整个推理实验过程,总共分为把模型加载进入内存,给模型准备输入输出,准备数据结构,最终进行推理得到结果。

猜你喜欢