上一篇:AscendCL快速入门——模型推理篇(上)
继上一篇:
2. 给模型准备输入输出
在上一篇文档中提到了内存管理的概念,解释了如何把待推理数据上传到Device上,这里我们传递到Device上的图片还是裸数据流,这种数据流是没法直接送进模型进行推理的,我们在推理之前,要为模型准备独特的数据结构。
一个模型的所有输入抽象为一个”DataSet”对象。每一个输入抽象为一个“DataBuffer”对象。比如一个模型有两个输入,第一个输入是若干张图片,第二个输入是每张图片的数据等信息。
第一个输入,所有图片,创建一个DataBuffer对象。第二个输入,图片的信息,创建另一个DataBuffer对象。创建一个DataSet对象。把第1/2步中创建的2个DataBuffer对象放到DataSet对象中。一个模型有且只有1个“输入DataSet”(数据集对象),里边包含所有的输入;而如果有多个输入的话,每个输入用一个“DataBuffer”来承载。
创建DataBuffer:在创建DataBuffer的时候就要用内存的指针来创建。上边这个接口的参数列表中,data参数就是数据的内存地址,size参数就是这段内存的大小了。aclDataBuffer *aclCreateDataBuffer(void *data, size_t size)
创建DataSet:和DataBuffer不同,DataSet在创建的时候只能创建空的,因为一个DataSet里边可能包含多个Buffer。aclmdlDataset *aclmdlCreateDataset()
向DataSet中添加DataBuffer:这里在添加输入的时候,并没有指定输入的名称,所以如果有多个Buffer要添加进DataSet里,要注意按顺序添加。aclError aclmdlAddDatasetBuffer(aclmdlDataset *dataset, aclDataBuffer *dataBuffer)
阅读下面代码,体会接口调用方式
aclError test4() { INFO_LOG(“Dataset_databuffer: start.”); aclError ret = aclInit(nullptr); int32_t deviceId_ = 0; ret = aclrtSetDevice(deviceId_); string picturePath = “./dog1_1024_683.jpg”; uint32_t pictureDataSize = 0; void *pictureHostData = nullptr; ifstream pictureFile(picturePath, ifstream::binary); pictureFile.seekg(0, pictureFile.end); pictureDataSize = pictureFile.tellg(); pictureFile.seekg(0, pictureFile.beg); ret = aclrtMallocHost(&pictureHostData, pictureDataSize); pictureFile.read((char*)pictureHostData, pictureDataSize); pictureFile.close(); void *pictureDeviceData = nullptr; ret = aclrtMalloc(&pictureDeviceData, pictureDataSize, ACL_MEM_MALLOC_HUGE_FIRST); ret = aclrtMemcpy(pictureDeviceData, pictureDataSize, pictureHostData, pictureDataSize, ACL_MEMCPY_HOST_TO_DEVICE); aclDataBuffer* inputDataBuffer = aclCreateDataBuffer(pictureDeviceData, pictureDataSize); aclmdlDataset *inputDataSet = aclmdlCreateDataset(); ret = aclmdlAddDatasetBuffer(inputDataSet, inputDataBuffer); INFO_LOG(“AclmdlAddDatasetBuffer ret = %d.”,ret); aclDestroyDataBuffer(inputDataBuffer); aclmdlDestroyDataset(inputDataSet); aclrtFree(pictureDeviceData); aclrtFreeHost(pictureHostData); ret = aclrtResetDevice(deviceId_); aclFinalize(); INFO_LOG(“Dataset_databuffer: end.”); return ret; } test4();组织输出数据结构的时候,也是一个DataSet,1-N个DataBuffer,在模型确定下来之后,输出的个数和占用内存大小就已经完全确定了。比如一个有1000个类别的分类网络的输出,结果就是1000组数据,每组包含一个标签和一个置信度,共2000个数值。AscendCL不支持推理过程中自动申请输出内存,要在调用推理接口之前先把输出内存、DataBuffer、DataSet准备好。“模型描述”系列接口就派上用场了。来观察如下几个接口:
aclmdlDesc* aclmdlCreateDesc()
这个接口用于创建一个“模型描述信息”对象,用于收集模型的描述信息,也就是模型的元数据。有了空的“模型描述信息”对象之后,我们需要一个接口来分析模型,并将其信息放到这个对象里边:
aclError aclmdlGetDesc(aclmdlDesc *modelDesc, uint32_t modelId)
这个接口可以根据模型的modelId(还记得modelId是什么吗?)来分析模型,并将描述信息填充进modelDesc对象中。这样,针对模型的分析就完成了。
阅读下面代码,体会接口调用
aclError test5() { INFO_LOG”AclmdlCreateDesc: start.”); aclError ret = aclInit(nullptr); int32_t deviceId_ = 0; ret = aclrtSetDevice(deviceId_); const char *modelPath = “./googlenet.om”; uint32_t modelId; ret = aclmdlLoadFromFile(modelPath, &modelId); INFO_LOG(“Model Id = %d.”, modelId); aclmdlDesc *modelDesc = aclmdlCreateDesc(); ret = aclmdlGetDesc(modelDesc, modelId); INFO_LOG(“Function aclmdlGetNumInputs = %zu.”, aclmdlGetNumInputs(modelDesc)); INFO_LOG(“Function aclmdlGetNumOutputs = %zu.”, aclmdlGetNumOutputs(modelDesc)); INFO_LOG(“Function aclmdlGetInputSizeByIndex = %zu.”, aclmdlGetInputSizeByIndex(modelDesc, 0)); INFO_LOG(“Function aclmdlGetOutputSizeByIndex = %zu.”, aclmdlGetOutputSizeByIndex(modelDesc, 0)); INFO_LOG(“Function aclmdlGetInputNameByIndex = %s.”, aclmdlGetInputNameByIndex(modelDesc, 0)); INFO_LOG(“Function aclmdlGetOutputNameByIndex = %s.”, aclmdlGetOutputNameByIndex(modelDesc, 0)); /* typedef enum { ACL_FORMAT_UNDEFINED = -1, ACL_FORMAT_NCHW = 0, ACL_FORMAT_NHWC = 1, ACL_FORMAT_ND = 2, ACL_FORMAT_NC1HWC0 = 3, ACL_FORMAT_FRACTAL_Z = 4, ACL_FORMAT_NC1HWC0_C04 = 12, ACL_FORMAT_FRACTAL_NZ = 29, } aclFormat; */ INFO_LOG(“Function aclmdlGetInputFormat = %d.”, aclmdlGetInputFormat(modelDesc, 0)); INFO_LOG(“Function aclmdlGetOutputFormat = %d.”, aclmdlGetOutputFormat(modelDesc, 0)); /* typedef enum { ACL_DT_UNDEFINED = -1, //未知数据类型,默认值。 ACL_FLOAT = 0, ACL_FLOAT16 = 1, ACL_INT8 = 2, ACL_INT32 = 3, ACL_UINT8 = 4, ACL_INT16 = 6, ACL_UINT16 = 7, ACL_UINT32 = 8, ACL_INT64 = 9, ACL_UINT64 = 10, ACL_DOUBLE = 11, ACL_BOOL = 12, }aclDataType; */ INFO_LOG(“Function aclmdlGetInputDataType = %d.”, aclmdlGetInputDataType(modelDesc, 0)); INFO_LOG(“Function aclmdlGetOutputDataType = %d.”, aclmdlGetOutputDataType(modelDesc, 0)); ret = aclmdlDestroyDesc(modelDesc); aclmdlUnload(modelId); ret = aclrtResetDevice(deviceId_); aclFinalize(); INFO_LOG(“AclmdlCreateDesc: end.”); return ret; } test5();3.执行推理过程
有了前面的准备工作,现在已经有了模型的modelld,输入的DataSet,输出的DataSet,接下来就可以执行推理了,观察下面接口:
aclError aclmdlExecute(uint32_t modelId, const aclmdlDataset *input, aclmdlDataset *output)
阅读下面代码,下边一段代码,我们把到现在为止的运行资源管理、内存管理与数据传输以及本实验讲到的一些接口串起来,做一个完整的推理流程:
int32_t deviceId_ = 0; uint32_t modelId = 0; size_t pictureDataSize = 0; void *pictureHostData = nullptr; void *pictureDeviceData = nullptr; aclmdlDataset *inputDataSet = nullptr; aclDataBuffer *inputDataBuffer = nullptr; aclmdlDataset *outputDataSet = nullptr; aclDataBuffer *outputDataBuffer = nullptr; aclmdlDesc *modelDesc = nullptr; size_t outputDataSize = 0; void *outputDeviceData = nullptr; void *outputHostData = nullptr; aclError InitResource() { aclError ret = aclInit(nullptr); ret = aclrtSetDevice(deviceId_); INFO_LOG(“InitResource success!”); return ret; } void ReadPictureTotHost(const char *picturePath) { string fileName = picturePath; ifstream binFile(fileName, ifstream::binary); binFile.seekg(0, binFile.end); pictureDataSize = binFile.tellg(); binFile.seekg(0, binFile.beg); aclError ret = aclrtMallocHost(&pictureHostData, pictureDataSize); binFile.read((char*)pictureHostData, pictureDataSize); binFile.close(); INFO_LOG(“ReadPictureTotHost !”); } void CopyDataFromHostToDevice() { aclError ret = aclrtMalloc(&pictureDeviceData, pictureDataSize, ACL_MEM_MALLOC_HUGE_FIRST); ret = aclrtMemcpy(pictureDeviceData, pictureDataSize, pictureHostData, pictureDataSize, ACL_MEMCPY_HOST_TO_DEVICE); INFO_LOG(“CopyDataFromHostToDevice !”); } void CreateModelInput() { inputDataSet = aclmdlCreateDataset(); inputDataBuffer = aclCreateDataBuffer(pictureDeviceData, pictureDataSize); aclError ret = aclmdlAddDatasetBuffer(inputDataSet, inputDataBuffer); INFO_LOG(“CreateModelInput!”); } void CreateModelOutput() { modelDesc = aclmdlCreateDesc(); aclError ret = aclmdlGetDesc(modelDesc, modelId); outputDataSet = aclmdlCreateDataset(); outputDataSize = aclmdlGetOutputSizeByIndex(modelDesc, 0); ret = aclrtMalloc(&outputDeviceData, outputDataSize, ACL_MEM_MALLOC_HUGE_FIRST); outputDataBuffer = aclCreateDataBuffer(outputDeviceData, outputDataSize); ret = aclmdlAddDatasetBuffer(outputDataSet, outputDataBuffer); INFO_LOG(“CreateModelOutput !”); } void LoadPicture(const char* picturePath) { ReadPictureTotHost(picturePath); CopyDataFromHostToDevice(); CreateModelInput(); CreateModelOutput(); INFO_LOG(“LoadPicture !”); } void LoadModel(const char* modelPath) { aclError ret = aclmdlLoadFromFile(modelPath, &modelId); INFO_LOG(“LoadModel success !”); } void Inference() { aclError ret = aclmdlExecute(modelId, inputDataSet, outputDataSet); INFO_LOG(“Inference ret %d !”, ret); } void PrintResult() { aclError ret = aclrtMallocHost(&outputHostData, outputDataSize); ret = aclrtMemcpy(outputHostData, outputDataSize, outputDeviceData, outputDataSize, ACL_MEMCPY_DEVICE_TO_HOST); float* outFloatData = reinterpret_cast(outputHostData); map> resultMap; for (unsigned int j = 0; j < outputDataSize / sizeof(float);++j) { resultMap[*outFloatData] = j; outFloatData++; } int cnt = 0; for (auto it = resultMap.begin();it != resultMap.end();++it) { if(++cnt > 5) { break; } INFO_LOG(“Top %d: index[%d] value[%lf] “, cnt, it->second, it->first); } } void UnloadModel() { aclmdlDestroyDesc(modelDesc); aclmdlUnload(modelId); INFO_LOG(“UnloadModel success !”); } void UnloadPicture() { aclError ret = aclrtFreeHost(pictureHostData); pictureHostData = nullptr; ret = aclrtFree(pictureDeviceData); pictureDeviceData = nullptr; aclDestroyDataBuffer(inputDataBuffer); inputDataBuffer = nullptr; aclmdlDestroyDataset(inputDataSet); inputDataSet = nullptr; ret = aclrtFreeHost(outputHostData); outputHostData = nullptr; ret = aclrtFree(outputDeviceData); outputDeviceData = nullptr; aclDestroyDataBuffer(outputDataBuffer); outputDataBuffer = nullptr; aclmdlDestroyDataset(outputDataSet); outputDataSet = nullptr; INFO_LOG(“UnloadPicture success !”); } void DestroyResource() { aclError ret = aclrtResetDevice(deviceId_); aclFinalize(); INFO_LOG(“DestroyResource success !”); } void mainTest() { const char *picturePath = “dog1_1024_683.bin”; const char *mdoelPath = “resnet50.om”; InitResource(); LoadModel(mdoelPath); LoadPicture(picturePath); Inference(); PrintResult(); UnloadModel(); UnloadPicture(); DestroyResource(); return; } mainTest();再回顾整个推理实验过程,总共分为把模型加载进入内存,给模型准备输入输出,准备数据结构,最终进行推理得到结果。
免责声明:文章内容来自互联网,本站不对其真实性负责,也不承担任何法律责任,如有侵权等情况,请与本站联系删除。
转载请注明出处:AscendCL快速入门——模型推理篇(中)-as模型推导 https://www.yhzz.com.cn/a/8772.html