[python][hostlib v0.4] Inference got no result with retrained model (MobileNetV1)
We have a retrained MobileNetV1 model, and after conversion to fw_info.bin
and all_models.bin
, I followed the python example and kdp_wrapper.py
in hostlib v0.4 to successfully init DME mode and load the converted model with toolchain.
However, although kdp_dme_inference()
returns 0, the res_flag
is still False
and inf_size
is 0.
How I init the device and load model (basically follows kdp_dme_load_*_models
in kdp_wrapper
:
def load_model(dev_idx, model_dir): model_id = 0 ret_size = 0 data = (ctypes.c_char * FW_SIZE)() p_buf = (ctypes.c_char * MODEL_SIZE)() print("loading models to Kneron Device: ") n_len = api.read_file_to_buf(data, pjoin(model_dir, 'fw_info.bin'), FW_SIZE) if n_len <= 0: print("reading fw setup file failed: {}...\n".format(n_len)) return -1 dat_size = n_len print(dat_size) n_len = api.read_file_to_buf(p_buf, pjoin(model_dir, 'all_models.bin'), MODEL_SIZE) if n_len <= 0: print("reading model file failed: {}...\n".format(n_len)) return -1 buf_len = n_len model_size = n_len print(buf_len, model_size) print("starting DME mode ...\n") ret, ret_size = api.kdp_start_dme( dev_idx, model_size, data, dat_size, ret_size, p_buf, buf_len) if ret: print("could not set to DME mode:{}..\n".format(ret_size)) return -1 time.sleep(0.1) # dme configuration model_id = 1 # model id when compiling in toolchain output_num = 1 # number of output node for the model image_col = 128 image_row = 128 image_ch = 3 image_format = (constants.IMAGE_FORMAT_SUB128 | constants.NPU_FORMAT_RGB565 | constants.IMAGE_FORMAT_RAW_OUTPUT | constants.IMAGE_FORMAT_CHANGE_ASPECT_RATIO) dme_cfg = constants.KDPDMEConfig(model_id, output_num, image_col, image_row, image_ch, image_format) dat_size = ctypes.sizeof(dme_cfg) print("starting DME configure ...\n") ret, model_id = api.kdp_dme_configure( dev_idx, ctypes.cast(ctypes.byref(dme_cfg), ctypes.c_char_p), dat_size, model_id) if ret: print("could not set to DME configure mode..\n") return -1 time.sleep(0.1)
Then inference:
inf_size = 0 inf_res = (ctypes.c_char * 256000)() res_flag = False mode = 0 model_id = 1 status = 0 _ret, ssid, res_flag = api.kdp_dme_inference( dev_idx, img_buf, buf_len, inf_size, res_flag, inf_res, mode, model_id)
The value of _ret
, inf_size
, status
, res_flag
, inf_size
are all zero or false.
The discussion has been closed due to inactivity. To continue with the topic, please feel free to post a new discussion.
Comments
Related issue:
Warnings when running `fpAnalyserCompilerIpevaluator_520.py` — Kneron Developer Forums
The mode = 0 for dme_inference means serial mode, in serial mode, just use the api "kdp_dme_retrieve_res" with the pointer "inf_res" to get the inference result if the return value of inference "_ret" is 0.
e.g.
api.kdp_dme_retrieve_res(dev_idx, 0, inf_size, inf_res)
Make sure the model_id which you set "1" is match with the parameter you set in toolchain. You can find the parameter "id" in file "batch_input_params.json" inside toolchain.
And we recommended you to use the latest host_lib which include latest fw binaries instead of v0.4.
Thanks for the reply! I've migrate to v0.8 (because v0.9 doesn't have python bindings), and I still got no result. By the way, the host_lib download link in KL520 - Document Center (kneron.com) is still v0.4.
The only change I made is to add additional parameters to
KDPDMEConfig()
, and I did use correctmodel_id
(1). After callingapi.kdp_dme_retrieve_res()
, I printinf_res
directly, and the values are all zero.This is my
fw_info.txt
:Due to some reasons, here are few operations that we are recommended you to try.
Hi, I've run
update_fw
and regenerate the model with correct model_id (1000), but I still got no result, data ininf_res
buffer are all zeros.As far as I know, since we're using a classification model (instead of detection), it should still produce some result even the image pre-process is wrong, is it correct?
Yes, if the inference process is done, there should be some results whether it's right or wrong.
Could you mind to provide your model binaries and python example code to check the issue?
This archive contains a original test model and converted firmware binary we provided in another thread (Warnings when running `fpAnalyserCompilerIpevaluator_520.py` — Kneron Developer Forums), the python code (
dt42.py
) I use to test, and some test images (undernumeral_recognition_test_set/
).I normally put
dt42.py
underhost_lib/python
, then runpython dt42.py <DIR-TO-FIRMARE> <TEST-IMG>
, e.g.,python dt42.py ~/test_model_for_kneron/batch_compile ~/test_model_for_kneron/numeral_recognition_test_set/0.jpg
I've tested both native Ubuntu and VM in Windows, and yield same result.
Due to the "inf_size" = 0, kdp_dme_retrieve_res could not get correct length to receive result data. Refer to the return value of api.kdp_dme_inference below: ( You can find it in __init__.py of kdp_host_api)
return ret, inf_size_p.value, res_flag_p.value
And the function you used _ret, ssid, res_flag = api.kdp_dme_inference(dev_idx,...) in dt42.py, the return value of inf_size is ssid. Fill it into kdp_dme_retrieve_res and you will get corresponding answers.
api.kdp_dme_retrieve_res(dev_idx, 0, ssid, inf_res)
I must misread the API at some point, now I can retrieve inference result, thanks for the help!
However I still have an additional question. With the above test model, the shape of output layer is `float32[1, 3]`, which I presume should provides 4 x 3 = 12 bytes of data, but the inference size API returns is 72, is there anything I missed?
Due to the NPU structure, the length of width should be round up to 16, and there should be 6 floating(4bytes) head numbers in the forefront of the output array. In your case, the total length should be
6 * 4 + 3 * round_up16(1) = 72
Please refer to the function "get_detection_res() " in kdp_wrapper.py to obtain the exact output data of your model.
Thanks for the help, we can now retrieve the data and do some post process. However the result is far from our original model, and our data scientist is wondering that if the image converting (such as
BGR2BGR565
) necessary for KL520 dongle.If it is, should we apply the conversion on our training data set first so we can get more consistent result? Thank you!
If you worry about the data loss by format transferring, you can try to choose the format NPU_FORMAT_RGBA8888.
And to reduce quantization loss during model was transferring to fixed-point structure, make sure there are over 100 images in the folder which was written in input_params.json (the parameter: "input_image_folder") and all images inside are model training related.
I've put training data when optimizing the model, and changed from
NPU_FORMAT_RGB565
toNPU_FORMAT_RGBA8888
, and convert the color space when loading image respectively (img = cv2.cvtColor(img, cv2.COLOR_BGR2RGBA)
) , the results changed but still are not accurate.Any suggestions to debugging this issue? Thank you!
Here is the format of RGBA8888, please check whether your data format is correct.
And may I ask your setting "img_preprocess_method" of input_params.json in toolchain?
For img_preprocess_method, I use default values in toolchain manual:
And I'm not sure the data format part. The input images are ordinary JPEGs:
After loading into program by OpenCV (with
imread()
), and covert to RGBA8888 color space withcvtColor()
, is the format main concern here?And our data scientist's concern is, if the hardware is optimized with RGB565, should we train our model with image in RGB565 color space?
The hardware structure is using RGBA8888 for 4 channels, but if you use the input format RGB565, you will get better accuracy with the model trained by image RGB565.
And for better accuracy, there are two ways you can try.
And may I ask that is the result you got not accurate enough or the result is totally incorrect?
I've changed the format constants to
0x0d
, the results changed but still are far from accurate. The accuracy is less than 20%. I've also test with and withoutCHANGE_ASPECT_RATIO
flag and I think that's not the problem because our the size of image is same as the modle's input (128 x 128).By the way, base on the document and example C code, I presume the final prediction scores are not normalized to [0, 1], is that correct? I also get negative scores, is that expected?
The score range of final prediction should be the same as your original model.
And there is usually a layer "softmax" in the end of MobileNetV1, but KL520 doesn't support "softmax". So I believe that "softmax" was be removed during you was using toolchain. When you got output data frome KL520, please make sure to implement "softmax" or any other layer you cut before to get the correct final prediction.
I used
/workspace/libs/ONNX_Convertor/optimizer_scripts/onnx2onnx.py
to optimize the model, and in my case the optimize script cut outSoftmax
afterGemm
op already (and add anadd
nop as the output):Yes, that's what I'm getting at. If you want to get the same model result, please calculate the layer softmax with the inf_res from kdp_dme_retrieve_res(), the prediction scale should be normalized after implement softmax.