[python][hostlib v0.4] Inference got no result with retrained model (MobileNetV1)

We have a retrained MobileNetV1 model, and after conversion to fw_info.bin and all_models.bin , I followed the python example and kdp_wrapper.py in hostlib v0.4 to successfully init DME mode and load the converted model with toolchain.

However, although kdp_dme_inference() returns 0, the res_flag is still False and inf_size is 0.

How I init the device and load model (basically follows kdp_dme_load_*_models in kdp_wrapper :

def load_model(dev_idx, model_dir):
    model_id = 0
    ret_size = 0
    data = (ctypes.c_char * FW_SIZE)()
    p_buf = (ctypes.c_char * MODEL_SIZE)()

    print("loading models to Kneron Device: ")
    n_len = api.read_file_to_buf(data, pjoin(model_dir, 'fw_info.bin'), FW_SIZE)
    if n_len <= 0:
        print("reading fw setup file failed: {}...\n".format(n_len))
        return -1

    dat_size = n_len
    print(dat_size)


    n_len = api.read_file_to_buf(p_buf, pjoin(model_dir, 'all_models.bin'), MODEL_SIZE)
    if n_len <= 0:
        print("reading model file failed: {}...\n".format(n_len))
        return -1

    buf_len = n_len
    model_size = n_len
    print(buf_len, model_size)

    print("starting DME mode ...\n")
    ret, ret_size = api.kdp_start_dme(
        dev_idx, model_size, data, dat_size, ret_size, p_buf, buf_len)
    if ret:
        print("could not set to DME mode:{}..\n".format(ret_size))
        return -1
    time.sleep(0.1)

    # dme configuration
    model_id = 1  # model id when compiling in toolchain
    output_num = 1     # number of output node for the model
    image_col = 128
    image_row = 128
    image_ch = 3
    image_format = (constants.IMAGE_FORMAT_SUB128 |
                    constants.NPU_FORMAT_RGB565 |
                    constants.IMAGE_FORMAT_RAW_OUTPUT |
                    constants.IMAGE_FORMAT_CHANGE_ASPECT_RATIO)

    dme_cfg = constants.KDPDMEConfig(model_id, output_num, image_col,
                                     image_row, image_ch, image_format)

    dat_size = ctypes.sizeof(dme_cfg)
    print("starting DME configure ...\n")
    ret, model_id = api.kdp_dme_configure(
        dev_idx, ctypes.cast(ctypes.byref(dme_cfg), ctypes.c_char_p), dat_size, model_id)
    if ret:
        print("could not set to DME configure mode..\n")
        return -1
    time.sleep(0.1)

Then inference:

    inf_size = 0
    inf_res = (ctypes.c_char * 256000)()
    res_flag = False
    mode = 0
    model_id = 1
    status = 0
    _ret, ssid, res_flag = api.kdp_dme_inference(
        dev_idx, img_buf, buf_len, inf_size, res_flag, inf_res, mode, model_id)

The value of _ret , inf_size , status , res_flag , inf_size are all zero or false.

Comments

  • The mode = 0 for dme_inference means serial mode, in serial mode, just use the api "kdp_dme_retrieve_res" with the pointer "inf_res" to get the inference result if the return value of inference "_ret" is 0.

    e.g.

    api.kdp_dme_retrieve_res(dev_idx, 0, inf_size, inf_res)

    Make sure the model_id which you set "1" is match with the parameter you set in toolchain. You can find the parameter "id" in file "batch_input_params.json" inside toolchain.


    And we recommended you to use the latest host_lib which include latest fw binaries instead of v0.4.

  • Thanks for the reply! I've migrate to v0.8 (because v0.9 doesn't have python bindings), and I still got no result. By the way, the host_lib download link in KL520 - Document Center (kneron.com) is still v0.4.

    The only change I made is to add additional parameters to KDPDMEConfig() , and I did use correct model_id (1). After calling api.kdp_dme_retrieve_res() , I print inf_res directly, and the values are all zero.

    This is my fw_info.txt :

    Total [1] models:
    [char_mobilenet_1]
        id: [1], version: [0x1]
        size: input [0x10000], output [0x220], buf [0x28000], cmd [0x221c], weight [0x4484b0], fw_code [0xa0]
        addr: input [0x60000000], output [0x60010000], buf [0x60010220], cmd [0x60038220], weight [0x6003a440], fw_code [0x604828f0]
    dram_addr_end [0x60482990], total bin size: [0x44a770]
    checksum: all_models.bin [0x20301310]
            0 [0x20301310],
    
  • Due to some reasons, here are few operations that we are recommended you to try.

    1. Update the fw binaries in \host_lib__v0.8\app_binaries\tiny_yolo_v3: The fw and host_lib should be match to ensure the function works, and we're recommended to use the default fw "tiny_yolo_v3". Please refer to the command "update_fw" or "update_app".
    2. Make sure the model dimensions are smaller than input image parameters image_col = 128 and image_row = 128, the pre-processing function for input image don't support image upscaling.
    3. Please change the model_id to 1000 and regenerate your model binaries by toolchain, the model_id 1~999 is reserved by kneron, and sometimes there are few special processing for each kneron reserved id in firmware.
  • Hi, I've run update_fw and regenerate the model with correct model_id (1000), but I still got no result, data in inf_res buffer are all zeros.

    As far as I know, since we're using a classification model (instead of detection), it should still produce some result even the image pre-process is wrong, is it correct?

  • Yes, if the inference process is done, there should be some results whether it's right or wrong.

    Could you mind to provide your model binaries and python example code to check the issue?

  • This archive contains a original test model and converted firmware binary we provided in another thread (Warnings when running `fpAnalyserCompilerIpevaluator_520.py` — Kneron Developer Forums), the python code (dt42.py ) I use to test, and some test images (under numeral_recognition_test_set/).

    I normally put dt42.py under host_lib/python , then run python dt42.py <DIR-TO-FIRMARE> <TEST-IMG>, e.g., python dt42.py ~/test_model_for_kneron/batch_compile ~/test_model_for_kneron/numeral_recognition_test_set/0.jpg

    I've tested both native Ubuntu and VM in Windows, and yield same result.

  • Due to the "inf_size" = 0, kdp_dme_retrieve_res could not get correct length to receive result data. Refer to the return value of api.kdp_dme_inference below: ( You can find it in __init__.py of kdp_host_api)

    return ret, inf_size_p.value, res_flag_p.value

    And the function you used _ret, ssid, res_flag = api.kdp_dme_inference(dev_idx,...) in dt42.py, the return value of inf_size is ssid. Fill it into kdp_dme_retrieve_res and you will get corresponding answers.

    api.kdp_dme_retrieve_res(dev_idx, 0, ssid, inf_res)

  • I must misread the API at some point, now I can retrieve inference result, thanks for the help!

    However I still have an additional question. With the above test model, the shape of output layer is `float32[1, 3]`, which I presume should provides 4 x 3 = 12 bytes of data, but the inference size API returns is 72, is there anything I missed?

  • Due to the NPU structure, the length of width should be round up to 16, and there should be 6 floating(4bytes) head numbers in the forefront of the output array. In your case, the total length should be

    6 * 4 + 3 * round_up16(1) = 72

    Please refer to the function "get_detection_res() " in kdp_wrapper.py to obtain the exact output data of your model.

  • Thanks for the help, we can now retrieve the data and do some post process. However the result is far from our original model, and our data scientist is wondering that if the image converting (such as BGR2BGR565 ) necessary for KL520 dongle.

    If it is, should we apply the conversion on our training data set first so we can get more consistent result? Thank you!

  • If you worry about the data loss by format transferring, you can try to choose the format NPU_FORMAT_RGBA8888.

    And to reduce quantization loss during model was transferring to fixed-point structure, make sure there are over 100 images in the folder which was written in input_params.json (the parameter: "input_image_folder") and all images inside are model training related.

  • I've put training data when optimizing the model, and changed from NPU_FORMAT_RGB565 to NPU_FORMAT_RGBA8888 , and convert the color space when loading image respectively (img = cv2.cvtColor(img, cv2.COLOR_BGR2RGBA)) , the results changed but still are not accurate.

    Any suggestions to debugging this issue? Thank you!

  • Here is the format of RGBA8888, please check whether your data format is correct.


    And may I ask your setting "img_preprocess_method" of input_params.json in toolchain?

  • For img_preprocess_method, I use default values in toolchain manual:

        "preprocess": {
            "img_preprocess_method": "kneron",
            "img_channel": "RGB",
            "radix": 8,
            "keep_aspect_ratio": true,
            "pad_mode": 1,
            "p_crop": {
                "crop_x": 0,
                "crop_y": 0,
                "crop_w": 0,
                "crop_h": 0
            }
        }
    

    And I'm not sure the data format part. The input images are ordinary JPEGs:

    JPEG image data, JFIF standard 1.01, aspect ratio, density 1x1, segment length 16, baseline, precision 8, 128x128, components 3
    

    After loading into program by OpenCV (with imread()), and covert to RGBA8888 color space with cvtColor(), is the format main concern here?

    And our data scientist's concern is, if the hardware is optimized with RGB565, should we train our model with image in RGB565 color space?

  • The hardware structure is using RGBA8888 for 4 channels, but if you use the input format RGB565, you will get better accuracy with the model trained by image RGB565.

    And for better accuracy, there are two ways you can try.

    • Modify the parameter NPU_FORMAT_RGBA8888 in /common/constants.py by using 0x0D instead of 0x00. Sometimes the format of RGBA8888 would be reverse sequence.


    • Remove the parameter "constants.IMAGE_FORMAT_CHANGE_ASPECT_RATIO" in image_format to keep the inference image's aspect ratio.

    And may I ask that is the result you got not accurate enough or the result is totally incorrect?

  • I've changed the format constants to 0x0d , the results changed but still are far from accurate. The accuracy is less than 20%. I've also test with and without CHANGE_ASPECT_RATIO flag and I think that's not the problem because our the size of image is same as the modle's input (128 x 128).

    By the way, base on the document and example C code, I presume the final prediction scores are not normalized to [0, 1], is that correct? I also get negative scores, is that expected?

  • The score range of final prediction should be the same as your original model. 

    And there is usually a layer "softmax" in the end of MobileNetV1, but KL520 doesn't support "softmax". So I believe that "softmax" was be removed during you was using toolchain. When you got output data frome KL520, please make sure to implement "softmax" or any other layer you cut before to get the correct final prediction.

  • I used /workspace/libs/ONNX_Convertor/optimizer_scripts/onnx2onnx.py to optimize the model, and in my case the optimize script cut out Softmax after Gemm op already (and add an add nop as the output):


  • Yes, that's what I'm getting at. If you want to get the same model result, please calculate the layer softmax with the inf_res from kdp_dme_retrieve_res(), the prediction  scale should be normalized after implement softmax.

The discussion has been closed due to inactivity. To continue with the topic, please feel free to post a new discussion.