Failure Quantize the onnx model

I am following the instructions in the link below:

https://doc.kneron.com/docs/#model_training/OpenMMLab/STDC/

While proceeding with Step 5, I encountered the following error message.

>>> img_list = []

>>> for (dirpath, dirnames, filenames) in walk("/docker_mount/cityscapes_minitest"):

...   for f in filenames:

...     fullpath = os.path.join(dirpath, f)

...     image = Image.open(fullpath).convert("RGB")

...     img_data = np.array(image.resize((1024, 512), Image.BILINEAR)) / 256 - 0.5

...     img_data = np.transpose(img_data, (2, 0, 1))

...     img_data = np.expand_dims(img_data, axis=0)

...     img_list.append(img_data)

...

>>> bie_model_path = km.analysis({"input": img_list })

 Failure for model "input/input" when running "kdp630/unimplemented feature"


Here is the Python code I entered:

import ktc

import numpy as np

import os

import onnx

from PIL import Image


onnx_path = '/docker_mount/latest.onnx'

m = onnx.load(onnx_path)

m = ktc.onnx_optimizer.onnx2onnx_flow(m)

onnx.save(m,'latest.opt.onnx')


# npu (only) performance simulation

km = ktc.ModelConfig(32769, "0001", "630", onnx_model=m)

eval_result = km.evaluate()

print("\nNpu performance evaluation result:\n" + str(eval_result))


from os import walk


img_list = []

for (dirpath, dirnames, filenames) in walk("/docker_mount/cityscapes_minitest"):

  for f in filenames:

    fullpath = os.path.join(dirpath, f)

    image = Image.open(fullpath).convert("RGB")

    img_data = np.array(image.resize((1024, 512), Image.BILINEAR)) / 256 - 0.5

    img_data = np.transpose(img_data, (2, 0, 1))

    img_data = np.expand_dims(img_data, axis=0)

    img_list.append(img_data)


bie_model_path = km.analysis({"input": img_list })


print("\nFixed-point analysis done. Save bie model to '" + str(bie_model_path) + "'")

When I change the 630 in the ktc.ModelConfig part to 530 or 720, it works fine, but it doesn't work with 630. Is Onnx Converter not available for the KL630 model?

Comments

  • Hi,

    Could you also provide the following so we could try debugging? Thanks!

    -The original onnx model and the optimized onnx model

    -The toolchain version (please use the command cat /workspace/version.txt)

    -The log

  • Hello,

    As requested, please find the following items for debugging.

    -The original onnx model (latest.onnx) and the optimized onnx model (latest.opt.onnx)

    -The toolchain version

    kneron/toolchain:v0.25.0_3


    -The log


    Please let me know if anything else is needed. Thank you!

  • Hi Nhleem,

    Thank you for providing the info and files. There was a bug in the toolchain related to KL630 that was fixed in v0.25.1. Could you update the docker ($ docker pull kneron/toolchain:latest) and try converting the model again? Thanks!

  • Hello,

    Thank you for your response and suggestion. However, I am still encountering the same issue. The model conversion for KL630 is still not working, while it works fine for KL730.

    I am attaching the logs for both cases:

    • The log for KL630 (failure)
    • The log for KL730 (success)

    Additionally, I have included the files that were generated after attempting to compile with KL630, despite the failure.

    Could you please take a look and advise on any potential solutions?

    Thank you for your continued support.

  • Hi,

    I've ran the script and KL630 also didn't work for me, while KL720 did. It seems like the error was caused by KL630's nmem not being able to fit in the quantized GlobalAveragePool. We are still looking into the solution. Thank you for your patience!

  • Hello,

    Thank you for the response. I appreciate your efforts in looking into this issue. I look forward to hearing from you once a solution has been found.

    Thank you again for your continued support.

  • Hi,

    Thank you for your patience as well. We've found the solution now: In km.analysis, we should set the parameter datapath_bitwidth_mode to "all int8"

    An example code would be:

    bie_model_path = km.analysis(input_mapping, threads=4, datapath_bitwidth_mode="all int8")

    This would quantize GlobalAveragePool into 8-bit, which would avoid the issue of not having enough nmem.

  • The issue has been resolved. Thank you!

The discussion has been closed due to inactivity. To continue with the topic, please feel free to post a new discussion.