Migrating a quantized aware model from Tensorflow

Hi,

We're interested in converting a Tensorflow model that passed quantized aware training.

We export from Tensorflow and convert to Tflite. Then using the Kneron toolchain we convert the Tflite model to ONNX then to a NEF file.

Part of the Kneron toolchain involves quantization, which we're worried is conflicting with our model's quantization. Is there a way to skip this quantization process?


Extra question:

We get Segmentation fault (core dumped) when using ktc.kneron_inference on the exported ONNX model, but not when using the same inference on the same model when converted to a NEF file from ONNX.

This is important for accuracy tests during the various conversion stages.

Is there a way to debug this? python just displays this error and then crashes.


Thanks!

Comments

  • @oded

    Hi oded,

    I don't know if you have optimized ONNX with the onnx2onnx API after you transferred the ONNX. After the optimization is complete, you must perform quantification. If you are worried about accuracy, you can use our E2E simulator for accuracy testing.

    At present, you must meet our hardware architecture through the process of Kneron model quantification.

    For your Extra question, can you provide us with which parameters we used when you use ktc.kneron_inference and give a more detailed screenshot of the error message.

    If it is convenient, can you provide your model for us to test?

    Thanks!

  • Hi, thanks for the reply.

    Attached a tflite model exported from a checkpoint downloaded from Tensorflow's model zoo. We cut the bottom layers start at the Reshape operators to comply with the hardware's supported operators.

    Using the toolchain guide: http://doc.kneron.com/docs/#toolchain/manual/ we get the segmentation error when running inference on the ONNX model and the bie model. The NEF model doesn't cause a crash.


  • edited November 2021

    @oded

    Hi Oded,

    I'd like to confirm your question when you said,

    "we get the segmentation error when running inference on the ONNX model and the bie model. The NEF model doesn't cause a crash."

    Do you mean that you have successfully converted the tflite model you provided to .onnx using the Kneron toolchain and then converted ONNX to .bie files, and converted .bie to get .nef?

    When we run inference on the Kneron hardware architecture, we will use the last generated .nef for inference.

  • Yes, we successfully converted TFlite->ONNX->bie->NEF.

    But the toolchain manual shows inference on multiple stages, including ONNX and bie to measure accuracy at each stage. We get the segmentation fault using inference on ONNX/bie.

  • @oded

    Hi Oded,

    You can try putting your optimized model ONNX file under /data1/ (/data1/model.onnx) in docker to successfully use ktc.kneron_inference and complete E2E simulation.

  • After updating the docker image to the latest version, the inference gave a more detailed error. The issue was that the inference uses the 520 platform by default, and the bie model was targeted for the KL720. Adding platform=720 solved that issue.

    We're still in the process of evaluating the accuracy at various stages, with regards to the information you gave about the mandatory usage of quantization, which possibly conflicts with the process of quantization aware training.

The discussion has been closed due to inactivity. To continue with the topic, please feel free to post a new discussion.