Radix/Scaling Clarification for Float-to-INT8 Tensor Transfer Between Models

We are facing an issue related to radix/scaling between two models in our pipeline.

Currently:

  • The output tensor from the first model is in float format.
  • The next classification model expects INT8 quantized input.

Because of this, we need to apply a proper scale factor/radix conversion before passing the tensor to the next model.

Our questions are:

  1. On what basis should we determine the scale factor/radix value for converting float output to INT8 input?
  2. Is there any standard or automated process in the Kneron pipeline for handling this conversion between models?
  3. How should the radix values mentioned in the model JSON files actually be interpreted and applied during tensor transfer?

We discussed this internally, and it seems that in some cases the pipeline works even without explicit quantization, while in other cases radix scaling is required. We currently do not understand what differentiates these cases.

We also checked the input/output radix values mentioned in the model JSON files, but directly applying those radix values is not working correctly for our use case.

Our pipeline is:

Pose Estimation Model → Classification Model

Any clarification, recommended workflow, or examples regarding radix/scaling handling between chained models would be very helpful.

Thanks.

Sign In or Register to comment.