issues when converting custom yolov5s from pytorch

Hello, I am having some issues when converting a yolov5s with custom weights from pytorch.

The official repo for yoloV5 currently requires at least pytorch version 1.6.0 since it uses hardswish activation function. In your docker toolchain documentation, regarding the conversion from pytorch to onnx, you specifically say that only pytorch<1.6 models are supported. And I am indeed unable to do the conversion when I provide your toolchain with an onnx of the model I obtained using the official repo.

In order to avoid the compatibility issues, I went back to using an outdated version of the repo (Leaky ReLU instead of hardswish), one that could be successfully run in pytorch 1.5.1.

After exporting the model into an onnx file, I was encountering the following error when using your toolchain to obtain the optimized onnx: "Your model ir_version is higher than the checker's".

When I searched about this error, it seems it is due to, when exporting the model, using an onnx version higher than the one you have in your toolchain.

Therefore, I set onnx to 1.4.1 and tried to export yolov5s again, but as you can see below, it seems some operations are not supported in this onnx version and setting opset version 9, as required by your toolchain, seems to be causing issues too:

"ONNX export failed on slice because step!=1 is currently not supported not supported /usr/local/lib/python3.6/dist-packages/torch/onnx/symbolic_helper.py:243: UserWarning: You are trying to export the model with onnx:Upsample for ONNX opset version 9. This operator might cause results to not match the expected results by PyTorch. ONNX's Upsample/Resize operator did not match Pytorch's Interpolation until opset 11. Attributes to determine how to transform the input were added in onnx:Resize in opset 11 to support Pytorch's behavior (like coordinate_transformation_mode and nearest_mode). We recommend using opset 11 and above for models using this operator."

Could you shed some light on this issue please? Did you use a particular version of the repository which was able to directly integrate with the onnx and opset version you are using, or did you changed the model such that you could use a lower opset version?

Thanks in advance!

Tagged:

Comments

  • Hi Ricardo,

    I'm sorry currently our toolchain only supports opset 9 ONNX. We're upgrading our toolchain to support opset 11. But it would be a later version.

    For now, we have a workaround. You can add the flag --align-corner while using pytorch2onnx.py. It will introduce a special mode in Upsample which is not defined in the official onnx document but should match the behavior of pytorch. Hope this could help.

  • Hello Jiyuan,

    Thank you for the help. I will give it a try and check whether it solves the issues.

  • Hello Jiyuan,

    Unfortunately it seems this issue is occurring before using your toolchain. It happens as soon as I generate the onnx file with the torch.onnx.export function. If I use onnx (with opset version 9), I have no problems in generating an onnx file, but it will originate an error when being fed to your toolchain: "onnx.onnx_cpp2py_export.checker.ValidationError: Your model ir_version is higer than the checker's". I think this is expected since your toolchain has onnx version 1.4.1.

    The problem is that I cannot use onnx<=1.5.0 when generating the onnx file for yolov5, since I get the error I mentioned in the original post. Looking at https://github.com/onnx/onnx/blob/master/docs/Versioning.md#released-versions, I suspect that your toolchain may require file format version 4, whereas the yolov5 repository is creating a file format version which is higher than that. Moreover, the error that is shown in the original post may result from unhandled slice operations in lower versions of onnx (< 1.6.0).

    Do you have any ideas on what is happening here? Is there some release of YoloV5 where this problem does not happen?


    Thanks in advance

  • Hi Ricardo,

    If you are converting yolov5s, I'm sorry that those slice operators cannot be converted correctly for now. We plan to switch to opset 11 since toolchain version 0.14, which should come out around May. If you are curious about how to support yolov5 in the current version, I'll describe how we do it below.

    1. Remove the Space2Depth-like node from the model, which generates the Slice nodes in onnx. We may do it in the preprocessing.
    2. Export the model in opset 9 using PyTorch which version is less than 1.6.0. The `ir_version` of the models exported by Pytorch are determined by the Pytorch version, not your local onnx version. And the ir_version of the models exported by onnx cannot be manually set. Pytorch version greater than or equal to 1.6.0 will always export onnx using ir_version 6, which would cause the error you saw. Pytorch version less than 1.6.0 would be fine. While exploring the model, there would be some warning. Just ignore them for now.
    3. Use pytorch_exported_onnx_preprocess.py to optimize the model. The command should be like: python pytorch_exported_onnx_preprocess.py pytorch_exported.onnx optimized.onnx. If you saw warnings about the Resize/Upsample operator, add --align-corner flag in this step.
    4. Use onnx2onnx.py to do the final optimization.

    With the above steps, you will get the yolov5 model runnable in our toolchain with some compromise. If you want the full support of the yolov5, please wait for the toolchain v0.14 release. Thanks a lot.

The discussion has been closed due to inactivity. To continue with the topic, please feel free to post a new discussion.