[YOLOv5] Teach you to use LabVIEW ONNX Runtime to deploy TensorRT acceleration and realize YOLOv5 real-time object recognition (including source code)

Hits: 1

Article directory

foreword

The last blog introduced the LabVIEW Open Neural Network Interaction Toolkit (ONNX) and its download and super-detailed installation . Today we will take a look at how to use the [LabVIEW] and its download and super-detailed installation** . Today we will take a look at how to use the [LabVIEW] Open Neural Network Interaction Toolkit to implement TensorRT to accelerate YOLOv5.

The following is a summary of the relevant notes of [YOLOv5] , I hope it will be helpful to everyone.

content address link
【YOLOv5】LabVIEW+OpenVINO let your YOLOv5 fly on CPU /virobotics/article/details/124951862
[YOLOv5] LabVIEW OpenCV dnn quickly realizes real-time object recognition (Object Detection) /virobotics/article/details/124929483

1. Introduction to TensorRT

TensorRT is a high-performance deep learning inference (Inference) optimizer that provides low-latency, high-throughput deployment inference for deep learning applications. TensorRT can be used to accelerate inference in hyperscale data centers, embedded platforms, or autonomous driving platforms. TensorRT can now support almost all deep learning frameworks such as TensorFlow, Caffe, Mxnet, Pytorch, etc. The combination of TensorRT and NVIDIA GPU enables fast and efficient deployment inference in almost all frameworks. Mainly used for high-performance inference (Inference) acceleration for NVIDIA GPUs.

Usually we do projects and want to speed up the deployment process. There are just a few ways. If our device is a CPU, we can use openvion. If we want to be able to use GPU, then we can try TensorRT. So why choose TensorRT? Because we are currently mainly using Nvidia’s computing equipment, TensorRT itself is Nvidia’s own thing, so on the Nvidia side, we must use Nvidia’s own son.

However, because the entry threshold of TensorRT is slightly high, players who want to enter the pit are directly discouraged. Part of the reason is that the official documents are messy; another part of the reason is that TensorRT is relatively low-level and requires a little knowledge of C++ and hardware, making it more difficult to learn. The GPU version of the Open Neural Network Interaction Toolkit we made , ONNXRuntime can use CUDA as the backend for acceleration when doing inference on the GPU. If it is faster, you can switch to TensorRT . Although there is still a gap with the pure TensorRT inference speed, it is also Very soon. This can greatly reduce the difficulty of development and enable faster and better reasoning. .

2. Preparations

Follow the [LabVIEW Open Neural Network Interaction Toolkit (ONNX) download and ultra-detailed installation tutorial] download and ultra-detailed installation tutorial](/virobotics/article/details/124998746) to install the required software. Because this blog mainly introduces how to use TensorRT to accelerate YOLOv5, it is recommended that you install the GPU version of the onnx toolkit, otherwise it will not be possible. TensorRT acceleration .

3. Acquisition of YOLOv5 model

For the convenience of use, the blogger has converted the yolov5 model into onnx format , which can be downloaded from Baidu network disk.
Link: https://pan.baidu.com/s/15dwoBM4W-5_nlRj4G9EhRg?pwd=yiku
Extraction code: yiku

1. Download [the source code]

Clone or download the open source YOLOv5 code of Ultralytics to the local, you can directly click Download ZIP to download,

Download address: https://github.com/ultralytics/yolov5

2. Install the module

Unzip the zip file you just downloaded, and then install the modules required by yolov5, remembering that the working path of cmd should be in the yolov5 folder:
Open cmd and switch the path to the yolov5 folder, and enter the following commands to install the modules required by yolov5

pip install -r requirements.txt

3. Download the pretrained model

Open cmd, enter the python environment, and use the following command to download the pre-trained model:

import torch

# Model
model = torch.hub.load('ultralytics/yolov5', 'yolov5s')  # or yolov5n - yolov5x6, custom

After successful download, it is shown as below:

4. Convert to onnx model

The official codes of yolov3 and yolov4 before yolov5 are based on the darknet framework, so when the dnn module of opencv does target detection, it reads .cfg and .weight files, which is very convenient. But the official code of yolov5 is based on the pytorch framework. You need to convert the pytorch training model .pt file to a .onnx file before loading it into the dnn module of opencv.

Convert .pt files to .onnx files, mainly referring to the blog of the nihate boss: /nihate/article/details/112731327

Modify export.py as follows, and comment out the second try in def export_onnx(), that is, the following partial comments:

'''
    try:
        check_requirements(('onnx',))
        import onnx

        LOGGER.info(f'\n{prefix} starting export with onnx {onnx.__version__}...')
        f = file.with_suffix('.onnx')
        print(f)

        torch.onnx.export(
            model,
            im,
            f,
            verbose=False,
            opset_version=opset,
            training=torch.onnx.TrainingMode.TRAINING if train else torch.onnx.TrainingMode.EVAL,
            do_constant_folding=not train,
            input_names=['images'],
            output_names=['output'],
            dynamic_axes={
                'images': {
                    0: 'batch',
                    2: 'height',
                    3: 'width'},  # shape(1,3,640,640)
                'output': {
                    0: 'batch',
                    1: 'anchors'}  # shape(1,25200,85)
            } if dynamic else None)

        # Checks
        model_onnx = onnx.load(f)  # load onnx model
        onnx.checker.check_model(model_onnx)  # check onnx model

        # Metadata
        d = {'stride': int(max(model.stride)), 'names': model.names}
        for k, v in d.items():
            meta = model_onnx.metadata_props.add()
            meta.key, meta.value = k, str(v)
        onnx.save(model_onnx, f)'''

And add a function def my_export_onnx():

def my_export_onnx(model, im, file, opset, train, dynamic, simplify, prefix=colorstr('ONNX:')):
    print('anchors:', model.yaml['anchors'])
    wtxt = open('class.names', 'w')
    for name in model.names:
        wtxt.write(name+'\n')
    wtxt.close()
    # YOLOv5 ONNX export
    print(im.shape)
    if not dynamic:
        f = os.path.splitext(file)[0] + '.onnx'
        torch.onnx.export(model, im, f, verbose=False, opset_version=12, input_names=['images'], output_names=['output'])
    else:
        f = os.path.splitext(file)[0] + '_dynamic.onnx'
        torch.onnx.export(model, im, f, verbose=False, opset_version=12, input_names=['images'],
                          output_names=['output'], dynamic_axes={'images': {0: 'batch', 2: 'height', 3: 'width'},  # shape(1,3,640,640)
                                        'output': {0: 'batch', 1: 'anchors'}  # shape(1,25200,85)
                                        })
    return f

Enter the command to convert to onnx in cmd (remember to put export.py and pt model in the same path):

python export.py --weights yolov5s.pt --include onnx

The following figure shows the successful conversion interface
Where yolov5s can be replaced with yolov5m\yolov5m\yolov5l\yolov5x

4. LabVIEW uses TensorRT to accelerate YOLOv5 for real-time object recognition (yolov5_new_onnx.vi)

1. LabVIEW calls YOLOv5 source code

2. Identify the results

Select the acceleration method as: TensorRT

Using TensorRT acceleration, the real-time detection and reasoning time is 20~30ms/frame , which is 30% faster than the acceleration of using cuda alone, without losing any accuracy. The computer graphics card used by the blogger is a 1060 graphics card. If you use a 30 series graphics card, the speed should be faster.

Five, opencv dnn and onnx toolkits under pure CPU load YOLOv5 to achieve real-time object recognition inference time comparison

1. The inference speed of YOLOv5 under opencv dnn cpu is about 300ms/frame

2. The inference speed of YOLOv5 under onnx toolkit cpu is about 200ms/frame

In comparison, we found that the inference speed of the onnx toolkit is about 30% faster than that of the opencv dnn inference using the same cpu.

6. Source code and model download

Link: https://pan.baidu.com/s/1cBVt8niF2fNT4j40JfiA-w?pwd=yiku
Extraction code: yiku

Additional Notes: Computer Environment

OS: Windows10
python: 3.6 and above
LabVIEW: 2018 and above 64-bit version
Vision Toolkit: virobotics_lib_onnx_cuda_tensorrt-1.0.0.11.vip

Summarize

That’s all I want to share with you today. You can download the relevant source code and model according to the link.

If you have any questions, you can discuss them in the comment area. Before asking questions, please like and support the blogger. If you want to discuss more about LabVIEW and artificial intelligence technology, please join our technical exchange group: 705637299.

If the article is helpful to you, please follow, like, and favorite

Leave a Reply

Your email address will not be published.