Teach you to use LabVIEW OpenCV dnn to implement image classification (including source code)

Article directory

foreword

The last article shared with you how to use [LabVIEW] OpenCV dnn to realize handwritten digit recognition. Today, let’s take a look at how to use LabVIEW OpenCV dnn to realize image classification .

1. What is [image classification] ?

1. The concept of image classification

Image classification , at its core, is the task of assigning a label to an image from a given set of classifications. In practice, this means that our task is to analyze an input image and return a label that classifies the image. Labels always come from a predefined set of possible categories.
Example: We assume a set of possible categories categories = {dog, cat, eagle}, then we provide an image (below) to the classification system. The goal here is to assign a category from the set of categories based on the input image, here is eagle, our classification system can also assign multiple labels to images based on probabilities, such as eagle: 95%, cat: 4%, panda: 1%

2. Introduction to MobileNet

MobileNet : The basic unit is the depthwise separable convolution. In fact, this structure has been used in the Inception model before. Depth-level separable convolution is actually a factorized convolution operation, which can be decomposed into two smaller operations: depthwise convolution and pointwise convolution, as shown in Figure 1. Depthwise convolution is different from standard convolution. For standard convolution, the convolution kernel is used on all input channels (input channels), while depthwise convolution uses different convolution kernels for each input channel, that is, a convolution kernel. Corresponds to an input channel, so depthwise convolution is a depth-level operation. The pointwise convolution is actually an ordinary convolution, but it uses a 1×1 convolution kernel. Both operations are shown more clearly in Figure 2. For depthwise separable convolution, it first uses depthwise convolution to convolve different input channels separately, and then uses pointwise convolution to combine the above outputs, so that the overall effect is similar to a standard convolution, but it will greatly reduce the calculation. and model parameters.
The network structure of MobileNet is shown in the table. The first is a 3×3 standard convolution, followed by the stacked depthwise separable convolution, and you can see that some of the depthwise convolutions will be down sampling through strides=2. Then use average pooling to turn the feature into 1×1, add a fully connected layer according to the size of the predicted category, and finally a softmax layer. If the depthwise convolution and pointwise convolution are calculated separately, the entire network has 28 layers (Avg Pool and Softmax are not counted here).

Second, use python to implement image classification (py_to_py_ssd_mobilenet.py)

1. Get the pre-trained model

  • Use tensorflow.keras.applications to get the model (take mobilenet as an example);

from tensorflow.keras.applications import MobileNet
    original_tf_model = MobileNet(
        include_top=True,
        weights="imagenet"
    )

  • Package original_tf_model into pb

def get_tf_model_proto(tf_model):
    # define the directory for .pb model
    pb_model_path = "models"

    # define the name of .pb model
    pb_model_name = "mobilenet.pb"

    # create directory for further converted model
    os.makedirs(pb_model_path, exist_ok=True)

    # get model TF graph
    tf_model_graph = tf.function(lambda x: tf_model(x))

    # get concrete function
    tf_model_graph = tf_model_graph.get_concrete_function(
        tf.TensorSpec(tf_model.inputs[0].shape, tf_model.inputs[0].dtype))

    # obtain frozen concrete function
    frozen_tf_func = convert_variables_to_constants_v2(tf_model_graph)
    # get frozen graph
    frozen_tf_func.graph.as_graph_def()

    # save full tf model
    tf.io.write_graph(graph_or_graph_def=frozen_tf_func.graph,
                      logdir=pb_model_path,
                      name=pb_model_name,
                      as_text=False)

    return os.path.join(pb_model_path, pb_model_name)

2. Inference using opencv_dnn

  • Image preprocessing (blob)

def get_preprocessed_img(img_path):
    # read the image
    input_img = cv2.imread(img_path, cv2.IMREAD_COLOR)
    input_img = input_img.astype(np.float32)

    # define preprocess parameters
    mean = np.array([1.0, 1.0, 1.0]) * 127.5
    scale = 1 / 127.5

    # prepare input blob to fit the model input:
    # 1. subtract mean
    # 2. scale to set pixel values from 0 to 1
    input_blob = cv2.dnn.blobFromImage(
        image=input_img,
        scalefactor=scale,
        size=(224, 224),  # img target size
        mean=mean,
        swapRB=True,  # BGR -> RGB
        crop=True  # center crop
    )
    print("Input blob shape: {}\n".format(input_blob.shape))

    return input_blob

  • Invoke the pb model for inference

def get_tf_dnn_prediction(original_net, preproc_img, imagenet_labels):
    # inference
    preproc_img = preproc_img.transpose(0, 2, 3, 1)
    print("TF input blob shape: {}\n".format(preproc_img.shape))

    out = original_net(preproc_img)

    print("\nTensorFlow model prediction: \n")
    print("* shape: ", out.shape)

    # get the predicted class ID
    imagenet_class_id = np.argmax(out)
    print("* class ID: {}, label: {}".format(imagenet_class_id, imagenet_labels[imagenet_class_id]))

    # get confidence
    confidence = out[0][imagenet_class_id]
    print("* confidence: {:.4f}".format(confidence))

3. Implement image classification (code summary)

import us

import cv2
import numpy as np
import tensorflow as tf
from tensorflow.keras.applications import MobileNet
from tensorflow.python.framework.convert_to_constants import convert_variables_to_constants_v2




def get_tf_model_proto(tf_model):
    # define the directory for .pb model
    pb_model_path = "models"

    # define the name of .pb model
    pb_model_name = "mobilenet.pb"

    # create directory for further converted model
    os.makedirs(pb_model_path, exist_ok=True)

    # get model TF graph
    tf_model_graph = tf.function(lambda x: tf_model(x))

    # get concrete function
    tf_model_graph = tf_model_graph.get_concrete_function(
        tf.TensorSpec(tf_model.inputs[0].shape, tf_model.inputs[0].dtype))

    # obtain frozen concrete function
    frozen_tf_func = convert_variables_to_constants_v2(tf_model_graph)
    # get frozen graph
    frozen_tf_func.graph.as_graph_def()

    # save full tf model
    tf.io.write_graph(graph_or_graph_def=frozen_tf_func.graph,
                      logdir=pb_model_path,
                      name=pb_model_name,
                      as_text=False)

    return os.path.join(pb_model_path, pb_model_name)


def get_preprocessed_img(img_path):
    # read the image
    input_img = cv2.imread(img_path, cv2.IMREAD_COLOR)
    input_img = input_img.astype(np.float32)

    # define preprocess parameters
    mean = np.array([1.0, 1.0, 1.0]) * 127.5
    scale = 1 / 127.5

    # prepare input blob to fit the model input:
    # 1. subtract mean
    # 2. scale to set pixel values from 0 to 1
    input_blob = cv2.dnn.blobFromImage(
        image=input_img,
        scalefactor=scale,
        size=(224, 224),  # img target size
        mean=mean,
        swapRB=True,  # BGR -> RGB
        crop=True  # center crop
    )
    print("Input blob shape: {}\n".format(input_blob.shape))

    return input_blob


def get_imagenet_labels(labels_path):
    with open(labels_path) as f:
        imagenet_labels = [line.strip() for line in f.readlines()]
    return imagenet_labels


def get_opencv_dnn_prediction(opencv_net, preproc_img, imagenet_labels):
    # set OpenCV DNN input
    opencv_net.setInput(preproc_img)

    # OpenCV DNN inference
    out = opencv_net.forward()
    print("OpenCV DNN prediction: \n")
    print("* shape: ", out.shape)

    # get the predicted class ID
    imagenet_class_id = np.argmax(out)

    # get confidence
    confidence = out[0][imagenet_class_id]
    print("* class ID: {}, label: {}".format(imagenet_class_id, imagenet_labels[imagenet_class_id]))
    print("* confidence: {:.4f}\n".format(confidence))


def get_tf_dnn_prediction(original_net, preproc_img, imagenet_labels):
    # inference
    preproc_img = preproc_img.transpose(0, 2, 3, 1)
    print("TF input blob shape: {}\n".format(preproc_img.shape))

    out = original_net(preproc_img)

    print("\nTensorFlow model prediction: \n")
    print("* shape: ", out.shape)

    # get the predicted class ID
    imagenet_class_id = np.argmax(out)
    print("* class ID: {}, label: {}".format(imagenet_class_id, imagenet_labels[imagenet_class_id]))

    # get confidence
    confidence = out[0][imagenet_class_id]
    print("* confidence: {:.4f}".format(confidence))


def main():
    # configure TF launching
    #set_tf_env()

    # initialize TF MobileNet model
    original_tf_model = MobileNet(
        include_top=True,
        weights="imagenet"
    )

    # get TF frozen graph path
    full_pb_path = get_tf_model_proto(original_tf_model)
    print(full_pb_path)

    # read frozen graph with OpenCV API
    opencv_net = cv2.dnn.readNetFromTensorflow(full_pb_path)
    print("OpenCV model was successfully read. Model layers: \n", opencv_net.getLayerNames())

    # get preprocessed image
    input_img = get_preprocessed_img("yaopin.png")

    # get ImageNet labels
    imagenet_labels = get_imagenet_labels("classification_classes.txt")

    # obtain OpenCV DNN predictions
    get_opencv_dnn_prediction(opencv_net, input_img, imagenet_labels)

    # obtain TF model predictions
    get_tf_dnn_prediction(original_tf_model, input_img, imagenet_labels)


if __name__ == "__main__":
    main()

3. Image classification using LabVIEW dnn (callpb_photo.vi)

The example used in this blog is based on the LabVIEW2018 version , calling the mobilenet pb model

1. Read the pictures and pb models to be classified

2. Preprocess the images to be classified

3. Input the image into the neural network and infer

4. Realize image classification

5. The overall program [source code] :

Encode according to the program shown in the figure below to realize image classification. In this example, a classification is used to classify the objects with the highest confidence.

The following figure shows the classification result obtained by loading the picture of the medicine bottle. You can see the picture and label on the front panel:

Four, source code download

Link: https://pan.baidu.com/s/10yO72ewfGjxAg_f07wjx0A?pwd=8888Extraction
code: 8888

Summarize

For more information on LabVIEW and artificial intelligence technology, you can add a technical exchange group for further discussion. QQ group number: 705637299

Leave a Comment

Your email address will not be published. Required fields are marked *