Build a computer vision project

Hits: 0

Do you want to create an app to detect something? Cats and dogs, detect the ripeness of fruit, find brands in pictures?

If your answer is yes, then this article is for you!

Will show you how to create an app for your detector and put it on the internet for everyone to see.

In the end, you’ll have something like this to show your colleagues and friends: https://huggingface.co/spaces/Kili/plastic_in_river

You will be able to upload a test image and the model will return predicted boxes and labels.

Disclaimer: You need git installed in your computer to upload files to HuggingFace Spaces. If you don’t have one, don’t worry! It’s easy to install. Follow this tutorial: https://git-scm.com/downloads

This will be the workflow of the project:

  1. First, you have to collect images for your project. Do you want to spot zebras from giraffes? First you need to get images of both animals. Whatever you want to detect, you need its image. This point is white in the workflow, which means you have to do the work in your computer.

  2. The label image is shown in blue in the workflow because you will be using Datature’s labeling tool. Datature is a company that specializes in building user-friendly tools for data labeling and model training.

  3. You will also use Datature to train the model.

  4. Once the model is trained, you download it to your computer and put all the files together (which I will give you)

  5. When all the files are together, you’ll upload them to HuggingFace Spaces and your model is ready to use!

1. Find the picture

In [a computer vision] project, the first thing we need to do is collect images. If we want to train a deep neural network, we need thousands of images.

Luckily, Datature uses very advanced models and may be pre-trained, which means that if we train the model from scratch, we only need a fraction of the images we need.

About 100 images per class is sufficient. For example, if you wanted to detect t-shirts and pants, you would need 100 images of t-shirts and 100 pants. Of course, this example applies to other situations as well. For example, you can have 100 pictures of cats and dogs, so you can have 100 examples of cats and 100 examples of dogs.

It’s ok if there is class imbalance, for example if your project detects sunny and cloudy days, you can have 120 pictures of sunny days and 100 pictures of cloudy days. About 100 sheets are enough.

Collect all the images and store them in a folder on your computer.

2. Tag the image

Create an account in Datature and create a project for your use case. This tutorial from the Datature team explains how to create items and mark up images.

https://datature.io/blog/train-and-visualize-face-mask-detection-model

This blog post details how to:

  • Create a Datature Nexus Account (Free Trial)

  • Create a project

  • upload image

  • create class

  • Annotate pictures

  • Create a rectangular box in an image

  • Assign a class to each box

For each image, you will annotate a box (where is the object?) and a class (what is the object?)

Just read the labels section, after that, in the project overview, you should see your image stats, label distribution, etc. For example, a project overview should look like this:

In this example, I have an item called bananas, I have 16 images labeled, and I have 3 categories: ripe, edible, and inactive. This is just an example, so make sure you have at least 100 examples per class!

3. Train the model

Once we have the images, we can train our model! We will have to create a “workflow” in Nexus. Try using the blog post: https://datature.io/blog/train-and-visualize-face-mask-detection-model to complete the following steps:

  • Build a training workflow: choose train-test split ratio, choose boost, choose model settings

  • Train the model

  • Monitoring Models: Loss, Precision, Recall

  • export model

The model will take about 1 hour to train, after which you should see this

Go to Artifacts and download the TensorFlow model

This part is complete when a .zip file is exported from your computer.

4. Create a HuggingFace account

The model is trained and we download it to our computer as a .zip. But how do we interact with it?

We can interact with it by uploading photos to HuggingFace Spaces. We also need some code for the front end of the website.

Huggingface Spaces is a website owned by Huggingface where people can display their models and interact with them.

These are the steps to create

  1. Create a Huggingface account

  2. Create a Space

  3. Write a name for Space. Remember, this site will be public, so choose a name that matches the app! Example: Banana Analysis or something like that

  4. Select Streamlit as the Spatial SDK

  5. select public

  6. After using Space, clone the repository to a folder in your local computer

  7. Optional README.md

5. Gather all files and upload to HuggingFace space

We now have a folder on our computer that belongs to Space. We have to copy all files and upload all files to Space using git.

First, copy the model files (saved_model/ folder, label_map.pbtxt) to the folder

Then, create 3 files in this https://gist.github.com/anebz/2f62caeab1f24aabb9f5d1a60a4c2d25 folder

app.py

This file contains code for uploading images, loading the model, preprocessing, and getting predictions from the model.

Note those lines with #TODO, you must modify them!

Specifically the color_map, which are the colors of the boxes for each class. Open the file label_map.pbtxt to see what label_id is assigned to each class, and use this label_id to assign RGB values ​​to the colors.

In this example, I only have 2 classes, so only 2 colors. If you have more classes, add more lines in the format of the example:

1: [255, 0, 0],

Remember, there should be a comma at the end of every line except the last!

import cv2
import numpy as np
from PIL import Image
import streamlit as st
import tensorflow as tf
from tensorflow.keras.models import load_model

# most of this code has been obtained from Datature's prediction script
# https://github.com/datature/resources/blob/main/scripts/bounding_box/prediction.py

st.set_option('deprecation.showfileUploaderEncoding', False)

@st.cache(allow_output_mutation=True)
def load_model():
 return tf.saved_model.load('./saved_model')

def load_label_map(label_map_path):
    """
    Reads label map in the format of .pbtxt and parse into dictionary
    Args:
      label_map_path: the file path to the label_map
    Returns:
      dictionary with the format of {label_index: {'id': label_index, 'name': label_name}}
    """
    label_map = {}

    with open(label_map_path, "r") as label_file:
        for line in label_file:
            if "id" in line:
                label_index = int(line.split(":")[-1])
                
                label_name = next(label_file).split(":")[-1].strip().strip('"')
                
                label_map[label_index] = {"id": label_index, "name": label_name}
    return label_map
 
def predict_class(image, model):
 image = tf.cast(image, tf.float32)
    
 image = tf.image.resize(image, [150, 150])
    
 image = np.expand_dims(image, axis = 0)
    
 return model.predict(image)

def plot_boxes_on_img(color_map, classes, bboxes, image_origi, origi_shape):
 for idx, each_bbox in enumerate(bboxes):
  color = color_map[classes[idx]]

  ## Draw bounding box
  cv2.rectangle(
   image_origi,
   (int(each_bbox[1] * origi_shape[1]),
    int(each_bbox[0] * origi_shape[0]),),
   (int(each_bbox[3] * origi_shape[1]),
    int(each_bbox[2] * origi_shape[0]),),
   color,
   2,
  )
        
  ## Draw label background
  cv2.rectangle(
   image_origi,
   (int(each_bbox[1] * origi_shape[1]),
    int(each_bbox[2] * origi_shape[0]),),
   (int(each_bbox[3] * origi_shape[1]),
    int(each_bbox[2] * origi_shape[0] + 15),),
   color,
   -1,
  )
        
  ## Insert label class & score
  cv2.putText(
   image_origi,
   "Class: {}, Score: {}".format(
    str(category_index[classes[idx]]["name"]),
    str(round(scores[idx], 2)),
   ),
   (int(each_bbox[1] * origi_shape[1]),
    int(each_bbox[2] * origi_shape[0] + 10),),
   cv2.FONT_HERSHEY_SIMPLEX,
   0.3,
   (0, 0, 0),
   1,
   cv2.LINE_AA,
  )
 return image_origi


# Webpage code starts here

#TODO change this
st.title('YOUR PROJECT NAME')

st.text('made by XXX')

st.markdown('## Description about your project')

with st.spinner('Model is being loaded...'):
 model = load_model()

# ask user to upload an image
file = st.file_uploader("Upload image", type=["jpg", "png"])

if file is None:
 st.text('Waiting for upload...')
    
else:
 st.text('Running inference...')
    
 # open image
    
 test_image = Image.open(file).convert("RGB")
    
 origi_shape = np.asarray(test_image).shape
    
 # resize image to default shape
    
 default_shape = 320
    
 image_resized = np.array(test_image.resize((default_shape, default_shape)))

 ## Load color map
    
 category_index = load_label_map("./label_map.pbtxt")

 # TODO Add more colors if there are more classes
  # color of each label. check label_map.pbtxt to check the index for each class
 color_map = {
  1: [255, 0, 0], # bad -> red
  2: [0, 255, 0] # good -> green
 }

 ## The model input needs to be a tensor
    
 input_tensor = tf.convert_to_tensor(image_resized)
    
 ## The model expects a batch of images, so add an axis with {{EJS0}}.
    
 input_tensor = input_tensor[tf.newaxis, ...]

 ## Feed image into model and obtain output
    
 detections_output = model(input_tensor)
    
 num_detections = int(detections_output.pop("num_detections"))
    
 detections = {key: value[0, :num_detections].numpy() for key, value in detections_output.items()}
    
 detections["num_detections"] = num_detections

 ## Filter out predictions below threshold
 # if threshold is higher, there will be fewer predictions
 # TODO change this number to see how the predictions change
 confidence_threshold = 0.8
    
 indexes = np.where(detections["detection_scores"] > confidence_threshold)

 ## Extract predicted bounding boxes
    
 bboxes = detections["detection_boxes"][indexes]
    
 # there are no predicted boxes
 if len(bboxes) == 0:
  st.error('No boxes predicted')
        
 # there are predicted boxes
 else:
  st.success('Boxes predicted')
        
  classes = detections["detection_classes"][indexes].astype(np.int64)
        
  scores = detections["detection_scores"][indexes]

  # plot boxes and labels on image
  image_origi = np.array(Image.fromarray(image_resized).resize((origi_shape[1], origi_shape[0])))
        
  image_origi = plot_boxes_on_img(color_map, classes, bboxes, image_origi, origi_shape)

  # show image in web page
  st.image(Image.fromarray(image_origi), caption="Image with predictions", width=400)
        
  st.markdown("### Predicted boxes")
        
  for idx in range(len((bboxes))):
   st.markdown(f"* Class: {str(category_index[classes[idx]]['name'])}, confidence score: {str(round(scores[idx], 2))}")

packages.txt:

ffmpeg
libsm6
I liked 6

requirements.txt:

numpy==1.18.5
opencv-python-headless
Pillow==7.2.0
streamlit
tensorflow==2.3.0

packages.txt and requirements.txt are the libraries that will be installed in the Space. These files are very important, without them the code will not run.

In the end, the folder should look like this

  • saved_model/ is the folder in the .zip file you downloaded earlier from Datature

  • label_map.pbtxt exists in the .zip file

  • .gitattributes

  • README.md

  • app.py is the file created with the code I wrote earlier in this article

  • requirements.txt is provided in the previous code

  • packages.txt is provided in the previous code

Once all the files we need are in the folder, we can push it to Space. Open Git Bash and paste the following commands in sequence:

  • git add .

  • git commit -m “Added files”

  • git push

It takes some time for the app to upload files, especially model files. After the git push is complete, Space will take a few minutes to build the application and display our application in Space.

If git shows errors that the model file is too large, check out these posts: https://discuss.huggingface.co/t/uploading-large-files-5gb-to-hf-spaces/12001/4 and https://discuss .huggingface.co/t/uploading-files-larger-than-5gb-to-model-hub/4081/6

Finally, your app will look like this: https://huggingface.co/spaces/anebz/test

You can upload an image, the model takes a few seconds to load, and you can see the predictions. Project is complete!

in conclusion

Your project is done, congratulations!! You can quickly create an application from scratch. You just need to install Git on your computer, no Python or write any code.

☆ END ☆

If you see this, it means you like this article, please forward and like it. Wechat search “uncle_pn”, welcome to add the editor’s Wechat”woshicver”, and update a high-quality blog post in the circle of friends every day.

*↓* Scan QR code to add editor↓

You may also like...

Leave a Reply

Your email address will not be published.