Ten lines of Python code extract Wei Xiaobao’s ID card information

I remember that there was an advertisement called: “Learn mathematics, physics and chemistry well, and you are not afraid to travel all over the world.” I feel that I should add another sentence: “Bring your ID card.” In this article, we will look at how to use Python to extract ID card information.

Method to realize

The implementation methods can be roughly divided into two types:

  • Make your own wheels, such as: use OpenCV to implement your own coding. All functions of this method need to be implemented by yourself, which is time-consuming and labor-intensive. The advantage is that it is more flexible.
  • Using off-the-shelf wheels, such as Baidu Cloud, the platform has already implemented the core functions and provided an API interface. We can directly call the interface. This method saves time and effort, but may be less flexible.

Implementation process

Because the function we want to achieve is relatively simple, we will use the second method to demonstrate it. Let’s take a brief look at the implementation process.

SDK installation

Baidu Cloud SDK provides support for multiple languages. Here we install the Python version of the SDK and use the pip install baidu-aip command. The SDK directory structure is as follows:

├── README.md
├── aip                    // SDK directory 
│ ├── __init__.py        // export class 
│ ├──  base .py            // aip base class 
│ ├── http.py            // http request 
│ └── ocr.py  / /OCR 
└── setup.py               // setuptools installation

Create an app

After the SDK is installed, we need to create an application. Here we need a Baidu account or Baidu Cloud account. If you don’t have one, you can register one yourself. The login and registration address is: https://login.bce.baidu.com/? redirect=http%3A%2F%2Fcloud.baidu.com%2Fcampaign%2Fcampus-2018%2Findex.html, the specific process is basically similar to license plate recognition. If you are not clear, you can read this article on license plate recognition.

Implementation

Let’s find an ID picture first, as shown in the figure:

Then look at the code implementation. First, create AipOcr. AipOcr is the Python SDK client of OCR. The code implementation is as follows:

# Own APPID AK SK 
APP_ID  =  'own App ID' 
API_KEY  =  'own Api Key' 
SECRET_KEY  =  'own Secret Key'

client = AipOcr(APP_ID, API_KEY, SECRET_KEY)

The above three parameters can also refer to the introduction in license plate recognition.

There are two modes of information extraction: normal mode and high-precision mode. The normal mode code is implemented as follows:

# Open and read the file content 
fp = open( "card.jpg" ,  "rb" ). read ()
res = client.basicGeneral(fp)  # normal 
# traverse the result 
for  tex  in  res[ "words_result" ]:
    row = tex["words"]
    print(row)

The output is as follows:

Name Wei Xiaobao
gender male ethnic
Born December 20, 1654
Address No. 4 Jingshan Front Street, Dongcheng District, Beijing
Forbidden City Reverend Room
Citizen ID number 112441654122 2438

Let’s try the high-precision mode again. The code is implemented as follows:

# Open and read the file content 
fp = open( "card.jpg" ,  "rb" ). read ()
res = client.basicAccurate(fp)  # high precision 
# traverse the result 
for  tex  in  res[ "words_result" ]:
    row = tex["words"]
    print(row)

The output is as follows:

Name Wei Xiaobao
gender male ethnic
Born December 20, 1654
Address No. 4 Jingshan Front Street, Dongcheng District, Beijing
Forbidden City Reverend Room
Citizen ID number 11204416541220243X

From the input results, we can see that the correct ID number is extracted in the high-precision mode, and there are some errors in the ID number extracted by the normal mode.

Summarize

In this article, we use Python combined with Baidu cloud interface to extract ID card information with a few lines of code. In fact, in addition to ID card information, other card information can also be extracted, such as bank card information, etc. If you are interested, you can try it.

Here I would like to recommend the Python development and learning group I built myself: 1156465813. The group is all developed by learning Python. If you want to learn or are learning Python, you are welcome to join. Everyone is a software development party and shares dry goods from time to time. (Only related to Python software development), including a copy of the latest Python advanced materials and advanced development tutorials in 2020 that I have compiled by myself. Welcome to advanced and advanced partners who want to go deeper into Python!

Leave a Comment

Your email address will not be published. Required fields are marked *