Text Recognition using EasyOCR Python on SAP Data Intelligence

Sejal Goyal
4 min readJul 3, 2021

In this post, we will create a REST API for text recognition in an image using EasyOCR python library. Worked on SAP Data Intelligence first time recently and found difficulty in creating a REST API. Sharing my knowledge here, I hope it helps!

Step-1: Create Scenario
Go to the main page of SAP Data Intelligence and click into the “ML Scenario Manager”. Click the little “+”-sign on the top right to create a new scenario. Name it “EasyOCR example” and you enter further details into the “Text Recognition” section. Click “Create”.

You see the empty scenario. You can use the Notebooks to explore the data. Pipelines bring the code into production. Executions of these pipelines will create Machine Learning models, which are then deployed as REST-API for inference. In this tutorial we are not using any ML model.

Step-2: Create Pipeline
Create Pipeline by selecting “Create” and Name it “OCR API” and Template as “Python Consumer”. you will see a diagram of pipeline. As we are not using any ML model so this pipeline needs to be changed like below diagram.

Step-3: Change Python file
Python script in our pipeline accepts image file and send the extracted text. For text extraction we are using Opencv2 and EasyOCR. Right click on “Python36-Inference” and select “Open Script” from the list. Copy the below python code and save the pipeline.

Step-4: Create DockerFile
Select “Repository” in left side menu and then select “dockerfiles”. Create a folder and place the below commands to “Dockerfile” file and save it. Select configuration on top right side and give tag name “OCR_image” and version. Build this dockerfile. After success go to next step.

FROM $com.sap.sles.ml.python
RUN python3.6 -m pip — no-cache-dir install — user — upgrade pip
RUN python3.6 -m pip — no-cache-dir install — user easyocr
RUN python3.6 -m pip — no-cache-dir install — user opencv-python
RUN python3.6 -m pip — no-cache-dir install — user numpy

Step-5: Apply DockerFile
Go to pipeline diagram and right click on Python script. Select “Group” from the list. Now right click on newly created group and select “Open configuration”. Go to tags and add “OCR_image” and version “V2”. Save it.

Step-6: Deploy API
Go back to scenario page and select the “OCR API” pipeline and select deploy.
Once you will get the status as “Running” then you can copy the Deployment url from the top.

Congrats!! you have created the REST API successfully.

Access this REST API using Postman:-
1.
Create an API client in Postman. Place <Deployment URL>+/v1/uploadjson/ and select post.
2. select authorization and place “<tenant name>+\+<user name>” (Ex. default\I456789) in Username and give password.

3. Select “Headers”. Add header “X-Requested-With” :“XMLHttpRequest”.

4. Select “Body” then “Binary” and now select image from your system.
5. Click Send. Result will be a json.

Suppose below is the selected image(Ex: test.jpg).

Result will be:- { “Extracted text”: “619121”}.

If you would like to learn more / have feedback, please let me know, so I can improve & write more for you interested folks.
Keep learning! :)

--

--

Sejal Goyal

A technology enthusiast who likes writing about Data Science.