Adding AI to the Raspberry Pi with the Movidius Neural Compute Stick Part 2: Using the Raspberry Pi Camera Module

Previously I introduced the Movidius NCS and showed how it could be combined with a Rasperry Pi 3 B+, a USB Camera and a Deep Learning Model to produce real time image analysis and annotation.

This time we’ll replace the USB Camera with a Raspberry Pi Camera Module and look into how to solve the resulting performance issues.

Mark West

Other articles in this series

Motivation

The Raspberry Pi Camera Module is the preferred camera for many Raspberry Pi enthusiasts. This is partly due to it’s low price (around 300-350 NOK) but also due to it’s reputation of generally providing better performance than USB Cameras.

This performance increase is due to the Pi Camera utilising the Raspberry Pi's GPU for processing. This frees up valuable CPU capacity for other things, such as running OpenCV or running a Desktop. USB Camera's on the other hand place a direct load on the Pi's CPU.

I was therefore keen to try out the Pi Camera and see how it performed along with my Pi3B+ and the NCS. The initial results were not what I expected.

Preparing the Raspberry Pi for the Pi Camera 

First off, make sure you've read through my previous article, and that you have successfully completed all the steps defined in it!

Next you'll need make sure that the your Pi Camera is enabled in Raspian. Do this by opening a terminal, running 

sudo raspi-config

...and enabling the Pi Camera under the "Interfacing options" section.

Finally you'll need to make sure that picamera Python Interface and additionally the picamera.array module are installed. To double check that these are installed on Raspian, run the following:

sudo apt-get install python3-picamera

sudo pip3 install "picamera[array]"

You should now have everything you need to continue.

Round One : Adapting Example Code to use the Pi Camera

In a previous article I showed you how to run the live-object-detector example from the Movidius Examples with a USB Camera (Task 4). This produced annotated video at a rate of 4.48 frames per second.

I decided to convert this example to work with the Pi Camera. This required some small changes to the original live-object-detector.py file. The amended code file is available as a download from dropbox. To use it, download a copy to the:

~/workspace/ncappzoo/apps/live-object-detector/

directory and run the following command:

python3 live-object-detector-picam.py

Here is what I saw on running this code:

Pi Camera with Pi 3 B+

This was, to be honest, rather disappointing. A FPS of around 1.5 made the Pi Camera three times slower than the USB Camera. I run a new test with the USB Camera to confirm that it was faster: 

USB Camera with Pi 3 B+

Much smoother. There was definitely a performance difference between the Pi Camera and USB Camera. But why?

One hunch was that the performance difference was due to the extra effort of moving images the Pi Camera video stream from the GPU to the CPU. This was the main difference between the USB Camera and Pi Camera processing.

Another suggestion was that the file sizes produced by the Pi Camera and USB were different, which also could affect performance.

I therefore decided to address the performance problem programmatically, with the plan of moving image handling into a separate thread.

Round Two : Adding Threading

Googling "threading python raspberry pi camera" quickly led me to Adrian Rosebrock's imutils Python library.

This library has a lot of functionality, but I was specifically interested in the PiVideoStream.py class. This is a wrapper for picamera that moves image I/O in to it's own thread. This frees up our main thread to handle NCS inference and annotation of the current frame.

When the main thread has finished processing the current image, the image I/O thread will always have the next image ready. This is potentially much faster than having a single thread processing the current frame, waiting for the next one to load, processing it, waiting for the next one to load and so on.

In addition, the imutils package also includes VideoStream.py. This class encapsulates the functionality for working a video stream from both a Pi Camera or USB Camera - and allows switching between the two video sources by setting a boolean flag.

Once again I decided to base my test on the live-object-detector example from the Movidius Examples. But first I had to install imutils: 

sudo pip3 install imutils

My next step was to copy my existing code and amend it to use imutils. This new file is also available as a download from dropbox. To use it, download a copy to the:

~/workspace/ncappzoo/apps/live-object-detector/

directory and run the following:

python3 live-object-detector-universal.py

This was the result when I ran the amended code with imutils and threading:

Threaded Pi Camera with Pi 3 B+

I think you can agree that this new version is somewhat better than my first attempt with the Pi Camera! But how does it compare to the USB Camera version?

Benchmarking with FPS .py

I benchmarked the different versions of the Camera using the FPS.py class from imutils. This class basically calculates the FPS and outputs it once processing is terminated. I ran a few tests and calculated the average FPS for each category.

  • USB Camera: 4.48 FPS (from part one of this series)
  • Pi Camera (unthreaded): 1.4 FPS (round one from this article) [code]
  • Pi Camera (threaded): 4.2 FPS (round two from this article) [code]

As you can see, the USB Camera slightly outperforms the Pi Camera in my tests. My tests were based upon a 640x480 resolution and all the same items in the frame.

Reducing the Pi Camera resolution to 320x240 gave me just over 5 FPS. This indicates that there are multiple things that could be tuned to improve the framerate. I'd also suggest that the performance will be affected by the amount of detected objects in frame - as the processing of output from the NCS (deserialization and image annotation) needs to do more when there are more objects.

Movidius NCS Performance

You'll recall from the first article in this series that the Movidius NCS has one job in the live-object-detector use case - namely find occurrences of 20 predefined objects in each frame from the video stream, using inference based on a pre-trained Deep Learning model.

Video I/O, image annotation and video display are handled by the Raspberry Pi.

Not surprisingly the NCS performs the same whether I use a Pi Camera or USB Camera. In my tests it took the NCS around 80 milliseconds to process each frame, indicating that the NCS processes frame faster than the Pi can produce them.

Final Thoughts and Conclusion

The imutils package definitely made a difference to the performance of my Raspberry Pi Camera Module, and the VideoStream.py helper class made it very easy for me to switch between video sources without having to change any code. In addition the FPS.py class helps me easily benchmark different combinations of camera and code.

What was very interesting was that my USB Camera and Pi Camera performed more or less the same once I'd added threading to the Pi Camera implementation. I would have expected the Pi Camera to perform better, but every use case is different.

Can my results be improved upon? Quite possibly - especially if you take some time to optimize the code, or look into the complexities of OpenCV and the Raspberry Pi Camera Module. Adrian Rosebrock (the creator of imutils) has plenty of online resources to help you do exactly this. 

Next time I'll share my experiences using the Movidius NCS with the Raspberry Pi Zero. As always, share your comments and feedback below! It's always nice to see that people actually read this stuff!

You can also find me on Twitter under the handle @markawest.

EDIT: 17th August - I've rewritten parts of this article for clarification and updated the video examples with better quality videos.

Thanks for reading!