The goal here is to have in C++ multiple threads running, with each doing only one (or few) image processing functions and get in this way high framerates combined with a large image size.
The reading of a frame from a camera is generally a “time-consuming” procedure (especially if it is a USB-connected webcam as in this post). So this step should definitely have its own thread to get the highest possible framerate at the largest possible video size.



The code was developed on a Raspberry Pi (2) with the standard Raspbian OS. Hence to compile the C++ code requires a Makefile, such as the following:

Note: In this Makefile I am assuming a flat hierarchy – i.e. the three files I am implementing are all in the same folder.
To get the highest frame-rate at the larges framesize on the now relatively “old” Raspberry Pi 2 it is really necessary to use threads to use all resources available (i.e. the 4 cores of the Raspberry Pi processor). Hence it is necessary to use the C++11 or C++14 (or later) Standard which introduces the threading functionality in C++. This is achieved by using in Makefile the flag -std=c++11 or -std=c++14, as is done in above Makefile in line 15 and line 17. The Makefile also links the OpenCV libraries to the compilation with the LIBS resources defined in line 7.
I usually clean the folder from previous compilations with the call make clean and then start a new copulation and linkage with the make call.

With this Makefile it is now possible to compile the code.


Below is the header file, which defines the main camera class and its member functions. I am calling this header file camLib.hpp:


This member function is declaring the constructor and destructor of the class Camera and one member function called captureVideo, which has no argument. It also declares a few (private) variables, such as the OpenCV matrix frame, which contains the latest captured frames.
The member functions are defined in the file camLib.cpp:

In this file the OpenCV VideoCapture instance is declared outside the class function in line 5. This allows the access to the VideoCapture cap instance from anywhere in the class. This has advantages and disadvantages. One advantage is the simplicity. A big disadvantage is that when the camera-assertion fails (which can happen), then the program needs to be manually restarted. Maybe even multiple times until finally the camera starts up.

Main file camera.cpp

The following, finally, is the main code camera.cpp. To get the highest possible frame rate, while applying some OpenCV functions on a Raspberry Pi 2, using threads is a good way to go.
The file camera.cpp contains three functions:

  • main()
  • grabFrame()
  • processFrame()

Some important variables are declared global in this code:

  • cam1, which is an instance of the camera class, and needs to be accesible from main and from grabFrame().
  • frameBuffer, which is the specified max limit of frames that can go in the OpenCV Mat stack before the whole stack gets erased.
  • frameStack, which is the OpenCV Mat frame stack containing the frames read from the camera (up to frameBuffer frames).
  • contourStack, which is the OpenCV Mat frame stack containing the frames taken from the frameStack stack and processed using OpenCV functions.
  • stopSig, which a flag that when set to 1 signals all threads to stop and return to the main routine.

In camera.cpp first the two OpenCV stacks frameStack and contourStack are cleared in lines 90 and 91 (they were defined in the lines 8 and 9. Then the two threads are started in lines 92 and 93.


In below picture I am trying to visualize the sequence of image processing through the three threads:

  1. grabFrame() is symbolized as a conveyor belt, whose only purpose is to grab frames from the camera and to store those frames on the frameStack stack. This function also makes sure that the stack does not flow over. In case the stack is full clear it and start from new. Note: A great place to look up the functions that can be used with the OpenCV stack is the GeeksForGeeks Website.
  2. The function processFrame() is symbolized as a conveyor belt as well. This one read the last frame from the stack frameStack and applies some OpenCV functions on the image. In above code you can see a few different test functions I ran on the frames. The latest that is not commented out is the Canny edge-detection (line 32). To have better results before applying the Canny function I am applying a Gaussian Blur in line 28 and transform the image from RGB to gray-scale in line 29. This function then saves the resulting frames in a second stack, called contourStack.
  3. The third thread is the main() function. This function starts the two other threads and waits for the user to press the Esc key, at which point it sets stopSig=1, which signals the two other threads to stop and to join back in main(). The other job of main() is to display the resulting images from the threads. Note: The OpenCV functions imshow and waitKey() don’t seem to really work if you try to use them in sub-frames. Stable results can be achieved only if using these two functions in the main() function.


Below image shows the result of the Canny edge-detection running at 720p, 30fps on the Raspberry Pi 2 using a Logitech QuickCam Pro 9000.

This is already pretty nice with what I have. However it is also pretty much at the limit of what is possible with this hardware.
One major bottleneck is the USB 2.0 throughput of the Raspberry Pi 2 at 480Mb/s. The camera produces HD video with up to 1280×720=921,600 pixels, with each having 8bits, and that at 30 frames per second. That is 1280 x 720 x 8 x 30 = 221.184 Mb/s, which is close to max of what the USB port can accept.
So for this reason in the next step I am going to use the Raspberry Pi camera, which has a CSI-2 MIPI bus with 2 lanes allowing up to 1Gb/s, which should allow 1080p frames at higher framerates.