-Kipling
One of the many technologies needed for the ongoing FPath project is the ability of a computer to control and manipulate its environment via robotic methods. As you are no doubt all too aware, such control is a very big topic indeed. Drilling down even further, one notes that computer image recognition could be usefully applied to the problem. Again, as before, image recognition is itself an extraordinarily large area of study - but one has to start somewhere and so I set about learning a bit about it.
I like to program in C# and while digging about to see what technologies might be available I came across the EmguCV library which is a C# interface onto the standard C++ based OpenCV library. EmguCV seems like a very usable technology and I thought I would devote a bit of time to setting up some code and figuring out how to use it. While doing so I noticed that, while there are plenty of code snippets available on the Internet, there were few (if any) complete Visual Studio demonstration solutions which might be downloaded in order for a beginner to try things out.
As part of my explorations I built 6 sample programs (Prism1 to Prism6) which demonstrate various aspects of EmguCV. Each Prism application is coded in C# as a Windows Form and is embedded in its own Visual Studio solution which is both complete and standalone. This code now been released on GitHub as open source under the MIT License in the thought that they may prove useful to others.
Below is a brief summary of the Prism Applications - a more extended discussion of each application can be found further on down the page.
Please note that the primary purpose of these test applications was to get some experience using EmguCV. As such none of the Prism apps attempt to tune the image recognition algorythms. That code has pretty much just been clipped verbatim from the readily available sample sources. It is simply assumed that the recognition aspect of the applications could be made to work better when that is required. All the current Prism applications really do is recognise "clean" circles and squares of pure primary red, blue and green colors.
Note that all of the applications expect to read from, and write to, a directory named C:\Dump\PrismData. This can be changed in the code, of course, but if you just want to run the apps you can save yourself some typing by creating C:\Dump\ directory and copying the PrismData directory from the Git Repo over into it.
In Visual Studio on a Windows system the installation of EmguCV is trivial. Simply use NuGet to download all of the EmguCV libraries. Once done just reference the libraries as usual.
using Emgu.CV; using Emgu.CV.CvEnum; using Emgu.CV.Structure; using Emgu.CV.Util;The YouTube Video has a demonstration of the NuGet download process if you are inexperienced. It is pretty easy - the only thing to watch out for is to make sure to search on "Emgu.CV" and get the files from Emgu Corporation. There are other libraries in NuGet with names like "EmguCV" which do not seem to be the official libraries. It is not at all certain what they are or what you might get if you use those ones.
Prism1 is pretty much the "Hello World" of EmguCV applications and is about as simple as these sort of EmguCV apps can get. All Prism1 does is enable the user to pick a static image file (usually .jpg) off the disk and display it on the screen. The Mat() object was used as the image carrier for the display and detection process.
An option provides the ability to detect circles or squares in the image and a second option provides the ability to select which color is detected. The center of a detected object is marked with a black cross. As mentioned previously, no attempt was made to tune the detection algorythm - it is pretty much just clipped right out of the sample code and the images it was tested on were very "clean". Click on the thumbnail at left to see a larger version of the Prism1 screen.
Prism2 is the same as Prism1 except that the image carrier is the older, and deprecated, Image<TColor, TDepth>() object. This was done to get a feel for the capabilities of the Image<> object. All the options of the Prism1 application are available in Prism2 and the same caveats regarding the lack of "tuning" on the recognition algorythm apply.
Click on the thumbnail at left to see a larger version of the Prism2 screen.
Prism3 is an odd one. Once Prism1 and Prism2 were completed I became interested in the performance of the various methods of accessing a pixel in Mat() and Image<> objects. After digging about for a while I put every access method I could find into the Prism3 app for a side by side comparison. Later this app was expanded to include things like the performance of bulk copies in and out of the objects and setting the object to solid colors. Of course, the speeds reported by these tests are all relative to my PC - but you can easily run the app yourself if you are interested in the relative performance benefits of various access methods.
Click on the thumbnail at left to see a larger version of the Prism2 screen. The Prism3Report.txt file contains a copy of the output from the complete list of tests in case you would like to look at it without downloading and running the Prism3 app on your system.
If you see any access methods and tests I missed, by all means make the appropriate changes and add a Pull Request to the GitRepo. Or you can contact me with the details. It would be generally useful to turn Prism3 into a generic "Complete List of EmguCV Access Techniques" reference.
Prism4 displays a webcam stream in a window on the form. This application is just a minimally modified version of the CameraCapture example in the Emgu.CV.Example solution available from the EmguCV GitHub repo. I could not get the Emgu.CV.Example.sln to compile. It contains a lot of other things besides the CameraCapture solution and there were many, many compile errors from missing dependencies.
So, this app is just the clipped out CameraCapture code from those examples and is set up as a standalone solution. It does compile and run and found the only webcam without issue. It is very slow to start (maybe 10-15 seconds) but runs without lag once started.
Click on the thumbnail at left to see a larger version of the Prism4 screen.
Prism5 expands on Prism4 and adds the ability to save the webcam stream to an mp4 file. In addition, the user can choose to recognise solid circles of various colors and those circles, when identified, will have their centers marked with a small black cross. As an experiment, the ability to set various other image parameters (such as the frame size and format) has also been provided - this also includes the type of backend the VideoCapture() object will use. Basically this application was intended to provide me with a reference implementation of image recognition in EmguCV with the additional ability to save the stream to disk. Only webcam 0 is displayed - however this choice is trivial to change in the code.
If you choose to use the Windows Media Foundation backend, this code is very slow to start (maybe 30-60 seconds). There does not seem to be much lag after that unless the larger frame sizes are chosen. When using DirectShow, the application displays the webcam image immediately but it is not possible for EmguCV to save the webcam stream to the disk using that backend.
Please note that the same caveats as stated previously regarding the lack of "tuning" on the recognition algorythm apply. Click on the thumbnail at left to see a larger version of the Prism5 screen.
Prism6 is Windows Media Foundation application which contains the same functionality as Prism5. Whereas Prism5 uses WMF behind the scenes in an extremely opaque manner, Prism6 provides a full Windows Media Foundation pipeline implementation. In other words, Prism5 is a EmguCV application which uses WMF to display and record the video stream and Prism6 is a WMF application which uses EmguCV calls only for purposes of image recognition. Like Prism5, Prism6 displays the video in realtime and can record the stream to disk if needed.
Please be aware that Prism6 is not a simple application, however, it does not seem to suffer any of the startup delay which plagues the Prism5 implementation. Since it is a full WMF implementation, the programmer has considerably more control over the video parameters and the processing of the frames as they move through the system. For example, the application interrogates the webcams on the system (all of them) and provides a list of the video options and formats they offer. The user can just pick the one they prefer from a "known good" list rather than just guessing and hoping for the best as is necessary in Prism5.
The Prism6 application uses EmguCV in the MFTDetectCircles_Sync transform in order to detect and mark solid red circles. The video frame will be in RGBA format when it arrives in the transform regardless of the format chosen from the webcam. Once in the transform, the frame will be converted into a Mat() object and run through the image recognition code. This image recognition code is pretty much the same as that used in all of the other Prism apps. Again, please remember the caveat that no "tuning" has been applied to the image recognition algorythm - the intent was to demonstrate the integration of EmguCV and Windows Media Foundation and not to provide a robust image recognition app that would recognise circles in all sorts of environments.
Writing a Windows Media Foundation app in C# is a big, big subject and space prohibits any discussion of it here. There is a free 350 page book on the topic (in pdf format) available at http://www.ofitselfso.com/Tanta/Windows_Media_Foundation_Getting_Started_CSharp.pdf and the writeup on the Tanta Project provides much more information and examples.
The Prism6 application was also used as the basis for the Walnut Server software which forms the PC side of a Server-Client pair (the client is on the Beaglebone Black). Walnut is the software component controlling servos and waldos used in the FPath project which explores the top down approach of the Feynman Path to Nanotechnology.
The Prism Sample Applications are open source and released under the MIT License. You can download, clone or fork the Prism Sample Projects at the following address:
https://github.com/OfItselfSo/Prism
Note that all of the Prism applications expect to read from, and write to, a directory named C:\Dump\PrismData. This can be changed in the code, of course, but if you just want to run the apps you can save yourself some typing by creating C:\Dump\ directory and copying the PrismData directory from the Git Repo over into it.
The contents of this web page are provided "as is" without any warranty of any kind and without any claim to accuracy. Please be aware that the information provided may be out-of-date, incomplete, erroneous or simply unsuitable for your purposes. Any use you make of the information is entirely at your discretion and any consequences of that use are entirely your responsibility. All source code is provided under the terms of the MIT License.
The icon used for the Prism apps is from the the lovely Underwater Icons set by TurboMilk.