Some people truly believe their laptop built-in webcams can pry them so they
fearfully beware of that. Sometimes they’re so seriously afraid of prying that
they even tape their device’s watchful eye. Actually, they do it invainly. We'll
show you how to master the built-in laptop webcam and use its functionality in
civilian purpose and not as much civilian too.
Implementation: first annoying troubles
I was very surprised and upset when I learned that great and mighty .NET
Framework is completely released from the ability of easy web camera interaction.
In .NET v4 the situation has got a bit better (SilverLight-projects got some
relevant classes), but I didn’t have enough time to test it, because I began
writing some code examples for this article before VS2010 and .NET v4 official
Almost desperate, I have tightly ensconced myself in Google. All I found were
MSDN and DirectDraw technology links. I even tried to knock out a simple
application, but due to lack of DirectDraw work experience I just got a can of
worms. Actually, I wrote an application, but I was never able to find and fix
all the bugs in it.
Getting even more desperate I started browsing our Western friends’ web
resources. After I studied a few dozens of links, I dug a lot of different
goodies. There were various application examples and small articles (Americans
don’t like to write a lot) among them too. I even managed to find a working
DirectDraw based application example, but I was really horrified when I saw the
source code. It was pretty hard to understand. So I decided not to bother with
that stuff and try to find some easier way. I had hardly bowed out the I-st
DirectDraw application example, before my eye caught another one. The author of
that application had coded a whole web cam and other video capture devices
handling library on the basis of the VFW (Video for Windows) technology.
This project (I'm talking of the library) was neutered at the hilt and that
was a big pity. All things that library could do is video outputting of the
webcam picture. It didn’t include neither individual frames capturing or video
recording nor any other useful features.
Nevertheless, my subconscious mind firmly told me that this project is what I
was looking for. Before I had a quick glance through its code, I saw names of
some familiar Win-messages and no less familiar names of WinAPI functions. Once
upon a time I had to write a Delphi application for webcam operation. That’s
when I faced these functions for the first time.
It’s possible for one PC/laptop to have several webcams connected at the same
time. Example is not far to seek. In a professional capacity I often have to
organize some simple videoconferences. Usually they involve two people. Each
participant is shot by individual cam. Web cams are connected to my PC. When I
start shooting, I choose an appropriate camera to work with at the moment using
special software. Since we decided to take the web cam under our control, we’ll
have to figure out how to get a list of system installed video capture devices
and choose the one to work with at the moment.
WindowsAPI provides the capGetDriverDescription() function to solve this
simple problem. It deals with five parameters:
- wDriverIndex – capture driver index. Index value ranges from 0 to 9;
- lpszName – buffer pointer, which contains the appropriate driver name;
- cbName – lpszName buffer size (in bytes);
- lpszVer – buffer , which contains the description of a specific driver;
- cbVer – lpszVer buffer size (in bytes).
This function returns TRUE in case of success. Now we have the function
description, so let's see how to define it in C #. This can be done as follows:
protected static extern bool capGetDriverDescriptionA (short wDriverIndex, [MarshalAs(UnmanagedType.VBByRefStr)]
ref String lpszName, int cbName, [MarshalAs(UnmanagedType.VBByRefStr)] ref
String lpszVer, int cbVer);
Please note that before you specify the name of the function it is required
to add the DLL name which includes its definition. In our case it’s
So, the function is imported and now you can write a class it will be used in.
I’m not going to show the whole class code, but only the key method code:
public static Device GetAllCapturesDevices()
String dName = "".PadRight(100);
String dVersion = "".PadRight(100);
for (short i = 0; i < 10; i++)
ref dName, 100, ref dVersion,
Device d = new Device(i);
d.Name = dName.Trim();
d.Version = dVersion.Trim();
Source code looks like child's play. The most interesting place is a cycle,
which references the above mentioned capGetDriverDescription function. MSDN
tells us that its index (the first parameter of the capGetDriverDescription ()
function) can vary from 0 to 9, so we deliberately set the cycle in this range.
The method result is an array of Device classes (this class I have defined by
myself. See the appropriate code source).
After we get the device list, we should take care of displaying the cam video
flow. There’s capCreateCaptureWindow () function invented to help us creating a
capture window to make that.
By jumping a little ahead, I’d say that further camera involved action will
take the form of banal capture window messaging. Yes, indeed, we’ll have to use
the SendMessage () function which is painfully familiar for every
Now let’s take a closer look at the capCreateCaptureWindow () function. There
are six arguments to be set:
- lpszWindowName – null terminal line, which contains the name of the
- dwStyle – window style;
- x – X coordinate;
- y – Y coordinate;
- nWidth – window width;
- nHeight – window height;
- hWnd – parent window handle;
- nID – window ID.
The function result is handling of created window or NULL in case of error.
This function has to be imported as it also applies to WinAPI. I won’t exemplify
the import code, because it’s almost identical to the one I wrote for the
capGetDriverDescription () function. We’d better look at the camera initializing
deviceHandle = capCreateCaptureWindowA (ref deviceIndex, WS_VISIBLE |
WS_CHILD, 0, 0, windowWidth, windowHeight, handle, 0);
if (SendMessage(deviceHandle, WM_CAP_DRIVER_CONNECT, this.index, 0) > 0)
SendMessage(deviceHandle, WM_CAP_SET_SCALE, -1, 0);
SendMessage(deviceHandle, WM_CAP_SET_PREVIEWRATE, 0x42, 0);
SendMessage(deviceHandle, WM_CAP_SET_PREVIEW, -1, 0);
SetWindowPos(deviceHandle, 1, 0, 0, windowWidth, windowHeight, 6);
In this code, there goes an attempt to send a WM_CAP_DRIVER_CONNECT message
immediately after the window is created. The non-null result will tell us about
the function performing success.
Now we’ll imagine that today, the gods are on our side, and we’ll immediately
send multiple messages: WM_CAP_SET_SCALE, WM_CAP_SET_PREVIEWRATE,
WM_CAP_SET_PREVIEW. Alas, the story goes just the same as functions story. C#
knows nothing about the existence of such constants. You'll need to define them
by yourself. A list of all necessary constants and comments goes below.
// Custom message
private const int WM_CAP = 0x400;
// Video capture driver is connected
private const int WM_CAP_DRIVER_CONNECT = 0x40a;
// Video capture driver is disconnected
private const int WM_CAP_DRIVER_DISCONNECT = 0x40b;
// Buffer copy of a frame
private const int WM_CAP_EDIT_COPY = 0x41e;
// Preview mode On/Off
private const int WM_CAP_SET_PREVIEW = 0x432;
// Overlay mode On/Off
private const int WM_CAP_SET_OVERLAY = 0x433;
// Preview rate
private const int WM_CAP_SET_PREVIEWRATE = 0x434;
// Zoom On/Off
private const int WM_CAP_SET_SCALE = 0x435;
private const int WS_CHILD = 0x40000000;
private const int WS_VISIBLE = 0x10000000;
// Setting the preview callback function
private const int WM_CAP_SET_CALLBACK_FRAME = 0x405;
// Getting a single frame from a video capture driver
private const int WM_CAP_GRAB_FRAME = 0x43c;
// Saving a frame to a file
private const int WM_CAP_SAVEDIB = 0x419;
I will omit all further class description as I reviewed its basic structure.
All the rest is easy to deal by getting acquainted to my well-commented source
code. The only thing I don’t want to leave behind the scenes is an example of
the library usage.
Totally, I have implemented a couple of methods in this library:
GetAllDevices (already discussed), GetDevice (getting the video capture device
driver by its index), ShowWindow (webcam video flow displaying), GetFrame (individual
frame to image file capture) and GetCapture (video flow capture).
I made a small application in order to demonstrate the efficiency of created
library. I've used one ComboBox component (which is used to store a list of
available video capture devices) and a few buttons - "Refresh", "Start", "Stop"
and "Screenshot". Ah, yes, there’s also an Image component which is to display
the camera video flow.
We’ll start from the "Update" button. It gets a fresh list of all installed
video capture devices. Event handler source code:
Device devices = DeviceManager.GetAllDevices();
foreach (Device d in devices)
Looks simple, isn’t it? We just enjoy the object-oriented programming because
the developed library undertakes all dirty work. The code which displays the
camera video flow is even easier:
Device selectedDevice = DeviceManager.GetDevice(cmbDevices.SelectedIndex);
Again, looks just like a piece of cake. Well, now let’s take a look at "Screenshot”
Device selectedDevice = DeviceManager.GetDevice(cmbDevices.SelectedIndex);
I don’t pay some special attention to the FrameGrabber () method. In my
source code this method call leads to direct root system drive saving of current
frame. Of course that’s not the way it should be, so don’t forget to make all
necessary changes before application "field” use.
Now it’s time to talk about how to create a simple but reliable CCNC system.
Typically, such systems are based on two algorithms: two frames distinguishing
and a simple background simulation. Their implementation (source code) is quite
a heavy thing, so I decided to go an easier way at the last moment. That easy
way includes the use of powerful, but so far little-known AForge.NET which is a
framework for .NET.
AForge.NET is primarily intended for developers and researchers. With its
help, developers can greatly facilitate their work in developing projects in the
following areas: neural networks, image operation (filtering, image editing,
per-pixel filtering, resizing, and image rotation), genetics, robotics,
interaction with video devices, etc. AForge.NET is delivered with good manual.
It describes everything about the product. Take the time to thoroughly read it.
I especially like to mention about the quality of the product source code.
Digging that code is a real pleasure.
Now back to our immediate problem. Frankly, it can be solved as two and two
by that framework means. "Then why did you give me soar brain with that WinAPI
functions?" – You’ll ask dissatisfiedly. Just to ensure that you won’t be
limited in anything. I think you know that there’re different kinds of project
and in one case it’s more convenient to apply the .NET, but in some other case
it’s easier to get away with just a good old WinAPI.
Let’s return to our problem again. We’ll have to take the MotionDetector
class of the above mentioned framework in order to implement the motion detector.
The class excellently operates with Bitmap objects and allows a quick
calculating of two images difference percentage. Source code example:
MotionDetector detector = new MotionDetector(
new TwoFramesDifferenceDetector( ),
new MotionAreaHighlighting( ) );
// Next frame processing
if ( detector != null )
float motionLevel = detector.ProcessFrame( image );
if ( motionLevel > motionAlarmLevel )
flash = (int) ( 2 * ( 1000 / alarmTimer.Interval ) );
if ( detector.MotionProcessingAlgorithm is BlobCountingObjectsProcessing )
BlobCountingObjectsProcessing countingDetector = (BlobCountingObjectsProcessing)
objectsCountLabel.Text = "Objects: " + countingDetector.ObjectsCount.ToString(
objectsCountLabel.Text = "";
The above code (if not taking into count the MotionDetector class
initialization) is performed when getting every next frame from the web cam.
After we’ve got a frame there follows a banal comparison (based on ProcessFrame
method). If the motionlevel value is more then motionLevelAlarm (0.015f) it
means we should sound the alarm! Some motion is detected. One of the screenshots
clearly demonstrates the work of the motion detector.
Any web cam can be easily adapted for facial recognition and advanced system
logon establishment. If after browsing all this material you think that it’s
difficult, then you're completely wrong! Late March, there appeared an example (and
then a link to the article) on the http://codeplex.com web site (OpenSource MS
projects hosting), which demonstrated the implementation of the application for
web cam face detecting. Application example is based on the use of new
opportunities of .NET and SilverLight. It’s unreal to be reviewed within the
limit of one journal article, because the author of the source code tried to do
everything elegant to the hilt. Here you can find as algorithms for image
handling (blur filter, noise reduction, pixel by pixel comparison, stretching,
etc.) so the demonstration of the SilverLight new products and much more. In
other words, it gets the "must use” label with no doubt! See the project and
article link below.
All application examples overviewed within the article will serve you a good
start point. On the basis of those examples it is easy to create a webcam
professional tool and earn a few hundred bucks a quarter by selling it or create
some greasy and creepy spy Trojan.
Bethink the story about the backup of Skype conversation. It was told there
that the keyloggers time had already passed away. Now audio and video data is
extremely red hot. If you consider that nowadays, the webcam is a mandatory
attribute of any laptop, it is easy to imagine how many interesting videos you
can shoot by putting off this kind of "useful program" to your victim... But,
anyway, I told you nothing about that, didn’t I? :). Good luck in programming!
Remember, if you got any questions just feel free to ask me.
"Silverlight 4 real-time Face Detection" Russian version.
http://facelight.codeplex.com/ – "Facelight" project is hosted up here. It
allows real time face recognition. If you’re going to code some serious software
for person identification or system logon, then you’re simply obliged to check
out this project.
http://www.aforgenet.com/framework/ – AForge .NET - is an excellent and easy
to use framework for video and image handling.
http://vr-online.ru – All
source code examples and a lot more information you can find at VR-Online
project web site.