Matrix Multiplication + OpenCL = ???

I’ve been trying to learn OpenCL lately and was presented with an excellent opportunity to practice implementing it. I was assigned a simple homework assignment to write a matrix multiplication program. I did, and now I plan to rewrite it with OpenCL. I’m going to clock it’s runtime before and after optimization and plot the results on a graph. So stay tuned…

You can get all of my code at my git repository.

Once you copy all of the files to your own program directory, just create a build folder, move into it and run CMake from there.

OpenCL optimization coming soon…

Posted in Matrix Multiplication | Leave a comment

OpenCL on ATI Graphics Card in Ubuntu

I’m working on optimizing some OpenCV algorithms with OpenCL. I have an ATI graphics card and run Ubuntu 10.04. To familiarize myself with OpenCL I’m following enja’s Adventures in OpenCL tutorial. Unfortunately before I can get familiar with OpenCL, I have to install it! This was not terribly easy to figure out. Now that it’s done though, it’s a piece of cake. So I’m sharing what I had to do to get OpenCL running on my Ubuntu+ATI system.

Before OpenCL can utilize your graphics card, you need a driver which can give you that control. The new ATI Catalyst 10.7 driver is running great on my system. Don’t download it from there though, it’s a real pain to install. There is a script written to take care of it for you below.

Ubuntu 10.04 – i386

wget http://mathnathan.com/wp-content/uploads/2010/08/aticatalist10_7_i386.txt

chmod +x aticatalist10_7_i386.txt

./aticatalist10_7_i386.txt

Ubuntu 10.04 – amd64

wget http://mathnathan.com/wp-content/uploads/2010/08/aticatalist10_7_amd64.txt

chmod +x aticatalist10_7_amd64.txt

./aticatalist10_7_amd64.txt

Once the new driver is installed you want to hit a rebot and make sure it all went smoothly.

Next you need the ATI Stream SDK. This has all the OpenCL libraries. Once you’ve got it, extract it to a location of your choice using something like…

For 32 bit installation

cd /path/to/chosen/location
tar -xvzf /path/to/download/ati-stream-sdk-v2.2-lnx32.tgz

For 64 bit installation

cd /path/to/chosen/location
tar -xvzf /path/to/ati-stream-sdk-v2.2-lnx64.tgz

Now you need to set some environment variables. Add these to your .bashrc file.

32 bit Installation

vim ~/.bashrc

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH":/path/to/ati-stream-sdk-v2.2-lnx32
                                         /lib/x86/"
export LIBRARY_PATH=$LIBRARY_PATH":/path/to/ati-stream-sdk-v2.2-lnx32
                                         /lib/x86/"
export C_INCLUDE_PATH=$C_INCLUDE_PATH":/path/to/ati-stream-sdk-v2.2-lnx32
                                         /include/"

64 bit Installation

vim ~/.bashrc

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH":/path/to/ati-stream-sdk-v2.2-lnx64
                                          /lib/x86/"
export LIBRARY_PATH=$LIBRARY_PATH":/path/to/ati-stream-sdk-v2.2-lnx64
                                          /lib/x86/"
export C_INCLUDE_PATH=$C_INCLUDE_PATH":/path/to/ati-stream-sdk-v2.2-lnx64
                                          /include/"

Lastly you need to download the icd-registration.tgz file, and extract it to your /etc folder…

cd /
sudo tar xzf /path/to/icd-registration.tgz

Extracting it in the / directory automatically puts both atiocl32.icd, and atiocl64.icd in the /etc/OpenCL/vendors directory.

That’s all there was to it for me. After this I was able to compile enja’s programs and follow the rest of his tutorials. Good luck!

Posted in OpenCL | 4 Comments

It’s Lonely Over Here

Looking at everything through numbers and equations just makes more sense than trying to wade your way through some overly complicated man made emotional veil of confusion… However, I’m certain I’m not alone when I say I’ve made some bad decisions that were terribly enjoyable, and some very good decisions that were unbearable. I know dedicating myself to understanding math and science is a good thing… It satisfies me and fulfills my rational mind. But when I find myself back in reality again, I often just feel alone. I find it difficult to enjoy things that most people do. It’s rare that I can have a conversation where I completely relate with someone else and connect on an intellectual level. I know these “nerdy people” who do math and science are perceived to see the world in a different way than most, and though that may be true in some situations, I for one would still like to feel a sense of belonging.

So I’m curious how the many scholarly people in the mathematics/computer science world perceive their world around them. How do you see your role in society? Do you feel like you belong where you are?

Posted in Nonrandom Thoughts | 19 Comments

A Logical Definition of Love

Logic and Love

We can all see patterns of behavior, but that is not to say that behavior follows patterns. This paper is to define a pattern that I have found in the ideas and actions which mankind has used the word love to describe. Any axioms and assumptions I make must be accepted as true for the majority of society.

This paper is most certainly not a rigorous proof, however I still want to follow a logical flow, so I need a solid foundation to work from. My chosen axioms are based on what would seem to me, as traits shared by our society as a whole. Traits so common, I’d say they’re ubiquitous and can support the rest of my contention. To state this axiom I must stereotype our culture, as must be done to discuss any large collection of things as a whole. Thus my coming argument can only be valid for the stereotyped portion of society which this axiom applies to.

With that said, I will use this as my pedestal…

 Axiom 1 - People want what they do not have.

Seems pretty straight forward. It’s a very widely used phrase which has given birth to many idioms and philosophical quotes – “The grass is always greener on the other side of the fence”, “Have one’s cake and eat it too”, etc.

I’m using this declaration in more of a metaphysical sense however. The word ‘have’ in my usage refers to a sense of intellectual ownership. To completely and thoroughly ‘understand’ a given topic (if that is even possible) instills a sense of possession, a custody of the intellectual property. Let’s say I have been taking derivatives since I was a fetus and now I ‘understand’ everything there is to know about derivatives and rates of change. Then in my mind I feel I can do with derivatives anything I please, just as I can do anything I please with this dead fish I’m holding in my hands. I ‘understand’ derivatives, just as much as I ‘have’ this fish.

So from here on I’m replacing the word ‘have’ with the word ‘understand’. I know this next claim may ruffle some feathers initially, but give me a moment to justify it.

 Axiom 2 - People want what they do not understand

This may seem off the wall at first, but it’s true. A colloquy might go as follows…

hooplehead - "A lot of people don't understand physics, and the lot of them
              hate it!"
mathnathan - "What do they hate?"
hooplehead - "Well, all of it! It's just confusing math, and Greek symbols."
mathnathan - "Do they hate televisions, and airplanes, and satellites?"
hooplehead - "Well no, of course not. But those aren't physics."
mathnathan - "Then what are they?"
hooplehead - "It's just technology, stuff that we use. Physics is the math
              and the equations behind it."
mathnathan - "Would you disagree with me if I said everything comes from
              something?"
hooplehead - "Well no. I see where you're going with this though, you
              think that because this technology came from physics, it
              is physics?"
mathnathan - "Not at all. I'm sure you'd agree however, that just as in
              religions and in our families, the creator, or those which
              make things possible deserve high reverence and
              respect, no?"
hooplehead - "Yes, that is true. Religions promote worship of their
              gods for giving them life, and children are taught to respect
              their parents and elders for making their lives possible
              as well. I see what you mean."
mathnathan - "Then you'd agree that people at least respect physics."
hooplehead - "Hmmm... Yes, this I agree with."
mathnathan - "Why do you suppose that is?"
hooplehead - "They respect what it can do."
mathnathan - "Physics has made a lot of things possible, it's quite powerful
              is it not?"
hooplehead - "It is..."
mathnathan - "Would you disagree that people as a whole, yearn for power?
              Whether for selfish, or righteous gains?"
hooplehead - "Hmmm, that may be debatable."
mathnathan - "Power brings possibility, and that can bring necessities.
              Even if indirectly or on a small scale people yearn for
              power to make things happen, for the betterment of
              society, or for the betterment of themselves.
              Either way, people do yearn for power."
hooplehead - "Okay, I'll give you that one."
mathnathan - "So if people want power, and physics is power, then people
              want to have physics."
hooplehead - "Well you can't have physics."
mathnathan - "Then they would want to be able to use it's power, so they
              would want to understand physics."
hooplehead - "I suppose they would..."

What happens if people do understand something then? (2) has a truth value so I can apply some laws of propositional logic to it.

Since (2) is actually an implication, or the “if-then” logic structure it can be written as follows…

 Axiom 2 - If people do not understand it, then they want it.

Notice this says nothing about what happens if people do understand something. This states if they don’t understand it, they want it. That is all it says. so I need another axiom.

Again I chose to look at the world around me to find another trait riddled in our personalities, and characters.

Music… Everyone has a favorite song. We’ll listen to the song over and over and over. Eventually though, we know every word. We can hum every chord progression. We know the life story of the musicians. We heard the song performed live… And eventually we move onto a new song.

Books… Stories are an excellent example. There is so much mystery in the plot twists, or there is so much drama in the love triangles that we can’t put the books down. However, after reading the book 5 times, the appeal begins to die. We know exactly what is going to happen to the Sally if she opens that door…

Movies… Plays… Art… After it is wholesomely digested we move on to something else.

Once we know something entirely, inside and out, then our desire to ‘have’, or ‘understand’ it dies. Which lead me to my 3rd axiom.

 Axiom 3 - If people do understand it, then they do not want it.

I hope all of my pure mathematicians reading this noticed that (3) is the inverse implication of (2). Of course the irrational behavior of humans would satisfy this…

There are occasions which may look like exceptions. Humphrey likes this painting so much, that no matter how many times he looks at it, it fills him with euphoria. No matter how many times Gretchen listens to this song it still gives her chills. My theory can explain these anomalies, and I’ll touch on them again at the end.

Why did our interest in these physical things die? Their existence, and their complexities are finite. A book is a book is a book. We see a new book that intrigues us, and by (2) we’ve got to have that book. The book can not change though, and once our minds have mastered it’s existence, by (3) we move on to another book. Our desires for things in the physical world follow a cyclical path. Our cycle from (2) -> (3) -> (2) -> (3) continues to spin, we just introduce new infatuations from the physical world to accommodate it.

People are not books… Nor are they songs, or movies, or sculptures. Though some people may look like they were sculpted out of stone, fortunately for us they were not. People have a constant ability to grow and learn and their inner complexities can become more and more complex with experience and knowledge. So let’s say we see a person who looks as if they were sculpted from stone and they appeal to us. Just as in finding an intriguing book, by (2) we want this person! We get close to this person, and begin to learn about them. The more we learn about them, the more they learn about us. They become more and more sophisticated with what we’ve contributed to them, and our contributions become more and more sophisticated with that they contribute to us. THIS cycle breaks our previous cycle, and (3) is never reached.

From the perspective of an entire planet the life of a living thing lasts no longer than the blink of an eye, and this cycle throughout our history has claimed the majority of many lives. We are no longer interested in trying to understand anything else. This struggle to understand someone matches the ideas and actions that our culture defines with the word love.

Which brings me to my definition of love.

 Theorem 1 - Love is the unending attempt to understand someone.

What about when people say they ‘love’ things, or ideas? This theorem leads to an intuitive corollary which can answer those as well.

Corollary 1 - To love something unchanging is the self-reflection of your
              changes due to that thing, leading to the unending attempt to
              understand yourself.

Corollary 1 encapsulates the idea of loving yourself. If you were in a moment wildly memorable, and a particular song was played, that song will stick with you for a long time. You’ll listen to it later on and remember the moment and the feelings that came along with it. Let’s say however that this moment was so impacting that it changed your life. This is like a semi-circle of the cycle of love. This moment contributed to the growth and development of you, so when reliving the moment, over and over, the change that this moment brought to you meets who you are now. So you can “relive” the change and this song will constantly be changing you. With that foundation, I say that through this self reflection you’re constantly trying to understand yourself.

Posted in Theories | 5 Comments

5. Alpha Blending

Alpha Blending is a technique to merge/blend two images or ROI’s (Region Of Interest) together. It is done by taking two arrays and applying weights to each. With this, the AddWeighted() function calculates the new pixel values in the blended ROI. Here is a simple porgam I used to implement an alpha blend.

# include <cv.h>
# include <highgui.h>

int main( int argc, char** argv ) {
	IplImage *src1, *src2;
	if( argc == 9 && ((src1=cvLoadImage(argv[1],1)) != 0
		)&&((src2=cvLoadImage(argv[2],1)) != 0))
		{
			int x = atoi(argv[3]);
			int y = atoi(argv[4]);
			int width = atoi(argv[5]);
			int height = atoi(argv[6]);
			double alpha = (double)atof(argv[7]);
			double beta = (double)atof(argv[8]);
			cvSetImageROI(src1, cvRect(x,y,width,height));
			cvSetImageROI(src2, cvRect(x,y,width,height));
			cvAddWeighted(src1, alpha, src2, beta, 0.0, src1);
			cvResetImageROI(src1);
			cvNamedWindow( "Alpha_blend", 1 );
			cvShowImage( "Alpha_blend", src1 );
			cvWaitKey();
		}
	return 0;
}

Get the code from my Git Repository
If you need help with Git, follow this quick introduction – Getting Started With Git

This program takes in eight arguments…

argv[1] - The destination image.
argv[2] - Source image to be blended onto the destination image.
argv[3], argv[4] - The (x,y) coordinate of the ROI in both the
                   destination image and the source image to be blended.
argv[5], argv[6] - Width and Height from the point (x,y) of the ROI.
argv[7] - Alpha, the weight of argv[1].
argv[8] - Beta, the weight of argv[2] to be blended onto argv[1].

For this simple example I didn’t have any applications in mind, I just wanted to see some images blended together. So the (x,y) coordinate and the width and height for the ROI are the same for both images.

The first new function that I used here was the cvSetImageROI…

void cvSetImageROI( IplImage* image, CvRect rect );
image - Image header.
rect - ROI rectangle.

This function sets the ROI (Region Of Interest) for a particular image. This happens inside the IplImage structure. Recall there was a data member…

 struct _IplROI *roi;

in the IplImage which is a pointer to the _IplROI struct and looks like this…

typedef struct _IplROI
{
     int coi;
     int xOffset;
     int yOffset;
     int width;
     int height;
} IplROI;

Notice this is just information for a subregion of the original image, where coi is channel of interest. To set that subregion I needed a rectangular region of the original image, and for this I used a CvRect structure…

typedef struct CvRect
{
     int x;
     int y;
     int width;
     int height;
} CvRect;

This is pretty self explanatory…

The way it is initialized is even easier!

CvRect  cvRect( int x, int y, int width, int height )
{
    CvRect r;

    r.x = x;
    r.y = y;
    r.width = width;
    r.height = height;

    return r;
}

There is kind of a lot going on here in this one line.

cvSetImageROI(src1, cvRect(x,y,width,height));

First calling cvRect as we can see returns a cvRect structure. The cvSetImageROI function expects a cvRect struct to be its second parameter, so it wouldn’t matter if I defined the cvRect in or out of the function call, as long as I’m sure to pass it in there. Then, cvSetImageROI takes the image passed in and sets its roi pointer to the region defined by the cvRect structure.

For example, say I create a pointer to an IplImage structure called img, then I set the roi pointer to some rectangular subregion of the image. Under these circumstances if I pass around the img pointer whenever I reference it’s picture, it will not refer to the original image, but instead it will reference the roi that was set.

The IplImage structure doesn’t loose the original image though, to set it so that it references the original again just call cvResetImageROI() which points the roi at NULL again.

void cvResetImageROI( IplImage* image );
image - Image header

Back to the program, once both of the roi’s are set for each image that is when the blending happens. I’ll have it noted that with the roi’s set to NULL you can still pass in both images and, provided they are the same size, both images in their entirety will be blended. It’s a pretty simple concept, the destination image is transformed following this simple calculation…

dst = src1*alpha + src2*beta + gamma

So I know alpha, beta, and gamma are doubles, what are src1, and src2? Well they are arrays, or matrices to be more precise. In each of their entries they store the RGB values of each pixel, and as our handy dandy linear algebra taught us, multiplying a matrix by a scalar will result in the multiplication of each entry with that scalar. This means I’m just changing the value of each pixel to be a mix (to the certain alpha and beta values) of the two images! Here is it’s declaration.

void  cvAddWeighted( const CvArr* src1, double alpha,
                     const CvArr* src2, double beta,
                     double gamma, CvArr* dst );
src1 - The first source array.
alpha - Weight of the first array elements.
src2 - The second source array.
beta - Weight of the second array elements.
gamma - Scalar, added to each sum.
dst - The destination array.

CvArr is a peculiar data type. I went rummaging through the source and this is what I found out about it.

typedef void CvArr;
/* CvArr* is used to pass arbitrary
 * array-like data structures
 * into functions where the particular
 * array type is recognized at runtime:
 */

Ya, so take that however you’d like.

Alright so how did it work? Well I’ll show ya!

There it is! It’s no longer a secret…. Yes, I am Dr. Manhattan!

References

I’m studying from Gary Bradski’s Learning OpenCV

Posted in OpenCV Journey | Leave a comment

CvVideoWriter Not Writing

I was trying to write a video to a file using the CvVideoWriter structure, but nothing was being written. I received no errors, nothing. The program would run, everything would go as expected, but the video would not write. This was my code.

..... more code here .....

CvVideoWriter *writer = cvCreateVideoWriter( argv[ 1 ],
                         CV_FOURCC('R','I','F','F'),
                         70,
                         size
                         );

..... more code here .....

cvWriteFrame( writer, bgr_frame );

..... more code here .....

After a bit of testing, what hinted me to the problem was placing this code in my program

CvVideoWriter *writer = cvCreateVideoWriter( argv[ 1 ],
                         CV_FOURCC('R','I','F','F'),
                         70,
                         size
                         );
if ( writer == NULL )
	printf("writer is NULL\n");
if ( writer == false)
	printf("writer is false\n");

This lead me to believe that the cvCreateVideoWriter function was returning, false, to my writer stucture. So when cvWriteFrame was called it did nothing. I wanted to know why, so I looked through the source to find the definition of the cvCreateVideoWriter function, this is what I found.

CvVideoWriter* cvCreateVideoWriter( const char* filename,
                                    int fourcc,
                                    double fps,
                                    CvSize frameSize,
                                    int isColor )
{
CvVideoWriter_FFMPEG* writer = new CvVideoWriter_FFMPEG;
if(writer->open(filename, fourcc, fps, frameSize, isColor != 0))
     return writer;
delete writer;
return 0;
}

So I saw here that what was truly returning the NULL was the writer objects open() function, the next step was to hunt it down. After testing some of the function’s modules which returned a NULL pointer, this is what I found.

bool CvVideoWriter::open( const char * filename, int fourcc,
		double fps, CvSize frameSize, bool is_color )
{
        ..... code code code .....
        ..... code code code .....
	/* Lookup codec_id for given fourcc */
#if LIBAVCODEC_VERSION_INT<((51<<16)+(49<<8)+0)
    if((codec_id = codec_get_bmp_id( fourcc )) == CODEC_ID_NONE )
        return false;
#else
	const struct AVCodecTag * tags[] = { codec_bmp_tags, NULL};
    if((codec_id = av_codec_get_id(tags, fourcc)) == CODEC_ID_NONE)
        return false;
#endif
        ..... code code code .....
        ..... code code code .....
	return true;
}

This hint took me to my final conclusion. I decided to look through the codec support in OpenCV, I found this big list of codecs only to be disappointed that the codec I was putting into my function wasn’t listed here! So I replaced it with one on the list that was also on my computer. If you’re having this problem look through this list to see if your codec is here.

FIX: Just replace the codec with one of the codecs on this list, that is also on your computer.

typedef struct AVCodecTag {
    int id;
    unsigned int tag;
} AVCodecTag;

const AVCodecTag codec_bmp_tags[] = {
    { CODEC_ID_H264, MKTAG('H', '2', '6', '4') },
    { CODEC_ID_H264, MKTAG('h', '2', '6', '4') },
    { CODEC_ID_H264, MKTAG('X', '2', '6', '4') },
    { CODEC_ID_H264, MKTAG('x', '2', '6', '4') },
    { CODEC_ID_H264, MKTAG('a', 'v', 'c', '1') },
    { CODEC_ID_H264, MKTAG('V', 'S', 'S', 'H') },

    { CODEC_ID_H263, MKTAG('H', '2', '6', '3') },
    { CODEC_ID_H263P, MKTAG('H', '2', '6', '3') },
    { CODEC_ID_H263I, MKTAG('I', '2', '6', '3') }, /* intel h263 */
    { CODEC_ID_H261, MKTAG('H', '2', '6', '1') },

    /* added based on MPlayer */
    { CODEC_ID_H263P, MKTAG('U', '2', '6', '3') },
    { CODEC_ID_H263P, MKTAG('v', 'i', 'v', '1') },

    { CODEC_ID_MPEG4, MKTAG('F', 'M', 'P', '4') },
    { CODEC_ID_MPEG4, MKTAG('D', 'I', 'V', 'X') },
    { CODEC_ID_MPEG4, MKTAG('D', 'X', '5', '0') },
    { CODEC_ID_MPEG4, MKTAG('X', 'V', 'I', 'D') },
    { CODEC_ID_MPEG4, MKTAG('M', 'P', '4', 'S') },
    { CODEC_ID_MPEG4, MKTAG('M', '4', 'S', '2') },
    { CODEC_ID_MPEG4, MKTAG(0x04, 0, 0, 0) }, /* some broken avi use this */

    /* added based on MPlayer */
    { CODEC_ID_MPEG4, MKTAG('D', 'I', 'V', '1') },
    { CODEC_ID_MPEG4, MKTAG('B', 'L', 'Z', '0') },
    { CODEC_ID_MPEG4, MKTAG('m', 'p', '4', 'v') },
    { CODEC_ID_MPEG4, MKTAG('U', 'M', 'P', '4') },
    { CODEC_ID_MPEG4, MKTAG('W', 'V', '1', 'F') },
    { CODEC_ID_MPEG4, MKTAG('S', 'E', 'D', 'G') },

    { CODEC_ID_MPEG4, MKTAG('R', 'M', 'P', '4') },

    { CODEC_ID_MSMPEG4V3, MKTAG('D', 'I', 'V', '3') }, /* default signature when using MSMPEG4 */
    { CODEC_ID_MSMPEG4V3, MKTAG('M', 'P', '4', '3') },

    /* added based on MPlayer */
    { CODEC_ID_MSMPEG4V3, MKTAG('M', 'P', 'G', '3') },
    { CODEC_ID_MSMPEG4V3, MKTAG('D', 'I', 'V', '5') },
    { CODEC_ID_MSMPEG4V3, MKTAG('D', 'I', 'V', '6') },
    { CODEC_ID_MSMPEG4V3, MKTAG('D', 'I', 'V', '4') },
    { CODEC_ID_MSMPEG4V3, MKTAG('A', 'P', '4', '1') },
    { CODEC_ID_MSMPEG4V3, MKTAG('C', 'O', 'L', '1') },
    { CODEC_ID_MSMPEG4V3, MKTAG('C', 'O', 'L', '0') },

    { CODEC_ID_MSMPEG4V2, MKTAG('M', 'P', '4', '2') },

    /* added based on MPlayer */
    { CODEC_ID_MSMPEG4V2, MKTAG('D', 'I', 'V', '2') },

    { CODEC_ID_MSMPEG4V1, MKTAG('M', 'P', 'G', '4') },

    { CODEC_ID_WMV1, MKTAG('W', 'M', 'V', '1') },

    /* added based on MPlayer */
    { CODEC_ID_WMV2, MKTAG('W', 'M', 'V', '2') },
    { CODEC_ID_DVVIDEO, MKTAG('d', 'v', 's', 'd') },
    { CODEC_ID_DVVIDEO, MKTAG('d', 'v', 'h', 'd') },
    { CODEC_ID_DVVIDEO, MKTAG('d', 'v', 's', 'l') },
    { CODEC_ID_DVVIDEO, MKTAG('d', 'v', '2', '5') },
    { CODEC_ID_MPEG1VIDEO, MKTAG('m', 'p', 'g', '1') },
    { CODEC_ID_MPEG1VIDEO, MKTAG('m', 'p', 'g', '2') },
    { CODEC_ID_MPEG2VIDEO, MKTAG('m', 'p', 'g', '2') },
    { CODEC_ID_MPEG2VIDEO, MKTAG('M', 'P', 'E', 'G') },
    { CODEC_ID_MPEG1VIDEO, MKTAG('P', 'I', 'M', '1') },
    { CODEC_ID_MPEG1VIDEO, MKTAG('V', 'C', 'R', '2') },
    { CODEC_ID_MPEG1VIDEO, 0x10000001 },
    { CODEC_ID_MPEG2VIDEO, 0x10000002 },
    { CODEC_ID_MPEG2VIDEO, MKTAG('D', 'V', 'R', ' ') },
    { CODEC_ID_MPEG2VIDEO, MKTAG('M', 'M', 'E', 'S') },
    { CODEC_ID_MJPEG, MKTAG('M', 'J', 'P', 'G') },
    { CODEC_ID_MJPEG, MKTAG('L', 'J', 'P', 'G') },
    { CODEC_ID_LJPEG, MKTAG('L', 'J', 'P', 'G') },
    { CODEC_ID_MJPEG, MKTAG('J', 'P', 'G', 'L') }, /* Pegasus lossless JPEG */
    { CODEC_ID_MJPEG, MKTAG('M', 'J', 'L', 'S') }, /* JPEG-LS custom FOURCC for avi - decoder */
    { CODEC_ID_MJPEG, MKTAG('j', 'p', 'e', 'g') },
    { CODEC_ID_MJPEG, MKTAG('I', 'J', 'P', 'G') },
    { CODEC_ID_MJPEG, MKTAG('A', 'V', 'R', 'n') },
    { CODEC_ID_HUFFYUV, MKTAG('H', 'F', 'Y', 'U') },
    { CODEC_ID_FFVHUFF, MKTAG('F', 'F', 'V', 'H') },
    { CODEC_ID_CYUV, MKTAG('C', 'Y', 'U', 'V') },
    { CODEC_ID_RAWVIDEO, 0 },
    { CODEC_ID_RAWVIDEO, MKTAG('I', '4', '2', '0') },
    { CODEC_ID_RAWVIDEO, MKTAG('Y', 'U', 'Y', '2') },
    { CODEC_ID_RAWVIDEO, MKTAG('Y', '4', '2', '2') },
    { CODEC_ID_RAWVIDEO, MKTAG('Y', 'V', '1', '2') },
    { CODEC_ID_RAWVIDEO, MKTAG('U', 'Y', 'V', 'Y') },
    { CODEC_ID_RAWVIDEO, MKTAG('I', 'Y', 'U', 'V') },
    { CODEC_ID_RAWVIDEO, MKTAG('Y', '8', '0', '0') },
    { CODEC_ID_RAWVIDEO, MKTAG('H', 'D', 'Y', 'C') },
    { CODEC_ID_INDEO3, MKTAG('I', 'V', '3', '1') },
    { CODEC_ID_INDEO3, MKTAG('I', 'V', '3', '2') },
    { CODEC_ID_VP3, MKTAG('V', 'P', '3', '1') },
    { CODEC_ID_VP3, MKTAG('V', 'P', '3', '0') },
    { CODEC_ID_ASV1, MKTAG('A', 'S', 'V', '1') },
    { CODEC_ID_ASV2, MKTAG('A', 'S', 'V', '2') },
    { CODEC_ID_VCR1, MKTAG('V', 'C', 'R', '1') },
    { CODEC_ID_FFV1, MKTAG('F', 'F', 'V', '1') },
    { CODEC_ID_XAN_WC4, MKTAG('X', 'x', 'a', 'n') },
    { CODEC_ID_MSRLE, MKTAG('m', 'r', 'l', 'e') },
    { CODEC_ID_MSRLE, MKTAG(0x1, 0x0, 0x0, 0x0) },
    { CODEC_ID_MSVIDEO1, MKTAG('M', 'S', 'V', 'C') },
    { CODEC_ID_MSVIDEO1, MKTAG('m', 's', 'v', 'c') },
    { CODEC_ID_MSVIDEO1, MKTAG('C', 'R', 'A', 'M') },
    { CODEC_ID_MSVIDEO1, MKTAG('c', 'r', 'a', 'm') },
    { CODEC_ID_MSVIDEO1, MKTAG('W', 'H', 'A', 'M') },
    { CODEC_ID_MSVIDEO1, MKTAG('w', 'h', 'a', 'm') },
    { CODEC_ID_CINEPAK, MKTAG('c', 'v', 'i', 'd') },
    { CODEC_ID_TRUEMOTION1, MKTAG('D', 'U', 'C', 'K') },
    { CODEC_ID_MSZH, MKTAG('M', 'S', 'Z', 'H') },
    { CODEC_ID_ZLIB, MKTAG('Z', 'L', 'I', 'B') },
    { CODEC_ID_SNOW, MKTAG('S', 'N', 'O', 'W') },
    { CODEC_ID_4XM, MKTAG('4', 'X', 'M', 'V') },
    { CODEC_ID_FLV1, MKTAG('F', 'L', 'V', '1') },
    { CODEC_ID_SVQ1, MKTAG('s', 'v', 'q', '1') },
    { CODEC_ID_TSCC, MKTAG('t', 's', 'c', 'c') },
    { CODEC_ID_ULTI, MKTAG('U', 'L', 'T', 'I') },
    { CODEC_ID_VIXL, MKTAG('V', 'I', 'X', 'L') },
    { CODEC_ID_QPEG, MKTAG('Q', 'P', 'E', 'G') },
    { CODEC_ID_QPEG, MKTAG('Q', '1', '.', '0') },
    { CODEC_ID_QPEG, MKTAG('Q', '1', '.', '1') },
    { CODEC_ID_WMV3, MKTAG('W', 'M', 'V', '3') },
    { CODEC_ID_LOCO, MKTAG('L', 'O', 'C', 'O') },
    { CODEC_ID_THEORA, MKTAG('t', 'h', 'e', 'o') },
#if LIBAVCODEC_VERSION_INT>0x000409
    { CODEC_ID_WNV1, MKTAG('W', 'N', 'V', '1') },
    { CODEC_ID_AASC, MKTAG('A', 'A', 'S', 'C') },
    { CODEC_ID_INDEO2, MKTAG('R', 'T', '2', '1') },
    { CODEC_ID_FRAPS, MKTAG('F', 'P', 'S', '1') },
    { CODEC_ID_TRUEMOTION2, MKTAG('T', 'M', '2', '0') },
#endif
#if LIBAVCODEC_VERSION_INT>((50<<16)+(1<<8)+0)
    { CODEC_ID_FLASHSV, MKTAG('F', 'S', 'V', '1') },
    { CODEC_ID_JPEGLS,MKTAG('M', 'J', 'L', 'S') }, /* JPEG-LS custom FOURCC for avi - encoder */
    { CODEC_ID_VC1, MKTAG('W', 'V', 'C', '1') },
    { CODEC_ID_VC1, MKTAG('W', 'M', 'V', 'A') },
    { CODEC_ID_CSCD, MKTAG('C', 'S', 'C', 'D') },
    { CODEC_ID_ZMBV, MKTAG('Z', 'M', 'B', 'V') },
    { CODEC_ID_KMVC, MKTAG('K', 'M', 'V', 'C') },
#endif
#if LIBAVCODEC_VERSION_INT>((51<<16)+(11<<8)+0)
    { CODEC_ID_VP5, MKTAG('V', 'P', '5', '0') },
    { CODEC_ID_VP6, MKTAG('V', 'P', '6', '0') },
    { CODEC_ID_VP6, MKTAG('V', 'P', '6', '1') },
    { CODEC_ID_VP6, MKTAG('V', 'P', '6', '2') },
    { CODEC_ID_VP6F, MKTAG('V', 'P', '6', 'F') },
    { CODEC_ID_JPEG2000, MKTAG('M', 'J', '2', 'C') },
    { CODEC_ID_VMNC, MKTAG('V', 'M', 'n', 'c') },
#endif
#if LIBAVCODEC_VERSION_INT>=((51<<16)+(49<<8)+0)
// this tag seems not to exist in older versions of FFMPEG
    { CODEC_ID_TARGA, MKTAG('t', 'g', 'a', ' ') },
#endif
    { CODEC_ID_NONE, 0 },
};
Posted in OpenCV Errors | 2 Comments

4. Reading From a Webcam

From my perspective one of the most intriguing applications of all this simple image and file handling so far is reading in data from a webcam. I was pleasantly surprised at how easy it was using OpenCV. Below is the code I used to read in information from a webcam and save it to a file.

# include "highgui.h"
# include "cv.h"

int main( int argc, char** argv ) {
	CvCapture* capture;

	capture = cvCreateCameraCapture(0);

	assert( capture != NULL );

	IplImage* bgr_frame = cvQueryFrame( capture );

	CvSize size = cvSize(
			(int)cvGetCaptureProperty( capture,
                            CV_CAP_PROP_FRAME_WIDTH),
			(int)cvGetCaptureProperty( capture,
                            CV_CAP_PROP_FRAME_HEIGHT)
			);

	cvNamedWindow( "Webcam", CV_WINDOW_AUTOSIZE );

	CvVideoWriter *writer = cvCreateVideoWriter( argv[ 1 ],
                            CV_FOURCC('D','I','V','X'),
                            30,
                            size
                            );

	while( (bgr_frame = cvQueryFrame( capture )) != NULL ) {
		cvWriteFrame( writer, bgr_frame );
		cvShowImage( "Webcam", bgr_frame );
		char c = cvWaitKey( 33 );
		if( c == 27 ) break;
	}
	cvReleaseVideoWriter( &writer );
	cvReleaseCapture( &capture );
	cvDestroyWindow( "Webcam" );
	return( 0 );
}

Get the code from my Git Repository
If you need help with Git, follow this quick introduction – Getting Started With Git

So the idea behind the program is this… First I wanted to make a pointer to a CvCapture structure, because I’m going to need somehwere to store the webcam information and capture feed. When I initialized the new capture structure, I used a new function CreateCameraCapture…

CvCapture* cvCreateCameraCapture( int index )
index - The camera device number
        Index   Device
          0     /dev/video0
          1     /dev/video1
          2     /dev/video2
          3     /dev/video3
          ...
          7     /dev/video7
          with
          -1    /dev/video

So I initialized the structure with my webcam /dev/video0, any number can be used, and the -1 will choose one arbitrarily.

Next I needed an IplImage structure pointer for displaying each frame captured by my webcam, but I wanted to do more than just display it. I wanted to write each frame down into a file as the camera was running. For the CreateVideoWriter function, which we’ll see momentarily, I needed the size of the images being captured by my webcam. There is another structure for storing the sizes, CvSize. It’s declared inside cxtypes.h as simply as it sounds.

typedef struct
{
    int width;
    int height;
}
CvSize;

After declaring a variable of type CvSize, I instantiated it with the cvSize function.

CvSize cvSize( int width, int height )
{
      CvSize s;
      s.width = width;
      s.height = height;
      return s;
}

It’s as easy as that, I don’t think that needs much explanation.

What does deserve an explanation is the super handy dandy cvGetCaptureProperty function. Notice I used it to get the height and width of my camera’s captured images and set that as the height and width of the new CvSize structure. Here is how it works.

double cvGetCaptureProperty(CvCapture* capture, int property_id);
capture - This is the structure containing all the information about
           it's video/camera/movie
property_id - The property information desried. 

Can be any of the following: CV_CAP_PROP_POS_MSEC       0
                             CV_CAP_PROP_POS_FRAMES     1
                             CV_CAP_PROP_POS_AVI_RATIO  2
                             CV_CAP_PROP_FRAME_WIDTH    3
                             CV_CAP_PROP_FRAME_HEIGHT   4
                             CV_CAP_PROP_FPS            5
                             CV_CAP_PROP_FOURCC         6
                             CV_CAP_PROP_FRAME_COUNT    7
                             CV_CAP_PROP_FORMAT         8
                             CV_CAP_PROP_MODE           9
                             CV_CAP_PROP_BRIGHTNESS    10
                             CV_CAP_PROP_CONTRAST      11
                             CV_CAP_PROP_SATURATION    12
                             CV_CAP_PROP_HUE           13
                             CV_CAP_PROP_GAIN          14
                             CV_CAP_PROP_EXPOSURE      15
                             CV_CAP_PROP_CONVERT_RGB   16
                             CV_CAP_PROP_WHITE_BALANCE 17
                             CV_CAP_PROP_RECTIFICATION 18

I find this to be a great function, one worth memorizing. Take note that it returns a double, so I needed to static cast them to integers for the cvSize function to accept them.

Inside the loop I’m thinking, I want to be writing the frames from the video feed before I display them. This is from remembering that IplImage structures instantiated to a CvCapture’s frames are released immediately after the function ends. So I’d have to write each frame down before they were released, and to write an image I needed a writer…

typedef struct CvVideoWriter CvVideoWriter;
CvVideoWriter* cvCreateVideoWriter(
                           const char* filename,
                           int fourcc,
                           double fps,
                           CvSize frame_size
                           );
filename - Name of the output video file.
fourcc - 4-character code of codec used to compress the frames.
         For example,
             CV_FOURCC('P','I','M','1') is MPEG-1 codec,
             CV_FOURCC('M','J','P','G') is motion-jpeg codec etc.
         Under Win32 it is possible to pass -1 in order to choose
         compression method and additional compression
         parameters from dialog.
fps - Framerate of the created video stream.
frame_size - Size of video frames.

I had some trouble with this function for a bit. The codec you choose to compress the frames must be on your computer. I put in a codec I believed was on my system only to find it not writing the file, after a while of searching and reading through the source, I found the problem which I recorded in my OpenCV Errors section.

I know what you’re thinking. “Hey! You put the fps in manually to 30, but I remember seeing a CV_CAP_PROP_FPS, why didn’t you use the cvGetCaptureProperty function and store it in a double and put it in that way?” Well I tried that. For a reason I’m not sure of yet, my webcam doesn’t seem to be sharing that information nicely with my computer, so I was receiving an error telling me it couldn’t read the fps from the CvCapture structure. Oh well. 30 fps seems to be working fine.

So I’m ready for the loop. I get it going with some new pizazz, but the logic is the same. As planned I write the frame first, then display it.

int cvWriteFrame( CvVideoWriter* writer, const IplImage* image );
writer - Video writer structure.
image - The frame to be written.

It does precisely what it says, takes in an image and writes it to the file that was given to the CreateVideoWriter function. Then show the image, and do that standard clean up, including releasing the new CvVideoWriter with the usual approach.

void cvReleaseVideoWriter( CvVideoWriter** writer );
writer - Pointer to the video file writer structure.

That’s that. Now by running this program a window will pop up displaying everything the webcam sees and save it to a file given as the first parameter to the main function. You can use the video player that was written in the previous post Basic Video Manipulation to test out the newly recorded video.

References

I’m studying from Gary Bradski’s Learning OpenCV

Posted in OpenCV Journey | Leave a comment

3. Basic Video Manipulation

Video Handling Basics

Understanding how to manipulate images was critical to learning video manipulation. This is because a video is just a loop of images! So toying with videos was a piece a pie after getting comfortable with image handling. Since I learned how to load images and display them, the only thing I needed to play videos was a loop. The next program should look very similar to the first two…

# include "stdio.h"
# include "highgui.h"

int main( int argc, char** argv ) {
	IplImage* frame = 0;

	if( argc < 2 ) {
		printf( "Usage: Accepts one video as argument\n" );
		exit( 0 );
	}

	cvNamedWindow( "Example2", CV_WINDOW_AUTOSIZE );
	cvMoveWindow( "Example2", 100, 100);
	CvCapture* capture = cvCreateFileCapture( argv[1] );

	while(1) {
	    frame = cvQueryFrame( capture );
	    if( !frame) break;
	    cvShowImage( "Example2", frame );
	    char c = cvWaitKey(33);
	    if( c == 27 ) break;
	}

	cvReleaseCapture( &capture );
	cvDestroyWindow( "Example2" );
}

Get the code from my Git Repository
If you need help with Git, follow this quick introduction – Getting Started With Git

First I created another window “Example2″ with my usual approach, no problems here.

Dealing with videos seemed simple enough in theory, but I was slightly taken aback at it’s complexity once I delved into the source files on my computer. CvCapture is another data structure, which has a complicated definition. All of it’s members are declared virtually and has different capture objects it uses for different situations. (of which, I don’t know yet…)

The way I loaded a video’s data into the CvCapture data structure is through the CreateFileCapture function. After looking through the source, this function calls the other functions with various usages, they were of the form

CvCapture* cvCreateFileCapture_XXX( const char* filename );
filename - Filename of the video you want to pass in.

where the XXX was different names. I just took note to remember this function takes a pointer to the filename of the video I’m trying to load.

Now remembering that a video is just a loop of images, I could be sure I was going to need an IplImage structure to store each one in. After creating the image variable, there was really nothing left but to jump into the loop.

while(1) {
   frame = cvQueryFrame( capture );
   if( !frame) break;
   cvShowImage( "Example2", frame );
   char c = cvWaitKey(33);
   if( c == 27 ) break;
}

Now inside the loop, my thinking process is this. I want to take each frame from the CvCapture data structure and display them one at a time, quickly. The first step was extracting each frame. I accomplished this using the QueryFrame function.

 IplImage* cvQueryFrame( CvCapture* capture );
capture - Here you pass the video structure.

It takes a pointer to a CvCapture structure, parses through the data and returns a pointer to each frame which is an IplImage structure. This function is just a combination of cvGrabFrame and cvRetrieveFrame in one call. Pretty handy, so I just stored it in frame, my IplImage pointer.

I then displayed the image in my “Example2″ window using my old friend cvShowImage. My question was “Okay, how long do I let this image sit here?”. After a little reading a good time to assume without knowledge of the captures frame rate it 30ms. the other 3 milliseconds are waiting for input (the ecs key == 27 in ASCII). There is a way to extract the correct frame rate from the CvCapture structure, but I’ve yet to glean this procedure…

char c = cvWaitKey(33);
if( c == 27 ) break;

Then if the esc key is pressed I could exit the loop. The loop only has two exit methods.

if( !frame ) break;
if( c == 27 ) break;

Then the usual clean ups were done, minus one. Notice I didn’t kill any of the IplImage structures? Apparently the CvCapture structure took care of that for me, how kind. Anyway, trying to release information that isn’t there is clearly not a good idea!

I saved the file, edited my CMakeLists.txt to accommodate the extra program and built it. I tested it on the sample video tree.avi and some other vids on my hard drive. You may need to finagle with the frame rate a bit to get a good flow for now.

It would be wise to put more exit procedures and provide more error checking. However, this was boring… I wanted to get to the fun stuff.

The Fun Stuff

As with the images, once I got my video basics down I wanted to try some more simple manipulations. My thought, play a video I normally enjoy with the previous procedure, then play another one next to it with each frame being blurred.

Alright so here it is.

# include "stdio.h"
# include "cv.h"
# include "highgui.h"

int main( int argc, char** argv ) {
	IplImage* frame = 0;
	IplImage* augframe = 0;

	if( argc < 2 ) {
		printf( "Usage: Accepts one video as argument\n" );
		exit( 0 );
	}

	cvNamedWindow( "Example2", CV_WINDOW_AUTOSIZE );
	cvMoveWindow( "Example2", 100, 100);
	cvNamedWindow( "output", CV_WINDOW_AUTOSIZE );
	cvMoveWindow( "output", 500, 100);
	CvCapture* capture = cvCreateFileCapture( argv[1] );

	while(1) {
	    frame = cvQueryFrame( capture );
	    if( !frame) break;
	    augframe = cvCreateImage( cvGetSize( frame ),
                                 IPL_DEPTH_8U, 3 );
		cvSmooth( frame, augframe, CV_BLUR, 3, 3 );
	    cvShowImage( "Example2", frame );
	    cvShowImage( "output", augframe );
	    char c = cvWaitKey(73);
	    if( c == 27 ) break;
	}

	cvReleaseCapture( &capture );
	cvDestroyWindow( "Example2" );
}

Get the code from my Git Repository
If you need help with Git, follow my quick introduction – Getting Started With Git

Everything highlighted I’ve used before, so there is nothing new to talk about. I used the same cvSmooth function from the previous post, I just chose to use another one of the smoothing procedures that was listed in the table. CV_BLUR was my choice, I don’t see much of a difference between it and the CV_GAUSSIAN. I suppose it’s just a different algorithm.

I did want to note one thing though. Recall that the CvCapture structure is taking care of releasing the frames for me now, this means that as soon as the call cvShowImage terminated, the image stored in frame was released… This meant that in order to create the augframe from the original using the cvCreateImage function it had to be done before frame was displayed!

I say try this out with some videos on your computer. Then perhaps take a look at some of the other filters, and what they do.

References

I’m studying from Gary Bradski’s Learning OpenCV

Posted in OpenCV Journey | Leave a comment

2. Basic Image Manipulation

Image Data Handling

After I installed OpenCV I wanted to get started with some image manipulations. The concept is pretty simple with Computer Vision being defined as basically “the transformation of data from a still or video camera into either a decision or a new representation” bradski. OpenCV provides a certain procedure for loading the data of the images. So the first thing I did was get comfortable with some of the basic image handling protocol.

# include "stdio.h"
# include "highgui.h"

int main( int argc, char** argv ) {
  IplImage* img = 0;

  if( argc < 2 ) {
    printf( "Usage: Accepts one image as argument\n" );
    exit( EXIT_SUCCESS );
  }

  img = cvLoadImage( argv[1] );

  if( !img ) {
    printf( "Error loading image file %s\n", argv[1]);
    exit( EXIT_SUCCESS );
  }

  cvNamedWindow( "Example1", CV_WINDOW_AUTOSIZE )
  cvMoveWindow( "Example1", 100, 100 );
  cvShowImage( "Example1", img );
  cvWaitKey( 0 );
  cvReleaseImage( &img );
  cvDestroyWindow( "Example1" );
  return EXIT_SUCCESS;
  }

Get the code from my Git Repository
If you need help with Git, follow this quick introduction – Getting Started With Git

It’s pretty clear from here why OpenCV is so user friendly, it’s interface is notably intuitive. My impression of the highgui.h header thus far leads me to believe it is host to the display functions which post all of the image data, as it’s name “highgui” suggests.

 IplImage* img = 0;

IplImage is a data structure, it is used to store image data. It got it’s name from it’s home, Intel, and it stands for Image Processing Library (IPL) Image. Here’s a taste of what it looks like.

typedef struct _IplImage {
    int  nSize;         /* sizeof(IplImage) */
    int  ID;            /* version (=0) */
    int  nChannels;     /* Most of OpenCV functions support
                           1,2,3 or 4 channels */
    int  alphaChannel;  /* Ignored by OpenCV */
    int  depth;         /* Pixel depth in bits: IPL_DEPTH_8U,
                           IPL_DEPTH_8S, IPL_DEPTH_16S,
                           IPL_DEPTH_32S, IPL_DEPTH_32F
                           and IPL_DEPTH_64F are supported */
    char colorModel[4]; /* Ignored by OpenCV */
    char channelSeq[4]; /* Ditto */
    int  dataOrder;     /* 0 - interleaved color channels,
                           1 - separate color channels.
                           cvCreateImage can only create
                           interleaved images */
    int  origin;        /* 0 - top-left origin,
                           1 - bottom-left origin
                           (Windows bitmaps style) */
    int  align;         /* Alignment of image rows (4 or 8).
                           OpenCV ignores it and uses
                           widthStep instead */
    int  width;         /* Image width in pixels */
    int  height;        /* Image height in pixels */
    struct _IplROI *roi;/* Image ROI. If NULL, the whole
                               image is selected */
    struct _IplImage *maskROI;     /* Must be NULL */
    void  *imageId;                /* "           " */
    struct _IplTileInfo *tileInfo; /* "           " */
    int  imageSize;     /* Image data size in bytes
                           (==image->height*image->widthStep
                           in case of interleaved data)*/
    char *imageData;    /* Pointer to aligned image data */
    int  widthStep;     /* Size of aligned image row in bytes */
    int  BorderMode[4]; /* Ignored by OpenCV */
    int  BorderConst[4];/* Ditto*/
    char *imageDataOrigin;/* Pointer to very origin of image data
                            (not necessarily aligned) -
                             needed for correct deallocation */
} IplImage; 

Most of it is pretty intuitive, but I’m sure I’ll learn more about some of it’s more obscure data members soon. A problem I faced next was; given an image, how do I extrapolate all of that data? That’s where the next function came in.

IplImage* cvLoadImage(const char* filename, int iscolor);
filename - name of the file to be loaded

iscolor - Specifies colorness of the loaded image:
    If >0, the loaded image is forced to be color 3-channel image;
    If 0, the loaded image is forced to be grayscale;
    If <0 or omitted, the loaded image will be loaded as is
    with the number of channels depending on the file.

This is an impressive function. It parses through it's argument, interprets all the information, and fills in that IplImage structure with all the data and finally returning a pointer to the IplImage. It allocates the appropriate memory size and has support for the following formats

  • Windows bitmaps - BMP, DIB
  • JPEG files - JPEG, JPG, JPE
  • Portable Network Graphics - PNG
  • Portable image format - PBM, PGM, PPM
  • Sun rasters - SR, RAS
  • TIFF files - TIFF, TIF

Notice however that I assign it to img. img is a pointer to the IplImage. That data structure is large and bulky, so passing it around by value would be costly in overhead.

Alright, so I've got an image loaded into my program here, how do I output it? What I learned next is that have you create your own windows for data output.

 int cvNamedWindow( const char* name, int flags );
name - Name of the window which is used as window identifier
       and appears in the window caption.

flags - Flags of the window. Currently the only supported flag
        is CV_WINDOW_AUTOSIZE. If it is set, window size is
        automatically adjusted to fit the displayed image while
        user can not change the window size manually.

The first parameter, the character array, is the name I used to refer to this exact window throughout the rest of the program, and the second is the windows size. There is something interesting to note here, the window variable name "Example1" is also the name that appears on the window after it is displayed... I haven't found any workarounds for this yet. I'm not sure if there is a need, but what if you want to make multiple windows with the same name, but need to refer to them independently throughout the program? Please let me know if you find a solution.

Now I noticed when the window appeared it was always in the corner... mathnathan doesn't like this... So a simple little move window function served me great here.

 void cvMoveWindow( const char* name, int x, int y );
name - Name of the window to be relocated.

x - New x coordinate of top-left corner

y - New y coordinate of top-left corner

Remember, "Example1" is now the string literal I had to use to refer to that window. The next two parameters are ints. They simply refer to the (x, y) coordinates where the window's upper left corner will be.

I got an image loaded and a window up, so then I needed to put the image in the window. OpenCV provides a function which displays an IplImage* pointer in an existing window, in my case "Example1".

 void cvShowImage( const char* name, const CvArr* image );
name - The name of the window.

image - The image to be shown.

Now the image is displayed in the window I created. Hooray. Note, if the window was set with CV_WINDOW_AUTOSIZE flag, then the window will resize to fit the image, otherwise the image will be resized to fit in the window.

To make the window go away I simply waited for any key to be pressed.

 int cvWaitKey( int delay=0 );
delay - The delay in milliseconds.

When given a parameter of less than or equal to 0, this function simply pauses the program and waits for a key to be pressed. Any positive integer will pause the program for that many milliseconds. This is the only function within highgui for event handling.

Being the good little C programmer that I am, I freed the allocated memory used to hold the image.

 void  cvReleaseImage( IplImage** image );
image - The IplImage structure you wish to be released.

Be sure to take note that this function expects a pointer to the IplImage*. So you should submit the address using the reference operator ' & '.

Lastly I destroyed the window.

 void cvDestroyWindow( const char* name );
name - The name of the window you wish to destroy.

I'm quite fond of this function, they really gave it a great name. Programming suddenly got way more fun!

Compiling

The program is ready to go, but building it wasn't as easy a gcc myprogram.c it required linking to the OpenCV libraries. This can be a nuisance, however OpenCV makes linking to their stuff a breeze when using CMake. If you're new to CMake and would like a quick introduction, check out my Getting Started with CMake tutorial. When I built OpenCV from source in the previous post, it produced an OpenCVConfig.cmake file (making it easy!) for me to put into my CMakeLists.txt file.

cmake_minimum_required(VERSION 2.6)
 project(examples)

 include($ENV{OpenCV_DIR}/OpenCVConfig.cmake)
 include_directories(${OPENCV_INCLUDE_DIR})

 add_executable(example1 example2-1.cpp)

 target_link_libraries(example1 ${OpenCV_LIBS})

Plain and simply that's all there was too it. I saved that as my CMakeLists.txt file, moved into the build directory, ran cmake, then make, and tested my program with some sample pictures. SUCCESS!

Applying a Gaussian Transformation

Using my new found image handling skills I attempted to smooth an image using cvSmooth(). There are a few minor additions to the previous program.

# include "stdio.h"
# include "highgui.h"
# include "cv.h"

int main( int argc, char** argv ) {
	IplImage* img = 0;
	IplImage* out = 0;

	if( argc < 2 ) {
		printf( "Usage: Accepts one image as argument\n" );
	    exit( EXIT_SUCCESS );
	}

	img = cvLoadImage( argv[1] );

	if( !img ) {
		printf( "Error loading image file %s\n", argv[1]);
	    exit( EXIT_SUCCESS );
	}

	out = cvCreateImage( cvGetSize(img), IPL_DEPTH_8U, 3 );

	cvNamedWindow( "Example1", CV_WINDOW_AUTOSIZE );
	cvMoveWindow( "Example1", 100, 100 );
	cvNamedWindow( "Output", CV_WINDOW_AUTOSIZE );
	cvMoveWindow( "Output", 300, 100 );
	cvShowImage( "Example1", img );
	cvSmooth( img, out, CV_GAUSSIAN, 3, 3 );
	cvShowImage( "Output", out );
	cvWaitKey( 0 );
	cvReleaseImage( &img );
	cvReleaseImage( &out );
	cvDestroyWindow( "Example1" );
	cvDestroyWindow( "Output" );
	return EXIT_SUCCESS;
}

Get the code from my Git Repository
If you need help with Git, follow my quick introduction - Getting Started With Git

First off, I added cv.h into the mix here. This is where I found all of my Computer Vision toys. To be able to see the before and after I created another pointer to an IplImage* data structure to store the transformed data which I called, out. I learned another method for creating an image without using the cvLoadImage() function, which I garnished here.

 IplImage*  cvCreateImage( CvSize size, int depth, int channels );
size - The images width and height

depth - Bit depth of image elements. Can be one of:
        IPL_DEPTH_8U - unsigned 8-bit integers
        IPL_DEPTH_8S - signed 8-bit integers
        IPL_DEPTH_16U - unsigned 16-bit integers
        IPL_DEPTH_16S - signed 16-bit integers
        IPL_DEPTH_32S - signed 32-bit integers
        IPL_DEPTH_32F - single precision floating-point numbers
        IPL_DEPTH_64F - double precision floating-point numbers

channels - Number of channels per element, pixel, in the image.

I got the size automatically using cvGetSize.

CvSize cvGetSize( const CvArr* arr );
arr - The array header

The function cvGetSize returns number of rows (CvSize::height) and number of columns (CvSize::width) of the input matrix or image. If I were to have the *roi pointer pointing to a subset of the original image, the size of ROI (Region Of Interest) is returned.

The next parameter is an int, and is the bit depth of the image elements. Lastly it wants the number of channels for the image. That is the number of channels per element (pixel), which can be 1, 2, 3 or 4. The channels are interleaved, for example the usual data layout of a color image is: b0 g0 r0 b1 g1 r1... Although in general IPL image format can store non-interleaved images as well and some of OpenCV can process it, cvCreateImage() creates interleaved images only.... How sad... :(

And finally here is the transformation.

void cvSmooth( const CvArr* src,
               CvArr* dst,
               int smoothtype=CV_GAUSSIAN,
               int param1=3,
               int param2=0,
               double param3=0 );
src - The source image, to be smoothed.
dst - The destination image, where to save the image.
smoothtype - Smoothing algorithm to be employed
  CV_BLUR_NO_SCALE (simple blur with no scaling) - summation over a
       pixel param1×param2 neighborhood. If the neighborhood size may
       vary, one may precompute integral image with cvIntegral function.
  CV_BLUR (simple blur) - summation over a pixel param1×param2
       neighborhood with subsequent scaling by 1/(param1•param2).
  CV_GAUSSIAN (gaussian blur) - convolving image with param1×param2
       Gaussian kernel.
  CV_MEDIAN (median blur) - finding median of param1×param1
       neighborhood (i.e. the neighborhood is square).
  CV_BILATERAL (bilateral filter) - applying bilateral 3x3 filtering
       with color sigma=param1 and space sigma=param2.
param1 - The first parameter of smoothing operation.
param2 - The second parameter of smoothing operation. In case
         of simple scaled/non-scaled and Gaussian blur if param2
         is zero, it is set to param1.
param3 - n case of Gaussian parameter this parameter may specify
         Gaussian sigma (standard deviation). If it is zero, it is
         calculated from the kernel size:

              sigma = (n/2 - 1)*0.3 + 0.8
                      n=param1 for horizontal kernel
                      n=param2 for vertical kernel.

         Using standard sigma for small kernels (3×3 to 7×7) gives
         better speed. If param3 is not zero, while param1 and param2
         are zeros, the kernel size is calculated from the sigma.

Your source image and destination image are self explanatory, the next parameter is your smoothing algorithm and
the other 3 parameters depend on the smooth type, here is another reference to the various smooth types. brasdki

I then displayed the output image in the "Output" window and was a good little C programmer and cleaned up after myself.

So those are the changes in the new program. It takes an image as input, makes a copy, creates two windows, displays one image, runs the other through a smoothing filter, then displays that one aswell! Easy stuff. Here is the output I received after running a picture of myself through the filter. COOL!!

References

I'm studying from Gary Bradski's Learning OpenCV

Posted in OpenCV Journey | 2 Comments

Performances

Here are a few performances I’ve done…

The Ocean Etude by Chopin

Fantasie Impromtu by Chopin

Final Fantasy VII Main Theme by Nobuo Uematsu

Posted in Performances | 3 Comments