It’s been a while since I did much with MVTron, but I took a brief look hack at it the other day, and it’s actually coming along, I think. That said, it isn’t something I’m actively working on, so there probably aren’t going to be any spectacular results or anything.
Earlier, my approach to MVTron was to wrap AviSynth in the hopes that it could take advantage of all the fast AviSynth plugins people have written in C. I managed to get a scene detector working that processed my test video in about twice the video’s duration. That struck me as a pretty poor performance, but I was still hopeful, and I was planning to get my plan in motion from there.
Well, it’s kind of embarrassing to say this, but I completely forgot about that AviSynth-Groovy success story until just now when I read my own blog. That said, this time around I duplicated that success in less than a day without interfacing to AviSynth at all, using Xuggle’s (relatively) new MediaTools API. And now I’m usually looking at processing times only 170% the test video’s duration. Thanks, Xuggle!
I’ve been looking over audio analysis techniques, wrapping my mind around how exactly I might go about implementing (or finding someone else’s implementation of) an FFT or DWT, getting to the point where I can understand this abstract, and I notice something there. The writers of that used something called MARSYAS. What’s that?
Well, Marsyas is an open-source C++ project that apparently is exactly as ambitious as MVTron would like to be in exactly the same ways. Although its main focus is music and other audio, apparently there’s MarsyasX, a branch or something, which is a reimagining of Marsyas to be more video-inclusive. Considering Marsyas’s seeming focus on feature extraction, similarity detection, and… well, all kinds of stuff, it seems like MVTron would be just another application in the sea over there. That takes a lot of (self-inflicted) pressure off of me as a lone programmer. :-p
It almost looks at this point like MVTron will end up being a project entirely submerged in Marsyas, maybe even to the point that it’s written in C++… but I guess I shouldn’t be so hasty. I’ve only just heard of Marsyas, and besides, there’s an entry on their ideas page calling for “Porting the Marsyas dataflow architecture to Java.” ^_-
I ported the scene-detecting part of my AviSynth script to Groovy (using the AviSynth wrapper I was talking about yesterday), and it was still much slower than I’d have liked it to be. It paused every once in a while, probably to run a garbage collection pass or to find more space to allocate frames in. In the hopes that I could avoid this problem, I spent a few days to rewrite the library to be a three-layered system that was operable first, memory-safe second, and easy-to-use third.
It had previously been a two-layered system that was a low-level library first and a high-level library second. The middle layer is what was new here, but its arrival pretty much forced a complete refactoring of the two layers that surrounded it.
In any case, I think this was an important step to take, but it didn’t help the speed at all. So, I tried a few optimizations of the ported scene-detection script itself. First of all, I took away the kludge I was using to represent the boolean either-this-is-the-start-of-a-scene-or-it-isn’t stream. Rather than representing true and false with white frames and black frames, as I was forced to do in AviSynth, I represented them with, well, Groovy’s true and false. Instead of having the function return a Clip, I had it return a Closure (which itself would take an int and return a boolean). This did the trick. I took away six of the filters I was using in the script, and the speed improved markedly.
Once I put in some caching for the intermediate frames, the speed improved again by about 30%. Finally, I thought I’d push the limits of Groovy optimization a bit more by implementing it in Java. There was practically no improvement. Oh, well.
In the end, it tends to take about twice as long to process the scenes as it takes to actually play the movie. That’s still somewhat dismal, but I think it’s good enough for now. Maybe once I have a complete working prototype the speed improvements will follow.
Well, I was going to work on writing something that would preprocess a video file and calculate exactly where all the cuts were and what the intensity and “bend” were at any given time, and I ran into a roadblock.
Groovy is an awesome programming language. I want to program in Groovy as much as I possibly can. And so I found Xuggler (a SWIG-based Java interface to ffmpeg), and I wrote a convenient Groovy wrapper class for accessing frames in an ffmpeg video stream.
It was great. I could load a file, get all the frames I wanted one at a time as Java BufferedImages, draw on them using Java2D, and write them back out in an image sequence. (I figure I can probably export another movie file, too, but I haven’t tried that part of Xuggler yet, let alone wrapped it.)
Now I actually needed to process those images to find motion in them, and so I set up a simple image difference filter in Groovy, and I tried it out. It was slooooooow. It processed about one frame per second. I realized Groovy would be slow, but I hoped it wouldn’t be that slow.
A couple of months ago, I took a stab at making an AMV (anime music video). I’ve done it before… once… but this time, like lots of times, I had absolutely no time to work on it because I’d jam-packed myself with more homework than I knew what to do with.
So, I found myself pacing along, in a sleep-deprived state, talking at a friend about how I tried to do such-and-such and had to do wossname instead and never found time for, um, doing that AMV thing and that since it was me of all people doing it I might as well save time in an insanely difficult way by, say, programming my computer do it all for me.
An automatic AMV generator. Sometimes I amaze myself. Oftentimes I come up with some idea that blows my socks off and then realize it’s just a telephone or a wheel. This time, whether or not it had been done before, I knew I was particularly persnicketily picky about my AMVs… so picky that I was sure no other automatic AMV generator could be exactly what I wanted.
To make myself clear, this is something that doesn’t exist yet, but it’s something I’ve been actively working on for a few weeks. And I’m calling it MVTron.