I'm using mplex (mjpegtools-1.9.0) in a customer project and I have some questions about the performance of this tool. The purpose of the project of my customer is to capture video (special HW of the customer) and audio signals (alsa) from an input device, to encode the video stream to mpeg2, to multiplex audio and the encoded video stream and to write this new stream to a DVD. This altogether has to run "on the fly".
I'm running Linux 2.6.32_2 on an ARM9 controller Davinci DM365 (about 400MHz) and I'm using the appropriate SDK from TexasInstruments (It's version 12). For multiplexing I'm using mplex from the mjpegtools-1.9.0 package and for authoring the stream before writing on a DVD, I'm using dvdauthor. The encoder of the Davinci controller is able to create I-Frames and P-Frames but no B-Frames. I can configure the bitrate between 2MBit/s up to 6.5MBit/s (actuall I'm using 2MBit/s). The stream on the DVD has to be PAL (or NTSC) with 24 frames per sec.
My problem now is, that I'm losing lots of frames and the result is an increasing offset between audio and video.
I order to make sure that I don't have a problem with the encoder I checked all the processing steps individually: As long as mplex doesn't run with the encoder (part of my application) in parallel everything is fine, no frames are lost. By the way, the encoder is a piece of hardware of DM365 and is controlled by my application.
Multiplexing this video stream by calling mplex as a separate process and not running my app at the same time leads to good results. In this scenario I noticed that mplex consumes about 90% of the cpu time and the time for multiplexing is nearly the same time I needed for capturing and encoding.
This cpu time is only about 65% when mplex runs in parallel (on the fly) with the rest of my app (this consumes about 20%). -> All the fifos and intermediate buffers are running full very quickly and the consequence is that the capturing unit drops frames.
On an X86 architecture it's completely different, multiplexing there takes only a few seconds. Looking into the code I see some hints about optimizations for X86 and for PPC.
- So what is about ARM9, are there any optimizations available?
- Is a Floating Point Unit necessary?
- Or even worth, is mlpex suitable for this "on the fly" project?
- Do you have any hints for improving the performance of mplex (besides using a different cpu:))
It would be very great is someone could help.
Thanks in advance and Best Regards
- This e-mail may contain trade secrets or privileged, undisclosed or otherwise confidential information.
- If you have received this e-mail in error, you are hereby notified that any review, copying or distribution of it is strictly prohibited.
- Please inform us immediately and destroy the original transmittal.