On Thu, 2009-10-22 at 07:13 +0200, Sebastian Dröge wrote:
> Am Mittwoch, den 21.10.2009, 12:05 -0700 schrieb Edward Hervey:
> > Module: gst-plugins-base
> > Branch: master
> > Commit: d48d47e68365990d7c66782225f7fddf7efde86e
> > URL:
http://cgit.freedesktop.org/gstreamer/gst-plugins-base/commit/?id=d48d47e68365990d7c66782225f7fddf7efde86e> >
> > Author: Edward Hervey <
[hidden email]>
> > Date: Wed Oct 21 20:44:33 2009 +0200
> >
> > typefind: speed up mxf_type_find over 300 times for worst case scenarios
> >
> > * memcmp is expensive and was being abused, reduce calling it by checking
> > the first byte.
>
> I expected that the memcmp() would be inlined by the compiler because of
> the fixed length... and then it would be as fast as your change. Any
> idea why the compiler doesn't inline it here? :)
I'm now totally confused, I just recompiled it and checked the asm...
and it does inline it with a "repz cmpsb" on x86 (which it was you'd
expect).
The reason why the initial [i] checking speed things up by an extra
50% seems to be because "cmp" (of data[i] == 0x06) is faster than the
setup/usage of "rep[z] cmps[b]". That would correspond to reducing the
probability that "cmps" needs to be called by 1/255.
Adding a manual check for the second byte speeds things up only a tiny
bit more (less than 1% speedup compared to checking the first byte). I
didn't put that for that reason.
The biggest overhead is definitely calling the data_scan_ctx methods 1
byte at a time, even if they're inlined. The mp3 typefind function
handles that on its own , and results in a much faster typefinder,
despite its complexity.
FWIW, the other expensive typefinders are:
* mpeg_video_stream_type
* mpeg_find_next_header
* mpeg_sys_type_find
* h264_video_type_find
And most of the overhead in those is because of using the
data_scan_ctx methods over little number of bytes.
FYI, on most modern cpus, cmp is 7-10 times faster than cmps for one
invalid byte (yes, crazy) and the setup is also more expensive (you need
to setup 3 registers, whereas with cmp you're comparing a constant to a
memory region which is already loaded).
>
> But thanks for noticing and fixing this :)
------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel