Hi again,
I'll try to restate and simplify my questions. 1. Is it preferable to optimize the existing audioconvert plugin for a special case where the only operation to be performed is endianness conversion or to create a new plugin for this particular purpose? The 'if' clause to see if the special case applies is fairly quick and lightweight and the overall modification does not add much to the code size. 2. Is there a recommended method to passthrough a buffer with zero copy when no modification of the buffer contents are required, but allow modification in place when the data needs to be modified by the filter? More details are listed in my original post below for those who might be interested. Thanks, Ed ________________________________________ From: gstreamer-devel-bounces+edward.endejan=[hidden email] [gstreamer-devel-bounces+edward.endejan=[hidden email]] On Behalf Of Endejan, Edward [[hidden email]] Sent: Wednesday, February 01, 2012 10:58 AM To: [hidden email] Subject: audioconvert inefficient when converting only endianness Hi, I have two main questions listed near the end, but first here is the background information ... I have been evaluating the CPU load of various GStreamer pipelines. In one case of interest, I needed to use audioconvert simply to change the endianness of a PCM audio stream. Server pipeline: alsasrc ! audioconvert ! \ "audio/x-raw-int, endianness=(int)4321, \ signed=(boolean)true, width=(int)16, depth=(int)16, \ rate=(int)44100, channels=(int)2" ! \ rtpL16pay ! udpsink Client pipeline: udpsrc ! rtpL16depay ! queue2 ! audioconvert ! "audio/x-raw-int, endianness=(int)1234, \ signed=(boolean)true, width=(int)16, depth=(int)16, \ rate=(int)44100, channels=(int)2" ! \ alsasink I found that the audioconvert plugin was consuming more than 50% of the processor budget for the entire chain. When I investigated this I noticed that audioconvert was always up-converting the input stream to at least 32-bit signed, presumably to retain precision for the potential conversions that it would do. But in the case that I need, where endianness conversion is the only change required, this up-conversion is unnecessary and wasteful in terms of CPU load and potentially memory consumption as well. I implemented a modification (see the end of my post for a diff showing my current, rough proposed changes) which first checks if endianness conversion is the only required change, and if so avoids the type conversion step. The performance improvement is dramatic for the 16-bit width case that I've tested. I also verified proper operation for 8, 32, and 64-bit width cases, but have not compared performance for these cases. I tried to test the 24-bit width case, which supposedly is a valid setting for width, but I always get the error "size 4096 is not a multiple of unit size 6" as the pipeline is being set up and so have not been able to verify that case. I also tried to use "width=24, depth=32" but it did not help. (For all the other tests I used settings with depth == width.) If someone could explain how to test the 24-bit width case, please let me know and I'll give it a try. Though the 8-bit case works as implemented, I think that it could be further optimized if the input buffer could simply be passed along as the output buffer rather than being copied. I attempted to use gst_buffer_replace(), but that causes the gst-launch process to hang. I have seen the gst_base_transform_is_passthrough flag in the GstBaseTransform object which looks promising, but I'm not sure how or if I could use it for this case. I would appreciate a recommendation or a reference to example code on the proper method for passing along a buffer without change from input to output. Notice that I had to add a special clause to detect the 8-bit width case since the endianness comparison alone was not catching it. Endianness really doesn't matter for the 8-bit width case, since we are only talking about byte endianness, not bit endianness. So, here are my main questions: 1. Is it better to optimize endianness conversion within the audioconvert plugin or to create a new plugin for this explicit purpose? 2. What is the proper way to pass a buffer from input to output without doing a copy? Best Regards, Ed Endejan P.S. Below are the changes I've tested thus far: --------------- --- gst-plugins-base-0.10.35/gst/audioconvert/audioconvert.c.org +++ gst-plugins-base-0.10.35/gst/audioconvert/audioconvert.c @@ -709,6 +709,81 @@ return TRUE; } +static inline void +audio_convert_endianness_only (gpointer src, gpointer dst, + gint width, gint samples) +{ + /* assumes src & dst already checked for NULL and samples > 0 */ + /* which is a valid assumption since this is only called from */ + /* within audio_convert_convert after such checks */ + switch(width) { + case 64: { + uint64_t *sp = (uint64_t *)src; + uint64_t *dp = (uint64_t *)dst; + for(; samples; samples--) { + *dp = (*sp << 56) + | ((*sp & 0x000000000000FF00) << 40) + | ((*sp & 0x0000000000FF0000) << 24) + | ((*sp & 0x00000000FF000000) << 8) + | ((*sp & 0x000000FF00000000) >> 8) + | ((*sp & 0x0000FF0000000000) >> 24) + | ((*sp & 0x00FF000000000000) >> 40) + | (*sp >> 56); + sp++; + dp++; + } + break; + } + case 32: { + uint32_t *sp = (uint32_t *)src; + uint32_t *dp = (uint32_t *)dst; + for(; samples; samples--) { + *dp = (*sp << 24) + | ((*sp & 0x0000FF00) << 8) + | ((*sp & 0x00FF0000) >> 8) + | (*sp >> 24); + sp++; + dp++; + } + break; + } +#if 0 + case 24: { + /* 24-bit case is not real */ + /* attempting to use width=24 causes error "size 4096 is not a multiple of unit size 6" */ + GST_WARNING ("24-bit endianness conversion ignored\n"); + (void)memcpy((void *)dst, (void *)src, (size_t)(width/8 * samples)); + /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */ + break; + } +#endif + case 16: { + uint16_t *sp = (uint16_t *)src; + uint16_t *dp = (uint16_t *)dst; + for(; samples; samples--) { + *dp = (*sp << 8) | (*sp >> 8); + sp++; + dp++; + } + break; + } + case 8: { + /* endianness conversion is a nop for 8-bit width */ + (void)memcpy((void *)dst, (void *)src, (size_t)samples); + /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */ + break; + } +#if 0 + default: { + GST_WARNING ("Unexpected width - endianness conversion ignored\n"); + (void)memcpy((void *)dst, (void *)src, (size_t)(width/8 * samples)); + /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */ + break; + } +#endif + } +} + gboolean audio_convert_convert (AudioConvertCtx * ctx, gpointer src, gpointer dst, gint samples, gboolean src_writable) @@ -725,6 +800,14 @@ if (samples == 0) return TRUE; + if (ctx->mix_passthrough && (ctx->in.width == ctx->out.width) + && ((ctx->in.width == 8) || (ctx->in.endianness != ctx->out.endianness))) { + /* no need for type conversion if only endianness conversion is needed */ + audio_convert_endianness_only (src, dst, ctx->out.width, samples * ctx->out.channels); + + return TRUE; + } + insize = ctx->in.unit_size * samples; outsize = ctx->out.unit_size * samples; --------------- _______________________________________________ gstreamer-devel mailing list [hidden email] http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel _______________________________________________ gstreamer-devel mailing list [hidden email] http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Edward,
Patches aren't usually sent to the mailing list - please file something in bugzilla and attach your suggested patch. That'll ensure that it doesn't get lost. As for your specific questions: 1) Yes, it's entirely sensible to optimise audioconvert for common/simple conversions like this. Please finish your patch and attach it to a bugzilla entry! 2) That's generally how gstreamer works. If zero-copy is possible at all (it might not be - the obvious example is where the same buffer is being passed to multiple different elements simultaneously, some of which are writing to the buffer), it happens automatically. You just need to call gst_buffer_make_writable() before writing to it. If you're writing a subclass of GstBaseTransform, like audioconvert, some of this is done a little differently - but still, zero-copy will be done whenever possible. Mike On Thu, Feb 2, 2012 at 6:57 PM, Endejan, Edward <[hidden email]> wrote: > Hi again, > > I'll try to restate and simplify my questions. > > 1. Is it preferable to optimize the existing audioconvert > plugin for a special case where the only operation > to be performed is endianness conversion or > to create a new plugin for this particular purpose? > The 'if' clause to see if the special case applies is > fairly quick and lightweight and the overall > modification does not add much to the code size. > > 2. Is there a recommended method to passthrough a > buffer with zero copy when no modification of the > buffer contents are required, but allow modification > in place when the data needs to be modified by > the filter? > > More details are listed in my original post below for those > who might be interested. > > Thanks, > Ed > > ________________________________________ > From: gstreamer-devel-bounces+edward.endejan=[hidden email] [gstreamer-devel-bounces+edward.endejan=[hidden email]] On Behalf Of Endejan, Edward [[hidden email]] > Sent: Wednesday, February 01, 2012 10:58 AM > To: [hidden email] > Subject: audioconvert inefficient when converting only endianness > > Hi, > > I have two main questions listed near the end, but first here is the > background information ... > > I have been evaluating the CPU load of various GStreamer pipelines. > In one case of interest, I needed to use audioconvert simply to > change the endianness of a PCM audio stream. > Server pipeline: > alsasrc ! audioconvert ! \ > "audio/x-raw-int, endianness=(int)4321, \ > signed=(boolean)true, width=(int)16, depth=(int)16, \ > rate=(int)44100, channels=(int)2" ! \ > rtpL16pay ! udpsink > Client pipeline: > udpsrc ! rtpL16depay ! queue2 ! audioconvert ! > "audio/x-raw-int, endianness=(int)1234, \ > signed=(boolean)true, width=(int)16, depth=(int)16, \ > rate=(int)44100, channels=(int)2" ! \ > alsasink > > I found that the audioconvert plugin was consuming more than 50% > of the processor budget for the entire chain. When I investigated > this I noticed that audioconvert was always up-converting the > input stream to at least 32-bit signed, presumably to retain > precision for the potential conversions that it would do. But in > the case that I need, where endianness conversion is the only > change required, this up-conversion is unnecessary and wasteful > in terms of CPU load and potentially memory consumption as well. > > I implemented a modification (see the end of my post for a diff > showing my current, rough proposed changes) which first checks > if endianness conversion is the only required change, and if so > avoids the type conversion step. The performance improvement is > dramatic for the 16-bit width case that I've tested. I also > verified proper operation for 8, 32, and 64-bit width cases, but > have not compared performance for these cases. > > I tried to test the 24-bit width case, which supposedly is a > valid setting for width, but I always get the error > "size 4096 is not a multiple of unit size 6" as the pipeline is > being set up and so have not been able to verify that case. > I also tried to use "width=24, depth=32" but it did not help. > (For all the other tests I used settings with depth == width.) > If someone could explain how to test the 24-bit width case, > please let me know and I'll give it a try. > > Though the 8-bit case works as implemented, I think that it > could be further optimized if the input buffer could simply be > passed along as the output buffer rather than being copied. > I attempted to use gst_buffer_replace(), but that causes the > gst-launch process to hang. I have seen the > gst_base_transform_is_passthrough flag in the GstBaseTransform > object which looks promising, but I'm not sure how or if I could > use it for this case. I would appreciate a recommendation or a > reference to example code on the proper method for passing along > a buffer without change from input to output. > > Notice that I had to add a special clause to detect the 8-bit width > case since the endianness comparison alone was not catching it. > Endianness really doesn't matter for the 8-bit width case, since > we are only talking about byte endianness, not bit endianness. > > So, here are my main questions: > 1. Is it better to optimize endianness conversion > within the audioconvert plugin or > to create a new plugin for this explicit purpose? > 2. What is the proper way to pass a buffer from input > to output without doing a copy? > > Best Regards, > Ed Endejan > > P.S. Below are the changes I've tested thus far: > --------------- > --- gst-plugins-base-0.10.35/gst/audioconvert/audioconvert.c.org > +++ gst-plugins-base-0.10.35/gst/audioconvert/audioconvert.c > @@ -709,6 +709,81 @@ > return TRUE; > } > > +static inline void > +audio_convert_endianness_only (gpointer src, gpointer dst, > + gint width, gint samples) > +{ > + /* assumes src & dst already checked for NULL and samples > 0 */ > + /* which is a valid assumption since this is only called from */ > + /* within audio_convert_convert after such checks */ > + switch(width) { > + case 64: { > + uint64_t *sp = (uint64_t *)src; > + uint64_t *dp = (uint64_t *)dst; > + for(; samples; samples--) { > + *dp = (*sp << 56) > + | ((*sp & 0x000000000000FF00) << 40) > + | ((*sp & 0x0000000000FF0000) << 24) > + | ((*sp & 0x00000000FF000000) << 8) > + | ((*sp & 0x000000FF00000000) >> 8) > + | ((*sp & 0x0000FF0000000000) >> 24) > + | ((*sp & 0x00FF000000000000) >> 40) > + | (*sp >> 56); > + sp++; > + dp++; > + } > + break; > + } > + case 32: { > + uint32_t *sp = (uint32_t *)src; > + uint32_t *dp = (uint32_t *)dst; > + for(; samples; samples--) { > + *dp = (*sp << 24) > + | ((*sp & 0x0000FF00) << 8) > + | ((*sp & 0x00FF0000) >> 8) > + | (*sp >> 24); > + sp++; > + dp++; > + } > + break; > + } > +#if 0 > + case 24: { > + /* 24-bit case is not real */ > + /* attempting to use width=24 causes error "size 4096 is not a multiple of unit size 6" */ > + GST_WARNING ("24-bit endianness conversion ignored\n"); > + (void)memcpy((void *)dst, (void *)src, (size_t)(width/8 * samples)); > + /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */ > + break; > + } > +#endif > + case 16: { > + uint16_t *sp = (uint16_t *)src; > + uint16_t *dp = (uint16_t *)dst; > + for(; samples; samples--) { > + *dp = (*sp << 8) | (*sp >> 8); > + sp++; > + dp++; > + } > + break; > + } > + case 8: { > + /* endianness conversion is a nop for 8-bit width */ > + (void)memcpy((void *)dst, (void *)src, (size_t)samples); > + /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */ > + break; > + } > +#if 0 > + default: { > + GST_WARNING ("Unexpected width - endianness conversion ignored\n"); > + (void)memcpy((void *)dst, (void *)src, (size_t)(width/8 * samples)); > + /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */ > + break; > + } > +#endif > + } > +} > + > gboolean > audio_convert_convert (AudioConvertCtx * ctx, gpointer src, > gpointer dst, gint samples, gboolean src_writable) > @@ -725,6 +800,14 @@ > if (samples == 0) > return TRUE; > > + if (ctx->mix_passthrough && (ctx->in.width == ctx->out.width) > + && ((ctx->in.width == 8) || (ctx->in.endianness != ctx->out.endianness))) { > + /* no need for type conversion if only endianness conversion is needed */ > + audio_convert_endianness_only (src, dst, ctx->out.width, samples * ctx->out.channels); > + > + return TRUE; > + } > + > insize = ctx->in.unit_size * samples; > outsize = ctx->out.unit_size * samples; > > --------------- > _______________________________________________ > gstreamer-devel mailing list > [hidden email] > http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel > > _______________________________________________ > gstreamer-devel mailing list > [hidden email] > http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel gstreamer-devel mailing list [hidden email] http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Free forum by Nabble | Edit this page |