RE: audioconvert inefficient when converting only endianness / buffer passthrough method

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

RE: audioconvert inefficient when converting only endianness / buffer passthrough method

Endejan, Edward
Hi again,

I'll try to restate and simplify my questions.

        1. Is it preferable to optimize the existing audioconvert
                plugin for a special case where the only operation
                to be performed is endianness conversion or
                to create a new plugin for this particular purpose?
                The 'if' clause to see if the special case applies is
                fairly quick and lightweight and the overall
                modification does not add much to the code size.

        2. Is there a recommended method to passthrough a
                buffer with zero copy when no modification of the
                buffer contents are required, but allow modification
                in place when the data needs to be modified by
                the filter?

More details are listed in my original post below for those
who might be interested.

Thanks,
Ed

________________________________________
From: gstreamer-devel-bounces+edward.endejan=[hidden email] [gstreamer-devel-bounces+edward.endejan=[hidden email]] On Behalf Of Endejan, Edward [[hidden email]]
Sent: Wednesday, February 01, 2012 10:58 AM
To: [hidden email]
Subject: audioconvert inefficient when converting only endianness

Hi,

I have two main questions listed near the end, but first here is the
background information ...

I have been evaluating the CPU load of various GStreamer pipelines.
In one case of interest, I needed to use audioconvert simply to
change the endianness of a PCM audio stream.
Server pipeline:
        alsasrc ! audioconvert ! \
        "audio/x-raw-int, endianness=(int)4321, \
        signed=(boolean)true, width=(int)16, depth=(int)16, \
        rate=(int)44100, channels=(int)2" ! \
        rtpL16pay ! udpsink
Client pipeline:
        udpsrc ! rtpL16depay ! queue2 ! audioconvert !
        "audio/x-raw-int, endianness=(int)1234, \
        signed=(boolean)true, width=(int)16, depth=(int)16, \
        rate=(int)44100, channels=(int)2" ! \
        alsasink

I found that the audioconvert plugin was consuming more than 50%
of the processor budget for the entire chain. When I investigated
this I noticed that audioconvert was always up-converting the
input stream to at least 32-bit signed, presumably to retain
precision for the potential conversions that it would do. But in
the case that I need, where endianness conversion is the only
change required, this up-conversion is unnecessary and wasteful
in terms of CPU load and potentially memory consumption as well.

I implemented a modification (see the end of my post for a diff
showing my current, rough proposed changes) which first checks
if endianness conversion is the only required change, and if so
avoids the type conversion step. The performance improvement is
dramatic for the 16-bit width case that I've tested. I also
verified proper operation for 8, 32, and 64-bit width cases, but
have not compared performance for these cases.

I tried to test the 24-bit width case, which supposedly is a
valid setting for width, but I always get the error
"size 4096 is not a multiple of unit size 6" as the pipeline is
being set up and so have not been able to verify that case.
I also tried to use "width=24, depth=32" but it did not help.
(For all the other tests I used settings with depth == width.)
If someone could explain how to test the 24-bit width case,
please let me know and I'll give it a try.

Though the 8-bit case works as implemented, I think that it
could be further optimized if the input buffer could simply be
passed along as the output buffer rather than being copied.
I attempted to use gst_buffer_replace(), but that causes the
gst-launch process to hang. I have seen the
gst_base_transform_is_passthrough flag in the GstBaseTransform
object which looks promising, but I'm not sure how or if I could
use it for this case. I would appreciate a recommendation or a
reference to example code on the proper method for passing along
a buffer without change from input to output.

Notice that I had to add a special clause to detect the 8-bit width
case since the endianness comparison alone was not catching it.
Endianness really doesn't matter for the 8-bit width case, since
we are only talking about byte endianness, not bit endianness.

So, here are my main questions:
        1. Is it better to optimize endianness conversion
                within the audioconvert plugin or
                to create a new plugin for this explicit purpose?
        2. What is the proper way to pass a buffer from input
                to output without doing a copy?

Best Regards,
Ed Endejan

P.S. Below are the changes I've tested thus far:
---------------
--- gst-plugins-base-0.10.35/gst/audioconvert/audioconvert.c.org
+++ gst-plugins-base-0.10.35/gst/audioconvert/audioconvert.c
@@ -709,6 +709,81 @@
   return TRUE;
 }

+static inline void
+audio_convert_endianness_only (gpointer src, gpointer dst,
+    gint width, gint samples)
+{
+  /* assumes src & dst already checked for NULL and samples > 0 */
+  /* which is a valid assumption since this is only called from */
+  /* within audio_convert_convert after such checks */
+  switch(width) {
+    case 64: {
+      uint64_t *sp = (uint64_t *)src;
+      uint64_t *dp = (uint64_t *)dst;
+      for(; samples; samples--) {
+        *dp = (*sp << 56)
+            | ((*sp & 0x000000000000FF00) << 40)
+            | ((*sp & 0x0000000000FF0000) << 24)
+            | ((*sp & 0x00000000FF000000) <<  8)
+            | ((*sp & 0x000000FF00000000) >>  8)
+            | ((*sp & 0x0000FF0000000000) >> 24)
+            | ((*sp & 0x00FF000000000000) >> 40)
+            | (*sp >> 56);
+        sp++;
+        dp++;
+      }
+      break;
+    }
+    case 32: {
+      uint32_t *sp = (uint32_t *)src;
+      uint32_t *dp = (uint32_t *)dst;
+      for(; samples; samples--) {
+        *dp = (*sp << 24)
+            | ((*sp & 0x0000FF00) << 8)
+            | ((*sp & 0x00FF0000) >> 8)
+            | (*sp >> 24);
+        sp++;
+        dp++;
+      }
+      break;
+    }
+#if 0
+    case 24: {
+      /* 24-bit case is not real */
+      /* attempting to use width=24 causes error "size 4096 is not a multiple of unit size 6" */
+      GST_WARNING ("24-bit endianness conversion ignored\n");
+      (void)memcpy((void *)dst, (void *)src, (size_t)(width/8 * samples));
+      /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */
+      break;
+    }
+#endif
+    case 16: {
+      uint16_t *sp = (uint16_t *)src;
+      uint16_t *dp = (uint16_t *)dst;
+      for(; samples; samples--) {
+        *dp = (*sp << 8) | (*sp >> 8);
+        sp++;
+        dp++;
+      }
+      break;
+    }
+    case 8: {
+      /* endianness conversion is a nop for 8-bit width */
+      (void)memcpy((void *)dst, (void *)src, (size_t)samples);
+      /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */
+      break;
+    }
+#if 0
+    default: {
+      GST_WARNING ("Unexpected width - endianness conversion ignored\n");
+      (void)memcpy((void *)dst, (void *)src, (size_t)(width/8 * samples));
+      /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */
+      break;
+    }
+#endif
+  }
+}
+
 gboolean
 audio_convert_convert (AudioConvertCtx * ctx, gpointer src,
     gpointer dst, gint samples, gboolean src_writable)
@@ -725,6 +800,14 @@
   if (samples == 0)
     return TRUE;

+  if (ctx->mix_passthrough && (ctx->in.width == ctx->out.width)
+      && ((ctx->in.width == 8) || (ctx->in.endianness != ctx->out.endianness))) {
+    /* no need for type conversion if only endianness conversion is needed */
+    audio_convert_endianness_only (src, dst, ctx->out.width, samples * ctx->out.channels);
+
+    return TRUE;
+  }
+
   insize = ctx->in.unit_size * samples;
   outsize = ctx->out.unit_size * samples;

---------------
_______________________________________________
gstreamer-devel mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: audioconvert inefficient when converting only endianness / buffer passthrough method

michael smith-6-3
Edward,

Patches aren't usually sent to the mailing list - please file
something in bugzilla and attach your suggested patch. That'll ensure
that it doesn't get lost.

As for your specific questions:

1) Yes, it's entirely sensible to optimise audioconvert for
common/simple conversions like this. Please finish your patch and
attach it to a bugzilla entry!

2) That's generally how gstreamer works. If zero-copy is possible at
all (it might not be - the obvious example is where the same buffer is
being passed to multiple different elements simultaneously, some of
which are writing to the buffer), it happens automatically. You just
need to call gst_buffer_make_writable() before writing to it. If
you're writing a subclass of GstBaseTransform, like audioconvert, some
of this is done a little differently - but still, zero-copy will be
done whenever possible.

Mike


On Thu, Feb 2, 2012 at 6:57 PM, Endejan, Edward
<[hidden email]> wrote:

> Hi again,
>
> I'll try to restate and simplify my questions.
>
>        1. Is it preferable to optimize the existing audioconvert
>                plugin for a special case where the only operation
>                to be performed is endianness conversion or
>                to create a new plugin for this particular purpose?
>                The 'if' clause to see if the special case applies is
>                fairly quick and lightweight and the overall
>                modification does not add much to the code size.
>
>        2. Is there a recommended method to passthrough a
>                buffer with zero copy when no modification of the
>                buffer contents are required, but allow modification
>                in place when the data needs to be modified by
>                the filter?
>
> More details are listed in my original post below for those
> who might be interested.
>
> Thanks,
> Ed
>
> ________________________________________
> From: gstreamer-devel-bounces+edward.endejan=[hidden email] [gstreamer-devel-bounces+edward.endejan=[hidden email]] On Behalf Of Endejan, Edward [[hidden email]]
> Sent: Wednesday, February 01, 2012 10:58 AM
> To: [hidden email]
> Subject: audioconvert inefficient when converting only endianness
>
> Hi,
>
> I have two main questions listed near the end, but first here is the
> background information ...
>
> I have been evaluating the CPU load of various GStreamer pipelines.
> In one case of interest, I needed to use audioconvert simply to
> change the endianness of a PCM audio stream.
> Server pipeline:
>        alsasrc ! audioconvert ! \
>        "audio/x-raw-int, endianness=(int)4321, \
>        signed=(boolean)true, width=(int)16, depth=(int)16, \
>        rate=(int)44100, channels=(int)2" ! \
>        rtpL16pay ! udpsink
> Client pipeline:
>        udpsrc ! rtpL16depay ! queue2 ! audioconvert !
>        "audio/x-raw-int, endianness=(int)1234, \
>        signed=(boolean)true, width=(int)16, depth=(int)16, \
>        rate=(int)44100, channels=(int)2" ! \
>        alsasink
>
> I found that the audioconvert plugin was consuming more than 50%
> of the processor budget for the entire chain. When I investigated
> this I noticed that audioconvert was always up-converting the
> input stream to at least 32-bit signed, presumably to retain
> precision for the potential conversions that it would do. But in
> the case that I need, where endianness conversion is the only
> change required, this up-conversion is unnecessary and wasteful
> in terms of CPU load and potentially memory consumption as well.
>
> I implemented a modification (see the end of my post for a diff
> showing my current, rough proposed changes) which first checks
> if endianness conversion is the only required change, and if so
> avoids the type conversion step. The performance improvement is
> dramatic for the 16-bit width case that I've tested. I also
> verified proper operation for 8, 32, and 64-bit width cases, but
> have not compared performance for these cases.
>
> I tried to test the 24-bit width case, which supposedly is a
> valid setting for width, but I always get the error
> "size 4096 is not a multiple of unit size 6" as the pipeline is
> being set up and so have not been able to verify that case.
> I also tried to use "width=24, depth=32" but it did not help.
> (For all the other tests I used settings with depth == width.)
> If someone could explain how to test the 24-bit width case,
> please let me know and I'll give it a try.
>
> Though the 8-bit case works as implemented, I think that it
> could be further optimized if the input buffer could simply be
> passed along as the output buffer rather than being copied.
> I attempted to use gst_buffer_replace(), but that causes the
> gst-launch process to hang. I have seen the
> gst_base_transform_is_passthrough flag in the GstBaseTransform
> object which looks promising, but I'm not sure how or if I could
> use it for this case. I would appreciate a recommendation or a
> reference to example code on the proper method for passing along
> a buffer without change from input to output.
>
> Notice that I had to add a special clause to detect the 8-bit width
> case since the endianness comparison alone was not catching it.
> Endianness really doesn't matter for the 8-bit width case, since
> we are only talking about byte endianness, not bit endianness.
>
> So, here are my main questions:
>        1. Is it better to optimize endianness conversion
>                within the audioconvert plugin or
>                to create a new plugin for this explicit purpose?
>        2. What is the proper way to pass a buffer from input
>                to output without doing a copy?
>
> Best Regards,
> Ed Endejan
>
> P.S. Below are the changes I've tested thus far:
> ---------------
> --- gst-plugins-base-0.10.35/gst/audioconvert/audioconvert.c.org
> +++ gst-plugins-base-0.10.35/gst/audioconvert/audioconvert.c
> @@ -709,6 +709,81 @@
>   return TRUE;
>  }
>
> +static inline void
> +audio_convert_endianness_only (gpointer src, gpointer dst,
> +    gint width, gint samples)
> +{
> +  /* assumes src & dst already checked for NULL and samples > 0 */
> +  /* which is a valid assumption since this is only called from */
> +  /* within audio_convert_convert after such checks */
> +  switch(width) {
> +    case 64: {
> +      uint64_t *sp = (uint64_t *)src;
> +      uint64_t *dp = (uint64_t *)dst;
> +      for(; samples; samples--) {
> +        *dp = (*sp << 56)
> +            | ((*sp & 0x000000000000FF00) << 40)
> +            | ((*sp & 0x0000000000FF0000) << 24)
> +            | ((*sp & 0x00000000FF000000) <<  8)
> +            | ((*sp & 0x000000FF00000000) >>  8)
> +            | ((*sp & 0x0000FF0000000000) >> 24)
> +            | ((*sp & 0x00FF000000000000) >> 40)
> +            | (*sp >> 56);
> +        sp++;
> +        dp++;
> +      }
> +      break;
> +    }
> +    case 32: {
> +      uint32_t *sp = (uint32_t *)src;
> +      uint32_t *dp = (uint32_t *)dst;
> +      for(; samples; samples--) {
> +        *dp = (*sp << 24)
> +            | ((*sp & 0x0000FF00) << 8)
> +            | ((*sp & 0x00FF0000) >> 8)
> +            | (*sp >> 24);
> +        sp++;
> +        dp++;
> +      }
> +      break;
> +    }
> +#if 0
> +    case 24: {
> +      /* 24-bit case is not real */
> +      /* attempting to use width=24 causes error "size 4096 is not a multiple of unit size 6" */
> +      GST_WARNING ("24-bit endianness conversion ignored\n");
> +      (void)memcpy((void *)dst, (void *)src, (size_t)(width/8 * samples));
> +      /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */
> +      break;
> +    }
> +#endif
> +    case 16: {
> +      uint16_t *sp = (uint16_t *)src;
> +      uint16_t *dp = (uint16_t *)dst;
> +      for(; samples; samples--) {
> +        *dp = (*sp << 8) | (*sp >> 8);
> +        sp++;
> +        dp++;
> +      }
> +      break;
> +    }
> +    case 8: {
> +      /* endianness conversion is a nop for 8-bit width */
> +      (void)memcpy((void *)dst, (void *)src, (size_t)samples);
> +      /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */
> +      break;
> +    }
> +#if 0
> +    default: {
> +      GST_WARNING ("Unexpected width - endianness conversion ignored\n");
> +      (void)memcpy((void *)dst, (void *)src, (size_t)(width/8 * samples));
> +      /* gst_buffer_replace(&dst, src); */ /* just pass along the unmodified buffer */
> +      break;
> +    }
> +#endif
> +  }
> +}
> +
>  gboolean
>  audio_convert_convert (AudioConvertCtx * ctx, gpointer src,
>     gpointer dst, gint samples, gboolean src_writable)
> @@ -725,6 +800,14 @@
>   if (samples == 0)
>     return TRUE;
>
> +  if (ctx->mix_passthrough && (ctx->in.width == ctx->out.width)
> +      && ((ctx->in.width == 8) || (ctx->in.endianness != ctx->out.endianness))) {
> +    /* no need for type conversion if only endianness conversion is needed */
> +    audio_convert_endianness_only (src, dst, ctx->out.width, samples * ctx->out.channels);
> +
> +    return TRUE;
> +  }
> +
>   insize = ctx->in.unit_size * samples;
>   outsize = ctx->out.unit_size * samples;
>
> ---------------
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
>
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
_______________________________________________
gstreamer-devel mailing list
[hidden email]
http://lists.freedesktop.org/mailman/listinfo/gstreamer-devel