webrtcdsp and webrtcechoprobe on Windows

classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

webrtcdsp and webrtcechoprobe on Windows

Attila
Hi,

I've been trying to create a pipeline for a voice chat application and would
like to include echo cancellation.

Since the example local loop pipeline does not work on Windows 10 (at least
not on the computers I tried), I came up with this:

wasapisrc low-latency=true ! queue max-size-time=400000000 ! audioconvert !
audio/x-raw, format=S16LE ! webrtcdsp ! webrtcechoprobe ! audioconvert !
wasapisink

This seems to work but it is very fragile. Depending on the PC, I have to
vary to queue size otherwise it will either not do any cancellation or if
the queue is too small, it will not produce any sound at all. What can I do
to fix that (I know that this is a test loop only, but I'm hoping that
understanding the problem here will help me with my actual problem).

For my actual pipeline, I have not been able to make echo-cancellation work
at all - the sound goes through and I get a very strong echo that does not
die down at all.

  wasapisrc low-latency=true ! queue max-size-time=400000000 ! audioconvert
! audio/x-raw, format=S16LE ! webrtcdsp^
    ! opusenc audio-type=2048 bitrate=24000 inband-fec=true
packet-loss-percentage=5 ! rtpopuspay ^
    ! udpsink host=$FARENDIP port=7480 async=FALSE ^
  udpsrc port=7480 caps="application/x-rtp, ssrc=(uint)1537893241,
payload=(int)96, channels=1, clock-rate=48000" ^
    ! rtpjitterbuffer latency=10 ! rtpopusdepay ! opusdec plc=true
use-inband-fec=true ! webrtcechoprobe ^
    ! audioconvert ! autoaudiosink

Any ideas/help would be appreciated.

Regards,
Attila



--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: webrtcdsp and webrtcechoprobe on Windows

Nicolas Dufresne-5


Le mer. 17 juin 2020 16 h 30, Attila <[hidden email]> a écrit :
Hi,

I've been trying to create a pipeline for a voice chat application and would
like to include echo cancellation.

Since the example local loop pipeline does not work on Windows 10 (at least
not on the computers I tried), I came up with this:

wasapisrc low-latency=true ! queue max-size-time=400000000 ! audioconvert !
audio/x-raw, format=S16LE ! webrtcdsp ! webrtcechoprobe ! audioconvert !
wasapisink

This seems to work but it is very fragile. Depending on the PC, I have to
vary to queue size otherwise it will either not do any cancellation or if
the queue is too small, it will not produce any sound at all. What can I do

The queue in this example should not be strictly needed, since you control the source buffer-time. Be aware that while webrtcdsp performs with up to 400ms delay, it won't perform consistently in these conditions. The default playback buffer-time in GStreamer is 200ms on the sink, you should reduce that to 2-3 time the latency-time.

Another detail that helps letting webrtcdsp perform better is to avoid adding elements between your source and the DSP element. The queue in this case likely adds jitter and lower the speed of sync convergence. That's also why real-time integration inside system audio daemon always performs better. Though, after some tweaks, I did managed to get results as consistent as chrome and Firefox.

to fix that (I know that this is a test loop only, but I'm hoping that
understanding the problem here will help me with my actual problem).

For my actual pipeline, I have not been able to make echo-cancellation work
at all - the sound goes through and I get a very strong echo that does not
die down at all.

  wasapisrc low-latency=true ! queue max-size-time=400000000 ! audioconvert
! audio/x-raw, format=S16LE !

The DSP performs better in float, if that is an option on your platform.

webrtcdsp^
    ! opusenc audio-type=2048 bitrate=24000 inband-fec=true
packet-loss-percentage=5 ! rtpopuspay ^
    ! udpsink host=$FARENDIP port=7480 async=FALSE ^
  udpsrc port=7480 caps="application/x-rtp, ssrc=(uint)1537893241,
payload=(int)96, channels=1, clock-rate=48000" ^
    ! rtpjitterbuffer latency=10 ! rtpopusdepay ! opusdec plc=true
use-inband-fec=true ! webrtcechoprobe ^
    ! audioconvert ! autoaudiosink

Any ideas/help would be appreciated.

I think replacing the autoaudiosink with wasapi and appropriate real-time configuration should help, I haven't tested much on Windows, as all my work happens on Linux.


Regards,
Attila



--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: webrtcdsp and webrtcechoprobe on Windows

Attila
Thanks for your suggestions Nicolas! I have made some changes based on them,
but I still have problems.

First a general question: I have only one audio channel (recording from the
webcam). What does converting from interleaved to non-interleaved does in
this context?

After swithching to floats for the dsp and removing the queue, my local echo
loop now works reasonably well on one computer (not perfect but it's
definitely cutting out most echoes). However when I tried on another less
powerful but still modern laptop it sounded like an echo chamber. The script
looks like this now:

wasapisrc buffer-time=60000 ! audioconvert ! audio/x-raw,
layout=non-interleaved ! webrtcdsp noise-suppression-level=high
echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert
! wasapisink low-latency=true

For my complete script below, I now use floats for the dsp and moved the
queue after it. This is also an echo chamber, but since one of the computers
I'm using doesn't work with the local loop, this might be expected...

  wasapisrc buffer-time=60000 ! audioconvert ! audio/x-raw,
layout=non-interleaved ! webrtcdsp noise-suppression-level=high
echo-suppression-level=high^
    ! queue ! audioconvert ! audio/x-raw, format=S16LE ! opusenc
audio-type=2048 bitrate=24000 inband-fec=true packet-loss-percentage=5 !
rtpopuspay ^
    ! udpsink host=192.168.1.108 port=7480 async=FALSE ^
  udpsrc port=7480 caps="application/x-rtp, ssrc=(uint)1537893241,
payload=(int)96, channels=1, clock-rate=48000" ^
    ! rtpjitterbuffer latency=10 ! rtpopusdepay ! opusdec plc=true
use-inband-fec=true  ! audioconvert ! audio/x-raw, format=F32LE,
layout=non-interleaved, rate=48000 ! webrtcechoprobe ^
    ! audioconvert ! wasapisink low-latency=true

Also, while it does not seem to affect actual operation, the above pipeline
generates a lot of the following errors. Any idea what this is?

** (gst-launch-1.0:10236): CRITICAL **: 16:51:58.551: the GstAudioInfo
argument is not equal to the GstAudioMeta's attached info

Thanks,
Attila





--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: webrtcdsp and webrtcechoprobe on Windows

Nicolas Dufresne-5
Le jeudi 18 juin 2020 à 16:29 -0500, Attila a écrit :
> Thanks for your suggestions Nicolas! I have made some changes based on them,
> but I still have problems.
>
> First a general question: I have only one audio channel (recording from the
> webcam). What does converting from interleaved to non-interleaved does in
> this context?

It add the audio meta.

>
> After swithching to floats for the dsp and removing the queue, my local echo
> loop now works reasonably well on one computer (not perfect but it's
> definitely cutting out most echoes). However when I tried on another less
> powerful but still modern laptop it sounded like an echo chamber. The script
> looks like this now:
>
> wasapisrc buffer-time=60000 ! audioconvert ! audio/x-raw,
> layout=non-interleaved ! webrtcdsp noise-suppression-level=high
> echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert
> ! wasapisink low-latency=true

I would copy the buffer-time configuration on the sink too, as this is the one
introduce large latency by default (200ms).

>
> For my complete script below, I now use floats for the dsp and moved the
> queue after it. This is also an echo chamber, but since one of the computers
> I'm using doesn't work with the local loop, this might be expected...
>
>   wasapisrc buffer-time=60000 ! audioconvert ! audio/x-raw,
> layout=non-interleaved ! webrtcdsp noise-suppression-level=high
> echo-suppression-level=high^
>     ! queue ! audioconvert ! audio/x-raw, format=S16LE ! opusenc
> audio-type=2048 bitrate=24000 inband-fec=true packet-loss-percentage=5 !
> rtpopuspay ^
>     ! udpsink host=192.168.1.108 port=7480 async=FALSE ^
>   udpsrc port=7480 caps="application/x-rtp, ssrc=(uint)1537893241,
> payload=(int)96, channels=1, clock-rate=48000" ^
>     ! rtpjitterbuffer latency=10 ! rtpopusdepay ! opusdec plc=true
> use-inband-fec=true  ! audioconvert ! audio/x-raw, format=F32LE,
> layout=non-interleaved, rate=48000 ! webrtcechoprobe ^
>     ! audioconvert ! wasapisink low-latency=true
>
> Also, while it does not seem to affect actual operation, the above pipeline
> generates a lot of the following errors. Any idea what this is?
>
> ** (gst-launch-1.0:10236): CRITICAL **: 16:51:58.551: the GstAudioInfo
> argument is not equal to the GstAudioMeta's attached info
>
> Thanks,
> Attila
>
>
>
>
>
> --
> Sent from: http://gstreamer-devel.966125.n4.nabble.com/
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: webrtcdsp and webrtcechoprobe on Windows

Attila
Thanks again. After experimenting a bit with the last suggestion I ended up
with a buffer-time of 10,000 on the sink and 30,000 on the source. This
seems to suppress voice echoes reasonably well on both test computers.

My problem now is that there is a strong high pitched echo - it's like
ringing. It's almost as if the suppressor only works on certain frequencies
and leaves others to echo freely. Below is my modified script.

Any suggestions on how to get rid of this "ringing"?

wasapisrc buffer-time=30000 ! audioconvert ! audio/x-raw,
layout=non-interleaved ! webrtcdsp noise-suppression-level=high
echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert
! wasapisink buffer-time=10000

Thanks,
Attila



--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: webrtcdsp and webrtcechoprobe on Windows

Nicolas Dufresne-5


Le ven. 19 juin 2020 16 h 45, Attila <[hidden email]> a écrit :
Thanks again. After experimenting a bit with the last suggestion I ended up
with a buffer-time of 10,000 on the sink and 30,000 on the source. This
seems to suppress voice echoes reasonably well on both test computers.

My problem now is that there is a strong high pitched echo - it's like
ringing. It's almost as if the suppressor only works on certain frequencies
and leaves others to echo freely. Below is my modified script.

Any suggestions on how to get rid of this "ringing"?

wasapisrc buffer-time=30000 ! audioconvert ! audio/x-raw,
layout=non-interleaved ! webrtcdsp noise-suppression-level=high
echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert
! wasapisink buffer-time=10000

That is strange, I don't recall having hissing, would be unfortunate having to add a band filter afterward. About the sink, check that buffer-time is slightly bigger the latency time, perhaps it has to do with that. Perhaps test difference ns and ec level ?


Thanks,
Attila



--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: webrtcdsp and webrtcechoprobe on Windows

Attila
I've tried setting the latency-time to a value lower than the buffer-time.
There's a small improvement, but the problem with the ringing persists.

I've recorded the result and shared the file (echo.mp3) on my google drive:
https://drive.google.com/file/d/1gB5U77-PK2GuYnVzRMTl0gx1xEpsNsIe/view

To make it work, I had to increase the buffer-time on both the source and
the sink. I've also tried adjusting the noise-suppression-level and
echo-suppression-level, but setting them both to high seems to give the best
result (no change).

After some testing, the pipeline below seems to give the best result.

gst-launch-1.0 -v -e wasapisrc buffer-time=600000 ! audioconvert !
audio/x-raw, layout=non-interleaved ! webrtcdsp noise-suppression-level=high
echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert
! wasapisink buffer-time=20000 latency-time=15000  

 ***

At this point, I'm wondering how I should proceed... I can't seem to get
this to a point where a 2-way conversation without headphones can be carried
out and it seems unlikely, that I can fix this by following up on short
hints (which are greatly appreciated!).

Is there a way to get someone to give me a little more hands on assistance?
If so, how should I go about that?
Am I even on the right track trying to do this with gstreamer? ...or should
I try coding this using the relevant libraries directly (clearly much more
work, but if it works better)?

Thanks,
Attila




--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: webrtcdsp and webrtcechoprobe on Windows

Nicolas Dufresne-5


Le mar. 23 juin 2020 21 h 30, Attila <[hidden email]> a écrit :
I've tried setting the latency-time to a value lower than the buffer-time.
There's a small improvement, but the problem with the ringing persists.

I've recorded the result and shared the file (echo.mp3) on my google drive:
https://drive.google.com/file/d/1gB5U77-PK2GuYnVzRMTl0gx1xEpsNsIe/view

To make it work, I had to increase the buffer-time on both the source and
the sink. I've also tried adjusting the noise-suppression-level and
echo-suppression-level, but setting them both to high seems to give the best
result (no change).

After some testing, the pipeline below seems to give the best result.

gst-launch-1.0 -v -e wasapisrc buffer-time=600000 ! audioconvert !
audio/x-raw, layout=non-interleaved ! webrtcdsp noise-suppression-level=high
echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert
! wasapisink buffer-time=20000 latency-time=15000 

That is strange, but I'm not familiar with the wasapi implementation. Normally you use 20/10 or 30/10, basically buffer-time is usually a multiple of latency-time.


 ***

At this point, I'm wondering how I should proceed... I can't seem to get
this to a point where a 2-way conversation without headphones can be carried
out and it seems unlikely, that I can fix this by following up on short
hints (which are greatly appreciated!).

The fact you still get ringing might indicate inaccurate latency reporting in this plugin. The accuracy could in fact change depending on the HW driver behind. Have you tried using delay agnostic mode ?


Is there a way to get someone to give me a little more hands on assistance?
If so, how should I go about that?
Am I even on the right track trying to do this with gstreamer? ...or should
I try coding this using the relevant libraries directly (clearly much more
work, but if it works better)?

Thanks,
Attila




--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel