Hi,
I've been trying to create a pipeline for a voice chat application and would like to include echo cancellation. Since the example local loop pipeline does not work on Windows 10 (at least not on the computers I tried), I came up with this: wasapisrc low-latency=true ! queue max-size-time=400000000 ! audioconvert ! audio/x-raw, format=S16LE ! webrtcdsp ! webrtcechoprobe ! audioconvert ! wasapisink This seems to work but it is very fragile. Depending on the PC, I have to vary to queue size otherwise it will either not do any cancellation or if the queue is too small, it will not produce any sound at all. What can I do to fix that (I know that this is a test loop only, but I'm hoping that understanding the problem here will help me with my actual problem). For my actual pipeline, I have not been able to make echo-cancellation work at all - the sound goes through and I get a very strong echo that does not die down at all. wasapisrc low-latency=true ! queue max-size-time=400000000 ! audioconvert ! audio/x-raw, format=S16LE ! webrtcdsp^ ! opusenc audio-type=2048 bitrate=24000 inband-fec=true packet-loss-percentage=5 ! rtpopuspay ^ ! udpsink host=$FARENDIP port=7480 async=FALSE ^ udpsrc port=7480 caps="application/x-rtp, ssrc=(uint)1537893241, payload=(int)96, channels=1, clock-rate=48000" ^ ! rtpjitterbuffer latency=10 ! rtpopusdepay ! opusdec plc=true use-inband-fec=true ! webrtcechoprobe ^ ! audioconvert ! autoaudiosink Any ideas/help would be appreciated. Regards, Attila -- Sent from: http://gstreamer-devel.966125.n4.nabble.com/ _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Le mer. 17 juin 2020 16 h 30, Attila <[hidden email]> a écrit : Hi, The queue in this example should not be strictly needed, since you control the source buffer-time. Be aware that while webrtcdsp performs with up to 400ms delay, it won't perform consistently in these conditions. The default playback buffer-time in GStreamer is 200ms on the sink, you should reduce that to 2-3 time the latency-time. Another detail that helps letting webrtcdsp perform better is to avoid adding elements between your source and the DSP element. The queue in this case likely adds jitter and lower the speed of sync convergence. That's also why real-time integration inside system audio daemon always performs better. Though, after some tweaks, I did managed to get results as consistent as chrome and Firefox. to fix that (I know that this is a test loop only, but I'm hoping that The DSP performs better in float, if that is an option on your platform. webrtcdsp^ I think replacing the autoaudiosink with wasapi and appropriate real-time configuration should help, I haven't tested much on Windows, as all my work happens on Linux.
_______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Thanks for your suggestions Nicolas! I have made some changes based on them,
but I still have problems. First a general question: I have only one audio channel (recording from the webcam). What does converting from interleaved to non-interleaved does in this context? After swithching to floats for the dsp and removing the queue, my local echo loop now works reasonably well on one computer (not perfect but it's definitely cutting out most echoes). However when I tried on another less powerful but still modern laptop it sounded like an echo chamber. The script looks like this now: wasapisrc buffer-time=60000 ! audioconvert ! audio/x-raw, layout=non-interleaved ! webrtcdsp noise-suppression-level=high echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert ! wasapisink low-latency=true For my complete script below, I now use floats for the dsp and moved the queue after it. This is also an echo chamber, but since one of the computers I'm using doesn't work with the local loop, this might be expected... wasapisrc buffer-time=60000 ! audioconvert ! audio/x-raw, layout=non-interleaved ! webrtcdsp noise-suppression-level=high echo-suppression-level=high^ ! queue ! audioconvert ! audio/x-raw, format=S16LE ! opusenc audio-type=2048 bitrate=24000 inband-fec=true packet-loss-percentage=5 ! rtpopuspay ^ ! udpsink host=192.168.1.108 port=7480 async=FALSE ^ udpsrc port=7480 caps="application/x-rtp, ssrc=(uint)1537893241, payload=(int)96, channels=1, clock-rate=48000" ^ ! rtpjitterbuffer latency=10 ! rtpopusdepay ! opusdec plc=true use-inband-fec=true ! audioconvert ! audio/x-raw, format=F32LE, layout=non-interleaved, rate=48000 ! webrtcechoprobe ^ ! audioconvert ! wasapisink low-latency=true Also, while it does not seem to affect actual operation, the above pipeline generates a lot of the following errors. Any idea what this is? ** (gst-launch-1.0:10236): CRITICAL **: 16:51:58.551: the GstAudioInfo argument is not equal to the GstAudioMeta's attached info Thanks, Attila -- Sent from: http://gstreamer-devel.966125.n4.nabble.com/ _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Le jeudi 18 juin 2020 à 16:29 -0500, Attila a écrit :
> Thanks for your suggestions Nicolas! I have made some changes based on them, > but I still have problems. > > First a general question: I have only one audio channel (recording from the > webcam). What does converting from interleaved to non-interleaved does in > this context? It add the audio meta. > > After swithching to floats for the dsp and removing the queue, my local echo > loop now works reasonably well on one computer (not perfect but it's > definitely cutting out most echoes). However when I tried on another less > powerful but still modern laptop it sounded like an echo chamber. The script > looks like this now: > > wasapisrc buffer-time=60000 ! audioconvert ! audio/x-raw, > layout=non-interleaved ! webrtcdsp noise-suppression-level=high > echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert > ! wasapisink low-latency=true I would copy the buffer-time configuration on the sink too, as this is the one introduce large latency by default (200ms). > > For my complete script below, I now use floats for the dsp and moved the > queue after it. This is also an echo chamber, but since one of the computers > I'm using doesn't work with the local loop, this might be expected... > > wasapisrc buffer-time=60000 ! audioconvert ! audio/x-raw, > layout=non-interleaved ! webrtcdsp noise-suppression-level=high > echo-suppression-level=high^ > ! queue ! audioconvert ! audio/x-raw, format=S16LE ! opusenc > audio-type=2048 bitrate=24000 inband-fec=true packet-loss-percentage=5 ! > rtpopuspay ^ > ! udpsink host=192.168.1.108 port=7480 async=FALSE ^ > udpsrc port=7480 caps="application/x-rtp, ssrc=(uint)1537893241, > payload=(int)96, channels=1, clock-rate=48000" ^ > ! rtpjitterbuffer latency=10 ! rtpopusdepay ! opusdec plc=true > use-inband-fec=true ! audioconvert ! audio/x-raw, format=F32LE, > layout=non-interleaved, rate=48000 ! webrtcechoprobe ^ > ! audioconvert ! wasapisink low-latency=true > > Also, while it does not seem to affect actual operation, the above pipeline > generates a lot of the following errors. Any idea what this is? > > ** (gst-launch-1.0:10236): CRITICAL **: 16:51:58.551: the GstAudioInfo > argument is not equal to the GstAudioMeta's attached info > > Thanks, > Attila > > > > > > -- > Sent from: http://gstreamer-devel.966125.n4.nabble.com/ > _______________________________________________ > gstreamer-devel mailing list > [hidden email] > https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Thanks again. After experimenting a bit with the last suggestion I ended up
with a buffer-time of 10,000 on the sink and 30,000 on the source. This seems to suppress voice echoes reasonably well on both test computers. My problem now is that there is a strong high pitched echo - it's like ringing. It's almost as if the suppressor only works on certain frequencies and leaves others to echo freely. Below is my modified script. Any suggestions on how to get rid of this "ringing"? wasapisrc buffer-time=30000 ! audioconvert ! audio/x-raw, layout=non-interleaved ! webrtcdsp noise-suppression-level=high echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert ! wasapisink buffer-time=10000 Thanks, Attila -- Sent from: http://gstreamer-devel.966125.n4.nabble.com/ _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Le ven. 19 juin 2020 16 h 45, Attila <[hidden email]> a écrit : Thanks again. After experimenting a bit with the last suggestion I ended up That is strange, I don't recall having hissing, would be unfortunate having to add a band filter afterward. About the sink, check that buffer-time is slightly bigger the latency time, perhaps it has to do with that. Perhaps test difference ns and ec level ?
_______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
I've tried setting the latency-time to a value lower than the buffer-time.
There's a small improvement, but the problem with the ringing persists. I've recorded the result and shared the file (echo.mp3) on my google drive: https://drive.google.com/file/d/1gB5U77-PK2GuYnVzRMTl0gx1xEpsNsIe/view To make it work, I had to increase the buffer-time on both the source and the sink. I've also tried adjusting the noise-suppression-level and echo-suppression-level, but setting them both to high seems to give the best result (no change). After some testing, the pipeline below seems to give the best result. gst-launch-1.0 -v -e wasapisrc buffer-time=600000 ! audioconvert ! audio/x-raw, layout=non-interleaved ! webrtcdsp noise-suppression-level=high echo-suppression-level=high ! audioconvert ! webrtcechoprobe ! audioconvert ! wasapisink buffer-time=20000 latency-time=15000 *** At this point, I'm wondering how I should proceed... I can't seem to get this to a point where a 2-way conversation without headphones can be carried out and it seems unlikely, that I can fix this by following up on short hints (which are greatly appreciated!). Is there a way to get someone to give me a little more hands on assistance? If so, how should I go about that? Am I even on the right track trying to do this with gstreamer? ...or should I try coding this using the relevant libraries directly (clearly much more work, but if it works better)? Thanks, Attila -- Sent from: http://gstreamer-devel.966125.n4.nabble.com/ _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Le mar. 23 juin 2020 21 h 30, Attila <[hidden email]> a écrit : I've tried setting the latency-time to a value lower than the buffer-time. That is strange, but I'm not familiar with the wasapi implementation. Normally you use 20/10 or 30/10, basically buffer-time is usually a multiple of latency-time.
The fact you still get ringing might indicate inaccurate latency reporting in this plugin. The accuracy could in fact change depending on the HW driver behind. Have you tried using delay agnostic mode ?
_______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Free forum by Nabble | Edit this page |