Hello,
I'm currently evaluating the following pipeline to receive, display and record a MJPEG video stream: # nice -20 \ gst-launch-1.0 -v \ udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \ caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, \ encoding-name=(string)JPEG, payload=(int)26, framerate=(fraction)50/1" \ ! rtpjitterbuffer latency=20 \ ! rtpjpegdepay \ ! vaapijpegdec \ ! timeoverlay \ ! tee name=t t. ! queue ! vaapisink t. ! queue ! vaapih264enc ! mp4mux ! filesink location=/tmp/test.mp4 The CPU usage of the various threads with and without the element "timeoverlay" is listed below: with timeoverlay -> no yes TID Thread Name CPU % CPU % ---------------------------------- 542 gst-launch-1.0 42.0 88.8 550 queue0:src 0.2 0.4 549 vaapiencodeh264 5.3 5.5 548 gmain 0.0 0.0 547 udpsrc0:src 8.9 8.4 546 rtpjitterbuffer 8.5 51.6 545 timer 0.0 0.0 544 queue1:src 14.8 17.6 543 queue0:src 3.9 4.9 The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%). What does take that much CPU time? Is there a faster way to do the overlay or is that element from the Pango plugin already quiet efficient? I want to overlay an image over the video stream, ideally done by the graphics hardware. Thanks for your help, Wolfgang. _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Le jeudi 18 octobre 2018 à 16:21 +0200, Wolfgang Grandegger a écrit :
> Hello, > > I'm currently evaluating the following pipeline to receive, display > and > record a MJPEG video stream: > > # nice -20 \ > gst-launch-1.0 -v \ > udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \ > caps="application/x-rtp, media=(string)video, clock- > rate=(int)90000, \ > encoding-name=(string)JPEG, payload=(int)26, > framerate=(fraction)50/1" \ > ! rtpjitterbuffer latency=20 \ > ! rtpjpegdepay \ > ! vaapijpegdec \ > ! timeoverlay \ > ! tee name=t > t. ! queue ! vaapisink > t. ! queue ! vaapih264enc ! mp4mux ! filesink > location=/tmp/test.mp4 > > The CPU usage of the various threads with and without the element > "timeoverlay" is listed below: > > with timeoverlay -> no yes > TID Thread Name CPU % CPU % > ---------------------------------- > 542 gst-launch-1.0 42.0 88.8 > 550 queue0:src 0.2 0.4 > 549 vaapiencodeh264 5.3 5.5 > 548 gmain 0.0 0.0 > 547 udpsrc0:src 8.9 8.4 > 546 rtpjitterbuffer 8.5 51.6 > 545 timer 0.0 0.0 > 544 queue1:src 14.8 17.6 > 543 queue0:src 3.9 4.9 > > The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%). > What does take that much CPU time? also a hit because you need to download/upload the pixels from/to the GPU. > Is there a faster way to do the overlay or is that element from the > Pango plugin already quiet efficient? > I want to overlay an image over the video stream, ideally done by > the graphics hardware. There is an active effort to enable GL rendering of CompositonOverlay meta, but we don't have a fast method to import back GL textures into VAAPI encoder iirc. So there is still quite some work. > > Thanks for your help, > > Wolfgang. > > _______________________________________________ > gstreamer-devel mailing list > [hidden email] > https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel signature.asc (201 bytes) Download Attachment |
Hello,
digging deeper about text, image and graphics overlay... Am 18.10.2018 um 17:55 schrieb Nicolas Dufresne: > Le jeudi 18 octobre 2018 à 16:21 +0200, Wolfgang Grandegger a écrit : >> Hello, >> >> I'm currently evaluating the following pipeline to receive, display >> and >> record a MJPEG video stream: >> >> # nice -20 \ >> gst-launch-1.0 -v \ >> udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \ >> caps="application/x-rtp, media=(string)video, clock- >> rate=(int)90000, \ >> encoding-name=(string)JPEG, payload=(int)26, >> framerate=(fraction)50/1" \ >> ! rtpjitterbuffer latency=20 \ >> ! rtpjpegdepay \ >> ! vaapijpegdec \ >> ! timeoverlay \ >> ! tee name=t >> t. ! queue ! vaapisink >> t. ! queue ! vaapih264enc ! mp4mux ! filesink >> location=/tmp/test.mp4 >> >> The CPU usage of the various threads with and without the element >> "timeoverlay" is listed below: >> >> with timeoverlay -> no yes >> TID Thread Name CPU % CPU % >> ---------------------------------- >> 542 gst-launch-1.0 42.0 88.8 >> 550 queue0:src 0.2 0.4 >> 549 vaapiencodeh264 5.3 5.5 >> 548 gmain 0.0 0.0 >> 547 udpsrc0:src 8.9 8.4 >> 546 rtpjitterbuffer 8.5 51.6 >> 545 timer 0.0 0.0 >> 544 queue1:src 14.8 17.6 >> 543 queue0:src 3.9 4.9 >> >> The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%). >> What does take that much CPU time? > > It's called software rendering (with anti-aliasing and all). There is > also a hit because you need to download/upload the pixels from/to the > GPU. I see! The text needs to be rendered and inserted into (overlayed with) the video frame. The CPU usage really depends how often and what is rendered. Already disabling shadow or outline drawing or a smaller font reduces the CPU load. BTW, is it possible to specify the colour for the shaded background? I know that it could be achieved with Pango "<span>" text attributes, e.g. "bgcolor", but it requires more CPU time than the "shaded" background from the text overlay. >> Is there a faster way to do the overlay or is that element from the >> Pango plugin already quiet efficient? >> I want to overlay an image over the video stream, ideally done by >> the graphics hardware. > > There is an active effort to enable GL rendering of CompositonOverlay > meta, but we don't have a fast method to import back GL textures into > VAAPI encoder iirc. So there is still quite some work. Is this work in progress visible somewhere, e.g. as GIT repo? I realized, that the i.MX6 GStreamer-IMX Plugin [1] does have a somehow optimized implementation of the "textoverlay" using 2D acceleration. Would that approach be feasible and help on Intel graphics hardware as well? Another option for text, image and graphics overlay is to use Cairo directly using the "cairooverlay". That would allow to use the Cairo text renderer and also add graphics or images in one process. Would that be "lighter" or more efficient? I think GStreamer 0.1 did have a "cairotextoverlay". [1] https://github.com/Freescale/gstreamer-imx/src/g2d/pango/ Thanks for any input! Wolfgang. _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Le jeudi 15 novembre 2018 à 09:46 +0100, Wolfgang Grandegger a écrit :
> Hello, > > digging deeper about text, image and graphics overlay... > > Am 18.10.2018 um 17:55 schrieb Nicolas Dufresne: > > Le jeudi 18 octobre 2018 à 16:21 +0200, Wolfgang Grandegger a écrit : > > > Hello, > > > > > > I'm currently evaluating the following pipeline to receive, display > > > and > > > record a MJPEG video stream: > > > > > > # nice -20 \ > > > gst-launch-1.0 -v \ > > > udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \ > > > caps="application/x-rtp, media=(string)video, clock- > > > rate=(int)90000, \ > > > encoding-name=(string)JPEG, payload=(int)26, > > > framerate=(fraction)50/1" \ > > > ! rtpjitterbuffer latency=20 \ > > > ! rtpjpegdepay \ > > > ! vaapijpegdec \ > > > ! timeoverlay \ > > > ! tee name=t > > > t. ! queue ! vaapisink > > > t. ! queue ! vaapih264enc ! mp4mux ! filesink > > > location=/tmp/test.mp4 > > > > > > The CPU usage of the various threads with and without the element > > > "timeoverlay" is listed below: > > > > > > with timeoverlay -> no yes > > > TID Thread Name CPU % CPU % > > > ---------------------------------- > > > 542 gst-launch-1.0 42.0 88.8 > > > 550 queue0:src 0.2 0.4 > > > 549 vaapiencodeh264 5.3 5.5 > > > 548 gmain 0.0 0.0 > > > 547 udpsrc0:src 8.9 8.4 > > > 546 rtpjitterbuffer 8.5 51.6 > > > 545 timer 0.0 0.0 > > > 544 queue1:src 14.8 17.6 > > > 543 queue0:src 3.9 4.9 > > > > > > The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%). > > > What does take that much CPU time? > > > > It's called software rendering (with anti-aliasing and all). There is > > also a hit because you need to download/upload the pixels from/to the > > GPU. > > I see! The text needs to be rendered and inserted into (overlayed with) > the video frame. The CPU usage really depends how often and what is > rendered. Already disabling shadow or outline drawing or a smaller font > reduces the CPU load. > > BTW, is it possible to specify the colour for the shaded background? I > know that it could be achieved with Pango "<span>" text attributes, e.g. > "bgcolor", but it requires more CPU time than the "shaded" background > from the text overlay. shadow color. Would be nice to add properties for this. The shaded background is done using plain cairo, I'm not sure why it's faster. > > > > Is there a faster way to do the overlay or is that element from the > > > Pango plugin already quiet efficient? > > > I want to overlay an image over the video stream, ideally done by > > > the graphics hardware. > > > > There is an active effort to enable GL rendering of CompositonOverlay > > meta, but we don't have a fast method to import back GL textures into > > VAAPI encoder iirc. So there is still quite some work. > > Is this work in progress visible somewhere, e.g. as GIT repo? https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/blob/master/ext/gl/gstgloverlaycompositorelement.c > > I realized, that the i.MX6 GStreamer-IMX Plugin [1] does have a somehow > optimized implementation of the "textoverlay" using 2D acceleration. > Would that approach be feasible and help on Intel graphics hardware as well? On Intel, it is probably simpler to use GL. In any case, the fonts are still rendered in software and using cairo. I have for a long time been thinking that we could take advantage of bitmap font-cache, but never got the time to work on it. On IMX2, there is a 2D blitter that could be used to implement an overlay-compositor. > > Another option for text, image and graphics overlay is to use Cairo > directly using the "cairooverlay". That would allow to use the Cairo > text renderer and also add graphics or images in one process. Would that > be "lighter" or more efficient? I think GStreamer 0.1 did have a > "cairotextoverlay". Text overlay extracts the vector path from pango and use cairo to render. Using cairo instead of bitmap font cache is the second performance hit (but only if you change the text every frames). Blending with the YUV video is the biggest performance hit. The blending is optimized using GL implementation. I haven't tested it, but it's likely adding an imxoverlaycompositor to gstreamer-imx would do the job will less work. I guess I should implement such an element for mainline kernel too. Adding this to my todos. > > [1] https://github.com/Freescale/gstreamer-imx/src/g2d/pango/ > > Thanks for any input! > > Wolfgang. > _______________________________________________ > gstreamer-devel mailing list > [hidden email] > https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel signature.asc (201 bytes) Download Attachment |
Hello,
Am 15.11.2018 um 16:56 schrieb Nicolas Dufresne: > Le jeudi 15 novembre 2018 à 09:46 +0100, Wolfgang Grandegger a écrit : >> Hello, >> >> digging deeper about text, image and graphics overlay... >> >> Am 18.10.2018 um 17:55 schrieb Nicolas Dufresne: >>> Le jeudi 18 octobre 2018 à 16:21 +0200, Wolfgang Grandegger a écrit : >>>> Hello, >>>> >>>> I'm currently evaluating the following pipeline to receive, display >>>> and >>>> record a MJPEG video stream: >>>> >>>> # nice -20 \ >>>> gst-launch-1.0 -v \ >>>> udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \ >>>> caps="application/x-rtp, media=(string)video, clock- >>>> rate=(int)90000, \ >>>> encoding-name=(string)JPEG, payload=(int)26, >>>> framerate=(fraction)50/1" \ >>>> ! rtpjitterbuffer latency=20 \ >>>> ! rtpjpegdepay \ >>>> ! vaapijpegdec \ >>>> ! timeoverlay \ >>>> ! tee name=t >>>> t. ! queue ! vaapisink >>>> t. ! queue ! vaapih264enc ! mp4mux ! filesink >>>> location=/tmp/test.mp4 >>>> >>>> The CPU usage of the various threads with and without the element >>>> "timeoverlay" is listed below: >>>> >>>> with timeoverlay -> no yes >>>> TID Thread Name CPU % CPU % >>>> ---------------------------------- >>>> 542 gst-launch-1.0 42.0 88.8 >>>> 550 queue0:src 0.2 0.4 >>>> 549 vaapiencodeh264 5.3 5.5 >>>> 548 gmain 0.0 0.0 >>>> 547 udpsrc0:src 8.9 8.4 >>>> 546 rtpjitterbuffer 8.5 51.6 >>>> 545 timer 0.0 0.0 >>>> 544 queue1:src 14.8 17.6 >>>> 543 queue0:src 3.9 4.9 >>>> >>>> The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%). >>>> What does take that much CPU time? >>> >>> It's called software rendering (with anti-aliasing and all). There is >>> also a hit because you need to download/upload the pixels from/to the >>> GPU. >> >> I see! The text needs to be rendered and inserted into (overlayed with) >> the video frame. The CPU usage really depends how often and what is >> rendered. Already disabling shadow or outline drawing or a smaller font >> reduces the CPU load. >> >> BTW, is it possible to specify the colour for the shaded background? I >> know that it could be achieved with Pango "<span>" text attributes, e.g. >> "bgcolor", but it requires more CPU time than the "shaded" background >> from the text overlay. > > It looks like you can specify font and outline color, but shade or the > shadow color. Would be nice to add properties for this. complicated/complex for the shade... > > The shaded background is done using plain cairo, I'm not sure why it's > faster. On Intel, we are in the NV12 color space. As I see it, the shading is done on the raw video frame here: https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/blob/master/ext/pango/gstbasetextoverlay.c#L2020 Just the luminance (Y) is decremented. With color more data need to be updated, taking more CPU time. Maybe that's the reason why it has not been implemented. >> >>>> Is there a faster way to do the overlay or is that element from the >>>> Pango plugin already quiet efficient? >>>> I want to overlay an image over the video stream, ideally done by >>>> the graphics hardware. >>> >>> There is an active effort to enable GL rendering of CompositonOverlay >>> meta, but we don't have a fast method to import back GL textures into >>> VAAPI encoder iirc. So there is still quite some work. >> >> Is this work in progress visible somewhere, e.g. as GIT repo? > > https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/blob/master/gst/overlaycomposition/gstoverlaycomposition.c > https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/blob/master/ext/gl/gstgloverlaycompositorelement.c >> I realized, that the i.MX6 GStreamer-IMX Plugin [1] does have a somehow >> optimized implementation of the "textoverlay" using 2D acceleration. >> Would that approach be feasible and help on Intel graphics hardware as well? > > On Intel, it is probably simpler to use GL. In any case, the fonts are > still rendered in software and using cairo. I have for a long time been > thinking that we could take advantage of bitmap font-cache, but never > got the time to work on it. > > On IMX2, there is a 2D blitter that could be used to implement an > overlay-compositor. > >> >> Another option for text, image and graphics overlay is to use Cairo >> directly using the "cairooverlay". That would allow to use the Cairo >> text renderer and also add graphics or images in one process. Would that >> be "lighter" or more efficient? I think GStreamer 0.1 did have a >> "cairotextoverlay". > > Text overlay extracts the vector path from pango and use cairo to > render. Using cairo instead of bitmap font cache is the second > performance hit (but only if you change the text every frames). > Blending with the YUV video is the biggest performance hit. The > blending is optimized using GL implementation. I haven't tested it, but > it's likely adding an imxoverlaycompositor to gstreamer-imx would do > the job will less work. > > I guess I should implement such an element for mainline kernel too. > Adding this to my todos. > >> >> [1] https://github.com/Freescale/gstreamer-imx/src/g2d/pango/ Thanks, Wolfgang. _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel signature.asc (853 bytes) Download Attachment |
Free forum by Nabble | Edit this page |