Image overlay over a video stream

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Image overlay over a video stream

Wolfgang Grandegger
Hello,

I'm currently evaluating the following pipeline to receive, display and
record a MJPEG video stream:

  # nice -20 \
    gst-launch-1.0 -v \
    udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \
      caps="application/x-rtp, media=(string)video, clock-rate=(int)90000, \
      encoding-name=(string)JPEG, payload=(int)26, framerate=(fraction)50/1" \
    ! rtpjitterbuffer latency=20 \
    ! rtpjpegdepay \
    ! vaapijpegdec \
    ! timeoverlay \
    ! tee name=t
    t. ! queue ! vaapisink
    t. ! queue ! vaapih264enc ! mp4mux ! filesink location=/tmp/test.mp4

The CPU usage of the various threads with and without the element
"timeoverlay" is listed below:

    with timeoverlay ->  no     yes
  TID  Thread Name      CPU %  CPU %
  ----------------------------------
  542  gst-launch-1.0    42.0   88.8
  550  queue0:src         0.2    0.4
  549  vaapiencodeh264    5.3    5.5
  548  gmain              0.0    0.0
  547  udpsrc0:src        8.9    8.4
  546  rtpjitterbuffer    8.5   51.6
  545  timer              0.0    0.0
  544  queue1:src        14.8   17.6
  543  queue0:src         3.9    4.9

The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%).
What does take that much CPU time?
Is there a faster way to do the overlay or is that element from the
Pango plugin already quiet efficient?
I want to overlay an image over the video stream, ideally done by
the graphics hardware.

Thanks for your help,

Wolfgang.

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: Image overlay over a video stream

Nicolas Dufresne-5
Le jeudi 18 octobre 2018 à 16:21 +0200, Wolfgang Grandegger a écrit :

> Hello,
>
> I'm currently evaluating the following pipeline to receive, display
> and
> record a MJPEG video stream:
>
>   # nice -20 \
>     gst-launch-1.0 -v \
>     udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \
>       caps="application/x-rtp, media=(string)video, clock-
> rate=(int)90000, \
>       encoding-name=(string)JPEG, payload=(int)26,
> framerate=(fraction)50/1" \
>     ! rtpjitterbuffer latency=20 \
>     ! rtpjpegdepay \
>     ! vaapijpegdec \
>     ! timeoverlay \
>     ! tee name=t
>     t. ! queue ! vaapisink
>     t. ! queue ! vaapih264enc ! mp4mux ! filesink
> location=/tmp/test.mp4
>
> The CPU usage of the various threads with and without the element
> "timeoverlay" is listed below:
>
>     with timeoverlay ->  no     yes
>   TID  Thread Name      CPU %  CPU %
>   ----------------------------------
>   542  gst-launch-1.0    42.0   88.8
>   550  queue0:src         0.2    0.4
>   549  vaapiencodeh264    5.3    5.5
>   548  gmain              0.0    0.0
>   547  udpsrc0:src        8.9    8.4
>   546  rtpjitterbuffer    8.5   51.6
>   545  timer              0.0    0.0
>   544  queue1:src        14.8   17.6
>   543  queue0:src         3.9    4.9
>
> The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%).
> What does take that much CPU time?
It's called software rendering (with anti-aliasing and all). There is
also a hit because you need to download/upload the pixels from/to the
GPU.

> Is there a faster way to do the overlay or is that element from the
> Pango plugin already quiet efficient?
> I want to overlay an image over the video stream, ideally done by
> the graphics hardware.

There is an active effort to enable GL rendering of CompositonOverlay
meta, but we don't have a fast method to import back GL textures into
VAAPI encoder iirc. So there is still quite some work.

>
> Thanks for your help,
>
> Wolfgang.
>
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Image overlay over a video stream

Wolfgang Grandegger
Hello,

digging deeper about text, image and graphics overlay...

Am 18.10.2018 um 17:55 schrieb Nicolas Dufresne:

> Le jeudi 18 octobre 2018 à 16:21 +0200, Wolfgang Grandegger a écrit :
>> Hello,
>>
>> I'm currently evaluating the following pipeline to receive, display
>> and
>> record a MJPEG video stream:
>>
>>   # nice -20 \
>>     gst-launch-1.0 -v \
>>     udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \
>>       caps="application/x-rtp, media=(string)video, clock-
>> rate=(int)90000, \
>>       encoding-name=(string)JPEG, payload=(int)26,
>> framerate=(fraction)50/1" \
>>     ! rtpjitterbuffer latency=20 \
>>     ! rtpjpegdepay \
>>     ! vaapijpegdec \
>>     ! timeoverlay \
>>     ! tee name=t
>>     t. ! queue ! vaapisink
>>     t. ! queue ! vaapih264enc ! mp4mux ! filesink
>> location=/tmp/test.mp4
>>
>> The CPU usage of the various threads with and without the element
>> "timeoverlay" is listed below:
>>
>>     with timeoverlay ->  no     yes
>>   TID  Thread Name      CPU %  CPU %
>>   ----------------------------------
>>   542  gst-launch-1.0    42.0   88.8
>>   550  queue0:src         0.2    0.4
>>   549  vaapiencodeh264    5.3    5.5
>>   548  gmain              0.0    0.0
>>   547  udpsrc0:src        8.9    8.4
>>   546  rtpjitterbuffer    8.5   51.6
>>   545  timer              0.0    0.0
>>   544  queue1:src        14.8   17.6
>>   543  queue0:src         3.9    4.9
>>
>> The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%).
>> What does take that much CPU time?
>
> It's called software rendering (with anti-aliasing and all). There is
> also a hit because you need to download/upload the pixels from/to the
> GPU.

I see! The text needs to be rendered and inserted into (overlayed with)
the video frame. The CPU usage really depends how often and what is
rendered. Already disabling shadow or outline drawing or a smaller font
reduces the CPU load.

BTW, is it possible to specify the colour for the shaded background? I
know that it could be achieved with Pango "<span>" text attributes, e.g.
"bgcolor", but it requires more CPU time than the "shaded" background
from the text overlay.

>> Is there a faster way to do the overlay or is that element from the
>> Pango plugin already quiet efficient?
>> I want to overlay an image over the video stream, ideally done by
>> the graphics hardware.
>
> There is an active effort to enable GL rendering of CompositonOverlay
> meta, but we don't have a fast method to import back GL textures into
> VAAPI encoder iirc. So there is still quite some work.

Is this work in progress visible somewhere, e.g. as GIT repo?

I realized, that the i.MX6 GStreamer-IMX Plugin [1] does have a somehow
optimized implementation of the "textoverlay" using 2D acceleration.
Would that approach be feasible and help on Intel graphics hardware as well?

Another option for text, image and graphics overlay is to use Cairo
directly using the "cairooverlay". That would allow to use the Cairo
text renderer and also add graphics or images in one process. Would that
be "lighter" or more efficient? I think GStreamer 0.1 did have a
"cairotextoverlay".

[1] https://github.com/Freescale/gstreamer-imx/src/g2d/pango/

Thanks for any input!

Wolfgang.
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: Image overlay over a video stream

Nicolas Dufresne-5
Le jeudi 15 novembre 2018 à 09:46 +0100, Wolfgang Grandegger a écrit :

> Hello,
>
> digging deeper about text, image and graphics overlay...
>
> Am 18.10.2018 um 17:55 schrieb Nicolas Dufresne:
> > Le jeudi 18 octobre 2018 à 16:21 +0200, Wolfgang Grandegger a écrit :
> > > Hello,
> > >
> > > I'm currently evaluating the following pipeline to receive, display
> > > and
> > > record a MJPEG video stream:
> > >
> > >   # nice -20 \
> > >     gst-launch-1.0 -v \
> > >     udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \
> > >       caps="application/x-rtp, media=(string)video, clock-
> > > rate=(int)90000, \
> > >       encoding-name=(string)JPEG, payload=(int)26,
> > > framerate=(fraction)50/1" \
> > >     ! rtpjitterbuffer latency=20 \
> > >     ! rtpjpegdepay \
> > >     ! vaapijpegdec \
> > >     ! timeoverlay \
> > >     ! tee name=t
> > >     t. ! queue ! vaapisink
> > >     t. ! queue ! vaapih264enc ! mp4mux ! filesink
> > > location=/tmp/test.mp4
> > >
> > > The CPU usage of the various threads with and without the element
> > > "timeoverlay" is listed below:
> > >
> > >     with timeoverlay ->  no     yes
> > >   TID  Thread Name      CPU %  CPU %
> > >   ----------------------------------
> > >   542  gst-launch-1.0    42.0   88.8
> > >   550  queue0:src         0.2    0.4
> > >   549  vaapiencodeh264    5.3    5.5
> > >   548  gmain              0.0    0.0
> > >   547  udpsrc0:src        8.9    8.4
> > >   546  rtpjitterbuffer    8.5   51.6
> > >   545  timer              0.0    0.0
> > >   544  queue1:src        14.8   17.6
> > >   543  queue0:src         3.9    4.9
> > >
> > > The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%).
> > > What does take that much CPU time?
> >
> > It's called software rendering (with anti-aliasing and all). There is
> > also a hit because you need to download/upload the pixels from/to the
> > GPU.
>
> I see! The text needs to be rendered and inserted into (overlayed with)
> the video frame. The CPU usage really depends how often and what is
> rendered. Already disabling shadow or outline drawing or a smaller font
> reduces the CPU load.
>
> BTW, is it possible to specify the colour for the shaded background? I
> know that it could be achieved with Pango "<span>" text attributes, e.g.
> "bgcolor", but it requires more CPU time than the "shaded" background
> from the text overlay.
It looks like you can specify font and outline color, but shade or the
shadow color. Would be nice to add properties for this.

The shaded background is done using plain cairo, I'm not sure why it's
faster.

>
> > > Is there a faster way to do the overlay or is that element from the
> > > Pango plugin already quiet efficient?
> > > I want to overlay an image over the video stream, ideally done by
> > > the graphics hardware.
> >
> > There is an active effort to enable GL rendering of CompositonOverlay
> > meta, but we don't have a fast method to import back GL textures into
> > VAAPI encoder iirc. So there is still quite some work.
>
> Is this work in progress visible somewhere, e.g. as GIT repo?
https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/blob/master/gst/overlaycomposition/gstoverlaycomposition.c
https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/blob/master/ext/gl/gstgloverlaycompositorelement.c


>
> I realized, that the i.MX6 GStreamer-IMX Plugin [1] does have a somehow
> optimized implementation of the "textoverlay" using 2D acceleration.
> Would that approach be feasible and help on Intel graphics hardware as well?

On Intel, it is probably simpler to use GL. In any case, the fonts are
still rendered in software and using cairo. I have for a long time been
thinking that we could take advantage of bitmap font-cache, but never
got the time to work on it.

On IMX2, there is a 2D blitter that could be used to implement an
overlay-compositor.

>
> Another option for text, image and graphics overlay is to use Cairo
> directly using the "cairooverlay". That would allow to use the Cairo
> text renderer and also add graphics or images in one process. Would that
> be "lighter" or more efficient? I think GStreamer 0.1 did have a
> "cairotextoverlay".

Text overlay extracts the vector path from pango and use cairo to
render. Using cairo instead of bitmap font cache is the second
performance hit (but only if you change the text every frames).
Blending with the YUV video is the biggest performance hit. The
blending is optimized using GL implementation. I haven't tested it, but
it's likely adding an imxoverlaycompositor to gstreamer-imx would do
the job will less work.

I guess I should implement such an element for mainline kernel too.
Adding this to my todos.

>
> [1] https://github.com/Freescale/gstreamer-imx/src/g2d/pango/
>
> Thanks for any input!
>
> Wolfgang.
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

signature.asc (201 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Image overlay over a video stream

Wolfgang Grandegger
Hello,

Am 15.11.2018 um 16:56 schrieb Nicolas Dufresne:

> Le jeudi 15 novembre 2018 à 09:46 +0100, Wolfgang Grandegger a écrit :
>> Hello,
>>
>> digging deeper about text, image and graphics overlay...
>>
>> Am 18.10.2018 um 17:55 schrieb Nicolas Dufresne:
>>> Le jeudi 18 octobre 2018 à 16:21 +0200, Wolfgang Grandegger a écrit :
>>>> Hello,
>>>>
>>>> I'm currently evaluating the following pipeline to receive, display
>>>> and
>>>> record a MJPEG video stream:
>>>>
>>>>   # nice -20 \
>>>>     gst-launch-1.0 -v \
>>>>     udpsrc port=50004 buffer-size=180000000 do-timestamp=1 \
>>>>       caps="application/x-rtp, media=(string)video, clock-
>>>> rate=(int)90000, \
>>>>       encoding-name=(string)JPEG, payload=(int)26,
>>>> framerate=(fraction)50/1" \
>>>>     ! rtpjitterbuffer latency=20 \
>>>>     ! rtpjpegdepay \
>>>>     ! vaapijpegdec \
>>>>     ! timeoverlay \
>>>>     ! tee name=t
>>>>     t. ! queue ! vaapisink
>>>>     t. ! queue ! vaapih264enc ! mp4mux ! filesink
>>>> location=/tmp/test.mp4
>>>>
>>>> The CPU usage of the various threads with and without the element
>>>> "timeoverlay" is listed below:
>>>>
>>>>     with timeoverlay ->  no     yes
>>>>   TID  Thread Name      CPU %  CPU %
>>>>   ----------------------------------
>>>>   542  gst-launch-1.0    42.0   88.8
>>>>   550  queue0:src         0.2    0.4
>>>>   549  vaapiencodeh264    5.3    5.5
>>>>   548  gmain              0.0    0.0
>>>>   547  udpsrc0:src        8.9    8.4
>>>>   546  rtpjitterbuffer    8.5   51.6
>>>>   545  timer              0.0    0.0
>>>>   544  queue1:src        14.8   17.6
>>>>   543  queue0:src         3.9    4.9
>>>>
>>>> The "timeoverlay" adds approx. 45% to the CPU load (max is 4 x 100%).
>>>> What does take that much CPU time?
>>>
>>> It's called software rendering (with anti-aliasing and all). There is
>>> also a hit because you need to download/upload the pixels from/to the
>>> GPU.
>>
>> I see! The text needs to be rendered and inserted into (overlayed with)
>> the video frame. The CPU usage really depends how often and what is
>> rendered. Already disabling shadow or outline drawing or a smaller font
>> reduces the CPU load.
>>
>> BTW, is it possible to specify the colour for the shaded background? I
>> know that it could be achieved with Pango "<span>" text attributes, e.g.
>> "bgcolor", but it requires more CPU time than the "shaded" background
>> from the text overlay.
>
> It looks like you can specify font and outline color, but shade or the
> shadow color. Would be nice to add properties for this.
OK, while adding color to the shadow is trivial, it's getting more
complicated/complex for the shade...

>
> The shaded background is done using plain cairo, I'm not sure why it's
> faster.

On Intel, we are in the NV12 color space. As I see it, the shading is
done on the raw video frame here:

  https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/blob/master/ext/pango/gstbasetextoverlay.c#L2020

Just the luminance (Y) is decremented. With color more data need to
be updated, taking more CPU time. Maybe that's the reason why it has
not been implemented.

>>
>>>> Is there a faster way to do the overlay or is that element from the
>>>> Pango plugin already quiet efficient?
>>>> I want to overlay an image over the video stream, ideally done by
>>>> the graphics hardware.
>>>
>>> There is an active effort to enable GL rendering of CompositonOverlay
>>> meta, but we don't have a fast method to import back GL textures into
>>> VAAPI encoder iirc. So there is still quite some work.
>>
>> Is this work in progress visible somewhere, e.g. as GIT repo?
>
> https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/blob/master/gst/overlaycomposition/gstoverlaycomposition.c
> https://gitlab.freedesktop.org/gstreamer/gst-plugins-base/blob/master/ext/gl/gstgloverlaycompositorelement.c
OK.

>> I realized, that the i.MX6 GStreamer-IMX Plugin [1] does have a somehow
>> optimized implementation of the "textoverlay" using 2D acceleration.
>> Would that approach be feasible and help on Intel graphics hardware as well?
>
> On Intel, it is probably simpler to use GL. In any case, the fonts are
> still rendered in software and using cairo. I have for a long time been
> thinking that we could take advantage of bitmap font-cache, but never
> got the time to work on it.
>
> On IMX2, there is a 2D blitter that could be used to implement an
> overlay-compositor.
>
>>
>> Another option for text, image and graphics overlay is to use Cairo
>> directly using the "cairooverlay". That would allow to use the Cairo
>> text renderer and also add graphics or images in one process. Would that
>> be "lighter" or more efficient? I think GStreamer 0.1 did have a
>> "cairotextoverlay".
>
> Text overlay extracts the vector path from pango and use cairo to
> render. Using cairo instead of bitmap font cache is the second
> performance hit (but only if you change the text every frames).
> Blending with the YUV video is the biggest performance hit. The
> blending is optimized using GL implementation. I haven't tested it, but
> it's likely adding an imxoverlaycompositor to gstreamer-imx would do
> the job will less work.
OK.

>
> I guess I should implement such an element for mainline kernel too.
> Adding this to my todos.
>
>>
>> [1] https://github.com/Freescale/gstreamer-imx/src/g2d/pango/

Thanks,

Wolfgang.


_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

signature.asc (853 bytes) Download Attachment