proxy allocation query to fix appsink zero-copy issue

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

proxy allocation query to fix appsink zero-copy issue

mksafavi
Hi.
to fix zero-copy issues I had with appsink. as suggested I tried proxying
the query allocation based on kmscube decoder.
       
https://gitlab.freedesktop.org/mesa/kmscube/blob/master/gst-decoder.c#L242
<https://gitlab.freedesktop.org/mesa/kmscube/blob/master/gst-decoder.c#L242>  
So far I added a probe to my appsink and in the callback function, I checked
for GST_QUERY_ALLOCATION queries and added GST_VIDEO_META_API_TYPE meta to
it.
I logged the queries. after a few Caps and Drain queries, I got a
GST_QUERY_ALLOCATION.
query type : 40963
query type : 35846  <----- GST_QUERY_ALLOCATION

what should I expect to change after adding allocation meta?
 /gst_query_add_allocation_meta(query, GST_VIDEO_META_API_TYPE, NULL);/

Is there any step that I'm missing?

thanks

PS, my followup question:
http://gstreamer-devel.966125.n4.nabble.com/appsrc-performance-issue-tt4693030.html



--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: proxy allocation query to fix appsink zero-copy issue

Nicolas Dufresne-5
Le jeudi 30 janvier 2020 à 05:39 -0600, mksafavi a écrit :

> Hi.
> to fix zero-copy issues I had with appsink. as suggested I tried proxying
> the query allocation based on kmscube decoder.
>        
> https://gitlab.freedesktop.org/mesa/kmscube/blob/master/gst-decoder.c#L242
> <https://gitlab.freedesktop.org/mesa/kmscube/blob/master/gst-decoder.c#L242>  
> So far I added a probe to my appsink and in the callback function, I checked
> for GST_QUERY_ALLOCATION queries and added GST_VIDEO_META_API_TYPE meta to
> it.
> I logged the queries. after a few Caps and Drain queries, I got a
> GST_QUERY_ALLOCATION.
> query type : 40963
> query type : 35846  <----- GST_QUERY_ALLOCATION
>
> what should I expect to change after adding allocation meta?
>  /gst_query_add_allocation_meta(query, GST_VIDEO_META_API_TYPE, NULL);/
>
> Is there any step that I'm missing?

I forgot a lot of the context, but you are proxying I believe, which requires
different code compared to kmscube. Answer these questions and I may be able to
help further:

  1. Are you modifying the buffers in-place ?
  2. If not, do you have the ability to allocated DMABuf memory ?
  3. What do you know about video buffer alignment ?

This is all a bit tricky, since you have to do what the GStreamer elements on
your platform do, which is very complex doe to the assymetry in the HW design.

>
> thanks
>
> PS, my followup question:
> http://gstreamer-devel.966125.n4.nabble.com/appsrc-performance-issue-tt4693030.html
>
>
>
> --
> Sent from: http://gstreamer-devel.966125.n4.nabble.com/
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: proxy allocation query to fix appsink zero-copy issue

mksafavi
  > 1. Are you modifying the buffers in-place?
yes. I memcpy the buffer data to a mmapped region, in order to send the
buffer with axidma to the accelerator. the returned data is scaled
down(right now I'm only testing it on 1080 to 720 on NV12).then I wrap it on
a GstMem and push the buffer to appsrc.
(we're trying to avoid the memcpy by setting up SMMU using VFIO driver)

  > 2. If not, do you have the ability to allocated DMABuf memory?
I'd like to allocate DMABUF but I couldn't make it work with Xilinx VCU
decoder.
the Xilinx changelog states that it's possible to get DMABUF form decoder.
I remember there was /iomode=1/ option for v4l2src that enabled dmabuf on
decoder.
but couldn't find anything for my pipeline
filesrc->decoder->appsink->scaler->appsrc->encoder->filesink

  > 3. What do you know about video buffer alignment?
I'm not really familiar with buffer alignment. I just know that it affects
the throughput of memcpy. but don't know why exactly





--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: proxy allocation query to fix appsink zero-copy issue

Nicolas Dufresne-5
Le jeudi 30 janvier 2020 à 16:28 -0600, mksafavi a écrit :
>   > 1. Are you modifying the buffers in-place?
> yes. I memcpy the buffer data to a mmapped region, in order to send the
> buffer with axidma to the accelerator. the returned data is scaled
> down(right now I'm only testing it on 1080 to 720 on NV12).then I wrap it on
> a GstMem and push the buffer to appsrc.
> (we're trying to avoid the memcpy by setting up SMMU using VFIO driver)

So on this side, you would have to reply to the allocation query by proposing a
pool. Unfortunatly, the VCU OMX code only support DMABuf importation, so you'll
need DMABuf support on your accelerator. The simple thing to do, is to run the
pipeline with decode ! encode, add a probe in between and prints the
strides/offset in use and try to reproduce this in your accelerator.

>
>   > 2. If not, do you have the ability to allocated DMABuf memory?
> I'd like to allocate DMABUF but I couldn't make it work with Xilinx VCU
> decoder.
> the Xilinx changelog states that it's possible to get DMABUF form decoder.
> I remember there was /iomode=1/ option for v4l2src that enabled dmabuf on
> decoder.
> but couldn't find anything for my pipeline
> filesrc->decoder->appsink->scaler->appsrc->encoder->filesink

The decoder always produce dmabuf. As long as you set GST_VIDEO_META_API in your
allocation query reply, it will produce that. New version of the VCU/OMX will
support DMAbuf importation.

>
>   > 3. What do you know about video buffer alignment?
> I'm not really familiar with buffer alignment. I just know that it affects
> the throughput of memcpy. but don't know why exactly

Ok, no worries, it's mostly used to satisfy batch processing requirement. As an
example, if you have an instructions that deadls with 128bits per intructions,
it will also requires the memory pointers to be 128 align (a multiple of), and
that the buffer size is a multiple of 128 bytes.

As the processing is often per line, we need this alignment/padding per line, so
in we use strides (like size in bytes), larger then the width in order to allow
using these accelerators. See GstVideoMeta for the paramters GStreamer supports.
We also have some alignment support in GstMemory.

>
>
>
>
>
> --
> Sent from: http://gstreamer-devel.966125.n4.nabble.com/
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: proxy allocation query to fix appsink zero-copy issue

mksafavi
Hi.
Thank you for your detailed response.


I checked the strides on (decoder ! encoder) by probing encoder and decoder.
for a 1080p video it was aligned (stride=1920) and for non-standard video
resolutions (e.g. 854x480), it fixed the stride by setting it (stride=856).

when I tried to get the same results with (decoder ! accelerator ! encoder),
I noticed that It's possible to get a Caps_Query on the encoder and see the
frame size, strides,... but when I probed the decoder, it didn't contain any
video metadata. and I got the following error:
/CRITICAL **: gst_video_info_from_caps: assertion 'gst_caps_is_fixed (caps)'
failed/

Correct me If I'm wrong, I think the Caps are negotiated UPSTREAM from the
encoder and (appsink -> appsrc) is stopping it from reaching the decoder. So
I should set a probe on the decoder and edit it's allocation metadata.
though I want to implement it with a bufferpool as you said.
I think  I should make two bufferpools for each decoder and encoder. and set
their Caps respectively(they will have different resolutions because of the
accelerator)


Is there any example available for implementing a bufferpool?

Is it possible to allocate buffers on a mmap-ed memory region?


thanks

 



--
Sent from: http://gstreamer-devel.966125.n4.nabble.com/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel