Gstreamer support for GPU codecs

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Gstreamer support for GPU codecs

boxerab@gmail.com
I have a question about how the streaming architecture works with GPU acceleration.

Since discrete cards are sitting on the PCI bus, best performance happens when data is
moved to the card in a pipeline and the host gets notified when data has been processed
and moved back to the host.

Also, it is sometimes more efficient to process N frames at a time.

So, for best perf, the flow would be:

A) host keeps a list of N host-side memory buffers
B) host waits for a host buffer to become available
C) when buffer is available, host copies memory into that buffer, and queues the buffer
to be copied over to the card
D) when N buffers have been processed, and copied back to host, the host receives an event
E) host can use the processed buffers, and when it is finished, that buffers becomes available
for another frame

Would this workflow work with GStreamer? 

Thanks,
Aaron

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: Gstreamer support for GPU codecs

Sebastian Dröge-3
On Fr, 2016-05-13 at 08:34 -0400, Aaron Boxer wrote:

> I have a question about how the streaming architecture works with GPU acceleration.
>
> Since discrete cards are sitting on the PCI bus, best performance happens when data is
> moved to the card in a pipeline and the host gets notified when data has been processed
> and moved back to the host.
>
> Also, it is sometimes more efficient to process N frames at a time.
>
> So, for best perf, the flow would be:
>
> A) host keeps a list of N host-side memory buffers
> B) host waits for a host buffer to become available
> C) when buffer is available, host copies memory into that buffer, and queues the buffer
> to be copied over to the card
> D) when N buffers have been processed, and copied back to host, the host receives an event
> E) host can use the processed buffers, and when it is finished, that buffers becomes available
> for another frame
>
> Would this workflow work with GStreamer?  
Yes, you just need to ensure that latency is reported accordingly by
your element.

--
Sebastian Dröge, Centricular Ltd · http://www.centricular.com

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

signature.asc (968 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Gstreamer support for GPU codecs

boxerab@gmail.com


On Sat, May 14, 2016 at 3:41 AM, Sebastian Dröge <[hidden email]> wrote:
On Fr, 2016-05-13 at 08:34 -0400, Aaron Boxer wrote:
> I have a question about how the streaming architecture works with GPU acceleration.
>
> Since discrete cards are sitting on the PCI bus, best performance happens when data is
> moved to the card in a pipeline and the host gets notified when data has been processed
> and moved back to the host.
>
> Also, it is sometimes more efficient to process N frames at a time.
>
> So, for best perf, the flow would be:
>
> A) host keeps a list of N host-side memory buffers
> B) host waits for a host buffer to become available
> C) when buffer is available, host copies memory into that buffer, and queues the buffer
> to be copied over to the card
> D) when N buffers have been processed, and copied back to host, the host receives an event
> E) host can use the processed buffers, and when it is finished, that buffers becomes available
> for another frame
>
> Would this workflow work with GStreamer?  

Yes, you just need to ensure that latency is reported accordingly by
your element.


Thanks. So, you mean to update the time stamp on the frame?
 

--
Sebastian Dröge, Centricular Ltd · http://www.centricular.com

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel



_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: Gstreamer support for GPU codecs

Sebastian Dröge-3
On Sa, 2016-05-14 at 12:25 -0400, Aaron Boxer wrote:

> > Yes, you just need to ensure that latency is reported accordingly
> > by your element.
>
> Thanks. So, you mean to update the time stamp on the frame?

No, handling of the LATENCY query. Or e.g.
gst_video_decoder_set_latency() :) 

--
Sebastian Dröge, Centricular Ltd · http://www.centricular.com


_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

signature.asc (968 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Gstreamer support for GPU codecs

boxerab@gmail.com


On Sun, May 15, 2016 at 2:02 AM, Sebastian Dröge <[hidden email]> wrote:
On Sa, 2016-05-14 at 12:25 -0400, Aaron Boxer wrote:

> > Yes, you just need to ensure that latency is reported accordingly
> > by your element.
>
> Thanks. So, you mean to update the time stamp on the frame?

No, handling of the LATENCY query. Or e.g.
gst_video_decoder_set_latency() :) 

 
 I see, thanks.




_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: Gstreamer support for GPU codecs

Nicolas Dufresne-4
In reply to this post by Sebastian Dröge-3
Le samedi 14 mai 2016 à 10:41 +0300, Sebastian Dröge a écrit :

> On Fr, 2016-05-13 at 08:34 -0400, Aaron Boxer wrote:
> >
> > I have a question about how the streaming architecture works with
> > GPU acceleration.
> >
> > Since discrete cards are sitting on the PCI bus, best performance
> > happens when data is
> > moved to the card in a pipeline and the host gets notified when
> > data has been processed
> > and moved back to the host.
> >
> > Also, it is sometimes more efficient to process N frames at a time.
> >
> > So, for best perf, the flow would be:
> >
> > A) host keeps a list of N host-side memory buffers
> > B) host waits for a host buffer to become available
> > C) when buffer is available, host copies memory into that buffer,
> > and queues the buffer
> > to be copied over to the card
> > D) when N buffers have been processed, and copied back to host, the
> > host receives an event
> > E) host can use the processed buffers, and when it is finished,
> > that buffers becomes available
> > for another frame
> >
> > Would this workflow work with GStreamer?  
> Yes, you just need to ensure that latency is reported accordingly by
> your element.

Note, it's arguably not the most efficient way. Ideally, you should
implement a V4L2 mem-to-mem driver for your card. The videobuf2 and/or
v4l2_mem2mem framework will provide you an appropriate queue mechanism,
and efficient memory allocation model. Those drivers are already
supported by GStreamer.

Nicolas
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: Gstreamer support for GPU codecs

boxerab@gmail.com


On Tue, May 17, 2016 at 2:13 PM, Nicolas Dufresne <[hidden email]> wrote:
Le samedi 14 mai 2016 à 10:41 +0300, Sebastian Dröge a écrit :
> On Fr, 2016-05-13 at 08:34 -0400, Aaron Boxer wrote:
> >
> > I have a question about how the streaming architecture works with
> > GPU acceleration.
> >
> > Since discrete cards are sitting on the PCI bus, best performance
> > happens when data is
> > moved to the card in a pipeline and the host gets notified when
> > data has been processed
> > and moved back to the host.
> >
> > Also, it is sometimes more efficient to process N frames at a time.
> >
> > So, for best perf, the flow would be:
> >
> > A) host keeps a list of N host-side memory buffers
> > B) host waits for a host buffer to become available
> > C) when buffer is available, host copies memory into that buffer,
> > and queues the buffer
> > to be copied over to the card
> > D) when N buffers have been processed, and copied back to host, the
> > host receives an event
> > E) host can use the processed buffers, and when it is finished,
> > that buffers becomes available
> > for another frame
> >
> > Would this workflow work with GStreamer?  
> Yes, you just need to ensure that latency is reported accordingly by
> your element.

Note, it's arguably not the most efficient way. Ideally, you should
implement a V4L2 mem-to-mem driver for your card. The videobuf2 and/or
v4l2_mem2mem framework will provide you an appropriate queue mechanism,
and efficient memory allocation model. Those drivers are already
supported by GStreamer.


Thanks, Nicolas. I am not completely sure about the actual workflow for J2K streaming,
but I don't think the encoding and capture would necessarily happen on the same machine.

So, one machine would capture uncompressed, then send to a second one to compress and stream
to others. And for viewing, one endpoint would receive the compressed stream, decompress, and display.

So, I don't think mem-to-mem copying from card is necessary. But, as I said, streaming is pretty new
to me, so perhaps more experience folks can chime in here.

Aaron


 

Nicolas
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel


_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel