I have a question about how the streaming architecture works with GPU acceleration. Since discrete cards are sitting on the PCI bus, best performance happens when data is_______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
On Fr, 2016-05-13 at 08:34 -0400, Aaron Boxer wrote:
> I have a question about how the streaming architecture works with GPU acceleration. > > Since discrete cards are sitting on the PCI bus, best performance happens when data is > moved to the card in a pipeline and the host gets notified when data has been processed > and moved back to the host. > > Also, it is sometimes more efficient to process N frames at a time. > > So, for best perf, the flow would be: > > A) host keeps a list of N host-side memory buffers > B) host waits for a host buffer to become available > C) when buffer is available, host copies memory into that buffer, and queues the buffer > to be copied over to the card > D) when N buffers have been processed, and copied back to host, the host receives an event > E) host can use the processed buffers, and when it is finished, that buffers becomes available > for another frame > > Would this workflow work with GStreamer? your element. -- Sebastian Dröge, Centricular Ltd · http://www.centricular.com _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel signature.asc (968 bytes) Download Attachment |
On Sat, May 14, 2016 at 3:41 AM, Sebastian Dröge <[hidden email]> wrote: On Fr, 2016-05-13 at 08:34 -0400, Aaron Boxer wrote: Thanks. So, you mean to update the time stamp on the frame?
_______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
On Sa, 2016-05-14 at 12:25 -0400, Aaron Boxer wrote:
> > > Yes, you just need to ensure that latency is reported accordingly > > by your element. > > Thanks. So, you mean to update the time stamp on the frame? No, handling of the LATENCY query. Or e.g. gst_video_decoder_set_latency() :) -- Sebastian Dröge, Centricular Ltd · http://www.centricular.com _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel signature.asc (968 bytes) Download Attachment |
On Sun, May 15, 2016 at 2:02 AM, Sebastian Dröge <[hidden email]> wrote: On Sa, 2016-05-14 at 12:25 -0400, Aaron Boxer wrote: I see, thanks. _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
In reply to this post by Sebastian Dröge-3
Le samedi 14 mai 2016 à 10:41 +0300, Sebastian Dröge a écrit :
> On Fr, 2016-05-13 at 08:34 -0400, Aaron Boxer wrote: > > > > I have a question about how the streaming architecture works with > > GPU acceleration. > > > > Since discrete cards are sitting on the PCI bus, best performance > > happens when data is > > moved to the card in a pipeline and the host gets notified when > > data has been processed > > and moved back to the host. > > > > Also, it is sometimes more efficient to process N frames at a time. > > > > So, for best perf, the flow would be: > > > > A) host keeps a list of N host-side memory buffers > > B) host waits for a host buffer to become available > > C) when buffer is available, host copies memory into that buffer, > > and queues the buffer > > to be copied over to the card > > D) when N buffers have been processed, and copied back to host, the > > host receives an event > > E) host can use the processed buffers, and when it is finished, > > that buffers becomes available > > for another frame > > > > Would this workflow work with GStreamer? > Yes, you just need to ensure that latency is reported accordingly by > your element. Note, it's arguably not the most efficient way. Ideally, you should implement a V4L2 mem-to-mem driver for your card. The videobuf2 and/or v4l2_mem2mem framework will provide you an appropriate queue mechanism, and efficient memory allocation model. Those drivers are already supported by GStreamer. Nicolas _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
On Tue, May 17, 2016 at 2:13 PM, Nicolas Dufresne <[hidden email]> wrote: Le samedi 14 mai 2016 à 10:41 +0300, Sebastian Dröge a écrit : Thanks, Nicolas. I am not completely sure about the actual workflow for J2K streaming, but I don't think the encoding and capture would necessarily happen on the same machine. So, one machine would capture uncompressed, then send to a second one to compress and stream to others. And for viewing, one endpoint would receive the compressed stream, decompress, and display. So, I don't think mem-to-mem copying from card is necessary. But, as I said, streaming is pretty new to me, so perhaps more experience folks can chime in here. Aaron
_______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Free forum by Nabble | Edit this page |