Hello,
I noticed that when transcoding using nvh264dec and nvh264enc, lower resolutions perform better when using system/host memory instead of GL memory. Higher resolutions perform better when using GL memory instead of system/host memory. If you profile the pipelines using nvprof, the memory copy operations seem in line with what you'd expect: device to host memory copies are slower than device to device. Since the memory copy operation performance seems as expected, what could be the cause of this slower performance and why does it only affect lower resolutions? This gist has results of my testing: https://gist.github.com/sidsethupathi/b464a6dc30907768a074d8dc526b2b66. I created 10 minute test sources, one at 320x420 and another at 3840x2160 and ran them through a "filesrc ! nvh264dec ! nvh264enc ! fakesink" pipeline, similar to Seungha's benchmarks in this MR: https://gitlab.freedesktop.org/gstreamer/gst-plugins-bad/-/merge_requests/539. The results are better formatted in the gist, but they are copied below: 320x240, host memory. Execution time = 0:00:11.252269640
320x240, GL memory. Execution time = 0:00:20.584277338
3840x2160, host memory. Execution time = 0:03:20.462018560
3840x2160, GL memory. Execution time = 0:02:18.106101429
Sid _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Free forum by Nabble | Edit this page |