It seems to me that Gstreamer currently is mostly occupied with how raw media data is organized in memory. For higher-quality applications, having explicit information about how the data was created and/or how it is supposed to be rendered could be helpful. I think that fourcc, rgb-masks is a mess, and if linear/HDR/bayer formats are introduced, it should really be hairy.
h264 allows embedding quite a lot of video meta-data as "video usability information", but Gstreamer have no standardised mechanism for representing that info. I guess that many other video/still-image codecs allows similar mechanisms. This includes gamma-function, primaries and chroma subsampling characteristics. My wish is for some general mechanism that _allows_ specifying the nitty-gritty details for those applications where it is needed, but also allows continuing using simplified fourcc or fourcc-like table lookup mechanisms where that is sufficient. The big change would be a "soft" transition where any Gstreamer element that claims to be outputting e.g. "I420" would actually contain a bug if the buffer actually contained I420-like data where the levels were 0-255 like in JPEG/JFIF files. That element could then be fixed by using another, suitable fourcc ("J420"?), or specifying the actual format from ground up. I have tried to scetch a format suitable for any common video/x-raw-rgb, video/x-raw-yuv or video/x-raw-bayer format where all doubt, unwritten "convention" is removed. As an exmple, I have filled it with example-numbers for a given BT709-based "HDYC" format. I have tried to use quantitative descriptions (numerical) instead of strings/table-lookups wherever posible, i.e. everywhere except for the gamma specification. I have sorted the specification into 1. Color-specification/gamma from CIE1931 2. Layout of pixels and subpixels in the image plane 3. Where to find each picture element in memory All of this is based on an idealised video source/conversion/storage unit that can be fully described by those parameters, I believe. string: "HDYC" Color specification Channel 0 CIE1931 xy primaries [0.3 0.6] Channel 1 CIE1931 xy primaries [0.15 0.06] Channel 2 CIE1931 xy primaries [0.64 0.33] Channel 3 CIE1931 xy primaries Color temperature (Kelvin) 6500 White-point [0.3127 0.329] Gamma (1.099xL^0.45)-0.099, 1>=L>=0.018 Gamma 4.5, 0.018>L>=0 Matrixing Number of channels 3 Last channel is alpha no Matrix [1 0 0; 0 1 0; 0 0 1] ch0_range [16 235] ch1_range [16 239] ch2_range [16 239] ch3_range Spatial layout height 1088 width 1920 crop [0 1079; 0 1919] pixel aspect ratio nom 1 pixel aspect ratio den 1 Channel 0 - horizontal decimation 1 Channel 0 - horizontal sample site 1 Channel 0 - horizontal support Channel 0 - vertical decimation 1 Channel 0 - vertical sample site 1 Channel 0 - vertical support Channel 1 - horizontal decimation 2 Channel 1 - horizontal sample site 1 Channel 1 - horizontal support Channel 1 - vertical decimation 1 Channel 1 - vertical sample site 1 Channel 1 - vertical support Channel 2 - horizontal decimation 2 Channel 2 - horizontal sample site 1 Channel 2 - horizontal support Channel 2 - vertical decimation 1 Channel 2 - vertical sample site 1 Channel 2 - vertical support Channel 3 - horizontal decimation Channel 3 - horizontal sample site Channel 3 - horizontal support Channel 3 - vertical decimation Channel 3 - vertical sample site Channel 3 - vertical support Storage layout packed/planar packed bpp 16 endianness 4321 pixel-period 2 stride 2x1920 Channel 0 mask 0xff000000 Channel 1 mask 0x00ff0000 Channel 2 mask 0x00000000 Channel 3 mask Channel 0 mask2 0xff000000 Channel 1 mask2 0x00000000 Channel 2 mask2 0x00ff0000 Channel 3 mask2 Channel 0 offset Channel 1 offset Channel 2 offset Channel 3 offset Channel 0 stride Channel 1 stride Channel 2 stride Channel 3 stride >> On Jul 2, 2009, at 6:43 PM, Clark, Rob wrote: >> >>> Hi gstreamer folks, >>> >>> The following is a proposal for how to add row-stride (and possibly >>> some related changes) to gstreamer. I gave a couple of possible >>> examples of where this would be useful, but it is probably not >>> exhaustive. Please let me know if you see any cases that I >>> missed, or >>> details that I overlooked, etc. >>> >>> >>> >>> Use-cases: >>> ---------- >>> + display hardware with special constraints on image dimensions, for >>> example >>> if the output buffer must have dimensions that are a power of two >>> + zero-copy cropping of image / videoframe (at least for interleaved >>> color >>> formats.. more on this later) >>> >>> One example to think about is rendering onto a 3d surface. In some >>> cases, graphics hardware could require that the surface dimensions >>> are >>> a power of 2. In this case, you would want the vsink to allocate a >>> buffer with a rowstride that is the next larger power of 2 from the >>> image width. >>> >>> >>> Another example to think about is video stabilization. In this use >>> case, you would ask the camera to capture an oversized frame. Your >>> vstab algorithm would calculate an x,y offset of the stabilized >>> image. But if the decoder understands rowstride, you do not need to >>> actually copy the image buffer. Say, just to pick some numbers, you >>> want your final output to be 640x480, and you want your oversized >>> frame to be +20% in each dimension (768x576): >>> >>> +--------+ +-------+ +------+ >>> | camera |---------->| vstab |---------->| venc | >>> +--------+ width=768 +-------+ width=640 +------+ >>> height=576 height=480 >>> rowstride=768 rowstride=768 >>> >>> In the case of an interleaved color format (RGB, UYVY, etc), you >>> could >>> simply increment the 'data' pointer in the buffer by (y*rowstride) >>> +x. >>> No memcpy() required. As long as the video encoder respects the >>> rowstride, it will see the stabilized frame correctly. >>> >>> >>> >>> Proposal: >>> --------- >>> >>> In all cases that I can think of, the row-stride will not be >>> changing >>> dynamically. So this parameter can be negotiated thru caps >>> negotiation in the same way as image width/height, colorformat, etc. >>> However, we need to know conclusively that there is no element in >>> the >>> pipeline that cares about the image format, but does not understand >>> "rowstride", so we cannot use existing type strings (ex. "video/x- >>> raw- >>> yuv"). And, at least in the cases that I can think of, the video >>> sink >>> will dictate the row-stride. So upstream caps-renegotiation will be >>> used to arrive at the final "rowstride" value. >>> >>> For media types, I propose to continue using existing strings for >>> non- >>> stride-aware element caps, ex. "video/x-raw-yuv". For stride-aware >>> elements, they can support a second format, ex. "video/x-raw-yuv- >>> strided", "image/x-raw-rgb-strided", etc (ie. append "-strided" to >>> whatever the existing string is). In the case that a strided format >>> is negotiated, it is required for there to also be a "rowstride" >>> entry >>> in the final negotiated caps. >>> >>> question: in general, most elements supporting rowstride will have >>> no >>> constraint on what particular rowstride values are supported. Do >>> they >>> just list "rowstride=[0-4294967295]" in their caps template? The >>> video sink allocating the buffer will likely have some constraints >>> on >>> rowstride, although this will be a function of the width (for >>> example, >>> round the width up to next power of two). >>> >>> We will implement some sort of GstRowStrideTransform element to >>> interface between stride-aware and non-stride-aware elements. >>> >>> ------------------------------------------------------------------------------ _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel winmail.dat (15K) Download Attachment |
Free forum by Nabble | Edit this page |