Hi all,
I have been working lately on researching ways to make the whole encoding experience better and more streamlined for applications using GStreamer, and have come up with a proposal. You will find the proposal below, and attached to the mail the research and proposed C API. Comments and feedback are most welcome ----- Summary ------- A. Problems B. Goals 1. EncodeBin 2. Encoding Profile System 3. Helper Library for Profiles A. Problems this proposal attempts to solve ------------------------------------------- * Duplication of pipeline code for gstreamer-based applications wishing to encode and or mux streams, leading to subtle differences and inconsistencies accross those applications. * No unified system for describing encoding targets for applications in a user-friendly way. * No unified system for creating encoding targets for applications, resulting in duplication of code accross all applications, differences and inconsistencies that come with that duplication, and applications hardcoding element names and settings resulting in poor portability. B. Goals -------- 1. Convenience encoding element Create a convenience GstBin for encoding and muxing several streams, hereafter called 'EncodeBin'. This element will only contain one single property, which is a profile. 2. Define a encoding profile system 2. Encoding profile helper library Create a helper library to: * create EncodeBin instances based on profiles, and * help applications to create/load/save/browse those profiles. 1. EncodeBin ------------ 1.1 Proposed API ---------------- EncodeBin is a GstBin subclass. It implements the GstTagSetter interface, by which it will proxy the calls to the muxer. Only two introspectable property (i.e. usable without extra API): * A GstEncodingProfile* * The name of the profile to use When a profile is selected, encodebin will: * Add REQUEST sinkpads for all the GstStreamProfile * Create the muxer and expose the source pad Whenever a request pad is created, encodebin will: * Create the chain of elements for that pad * Ghost the sink pad * Return that ghost pad This allows reducing the code to the minimum for applications wishing to encode a source for a given profile: ... encbin = gst_element_factory_make("encodebin, NULL); g_object_set (encbin, "profile", "N900/H264 HQ", NULL); gst_element_link (encbin, filesink); ... vsrcpad = gst_element_get_src_pad(source, "src1"); vsinkpad = gst_element_get_request_pad (encbin, "video_%d"); gst_pad_link(vsrcpad, vsinkpad); ... 1.2 Explanation of the Various stages in EncodeBin -------------------------------------------------- This describes the various stages which can happen in order to end up with a multiplexed stream that can then be stored or streamed. 1.2.1 Incoming streams The streams fed to EncodeBin can be of various types: * Video * Uncompressed (but maybe subsampled) * Compressed * Audio * Uncompressed (audio/x-raw-{int|float}) * Compressed * Timed text * Private streams 1.2.2 Steps involved for raw video encoding (0) Incoming Stream (1) Transform raw video feed (optional) Here we modify the various fundamental properties of a raw video stream to be compatible with the intersection of: * The encoder GstCaps and * The specified "Stream Restriction" of the profile/target The fundamental properties that can be modified are: * width/height This is done with a video scaler. The DAR (Display Aspect Ratio) MUST be respected. If needed, black borders can be added to comply with the target DAR. * framerate * format/colorspace/depth All of this is done with a colorspace converter (2) Actual encoding (optional for raw streams) An encoder (with some optional settings) is used. (3) Muxing A muxer (with some optional settings) is used. (4) Outgoing encoded and muxed stream 1.2.3 Steps involved for raw audio encoding This is roughly the same as for raw video, expect for (1) (1) Transform raw audo feed (optional) We modify the various fundamental properties of a raw audio stream to be compatible with the intersection of: * The encoder GstCaps and * The specified "Stream Restriction" of the profile/target The fundamental properties that can be modifier are: * Number of channels * Type of raw audio (integer or floating point) * Depth (number of bits required to encode one sample) 1.2.4 Steps involved for encoded audio/video streams Steps (1) and (2) are replaced by a parser if a parser is available for the given format. 1.2.5 Steps involved for other streams Other streams will just be forwarded as-is to the muxer, provided the muxer accepts the stream type. 2. Encoding Profile System -------------------------- This work is based on: * The existing GstPreset system for elements [0] * The gnome-media GConf audio profile system [1] * The investigation done into device profiles by Arista and Transmageddon [2 and 3] 2.2 Terminology --------------- * Encoding Target Category A Target Category is a classification of devices/systems/use-cases for encoding. Such a classification is required in order for: * Applications with a very-specific use-case to limit the number of profiles they can offer the user. A screencasting application has no use with the online services targets for example. * Offering the user some initial classification in the case of a more generic encoding application (like a video editor or a transcoder). Ex: Consumer devices Online service Intermediate Editing Format Screencast Capture Computer * Encoding Profile Target A Profile Target describes a specific entity for which we wish to encode. A Profile Target must belong to at least one Target Category. It will define at least one Encoding Profile. Ex (with category): Nokia N900 (Consumer device) Sony PlayStation 3 (Consumer device) Youtube (Online service) DNxHD (Intermediate editing format) HuffYUV (Screencast) Theora (Computer) * Encoding Profile A specific combination of muxer, encoders, presets and limitations. Ex: Nokia N900/H264 HQ Ipod/High Quality DVD/Pal Youtube/High Quality HTML5/Low Bandwith DNxHD 2.3 Encoding Profile -------------------- An encoding profile requires the following information: * Name This string is not translatable and must be unique. A recommendation to guarantee uniqueness of the naming could be: <target>/<name> * Description This is a translatable string describing the profile * Muxing format This is a string containing the GStreamer media-type of the container format. * Muxing preset This is an optional string describing the preset(s) to use on the muxer. * Multipass setting This is a boolean describing whether the profile requires several passes. * List of Stream Profile 2.3.1 Stream Profiles A Stream Profile consists of: * Type The type of stream profile (audio, video, text, private-data) * Encoding Format This is a string containing the GStreamer media-type of the encoding format to be used. If encoding is not to be applied, the raw audio media type will be used. * Encoding preset This is an optional string describing the preset(s) to use on the encoder. * Restriction This is an optional GstCaps containing the restriction of the stream that can be fed to the encoder. This will generally containing restrictions in video width/heigh/framerate or audio depth. * presence This is an integer specifying how many streams can be used in the containing profile. 0 means that any number of streams can be used. * pass This is an integer which is only meaningful if the multipass flag has been set in the profile. If it has been set it indicates which pass this Stream Profile corresponds to. 2.4 Example profile ------------------- The representation used here is XML only as an example. No decision is made as to which formatting to use for storing targets and profiles. <gst-encoding-target> <name>Nokia N900</name> <category>Consumer Device</category> <profiles> <profile>Nokia N900/H264 HQ</profile> <profile>Nokia N900/MP3</profile> <profile>Nokia N900/AAC</profile> </profiles> </gst-encoding-target> <gst-encoding-profile> <name>Nokia N900/H264 HQ</name> <description> High Quality H264/AAC for the Nokia N900 </description> <format>video/quicktime,variant=iso</format> <streams> <stream-profile> <type>audio</type> <format>audio/mpeg,mpegversion=4</format> <preset>Quality High/Main</preset> <restriction>audio/x-raw-int,channels=[1,2]</restriction> <presence>1</presence> </stream-profile> <stream-profile> <type>video</type> <format>video/x-h264</format> <preset>Profile Baseline/Quality High</preset> <restriction> video/x-raw-yuv,width=[16, 800],\ height=[16, 480],framerate=[1/1, 30000/1001] </restriction> <presence>1</presence> </stream-profile> </streams> </gst-encoding-profile> 2.5 API ------- A proposed C API is contained in the gstprofile.h file in this directory. 2.6 Modifications required in the existing GstPreset system ----------------------------------------------------------- 2.6.1. Temporary preset. Currently a preset needs to be saved on disk in order to be used. This makes it impossible to have temporary presets (that exist only during the lifetime of a process), which might be required in the new proposed profile system 2.6.2 Categorisation of presets. Currently presets are just aliases of a group of property/value without any meanings or explanation as to how they exclude each other. Take for example the H264 encoder. It can have presets for: * passes (1,2 or 3 passes) * profiles (Baseline, Main, ...) * quality (Low, medium, High) In order to programmatically know which presets exclude each other, we here propose the categorisation of these presets. This can be done in one of two ways 1. in the name (by making the name be [<category>:]<name>) This would give for example: "Quality:High", "Profile:Baseline" 2. by adding a new _meta key This would give for example: _meta/category:quality 2.6.3 Aggregation of presets. There can be more than one choice of presets to be done for an element (quality, profile, pass). This means that one can not currently describe the full configuration of an element with a single string but with many. The proposal here is to extend the GstPreset API to be able to set all presets using one string and a well-known separator ('/'). This change only requires changes in the core preset handling code. This would allow doing the following: gst_preset_load_preset (h264enc, "pass:1/profile:baseline/quality:high"); 2.7 Points to be determined --------------------------- This document hasn't determined yet how to solve the following problems: 2.7.1 Storage of profiles One proposal for storage would be to use a system wide directory (like $prefix/share/gstreamer-0.10/profiles) and store XML files for every individual profiles. Users could then add their own profiles in ~/.gstreamer-0.10/profiles This poses some limitations as to what to do if some applications want to have some profiles limited to their own usage. 3. Helper library for profiles ------------------------------ These helper methods could also be added to existing libraries (like GstPreset, GstPbUtils, ..). The various API proposed are in the accompanying gstprofile.h file. 3.1 Getting user-readable names for formats This is already provided by GstPbUtils. 3.2 Hierarchy of profiles The goal is for applications to be able to present to the user a list of combo-boxes for choosing their output profile: [ Category ] # optional, depends on the application [ Device/Site/.. ] # optional, depends on the application [ Profile ] Convenience methods are offered to easily get lists of categories, devices, and profiles. 3.3 Creating Profiles The goal is for applications to be able to easily create profiles. The applications needs to be able to have a fast/efficient way to: * select a container format and see all compatible streams he can use with it. * select a codec format and see which container formats he can use with it. The remaining parts concern the restrictions to encoder input. 3.4 Ensuring availability of plugins for Profiles When an application wishes to use a Profile, it should be able to query whether it has all the needed plugins to use it. This part will use GstPbUtils to query, and if needed install the missing plugins through the installed distribution plugin installer. * Research links Some of these are still active documents, some other not [0] GstPreset API documentation http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/GstPreset.html [1] gnome-media GConf profiles http://www.gnome.org/~bmsmith/gconf-docs/C/gnome-media.html [2] Research on a Device Profile API http://gstreamer.freedesktop.org/wiki/DeviceProfile [3] Research on defining presets usage http://gstreamer.freedesktop.org/wiki/PresetDesign ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel encoding-research.txt (3K) Download Attachment gstencodebin.h (1K) Download Attachment gstprofile.h (7K) Download Attachment |
On Mon, Oct 19, 2009 at 9:53 AM, Edward Hervey
<[hidden email]> wrote: > Hi all, > > I have been working lately on researching ways to make the whole > encoding experience better and more streamlined for applications using > GStreamer, and have come up with a proposal. Great. Comments inline below: > > Only two introspectable property (i.e. usable without extra API): > * A GstEncodingProfile* > * The name of the profile to use > > When a profile is selected, encodebin will: > * Add REQUEST sinkpads for all the GstStreamProfile > * Create the muxer and expose the source pad > > Whenever a request pad is created, encodebin will: > * Create the chain of elements for that pad > * Ghost the sink pad > * Return that ghost pad > > This allows reducing the code to the minimum for applications > wishing to encode a source for a given profile: > > ... > > encbin = gst_element_factory_make("encodebin, NULL); > g_object_set (encbin, "profile", "N900/H264 HQ", NULL); Perhaps "profile-name" ("profile" being reserved for the profile object itself) would be a better name. > > 1.2.1 Incoming streams > > The streams fed to EncodeBin can be of various types: > > * Video > * Uncompressed (but maybe subsampled) > * Compressed > * Audio > * Uncompressed (audio/x-raw-{int|float}) > * Compressed > * Timed text > * Private streams Any ideas on how this allows re-muxing (without re-encoding) of certain streams? This wouldn't be an essential feature for the initial _implementation_, but I think keeping it in mind when designing the APIs is pretty important. It looks like you've thought about this, but it's not clear from this writeup what conclusions you came to :-) Maybe some API to query what caps are available for re-muxing given the current profile - then the app can check that, then either continue decoding if the input stream is incompatible, or pass-through if possible. Or should the application directly be querying the profile, rather than going through APIs on the bin, for this stuff? > > > 1.2.2 Steps involved for raw video encoding > > (0) Incoming Stream > > (1) Transform raw video feed (optional) > > Here we modify the various fundamental properties of a raw video > stream to be compatible with the intersection of: > * The encoder GstCaps and > * The specified "Stream Restriction" of the profile/target > > The fundamental properties that can be modified are: > * width/height > This is done with a video scaler. > The DAR (Display Aspect Ratio) MUST be respected. > If needed, black borders can be added to comply with the target DAR. > * framerate > * format/colorspace/depth > All of this is done with a colorspace converter With respect to framerate, any thought on VFR streams? If the target format supports VFR, then it'd be nice to be able to just encode the input as-is, without having to force it to a specified framerate. It'd probably also be good to have some way to select, and then set properties on, the elements used here. e.g. the application probably wants to be able to control what sort of scaling to do (to enable high-quality scaling, for example, or low-quality/fast for preview encodes). Obviously, the default would just work, so this would be more optional API for more advanced applications. > > (2) Actual encoding (optional for raw streams) > > An encoder (with some optional settings) is used. Are you planning anything for specifying how the settings should work, such that a profile could contain settings that apply to several different encoders (probably selected by rank, or optionally forced by the application), or will the settings be tied to a specific element? > > (3) Muxing > > A muxer (with some optional settings) is used. > > (4) Outgoing encoded and muxed stream > > > 1.2.3 Steps involved for raw audio encoding > > This is roughly the same as for raw video, expect for (1) > > (1) Transform raw audo feed (optional) > > We modify the various fundamental properties of a raw audio stream to > be compatible with the intersection of: > * The encoder GstCaps and > * The specified "Stream Restriction" of the profile/target > > The fundamental properties that can be modifier are: > * Number of channels > * Type of raw audio (integer or floating point) > * Depth (number of bits required to encode one sample) > > > 1.2.4 Steps involved for encoded audio/video streams > > Steps (1) and (2) are replaced by a parser if a parser is available > for the given format. > > > 1.2.5 Steps involved for other streams > > Other streams will just be forwarded as-is to the muxer, provided the > muxer accepts the stream type. > > > > > 2. Encoding Profile System > -------------------------- > > This work is based on: > * The existing GstPreset system for elements [0] > * The gnome-media GConf audio profile system [1] > * The investigation done into device profiles by Arista and > Transmageddon [2 and 3] > > 2.2 Terminology > --------------- > > * Encoding Target Category > A Target Category is a classification of devices/systems/use-cases > for encoding. > > Such a classification is required in order for: > * Applications with a very-specific use-case to limit the number of > profiles they can offer the user. A screencasting application has > no use with the online services targets for example. > * Offering the user some initial classification in the case of a > more generic encoding application (like a video editor or a > transcoder). > > Ex: > Consumer devices > Online service > Intermediate Editing Format > Screencast > Capture > Computer > > * Encoding Profile Target > A Profile Target describes a specific entity for which we wish to > encode. > A Profile Target must belong to at least one Target Category. > It will define at least one Encoding Profile. > > Ex (with category): > Nokia N900 (Consumer device) > Sony PlayStation 3 (Consumer device) > Youtube (Online service) > DNxHD (Intermediate editing format) > HuffYUV (Screencast) > Theora (Computer) > > * Encoding Profile > A specific combination of muxer, encoders, presets and limitations. > > Ex: > Nokia N900/H264 HQ > Ipod/High Quality > DVD/Pal > Youtube/High Quality > HTML5/Low Bandwith > DNxHD > > 2.3 Encoding Profile > -------------------- > > An encoding profile requires the following information: > > * Name > This string is not translatable and must be unique. > A recommendation to guarantee uniqueness of the naming could be: > <target>/<name> > * Description > This is a translatable string describing the profile > * Muxing format > This is a string containing the GStreamer media-type of the > container format. > * Muxing preset > This is an optional string describing the preset(s) to use on the > muxer. > * Multipass setting > This is a boolean describing whether the profile requires several > passes. > * List of Stream Profile > > 2.3.1 Stream Profiles > > A Stream Profile consists of: > > * Type > The type of stream profile (audio, video, text, private-data) > * Encoding Format > This is a string containing the GStreamer media-type of the encoding > format to be used. If encoding is not to be applied, the raw audio > media type will be used. > * Encoding preset > This is an optional string describing the preset(s) to use on the > encoder. > * Restriction > This is an optional GstCaps containing the restriction of the > stream that can be fed to the encoder. > This will generally containing restrictions in video > width/heigh/framerate or audio depth. > * presence > This is an integer specifying how many streams can be used in the > containing profile. 0 means that any number of streams can be > used. > * pass > This is an integer which is only meaningful if the multipass flag > has been set in the profile. If it has been set it indicates which > pass this Stream Profile corresponds to. > > 2.4 Example profile > ------------------- > > The representation used here is XML only as an example. No decision is > made as to which formatting to use for storing targets and profiles. Whatever decision in made as to the 'default' format for storing these, I'd really like to see a sufficiently complete API that an application that (for whatever reason) doesn't want to use that format could build the GstEncodingProfile object itself, from its own data store. > > <gst-encoding-target> > <name>Nokia N900</name> > <category>Consumer Device</category> > <profiles> > <profile>Nokia N900/H264 HQ</profile> > <profile>Nokia N900/MP3</profile> > <profile>Nokia N900/AAC</profile> > </profiles> > </gst-encoding-target> > > <gst-encoding-profile> > <name>Nokia N900/H264 HQ</name> > <description> > High Quality H264/AAC for the Nokia N900 > </description> > <format>video/quicktime,variant=iso</format> > <streams> > <stream-profile> > <type>audio</type> > <format>audio/mpeg,mpegversion=4</format> > <preset>Quality High/Main</preset> > <restriction>audio/x-raw-int,channels=[1,2]</restriction> > <presence>1</presence> > </stream-profile> > <stream-profile> > <type>video</type> > <format>video/x-h264</format> > <preset>Profile Baseline/Quality High</preset> > <restriction> > video/x-raw-yuv,width=[16, 800],\ > height=[16, 480],framerate=[1/1, 30000/1001] > </restriction> > <presence>1</presence> > </stream-profile> > </streams> > > </gst-encoding-profile> This describes the constraints on the device (or whatever). Have you thought at all about splitting out "constraints on what the target can accept" from "what we actually want to encode"? e.g. this profile says that I can do any size (within that range) video, but my application wants to encode at a particular size - should I be replacing the caps in the profile at runtime, or should there be another object to represent these (somewhat different) concepts? What about constraints that are not (currently, at least) expressible through caps? e.g. bitrate, profiles, etc? Anyway, I don't have time right now to continue through this in enough depth - and I'm sure some of my remarks miss something you've already thought about - but this was just to throw some more ideas into the mix. I'm very happy to see you looking into this more deeply! Mike ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel |
Administrator
|
On Mon, 2009-10-19 at 10:54 -0700, Michael Smith wrote:
> > > > encbin = gst_element_factory_make("encodebin, NULL); > > g_object_set (encbin, "profile", "N900/H264 HQ", NULL); > > Perhaps "profile-name" ("profile" being reserved for the profile > object itself) would be a better name. Yes, having both would be nice. I'll add that. > > > > > 1.2.1 Incoming streams > > > > The streams fed to EncodeBin can be of various types: > > > > * Video > > * Uncompressed (but maybe subsampled) > > * Compressed > > * Audio > > * Uncompressed (audio/x-raw-{int|float}) > > * Compressed > > * Timed text > > * Private streams > > Any ideas on how this allows re-muxing (without re-encoding) of > certain streams? This wouldn't be an essential feature for the initial > _implementation_, but I think keeping it in mind when designing the > APIs is pretty important. That is definitely going to go in the first implementation (one of the initial tests I have in mind is the simple remux-to-same-format and remux-in-compatible muxer). The current experience so far when doing transmuxing, is that we need to have a parser (like mpegaudioparse, mpegvideoparse,...) to verify that the stream is properly formatted, that the buffers are correctly packetized, timestamps properly set, etc... So the idea is when no re-encoding is involved, to use a parser for that stream format is present. > It looks like you've thought about this, but > it's not clear from this writeup what conclusions you came to :-) > > Maybe some API to query what caps are available for re-muxing given > the current profile Something like /** * gst_encoding_profile_get_input_caps: * @profile: a #GstEncodingProfile * * Returns: the list of all caps the profile can accept. Caller must call * gst_cap_unref on all unwanted caps once it is done with the list. */ GList * gst_profile_get_input_caps (GstEncodingProfile *profile); > - then the app can check that, then either > continue decoding if the input stream is incompatible, or pass-through > if possible. Or should the application directly be querying the > profile, rather than going through APIs on the bin, for this stuff? The generic idea is: * to not have any special API on EncodeBin (except for the standard caps request, state change, .. API) * to delay creation of EncodeBin as late as possible and work only with GstEncodingProfile API until then. > > > > > > > > 1.2.2 Steps involved for raw video encoding > > > > (0) Incoming Stream > > > > (1) Transform raw video feed (optional) > > > > Here we modify the various fundamental properties of a raw video > > stream to be compatible with the intersection of: > > * The encoder GstCaps and > > * The specified "Stream Restriction" of the profile/target > > > > The fundamental properties that can be modified are: > > * width/height > > This is done with a video scaler. > > The DAR (Display Aspect Ratio) MUST be respected. > > If needed, black borders can be added to comply with the target DAR. > > * framerate > > * format/colorspace/depth > > All of this is done with a colorspace converter > > With respect to framerate, any thought on VFR streams? If the target > format supports VFR, then it'd be nice to be able to just encode the > input as-is, without having to force it to a specified framerate. Hadn't thought about that one, and there are indeed use-cases where you'd want that (webcams that change their framerate depending on the light level for example, and you don't want to use videorate in those cases). Adding a boolean variable_framerate in GstVideoEncodingProfile would be an option then, and have it to FALSE by default. /** * GstVideoEncodingProfile: * @profile: common #GstEncodingProfile part. * @pass: The pass number if this is part of a multi-pass profile. Starts at 1 * for multi-pass. Set to 0 if this is not part of a multi-pass profile. * @variable_framerate: Do not enforce framerate on incoming raw stream. Default * is FALSE. */ struct _GstVideoEncodingProfile { GstStreamEncodingProfile profile; guint pass; gboolean variable_framerate; }; > > It'd probably also be good to have some way to select, and then set > properties on, the elements used here. e.g. the application probably > wants to be able to control what sort of scaling to do (to enable > high-quality scaling, for example, or low-quality/fast for preview > encodes). Obviously, the default would just work, so this would be > more optional API for more advanced applications. That's the problem with trying to design the one-API-to-rule-them-all :) Seriously though... the problem is that if we expose everything... we just come back to square one (or maybe two, but not that far ahead). The other problem is also that we might have several 'converters' available, none of them having well-known properties. Maybe some platform might have a differently named converter, ... An intermediate solution might be to provide a quality/speed knob over those conversions. Maybe have it as a boolean. So by default you would get the highest quality of conversion available (if you're in a live pipeline, QoS would kick in to lower the quality so you don't lose any data), but if you flip that boolean, you would get a low-quality/as-fast-as-possible conversion. > > > > > (2) Actual encoding (optional for raw streams) > > > > An encoder (with some optional settings) is used. > > Are you planning anything for specifying how the settings should work, > such that a profile could contain settings that apply to several > different encoders (probably selected by rank, or optionally forced by > the application), or will the settings be tied to a specific element? Right now the settings would be tied to a specific element, since the profile system relies exclusively on presets (which are tied to an element) for properties of an element. The rationale behind this... is that we have no unified properties across elements, let alone across encoders, let alone across different encoders for the same format. I'd *LOVE* to have a unified system for properties which are common to encoders... but every time I put my head down on that problem.. I only see one solution : base classes for encoders (guaranteeing *some* consistency in properties). > > > > The representation used here is XML only as an example. No decision is > > made as to which formatting to use for storing targets and profiles. > > Whatever decision in made as to the 'default' format for storing > these, I'd really like to see a sufficiently complete API that an > application that (for whatever reason) doesn't want to use that format > could build the GstEncodingProfile object itself, from its own data > store. Absolutely, the storage format which will be decided will only be the reference one. It's important to get that one right, since the goal is for it to be the one system-wide profiles (and those shipped in gstreamer modules) will come in. BUT, we do want to leave the possibility for applications (or any service) to create those profiles on their own. > > > > > > > <gst-encoding-target> > > <name>Nokia N900</name> > > <category>Consumer Device</category> > > <profiles> > > <profile>Nokia N900/H264 HQ</profile> > > <profile>Nokia N900/MP3</profile> > > <profile>Nokia N900/AAC</profile> > > </profiles> > > </gst-encoding-target> > > > > <gst-encoding-profile> > > <name>Nokia N900/H264 HQ</name> > > <description> > > High Quality H264/AAC for the Nokia N900 > > </description> > > <format>video/quicktime,variant=iso</format> > > <streams> > > <stream-profile> > > <type>audio</type> > > <format>audio/mpeg,mpegversion=4</format> > > <preset>Quality High/Main</preset> > > <restriction>audio/x-raw-int,channels=[1,2]</restriction> > > <presence>1</presence> > > </stream-profile> > > <stream-profile> > > <type>video</type> > > <format>video/x-h264</format> > > <preset>Profile Baseline/Quality High</preset> > > <restriction> > > video/x-raw-yuv,width=[16, 800],\ > > height=[16, 480],framerate=[1/1, 30000/1001] > > </restriction> > > <presence>1</presence> > > </stream-profile> > > </streams> > > > > </gst-encoding-profile> > > This describes the constraints on the device (or whatever). Have you > thought at all about splitting out "constraints on what the target can > accept" from "what we actually want to encode"? > > e.g. this profile says that I can do any size (within that range) > video, but my application wants to encode at a particular size - > should I be replacing the caps in the profile at runtime, or should > there be another object to represent these (somewhat different) > concepts? I guess I should have put another example, where the target is a less 'flexible' device. Let's say your target is a portable device that only support 320x240@25fps, then the profile would have those very specific caps in the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1) Do you have any more specific example in mind with the above use-case ? Did you mean you wanted the application to be able to fine-tune even more the profile at runtime ? > > What about constraints that are not (currently, at least) expressible > through caps? e.g. bitrate, profiles, etc? Those are tunable through the presets (through which all properties are expressed), in the N900 example above, it is set to baseline profile and the bitrate corresponding to "Quality High". > > > Anyway, I don't have time right now to continue through this in enough > depth - and I'm sure some of my remarks miss something you've already > thought about - but this was just to throw some more ideas into the > mix. I'm looking forward to the rest of your comments, > > I'm very happy to see you looking into this more deeply! Thank you Edward > > Mike > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry(R) Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > gstreamer-devel mailing list > [hidden email] > https://lists.sourceforge.net/lists/listinfo/gstreamer-devel ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel |
>>
>> It'd probably also be good to have some way to select, and then set >> properties on, the elements used here. e.g. the application probably >> wants to be able to control what sort of scaling to do (to enable >> high-quality scaling, for example, or low-quality/fast for preview >> encodes). Obviously, the default would just work, so this would be >> more optional API for more advanced applications. > > That's the problem with trying to design the > one-API-to-rule-them-all :) > > Seriously though... the problem is that if we expose everything... we > just come back to square one (or maybe two, but not that far ahead). > > The other problem is also that we might have several 'converters' > available, none of them having well-known properties. Maybe some > platform might have a differently named converter, ... > > An intermediate solution might be to provide a quality/speed knob over > those conversions. > Maybe have it as a boolean. So by default you would get the highest > quality of conversion available (if you're in a live pipeline, QoS would > kick in to lower the quality so you don't lose any data), but if you > flip that boolean, you would get a low-quality/as-fast-as-possible > conversion. I was basically just thinking of something along the lines of what playbin does - by default, just do something sane (here, videoscale with the scaling mode set to the highest-quality mode), but also allow the app to just say "use this element instead" - the app can then override things as much as it wants to, but it doesn't _have to_, since the defaults work sensibly. > > >> >> > >> > (2) Actual encoding (optional for raw streams) >> > >> > An encoder (with some optional settings) is used. >> >> Are you planning anything for specifying how the settings should work, >> such that a profile could contain settings that apply to several >> different encoders (probably selected by rank, or optionally forced by >> the application), or will the settings be tied to a specific element? > > Right now the settings would be tied to a specific element, since the > profile system relies exclusively on presets (which are tied to an > element) for properties of an element. > > The rationale behind this... is that we have no unified properties > across elements, let alone across encoders, let alone across different > encoders for the same format. > > I'd *LOVE* to have a unified system for properties which are common to > encoders... but every time I put my head down on that problem.. I only > see one solution : base classes for encoders (guaranteeing *some* > consistency in properties). Yeah, probably. That's pretty unfortunate, though. e.g. in songbird, the profile I use will have something to do with the user's configuration and the device we're transcoding for - but the actual elements available to satisfy that profile will be different across platforms and depend on what things the user has installed. I can't see myself using this system if it's tightly tied to specific elements for encoders (parsers, muxers, decoders, scalers, etc are less problematic), which suggests that we do need _some_ mechanism to use these things across multiple elements, even if it requires custom application code rather than being automatic. >> This describes the constraints on the device (or whatever). Have you >> thought at all about splitting out "constraints on what the target can >> accept" from "what we actually want to encode"? >> >> e.g. this profile says that I can do any size (within that range) >> video, but my application wants to encode at a particular size - >> should I be replacing the caps in the profile at runtime, or should >> there be another object to represent these (somewhat different) >> concepts? > > I guess I should have put another example, where the target is a less > 'flexible' device. > Let's say your target is a portable device that only support > 320x240@25fps, then the profile would have those very specific caps in > the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1) > > Do you have any more specific example in mind with the above > use-case ? Did you mean you wanted the application to be able to > fine-tune even more the profile at runtime ? Yeah - I want the app to fine tune. Let's suppose the following use-case: - User has an input video they got from the internet somewhere. It's 640x480, 30 fps, theora. - User has a mobile phone that can play H.264 video at up to 720p (so it can do the video at this resolution). - User wants to encode it at 320x240 to fit on their little micro-sd card. So the video is already a supported size - but the application wants to scale it smaller because the user has chosen that option - I don't quite understand how that's meant to be expressed in your profiles/API right now (I might be missing something). > >> >> What about constraints that are not (currently, at least) expressible >> through caps? e.g. bitrate, profiles, etc? > > Those are tunable through the presets (through which all properties > are expressed), in the N900 example above, it is set to baseline profile > and the bitrate corresponding to "Quality High". But the presets specify a particular set of settings, not the target constraints. So there's no way to say "this device supports bitrates up to 4 Mbps", but have a default bitrate for this profile of 2 Mbps, I think? I don't think the element presets are particularly helpful here - they express "a particular configuration" not "a range of possibilities". Mike ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel |
In reply to this post by Edward Hervey-2
Hello,
2009/10/19 Edward Hervey <[hidden email]>: > I have been working lately on researching ways to make the whole > encoding experience better and more streamlined for applications using > GStreamer, and have come up with a proposal. > > You will find the proposal below, and attached to the mail the > research and proposed C API. > > Comments and feedback are most welcome Great! Looks good. :) A few notes: - It's not clear if it's possible to derive target profiles from other but it seems like it's been considered. My thought is that encoder quality presets are usually orthogonal to profile/level and device-specific restrictions. I think something like the following might be useful: * Quality-related ** Hypothetically, say we have qth264enc and x264enc. They can have presets created for options which affect the encoding time/quality. ** Quality profiles can use these presets and add to them suggestions for the available rate control methods (e.g. quantiser, bit rate or so) * Restriction-related ** Profile/level restrictions ** Device-specific restrictions My point is that codec elements always have rather specific configuration options and it would be good to maintain some kind of range of quality option presets for each encoder that trade off speed and quality for a given bit rate. If these options are included in each target preset it will greatly increase maintenance and it's a big enough job as it is. The whole set up lends itself well to this as target devices could just override options as necessary. - System-wide versus application-specific versus user-specific profiles can be manageable through the API without too much difficulty I think. If scope is added in the API for providing a path from which one can load a profile then applications can use their own profiles. Similarly if scope is added for creating non-stored profiles in memory and just passing a pointer, this allows users/application developers to create profiles however they like. - To manage stream copying it should be simple enough to probe the caps of the target profile to check if an input stream is supported and if so flag somehow that it can be copied so that this can be checked and specified whether to copy or transcode in the application. - I think the API should allow an application to probe the profile to find what ranges are supported and allow customisation within those ranges. Michael's use case about the device being able to play up to 720p but to save space on the low capacity storage device he wants to encode at a lower resolution/bit rate is sound. Also, one may have a device that can play a higher resolution and has some video output to hook up to a higher resolution display device, but the playback device itself only has a small display. If one never uses the video output functionality, one might want to restrict the resolution. Similarly one might want to downmix audio or so even though 5.1 is supported e.g. a PS3 hooked up to a stereo amplifier. In short: target profiles should specify the range of operation of a target device, essentially being a device caps specification plus any necessary encoder/muxer options (e.g. setting 0 b-frames or the mux rate) to make things work on the device. Encoder element presets should consider an unrestricted playback service and be focused on speed/quality trade-off. - Finally with regard to Michael's concerns about tying a target profile to specific elements - how about one being able to specify multiple possible elements for encode/mux a particular mime-type and having some hierarchy for selection based on which is deemed best like the rest of the GStreamer system works (I think... I'm new here. :)) Best regards, Rob ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel |
Administrator
|
On Tue, 2009-10-20 at 13:12 +0100, Robert Swain wrote:
> Hello, > > 2009/10/19 Edward Hervey <[hidden email]>: > > I have been working lately on researching ways to make the whole > > encoding experience better and more streamlined for applications using > > GStreamer, and have come up with a proposal. > > > > You will find the proposal below, and attached to the mail the > > research and proposed C API. > > > > Comments and feedback are most welcome > > Great! Looks good. :) > > A few notes: > > - It's not clear if it's possible to derive target profiles from other > but it seems like it's been considered. I'm not 100% certain about this. The main issue that comes to mind is that if we start making a hierarchy of profiles, we'll have to make sure that any change in a parent profile doesn't have any ill-effect on any of the sub-profiles... including those you don't control. > My thought is that encoder > quality presets are usually orthogonal to profile/level and > device-specific restrictions. I think something like the following > might be useful: > > * Quality-related > ** Hypothetically, say we have qth264enc and x264enc. They can have > presets created for options which affect the encoding time/quality. > ** Quality profiles can use these presets and add to them suggestions > for the available rate control methods (e.g. quantiser, bit rate or > so) I don't really understand this part :( > > * Restriction-related > ** Profile/level restrictions > ** Device-specific restrictions > > My point is that codec elements always have rather specific > configuration options and it would be good to maintain some kind of > range of quality option presets for each encoder that trade off speed > and quality for a given bit rate. If these options are included in > each target preset it will greatly increase maintenance and it's a big > enough job as it is. The whole set up lends itself well to this as > target devices could just override options as necessary. Profile/Level are clear restrictions. If you choose those, you're already limiting the rest of the choices (1) Then amongst the remaining choices I can only think three ways of going: * Expose all remaining element-specific options... * Having a Quality/Speed setting ranging from Low-Quality/Fast to HighestQuality/Sloww * A mix of the two :( (1): that reminds me... if you select a certain profile that limits the range of some option... how do we report that ? > > - System-wide versus application-specific versus user-specific > profiles can be manageable through the API without too much difficulty > I think. If scope is added in the API for providing a path from which > one can load a profile then applications can use their own profiles. > Similarly if scope is added for creating non-stored profiles in memory > and just passing a pointer, this allows users/application developers > to create profiles however they like. As stated before, all profiles will be available a C structures/objects. Different backends can be written to support various storage formats. > > - To manage stream copying it should be simple enough to probe the > caps of the target profile to check if an input stream is supported > and if so flag somehow that it can be copied so that this can be > checked and specified whether to copy or transcode in the application. For that you would need to know the available streams in the file you wish to convert. I'm working on that part, but should go along with this. > > - I think the API should allow an application to probe the profile to > find what ranges are supported and allow customisation within those > ranges. Ranges of what ? > > Michael's use case about the device being able to play up to 720p but > to save space on the low capacity storage device he wants to encode at > a lower resolution/bit rate is sound. Agreed. > > Also, one may have a device that can play a higher resolution and has > some video output to hook up to a higher resolution display device, > but the playback device itself only has a small display. If one never > uses the video output functionality, one might want to restrict the > resolution. That could just be another target. Ex : "N1234/H264" and "N1234/H264 External viewing" > > Similarly one might want to downmix audio or so even though 5.1 is > supported e.g. a PS3 hooked up to a stereo amplifier. Can be overridden in the profile once you've loaded it. > > > In short: target profiles should specify the range of operation of a > target device, essentially being a device caps specification plus any > necessary encoder/muxer options (e.g. setting 0 b-frames or the mux > rate) to make things work on the device. Encoder element presets > should consider an unrestricted playback service and be focused on > speed/quality trade-off. > > > - Finally with regard to Michael's concerns about tying a target > profile to specific elements - how about one being able to specify > multiple possible elements for encode/mux a particular mime-type and > having some hierarchy for selection based on which is deemed best like > the rest of the GStreamer system works (I think... I'm new here. :)) That's why I'm only specifying GstCaps in the profile. I'm figuring out some extra signals on EncodeBin so that one can override/reorder the selected muxers/encoders. Edward > > Best regards, > Rob > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry(R) Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > gstreamer-devel mailing list > [hidden email] > https://lists.sourceforge.net/lists/listinfo/gstreamer-devel ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel |
Administrator
|
In reply to this post by michael smith-6-3
On Mon, 2009-10-19 at 12:23 -0700, Michael Smith wrote:
> > > > That's the problem with trying to design the > > one-API-to-rule-them-all :) > > > > Seriously though... the problem is that if we expose everything... we > > just come back to square one (or maybe two, but not that far ahead). > > > > The other problem is also that we might have several 'converters' > > available, none of them having well-known properties. Maybe some > > platform might have a differently named converter, ... > > > > An intermediate solution might be to provide a quality/speed knob over > > those conversions. > > Maybe have it as a boolean. So by default you would get the highest > > quality of conversion available (if you're in a live pipeline, QoS would > > kick in to lower the quality so you don't lose any data), but if you > > flip that boolean, you would get a low-quality/as-fast-as-possible > > conversion. > > I was basically just thinking of something along the lines of what > playbin does - by default, just do something sane (here, videoscale > with the scaling mode set to the highest-quality mode), but also allow > the app to just say "use this element instead" - the app can then > override things as much as it wants to, but it doesn't _have to_, > since the defaults work sensibly. > Something like this ? videoscale : The video scaler to use for converting video streams (if needed). flags: readable/writable Object of type "GstElement" (default : videoscale) videocolorspace The colorspace converter to use flags: readable/writable Object of type "GstElement" (default : ffmpegcolorspace) > > > > > > > >> > >> > > >> > (2) Actual encoding (optional for raw streams) > >> > > >> > An encoder (with some optional settings) is used. > >> > >> Are you planning anything for specifying how the settings should work, > >> such that a profile could contain settings that apply to several > >> different encoders (probably selected by rank, or optionally forced by > >> the application), or will the settings be tied to a specific element? > > > > Right now the settings would be tied to a specific element, since the > > profile system relies exclusively on presets (which are tied to an > > element) for properties of an element. > > > > The rationale behind this... is that we have no unified properties > > across elements, let alone across encoders, let alone across different > > encoders for the same format. > > > > I'd *LOVE* to have a unified system for properties which are common to > > encoders... but every time I put my head down on that problem.. I only > > see one solution : base classes for encoders (guaranteeing *some* > > consistency in properties). > > Yeah, probably. That's pretty unfortunate, though. e.g. in songbird, > the profile I use will have something to do with the user's > configuration and the device we're transcoding for - but the actual > elements available to satisfy that profile will be different across > platforms and depend on what things the user has installed. > > I can't see myself using this system if it's tightly tied to specific > elements for encoders (parsers, muxers, decoders, scalers, etc are > less problematic), which suggests that we do need _some_ mechanism to > use these things across multiple elements, even if it requires custom > application code rather than being automatic. One way (specifying element names and properties) or the other (specifying caps and presets), it's going to require some custom work to be done. The reason I prefer/recommend going the caps/presets way is that most of the work will be done *in* the element and preset, and much less (if not none) in the profiles and applications. > > >> This describes the constraints on the device (or whatever). Have you > >> thought at all about splitting out "constraints on what the target can > >> accept" from "what we actually want to encode"? > >> > >> e.g. this profile says that I can do any size (within that range) > >> video, but my application wants to encode at a particular size - > >> should I be replacing the caps in the profile at runtime, or should > >> there be another object to represent these (somewhat different) > >> concepts? > > > > I guess I should have put another example, where the target is a less > > 'flexible' device. > > Let's say your target is a portable device that only support > > 320x240@25fps, then the profile would have those very specific caps in > > the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1) > > > > Do you have any more specific example in mind with the above > > use-case ? Did you mean you wanted the application to be able to > > fine-tune even more the profile at runtime ? > > Yeah - I want the app to fine tune. > > Let's suppose the following use-case: > - User has an input video they got from the internet somewhere. It's > 640x480, 30 fps, theora. > - User has a mobile phone that can play H.264 video at up to 720p (so > it can do the video at this resolution). > - User wants to encode it at 320x240 to fit on their little micro-sd card. > > So the video is already a supported size - but the application wants > to scale it smaller because the user has chosen that option - I don't > quite understand how that's meant to be expressed in your profiles/API > right now (I might be missing something). In that case, your target for that device would have a restriction caps along the looks of : video/x-raw-yuv,width=[16,1280],height=[16,720],framerate=[0/1, 1000/1] Meaning that your device can playback any videos between 16x16 and 1280x720. The example I gave before with 320x240@25 would be the case for devices that can only playback one and only resolution/fps. > > > > > >> > >> What about constraints that are not (currently, at least) expressible > >> through caps? e.g. bitrate, profiles, etc? > > > > Those are tunable through the presets (through which all properties > > are expressed), in the N900 example above, it is set to baseline profile > > and the bitrate corresponding to "Quality High". > > But the presets specify a particular set of settings, not the target > constraints. So there's no way to say "this device supports bitrates > up to 4 Mbps", but have a default bitrate for this profile of 2 Mbps, > I think? > > I don't think the element presets are particularly helpful here - they > express "a particular configuration" not "a range of possibilities". Are you saying presets don't satisfy all the requirements here ? I completely agree. But apart from trying to extend presets there's only one ugly other option: * make sure all properties (like bitrate) for encoders have the SAME/EXACT/GUARANTEED name and meaning and range ... ... and that one doesn't seem trivial either. Edward > > Mike ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel |
Edward Hervey schrieb:
> On Mon, 2009-10-19 at 12:23 -0700, Michael Smith wrote: > >>> That's the problem with trying to design the >>> one-API-to-rule-them-all :) >>> >>> Seriously though... the problem is that if we expose everything... we >>> just come back to square one (or maybe two, but not that far ahead). >>> >>> The other problem is also that we might have several 'converters' >>> available, none of them having well-known properties. Maybe some >>> platform might have a differently named converter, ... >>> >>> An intermediate solution might be to provide a quality/speed knob over >>> those conversions. >>> Maybe have it as a boolean. So by default you would get the highest >>> quality of conversion available (if you're in a live pipeline, QoS would >>> kick in to lower the quality so you don't lose any data), but if you >>> flip that boolean, you would get a low-quality/as-fast-as-possible >>> conversion. >> I was basically just thinking of something along the lines of what >> playbin does - by default, just do something sane (here, videoscale >> with the scaling mode set to the highest-quality mode), but also allow >> the app to just say "use this element instead" - the app can then >> override things as much as it wants to, but it doesn't _have to_, >> since the defaults work sensibly. >> > > Something like this ? > > videoscale : The video scaler to use for converting video > streams (if needed). > flags: readable/writable > Object of type "GstElement" (default : videoscale) > videocolorspace The colorspace converter to use > flags: readable/writable > Object of type "GstElement" (default : > ffmpegcolorspace) > > Yes, something we should also do on playbin2, camerabin. Maybe we could also have a autotransform elements that gets a klass and picks the highest ranked element from the class (don't think we want autovideoscale, autocolorspace, ...) Not sure if we should name them video-scale and colorspace-convert. Stefan > >> >>> >>>>> (2) Actual encoding (optional for raw streams) >>>>> >>>>> An encoder (with some optional settings) is used. >>>> Are you planning anything for specifying how the settings should work, >>>> such that a profile could contain settings that apply to several >>>> different encoders (probably selected by rank, or optionally forced by >>>> the application), or will the settings be tied to a specific element? >>> Right now the settings would be tied to a specific element, since the >>> profile system relies exclusively on presets (which are tied to an >>> element) for properties of an element. >>> >>> The rationale behind this... is that we have no unified properties >>> across elements, let alone across encoders, let alone across different >>> encoders for the same format. >>> >>> I'd *LOVE* to have a unified system for properties which are common to >>> encoders... but every time I put my head down on that problem.. I only >>> see one solution : base classes for encoders (guaranteeing *some* >>> consistency in properties). >> Yeah, probably. That's pretty unfortunate, though. e.g. in songbird, >> the profile I use will have something to do with the user's >> configuration and the device we're transcoding for - but the actual >> elements available to satisfy that profile will be different across >> platforms and depend on what things the user has installed. >> >> I can't see myself using this system if it's tightly tied to specific >> elements for encoders (parsers, muxers, decoders, scalers, etc are >> less problematic), which suggests that we do need _some_ mechanism to >> use these things across multiple elements, even if it requires custom >> application code rather than being automatic. > > One way (specifying element names and properties) or the other > (specifying caps and presets), it's going to require some custom work to > be done. > > The reason I prefer/recommend going the caps/presets way is that most > of the work will be done *in* the element and preset, and much less (if > not none) in the profiles and applications. > >>>> This describes the constraints on the device (or whatever). Have you >>>> thought at all about splitting out "constraints on what the target can >>>> accept" from "what we actually want to encode"? >>>> >>>> e.g. this profile says that I can do any size (within that range) >>>> video, but my application wants to encode at a particular size - >>>> should I be replacing the caps in the profile at runtime, or should >>>> there be another object to represent these (somewhat different) >>>> concepts? >>> I guess I should have put another example, where the target is a less >>> 'flexible' device. >>> Let's say your target is a portable device that only support >>> 320x240@25fps, then the profile would have those very specific caps in >>> the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1) >>> >>> Do you have any more specific example in mind with the above >>> use-case ? Did you mean you wanted the application to be able to >>> fine-tune even more the profile at runtime ? >> Yeah - I want the app to fine tune. >> >> Let's suppose the following use-case: >> - User has an input video they got from the internet somewhere. It's >> 640x480, 30 fps, theora. >> - User has a mobile phone that can play H.264 video at up to 720p (so >> it can do the video at this resolution). >> - User wants to encode it at 320x240 to fit on their little micro-sd card. >> >> So the video is already a supported size - but the application wants >> to scale it smaller because the user has chosen that option - I don't >> quite understand how that's meant to be expressed in your profiles/API >> right now (I might be missing something). > > In that case, your target for that device would have a restriction > caps along the looks of : > video/x-raw-yuv,width=[16,1280],height=[16,720],framerate=[0/1, > 1000/1] > > Meaning that your device can playback any videos between 16x16 and > 1280x720. > > The example I gave before with 320x240@25 would be the case for > devices that can only playback one and only resolution/fps. > >> >>>> What about constraints that are not (currently, at least) expressible >>>> through caps? e.g. bitrate, profiles, etc? >>> Those are tunable through the presets (through which all properties >>> are expressed), in the N900 example above, it is set to baseline profile >>> and the bitrate corresponding to "Quality High". >> But the presets specify a particular set of settings, not the target >> constraints. So there's no way to say "this device supports bitrates >> up to 4 Mbps", but have a default bitrate for this profile of 2 Mbps, >> I think? >> >> I don't think the element presets are particularly helpful here - they >> express "a particular configuration" not "a range of possibilities". > > > Are you saying presets don't satisfy all the requirements here ? I > completely agree. But apart from trying to extend presets there's only > one ugly other option: > * make sure all properties (like bitrate) for encoders have the > SAME/EXACT/GUARANTEED name and meaning and range ... > > ... and that one doesn't seem trivial either. > > > Edward > >> Mike > > > > ------------------------------------------------------------------------------ > Come build with us! The BlackBerry(R) Developer Conference in SF, CA > is the only developer event you need to attend this year. Jumpstart your > developing skills, take BlackBerry mobile applications to market and stay > ahead of the curve. Join us from November 9 - 12, 2009. Register now! > http://p.sf.net/sfu/devconference > _______________________________________________ > gstreamer-devel mailing list > [hidden email] > https://lists.sourceforge.net/lists/listinfo/gstreamer-devel ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel |
In reply to this post by Edward Hervey
Edward Hervey schrieb:
> On Tue, 2009-10-20 at 13:12 +0100, Robert Swain wrote: >> Hello, >> >> 2009/10/19 Edward Hervey <[hidden email]>: >>> I have been working lately on researching ways to make the whole >>> encoding experience better and more streamlined for applications using >>> GStreamer, and have come up with a proposal. >>> >>> You will find the proposal below, and attached to the mail the >>> research and proposed C API. >>> <snip> >> * Restriction-related >> ** Profile/level restrictions >> ** Device-specific restrictions >> >> My point is that codec elements always have rather specific >> configuration options and it would be good to maintain some kind of >> range of quality option presets for each encoder that trade off speed >> and quality for a given bit rate. If these options are included in >> each target preset it will greatly increase maintenance and it's a big >> enough job as it is. The whole set up lends itself well to this as >> target devices could just override options as necessary. > > Profile/Level are clear restrictions. If you choose those, you're > already limiting the rest of the choices (1) > > Then amongst the remaining choices I can only think three ways of > going: > * Expose all remaining element-specific options... > * Having a Quality/Speed setting ranging from Low-Quality/Fast to > HighestQuality/Sloww > * A mix of the two :( > > (1): that reminds me... if you select a certain profile that limits the > range of some option... how do we report that ? This is a general restriction on e.g. GObject properties. E.g. in v4l2src I somethimes would like to limmit the real value range once the device has been openend. We can do that for caps, but not for properties. Sure we have GstPropertyProbe for it. Stefan ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel |
In reply to this post by Edward Hervey-2
On Mon, 2009-10-19 at 18:53 +0200, Edward Hervey wrote:
> Hi all, > > I have been working lately on researching ways to make the whole > encoding experience better and more streamlined for applications using > GStreamer, and have come up with a proposal. > > You will find the proposal below, and attached to the mail the > research and proposed C API. > > Comments and feedback are most welcome I've started the implementation of that proposal and have put the current work in a new module called gst-convenience. What is currently available: * encodebin support for single/multiple audio/video/containers * gstprofile Creating/copying/freeing GstEncodingProfile and GstStreamEncodingProfile There are unit tests in tests/check that show how to use that API and element, plus inline documentation. What remains to be done: * Use the restriction support in encodebin * Use the preset fields in encodebin * Design a default storage/loading system for profiles Bonus: * I re-implemented in C the Discoverer which many gst-python applications are using (PiTiVi, Jokosher, Transmageddon, ..). The goal is to be able to very quickly get a lot of information about one or many URIs (number of streams, stream properties, duration, tags, ...). There is a test application showing how to use it in tests/examples/ Currently it outputs the information as a GstStructure, I'm still working on coming up with a saner API to access that information The proposal from this mail thread (modified according to feedback) is available in docs/design/ The code is available here : http://git.collabora.co.uk/?p=user/edward/gst-convenience.git;a=summary Comments and feedback welcome, Edward P.S. The goal in the long term is not to keep all of those in a separate repository but to eventually move them in -base. -- Edward Hervey -- Collabora Multimedia Lead Platforms Engineer Co-Founder ------------------------------------------------------------------------------ Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day trial. Simplify your report design, integration and deployment - and focus on what you do best, core application coding. Discover what's new with Crystal Reports now. http://p.sf.net/sfu/bobj-july _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.sourceforge.net/lists/listinfo/gstreamer-devel |
Free forum by Nabble | Edit this page |