[RFC] Encoding and Profiles

classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

[RFC] Encoding and Profiles

Edward Hervey-2
Hi all,

  I have been working lately on researching ways to make the whole
encoding experience better and more streamlined for applications using
GStreamer, and have come up with a proposal.

  You will find the proposal below, and attached to the mail the
research and proposed C API.

  Comments and feedback are most welcome

-----

Summary
-------
 A. Problems
 B. Goals
 1. EncodeBin
 2. Encoding Profile System
 3. Helper Library for Profiles



A. Problems this proposal attempts to solve
-------------------------------------------

* Duplication of pipeline code for gstreamer-based applications
  wishing to encode and or mux streams, leading to subtle differences
  and inconsistencies accross those applications.

* No unified system for describing encoding targets for applications
  in a user-friendly way.

* No unified system for creating encoding targets for applications,
  resulting in duplication of code accross all applications,
  differences and inconsistencies that come with that duplication,
  and applications hardcoding element names and settings resulting in
  poor portability.



B. Goals
--------

1. Convenience encoding element

  Create a convenience GstBin for encoding and muxing several streams,
  hereafter called 'EncodeBin'.

  This element will only contain one single property, which is a
  profile.

2. Define a encoding profile system

2. Encoding profile helper library

  Create a helper library to:
  * create EncodeBin instances based on profiles, and
  * help applications to create/load/save/browse those profiles.




1. EncodeBin
------------

1.1 Proposed API
----------------

  EncodeBin is a GstBin subclass.

  It implements the GstTagSetter interface, by which it will proxy the
  calls to the muxer.

  Only two introspectable property (i.e. usable without extra API):
  * A GstEncodingProfile*
  * The name of the profile to use

  When a profile is selected, encodebin will:
  * Add REQUEST sinkpads for all the GstStreamProfile
  * Create the muxer and expose the source pad

  Whenever a request pad is created, encodebin will:
  * Create the chain of elements for that pad
  * Ghost the sink pad
  * Return that ghost pad

  This allows reducing the code to the minimum for applications
  wishing to encode a source for a given profile:

  ...

  encbin = gst_element_factory_make("encodebin, NULL);
  g_object_set (encbin, "profile", "N900/H264 HQ", NULL);
  gst_element_link (encbin, filesink);

  ...

  vsrcpad = gst_element_get_src_pad(source, "src1");
  vsinkpad = gst_element_get_request_pad (encbin, "video_%d");
  gst_pad_link(vsrcpad, vsinkpad);

  ...


1.2 Explanation of the Various stages in EncodeBin
--------------------------------------------------

  This describes the various stages which can happen in order to end
  up with a multiplexed stream that can then be stored or streamed.

1.2.1 Incoming streams

  The streams fed to EncodeBin can be of various types:

  * Video
   * Uncompressed (but maybe subsampled)
   * Compressed
  * Audio
   * Uncompressed (audio/x-raw-{int|float})
   * Compressed
  * Timed text
  * Private streams


1.2.2 Steps involved for raw video encoding

(0) Incoming Stream

(1) Transform raw video feed (optional)

 Here we modify the various fundamental properties of a raw video
 stream to be compatible with the intersection of:
  * The encoder GstCaps and
  * The specified "Stream Restriction" of the profile/target

 The fundamental properties that can be modified are:
  * width/height
    This is done with a video scaler.
    The DAR (Display Aspect Ratio) MUST be respected.
    If needed, black borders can be added to comply with the target DAR.
  * framerate
  * format/colorspace/depth
    All of this is done with a colorspace converter

(2) Actual encoding (optional for raw streams)

 An encoder (with some optional settings) is used.

(3) Muxing

 A muxer (with some optional settings) is used.

(4) Outgoing encoded and muxed stream


1.2.3 Steps involved for raw audio encoding

 This is roughly the same as for raw video, expect for (1)

(1) Transform raw audo feed (optional)

 We modify the various fundamental properties of a raw audio stream to
 be compatible with the intersection of:
  * The encoder GstCaps and
  * The specified "Stream Restriction" of the profile/target

 The fundamental properties that can be modifier are:
 * Number of channels
 * Type of raw audio (integer or floating point)
 * Depth (number of bits required to encode one sample)


1.2.4 Steps involved for encoded audio/video streams

 Steps (1) and (2) are replaced by a parser if a parser is available
 for the given format.


1.2.5 Steps involved for other streams

 Other streams will just be forwarded as-is to the muxer, provided the
 muxer accepts the stream type.

 


2. Encoding Profile System
--------------------------

 This work is based on:
 * The existing GstPreset system for elements [0]
 * The gnome-media GConf audio profile system [1]
 * The investigation done into device profiles by Arista and
 Transmageddon [2 and 3]

2.2 Terminology
---------------

* Encoding Target Category
  A Target Category is a classification of devices/systems/use-cases
  for encoding.

  Such a classification is required in order for:
  * Applications with a very-specific use-case to limit the number of
    profiles they can offer the user. A screencasting application has
    no use with the online services targets for example.
  * Offering the user some initial classification in the case of a
    more generic encoding application (like a video editor or a
    transcoder).

  Ex:
   Consumer devices
   Online service
   Intermediate Editing Format
   Screencast
   Capture
   Computer

* Encoding Profile Target
  A Profile Target describes a specific entity for which we wish to
  encode.
  A Profile Target must belong to at least one Target Category.
  It will define at least one Encoding Profile.

  Ex (with category):
   Nokia N900 (Consumer device)
   Sony PlayStation 3 (Consumer device)
   Youtube (Online service)
   DNxHD (Intermediate editing format)
   HuffYUV (Screencast)
   Theora (Computer)

* Encoding Profile
  A specific combination of muxer, encoders, presets and limitations.

  Ex:
   Nokia N900/H264 HQ
   Ipod/High Quality
   DVD/Pal
   Youtube/High Quality
   HTML5/Low Bandwith
   DNxHD

2.3 Encoding Profile
--------------------

An encoding profile requires the following information:

 * Name
   This string is not translatable and must be unique.
   A recommendation to guarantee uniqueness of the naming could be:
      <target>/<name>
 * Description
   This is a translatable string describing the profile
 * Muxing format
   This is a string containing the GStreamer media-type of the
   container format.
 * Muxing preset
   This is an optional string describing the preset(s) to use on the
   muxer.
 * Multipass setting
   This is a boolean describing whether the profile requires several
   passes.
 * List of Stream Profile

2.3.1 Stream Profiles

A Stream Profile consists of:

 * Type
   The type of stream profile (audio, video, text, private-data)
 * Encoding Format
   This is a string containing the GStreamer media-type of the encoding
   format to be used. If encoding is not to be applied, the raw audio
   media type will be used.
 * Encoding preset
   This is an optional string describing the preset(s) to use on the
   encoder.
 * Restriction
   This is an optional GstCaps containing the restriction of the
   stream that can be fed to the encoder.
   This will generally containing restrictions in video
   width/heigh/framerate or audio depth.
 * presence
   This is an integer specifying how many streams can be used in the
   containing profile. 0 means that any number of streams can be
   used.
 * pass
   This is an integer which is only meaningful if the multipass flag
   has been set in the profile. If it has been set it indicates which
   pass this Stream Profile corresponds to.
 
2.4 Example profile
-------------------

The representation used here is XML only as an example. No decision is
made as to which formatting to use for storing targets and profiles.

<gst-encoding-target>
  <name>Nokia N900</name>
  <category>Consumer Device</category>
  <profiles>
    <profile>Nokia N900/H264 HQ</profile>
    <profile>Nokia N900/MP3</profile>
    <profile>Nokia N900/AAC</profile>
  </profiles>
</gst-encoding-target>

<gst-encoding-profile>
  <name>Nokia N900/H264 HQ</name>
  <description>
    High Quality H264/AAC for the Nokia N900
  </description>
  <format>video/quicktime,variant=iso</format>
  <streams>
    <stream-profile>
      <type>audio</type>
      <format>audio/mpeg,mpegversion=4</format>
      <preset>Quality High/Main</preset>
      <restriction>audio/x-raw-int,channels=[1,2]</restriction>
      <presence>1</presence>
    </stream-profile>
    <stream-profile>
      <type>video</type>
      <format>video/x-h264</format>
      <preset>Profile Baseline/Quality High</preset>
      <restriction>
        video/x-raw-yuv,width=[16, 800],\
        height=[16, 480],framerate=[1/1, 30000/1001]
      </restriction>
      <presence>1</presence>
    </stream-profile>
  </streams>
 
</gst-encoding-profile>

2.5 API
-------
  A proposed C API is contained in the gstprofile.h file in this
directory.


2.6 Modifications required in the existing GstPreset system
-----------------------------------------------------------

2.6.1. Temporary preset.

  Currently a preset needs to be saved on disk in order to be
  used.

  This makes it impossible to have temporary presets (that exist only
  during the lifetime of a process), which might be required in the
  new proposed profile system

2.6.2 Categorisation of presets.

  Currently presets are just aliases of a group of property/value
  without any meanings or explanation as to how they exclude each
  other.

  Take for example the H264 encoder. It can have presets for:
  * passes (1,2 or 3 passes)
  * profiles (Baseline, Main, ...)
  * quality (Low, medium, High)

  In order to programmatically know which presets exclude each other,
  we here propose the categorisation of these presets.

  This can be done in one of two ways
  1. in the name (by making the name be [<category>:]<name>)
    This would give for example: "Quality:High", "Profile:Baseline"
  2. by adding a new _meta key
    This would give for example: _meta/category:quality

2.6.3 Aggregation of presets.

  There can be more than one choice of presets to be done for an
  element (quality, profile, pass).

  This means that one can not currently describe the full
  configuration of an element with a single string but with many.

  The proposal here is to extend the GstPreset API to be able to set
  all presets using one string and a well-known separator ('/').

  This change only requires changes in the core preset handling code.

  This would allow doing the following:
  gst_preset_load_preset (h264enc,
                          "pass:1/profile:baseline/quality:high");

2.7 Points to be determined
---------------------------

  This document hasn't determined yet how to solve the following
  problems:

2.7.1 Storage of profiles

  One proposal for storage would be to use a system wide directory
  (like $prefix/share/gstreamer-0.10/profiles) and store XML files for
  every individual profiles.

  Users could then add their own profiles in ~/.gstreamer-0.10/profiles

  This poses some limitations as to what to do if some applications
  want to have some profiles limited to their own usage.





3. Helper library for profiles
------------------------------

 These helper methods could also be added to existing libraries (like
 GstPreset, GstPbUtils, ..).

 The various API proposed are in the accompanying gstprofile.h file.

3.1 Getting user-readable names for formats

 This is already provided by GstPbUtils.

3.2 Hierarchy of profiles

 The goal is for applications to be able to present to the user a list
 of combo-boxes for choosing their output profile:

 [      Category      ]       # optional, depends on the application
 [    Device/Site/..  ]       # optional, depends on the application
 [      Profile       ]

 Convenience methods are offered to easily get lists of categories,
 devices, and profiles.

3.3 Creating Profiles

 The goal is for applications to be able to easily create profiles.

 The applications needs to be able to have a fast/efficient way to:
 * select a container format and see all compatible streams he can use
 with it.
 * select a codec format and see which container formats he can use
 with it.

 The remaining parts concern the restrictions to encoder
 input.

3.4 Ensuring availability of plugins for Profiles

 When an application wishes to use a Profile, it should be able to
 query whether it has all the needed plugins to use it.

 This part will use GstPbUtils to query, and if needed install the
 missing plugins through the installed distribution plugin installer.




* Research links

  Some of these are still active documents, some other not

[0] GstPreset API documentation

http://gstreamer.freedesktop.org/data/doc/gstreamer/head/gstreamer/html/GstPreset.html

[1] gnome-media GConf profiles
    http://www.gnome.org/~bmsmith/gconf-docs/C/gnome-media.html

[2] Research on a Device Profile API
    http://gstreamer.freedesktop.org/wiki/DeviceProfile

[3] Research on defining presets usage
    http://gstreamer.freedesktop.org/wiki/PresetDesign



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel

encoding-research.txt (3K) Download Attachment
gstencodebin.h (1K) Download Attachment
gstprofile.h (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Encoding and Profiles

michael smith-6-3
On Mon, Oct 19, 2009 at 9:53 AM, Edward Hervey
<[hidden email]> wrote:
> Hi all,
>
>  I have been working lately on researching ways to make the whole
> encoding experience better and more streamlined for applications using
> GStreamer, and have come up with a proposal.

Great. Comments inline below:

>
>  Only two introspectable property (i.e. usable without extra API):
>  * A GstEncodingProfile*
>  * The name of the profile to use
>
>  When a profile is selected, encodebin will:
>  * Add REQUEST sinkpads for all the GstStreamProfile
>  * Create the muxer and expose the source pad
>
>  Whenever a request pad is created, encodebin will:
>  * Create the chain of elements for that pad
>  * Ghost the sink pad
>  * Return that ghost pad
>
>  This allows reducing the code to the minimum for applications
>  wishing to encode a source for a given profile:
>
>  ...
>
>  encbin = gst_element_factory_make("encodebin, NULL);
>  g_object_set (encbin, "profile", "N900/H264 HQ", NULL);

Perhaps "profile-name" ("profile" being reserved for the profile
object itself) would be a better name.

>
> 1.2.1 Incoming streams
>
>  The streams fed to EncodeBin can be of various types:
>
>  * Video
>   * Uncompressed (but maybe subsampled)
>   * Compressed
>  * Audio
>   * Uncompressed (audio/x-raw-{int|float})
>   * Compressed
>  * Timed text
>  * Private streams

Any ideas on how this allows re-muxing (without re-encoding) of
certain streams? This wouldn't be an essential feature for the initial
_implementation_, but I think keeping it in mind when designing the
APIs is pretty important. It looks like you've thought about this, but
it's not clear from this writeup what conclusions you came to :-)

Maybe some API to query what caps are available for re-muxing given
the current profile - then the app can check that, then either
continue decoding if the input stream is incompatible, or pass-through
if possible. Or should the application directly be querying the
profile, rather than going through APIs on the bin, for this stuff?


>
>
> 1.2.2 Steps involved for raw video encoding
>
> (0) Incoming Stream
>
> (1) Transform raw video feed (optional)
>
>  Here we modify the various fundamental properties of a raw video
>  stream to be compatible with the intersection of:
>  * The encoder GstCaps and
>  * The specified "Stream Restriction" of the profile/target
>
>  The fundamental properties that can be modified are:
>  * width/height
>    This is done with a video scaler.
>    The DAR (Display Aspect Ratio) MUST be respected.
>    If needed, black borders can be added to comply with the target DAR.
>  * framerate
>  * format/colorspace/depth
>    All of this is done with a colorspace converter

With respect to framerate, any thought on VFR streams? If the target
format supports VFR, then it'd be nice to be able to just encode the
input as-is, without having to force it to a specified framerate.

It'd probably also be good to have some way to select, and then set
properties on, the elements used here. e.g. the application probably
wants to be able to control what sort of scaling to do (to enable
high-quality scaling, for example, or low-quality/fast for preview
encodes). Obviously, the default would just work, so this would be
more optional API for more advanced applications.

>
> (2) Actual encoding (optional for raw streams)
>
>  An encoder (with some optional settings) is used.

Are you planning anything for specifying how the settings should work,
such that a profile could contain settings that apply to several
different encoders (probably selected by rank, or optionally forced by
the application), or will the settings be tied to a specific element?

>
> (3) Muxing
>
>  A muxer (with some optional settings) is used.
>
> (4) Outgoing encoded and muxed stream
>
>
> 1.2.3 Steps involved for raw audio encoding
>
>  This is roughly the same as for raw video, expect for (1)
>
> (1) Transform raw audo feed (optional)
>
>  We modify the various fundamental properties of a raw audio stream to
>  be compatible with the intersection of:
>  * The encoder GstCaps and
>  * The specified "Stream Restriction" of the profile/target
>
>  The fundamental properties that can be modifier are:
>  * Number of channels
>  * Type of raw audio (integer or floating point)
>  * Depth (number of bits required to encode one sample)
>
>
> 1.2.4 Steps involved for encoded audio/video streams
>
>  Steps (1) and (2) are replaced by a parser if a parser is available
>  for the given format.
>
>
> 1.2.5 Steps involved for other streams
>
>  Other streams will just be forwarded as-is to the muxer, provided the
>  muxer accepts the stream type.
>
>
>
>
> 2. Encoding Profile System
> --------------------------
>
>  This work is based on:
>  * The existing GstPreset system for elements [0]
>  * The gnome-media GConf audio profile system [1]
>  * The investigation done into device profiles by Arista and
>  Transmageddon [2 and 3]
>
> 2.2 Terminology
> ---------------
>
> * Encoding Target Category
>  A Target Category is a classification of devices/systems/use-cases
>  for encoding.
>
>  Such a classification is required in order for:
>  * Applications with a very-specific use-case to limit the number of
>    profiles they can offer the user. A screencasting application has
>    no use with the online services targets for example.
>  * Offering the user some initial classification in the case of a
>    more generic encoding application (like a video editor or a
>    transcoder).
>
>  Ex:
>   Consumer devices
>   Online service
>   Intermediate Editing Format
>   Screencast
>   Capture
>   Computer
>
> * Encoding Profile Target
>  A Profile Target describes a specific entity for which we wish to
>  encode.
>  A Profile Target must belong to at least one Target Category.
>  It will define at least one Encoding Profile.
>
>  Ex (with category):
>   Nokia N900 (Consumer device)
>   Sony PlayStation 3 (Consumer device)
>   Youtube (Online service)
>   DNxHD (Intermediate editing format)
>   HuffYUV (Screencast)
>   Theora (Computer)
>
> * Encoding Profile
>  A specific combination of muxer, encoders, presets and limitations.
>
>  Ex:
>   Nokia N900/H264 HQ
>   Ipod/High Quality
>   DVD/Pal
>   Youtube/High Quality
>   HTML5/Low Bandwith
>   DNxHD
>
> 2.3 Encoding Profile
> --------------------
>
> An encoding profile requires the following information:
>
>  * Name
>   This string is not translatable and must be unique.
>   A recommendation to guarantee uniqueness of the naming could be:
>      <target>/<name>
>  * Description
>   This is a translatable string describing the profile
>  * Muxing format
>   This is a string containing the GStreamer media-type of the
>   container format.
>  * Muxing preset
>   This is an optional string describing the preset(s) to use on the
>   muxer.
>  * Multipass setting
>   This is a boolean describing whether the profile requires several
>   passes.
>  * List of Stream Profile
>
> 2.3.1 Stream Profiles
>
> A Stream Profile consists of:
>
>  * Type
>   The type of stream profile (audio, video, text, private-data)
>  * Encoding Format
>   This is a string containing the GStreamer media-type of the encoding
>   format to be used. If encoding is not to be applied, the raw audio
>   media type will be used.
>  * Encoding preset
>   This is an optional string describing the preset(s) to use on the
>   encoder.
>  * Restriction
>   This is an optional GstCaps containing the restriction of the
>   stream that can be fed to the encoder.
>   This will generally containing restrictions in video
>   width/heigh/framerate or audio depth.
>  * presence
>   This is an integer specifying how many streams can be used in the
>   containing profile. 0 means that any number of streams can be
>   used.
>  * pass
>   This is an integer which is only meaningful if the multipass flag
>   has been set in the profile. If it has been set it indicates which
>   pass this Stream Profile corresponds to.
>
> 2.4 Example profile
> -------------------
>
> The representation used here is XML only as an example. No decision is
> made as to which formatting to use for storing targets and profiles.

Whatever decision in made as to the 'default' format for storing
these, I'd really like to see a sufficiently complete API that an
application that (for whatever reason) doesn't want to use that format
could build the GstEncodingProfile object itself, from its own data
store.



>
> <gst-encoding-target>
>  <name>Nokia N900</name>
>  <category>Consumer Device</category>
>  <profiles>
>    <profile>Nokia N900/H264 HQ</profile>
>    <profile>Nokia N900/MP3</profile>
>    <profile>Nokia N900/AAC</profile>
>  </profiles>
> </gst-encoding-target>
>
> <gst-encoding-profile>
>  <name>Nokia N900/H264 HQ</name>
>  <description>
>    High Quality H264/AAC for the Nokia N900
>  </description>
>  <format>video/quicktime,variant=iso</format>
>  <streams>
>    <stream-profile>
>      <type>audio</type>
>      <format>audio/mpeg,mpegversion=4</format>
>      <preset>Quality High/Main</preset>
>      <restriction>audio/x-raw-int,channels=[1,2]</restriction>
>      <presence>1</presence>
>    </stream-profile>
>    <stream-profile>
>      <type>video</type>
>      <format>video/x-h264</format>
>      <preset>Profile Baseline/Quality High</preset>
>      <restriction>
>        video/x-raw-yuv,width=[16, 800],\
>        height=[16, 480],framerate=[1/1, 30000/1001]
>      </restriction>
>      <presence>1</presence>
>    </stream-profile>
>  </streams>
>
> </gst-encoding-profile>

This describes the constraints on the device (or whatever). Have you
thought at all about splitting out "constraints on what the target can
accept" from "what we actually want to encode"?

e.g. this profile says that I can do any size (within that range)
video, but my application wants to encode at a particular size -
should I be replacing the caps in the profile at runtime, or should
there be another object to represent these (somewhat different)
concepts?

What about constraints that are not (currently, at least) expressible
through caps? e.g. bitrate, profiles, etc?


Anyway, I don't have time right now to continue through this in enough
depth - and I'm sure some of my remarks miss something you've already
thought about - but this was just to throw some more ideas into the
mix.

I'm very happy to see you looking into this more deeply!

Mike

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Encoding and Profiles

Edward Hervey
Administrator
On Mon, 2009-10-19 at 10:54 -0700, Michael Smith wrote:
> >
> >  encbin = gst_element_factory_make("encodebin, NULL);
> >  g_object_set (encbin, "profile", "N900/H264 HQ", NULL);
>
> Perhaps "profile-name" ("profile" being reserved for the profile
> object itself) would be a better name.

  Yes, having both would be nice. I'll add that.

>
> >
> > 1.2.1 Incoming streams
> >
> >  The streams fed to EncodeBin can be of various types:
> >
> >  * Video
> >   * Uncompressed (but maybe subsampled)
> >   * Compressed
> >  * Audio
> >   * Uncompressed (audio/x-raw-{int|float})
> >   * Compressed
> >  * Timed text
> >  * Private streams
>
> Any ideas on how this allows re-muxing (without re-encoding) of
> certain streams? This wouldn't be an essential feature for the initial
> _implementation_, but I think keeping it in mind when designing the
> APIs is pretty important.

  That is definitely going to go in the first implementation (one of the
initial tests I have in mind is the simple remux-to-same-format and
remux-in-compatible muxer).

  The current experience so far when doing transmuxing, is that we need
to have a parser (like mpegaudioparse, mpegvideoparse,...) to verify
that the stream is properly formatted, that the buffers are correctly
packetized, timestamps properly set, etc...

  So the idea is when no re-encoding is involved, to use a parser for
that stream format is present.

>  It looks like you've thought about this, but
> it's not clear from this writeup what conclusions you came to :-)



>
> Maybe some API to query what caps are available for re-muxing given
> the current profile

Something like

/**
 * gst_encoding_profile_get_input_caps:
 * @profile: a #GstEncodingProfile
 *
 * Returns: the list of all caps the profile can accept. Caller must call
 * gst_cap_unref on all unwanted caps once it is done with the list.
 */
GList * gst_profile_get_input_caps (GstEncodingProfile *profile);

>  - then the app can check that, then either
> continue decoding if the input stream is incompatible, or pass-through
> if possible. Or should the application directly be querying the
> profile, rather than going through APIs on the bin, for this stuff?

  The generic idea is:
 * to not have any special API on EncodeBin (except for the standard
caps request, state change, .. API)
 * to delay creation of EncodeBin as late as possible and work only with
GstEncodingProfile API until then.

>
>
> >
> >
> > 1.2.2 Steps involved for raw video encoding
> >
> > (0) Incoming Stream
> >
> > (1) Transform raw video feed (optional)
> >
> >  Here we modify the various fundamental properties of a raw video
> >  stream to be compatible with the intersection of:
> >  * The encoder GstCaps and
> >  * The specified "Stream Restriction" of the profile/target
> >
> >  The fundamental properties that can be modified are:
> >  * width/height
> >    This is done with a video scaler.
> >    The DAR (Display Aspect Ratio) MUST be respected.
> >    If needed, black borders can be added to comply with the target DAR.
> >  * framerate
> >  * format/colorspace/depth
> >    All of this is done with a colorspace converter
>
> With respect to framerate, any thought on VFR streams? If the target
> format supports VFR, then it'd be nice to be able to just encode the
> input as-is, without having to force it to a specified framerate.

  Hadn't thought about that one, and there are indeed use-cases where
you'd want that (webcams that change their framerate depending on the
light level for example, and you don't want to use videorate in those
cases).

 Adding a boolean variable_framerate in GstVideoEncodingProfile would be
an option then, and have it to FALSE by default.

/**
 * GstVideoEncodingProfile:
 * @profile: common #GstEncodingProfile part.
 * @pass: The pass number if this is part of a multi-pass profile. Starts at 1
 * for multi-pass. Set to 0 if this is not part of a multi-pass profile.
 * @variable_framerate: Do not enforce framerate on incoming raw stream. Default
 * is FALSE.
 */
struct _GstVideoEncodingProfile {
  GstStreamEncodingProfile      profile;
  guint                         pass;
  gboolean                      variable_framerate;
};


>
> It'd probably also be good to have some way to select, and then set
> properties on, the elements used here. e.g. the application probably
> wants to be able to control what sort of scaling to do (to enable
> high-quality scaling, for example, or low-quality/fast for preview
> encodes). Obviously, the default would just work, so this would be
> more optional API for more advanced applications.

  That's the problem with trying to design the
one-API-to-rule-them-all :)

  Seriously though... the problem is that if we expose everything... we
just come back to square one (or maybe two, but not that far ahead).

  The other problem is also that we might have several 'converters'
available, none of them having well-known properties. Maybe some
platform might have a differently named converter, ...

  An intermediate solution might be to provide a quality/speed knob over
those conversions.
  Maybe have it as a boolean. So by default you would get the highest
quality of conversion available (if you're in a live pipeline, QoS would
kick in to lower the quality so you don't lose any data), but if you
flip that boolean, you would get a low-quality/as-fast-as-possible
conversion.


>
> >
> > (2) Actual encoding (optional for raw streams)
> >
> >  An encoder (with some optional settings) is used.
>
> Are you planning anything for specifying how the settings should work,
> such that a profile could contain settings that apply to several
> different encoders (probably selected by rank, or optionally forced by
> the application), or will the settings be tied to a specific element?

  Right now the settings would be tied to a specific element, since the
profile system relies exclusively on presets (which are tied to an
element) for properties of an element.

  The rationale behind this... is that we have no unified properties
across elements, let alone across encoders, let alone across different
encoders for the same format.

  I'd *LOVE* to have a unified system for properties which are common to
encoders... but every time I put my head down on that problem.. I only
see one solution : base classes for encoders (guaranteeing *some*
consistency in properties).


> >
> > The representation used here is XML only as an example. No decision is
> > made as to which formatting to use for storing targets and profiles.
>
> Whatever decision in made as to the 'default' format for storing
> these, I'd really like to see a sufficiently complete API that an
> application that (for whatever reason) doesn't want to use that format
> could build the GstEncodingProfile object itself, from its own data
> store.

  Absolutely, the storage format which will be decided will only be the
reference one. It's important to get that one right, since the goal is
for it to be the one system-wide profiles (and those shipped in
gstreamer modules) will come in.

  BUT, we do want to leave the possibility for applications (or any
service) to create those profiles on their own.

>
>
>
> >
> > <gst-encoding-target>
> >  <name>Nokia N900</name>
> >  <category>Consumer Device</category>
> >  <profiles>
> >    <profile>Nokia N900/H264 HQ</profile>
> >    <profile>Nokia N900/MP3</profile>
> >    <profile>Nokia N900/AAC</profile>
> >  </profiles>
> > </gst-encoding-target>
> >
> > <gst-encoding-profile>
> >  <name>Nokia N900/H264 HQ</name>
> >  <description>
> >    High Quality H264/AAC for the Nokia N900
> >  </description>
> >  <format>video/quicktime,variant=iso</format>
> >  <streams>
> >    <stream-profile>
> >      <type>audio</type>
> >      <format>audio/mpeg,mpegversion=4</format>
> >      <preset>Quality High/Main</preset>
> >      <restriction>audio/x-raw-int,channels=[1,2]</restriction>
> >      <presence>1</presence>
> >    </stream-profile>
> >    <stream-profile>
> >      <type>video</type>
> >      <format>video/x-h264</format>
> >      <preset>Profile Baseline/Quality High</preset>
> >      <restriction>
> >        video/x-raw-yuv,width=[16, 800],\
> >        height=[16, 480],framerate=[1/1, 30000/1001]
> >      </restriction>
> >      <presence>1</presence>
> >    </stream-profile>
> >  </streams>
> >
> > </gst-encoding-profile>
>
> This describes the constraints on the device (or whatever). Have you
> thought at all about splitting out "constraints on what the target can
> accept" from "what we actually want to encode"?
>
> e.g. this profile says that I can do any size (within that range)
> video, but my application wants to encode at a particular size -
> should I be replacing the caps in the profile at runtime, or should
> there be another object to represent these (somewhat different)
> concepts?

  I guess I should have put another example, where the target is a less
'flexible' device.
  Let's say your target is a portable device that only support
320x240@25fps, then the profile would have those very specific caps in
the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1)

  Do you have any more specific example in mind with the above
use-case ? Did you mean you wanted the application to be able to
fine-tune even more the profile at runtime ?

>
> What about constraints that are not (currently, at least) expressible
> through caps? e.g. bitrate, profiles, etc?

  Those are tunable through the presets (through which all properties
are expressed), in the N900 example above, it is set to baseline profile
and the bitrate corresponding to "Quality High".

>
>
> Anyway, I don't have time right now to continue through this in enough
> depth - and I'm sure some of my remarks miss something you've already
> thought about - but this was just to throw some more ideas into the
> mix.

  I'm looking forward to the rest of your comments,

>
> I'm very happy to see you looking into this more deeply!

  Thank you

     Edward

>
> Mike
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gstreamer-devel



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Encoding and Profiles

michael smith-6-3
>>
>> It'd probably also be good to have some way to select, and then set
>> properties on, the elements used here. e.g. the application probably
>> wants to be able to control what sort of scaling to do (to enable
>> high-quality scaling, for example, or low-quality/fast for preview
>> encodes). Obviously, the default would just work, so this would be
>> more optional API for more advanced applications.
>
>  That's the problem with trying to design the
> one-API-to-rule-them-all :)
>
>  Seriously though... the problem is that if we expose everything... we
> just come back to square one (or maybe two, but not that far ahead).
>
>  The other problem is also that we might have several 'converters'
> available, none of them having well-known properties. Maybe some
> platform might have a differently named converter, ...
>
>  An intermediate solution might be to provide a quality/speed knob over
> those conversions.
>  Maybe have it as a boolean. So by default you would get the highest
> quality of conversion available (if you're in a live pipeline, QoS would
> kick in to lower the quality so you don't lose any data), but if you
> flip that boolean, you would get a low-quality/as-fast-as-possible
> conversion.

I was basically just thinking of something along the lines of what
playbin does - by default, just do something sane (here, videoscale
with the scaling mode set to the highest-quality mode), but also allow
the app to just say "use this element instead" - the app can then
override things as much as it wants to, but it doesn't _have to_,
since the defaults work sensibly.



>
>
>>
>> >
>> > (2) Actual encoding (optional for raw streams)
>> >
>> >  An encoder (with some optional settings) is used.
>>
>> Are you planning anything for specifying how the settings should work,
>> such that a profile could contain settings that apply to several
>> different encoders (probably selected by rank, or optionally forced by
>> the application), or will the settings be tied to a specific element?
>
>  Right now the settings would be tied to a specific element, since the
> profile system relies exclusively on presets (which are tied to an
> element) for properties of an element.
>
>  The rationale behind this... is that we have no unified properties
> across elements, let alone across encoders, let alone across different
> encoders for the same format.
>
>  I'd *LOVE* to have a unified system for properties which are common to
> encoders... but every time I put my head down on that problem.. I only
> see one solution : base classes for encoders (guaranteeing *some*
> consistency in properties).

Yeah, probably. That's pretty unfortunate, though. e.g. in songbird,
the profile I use will have something to do with the user's
configuration and the device we're transcoding for - but the actual
elements available to satisfy that profile will be different across
platforms and depend on what things the user has installed.

I can't see myself using this system if it's tightly tied to specific
elements for encoders (parsers, muxers, decoders, scalers, etc are
less problematic), which suggests that we do need _some_ mechanism to
use these things across multiple elements, even if it requires custom
application code rather than being automatic.

>> This describes the constraints on the device (or whatever). Have you
>> thought at all about splitting out "constraints on what the target can
>> accept" from "what we actually want to encode"?
>>
>> e.g. this profile says that I can do any size (within that range)
>> video, but my application wants to encode at a particular size -
>> should I be replacing the caps in the profile at runtime, or should
>> there be another object to represent these (somewhat different)
>> concepts?
>
>  I guess I should have put another example, where the target is a less
> 'flexible' device.
>  Let's say your target is a portable device that only support
> 320x240@25fps, then the profile would have those very specific caps in
> the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1)
>
>  Do you have any more specific example in mind with the above
> use-case ? Did you mean you wanted the application to be able to
> fine-tune even more the profile at runtime ?

Yeah - I want the app to fine tune.

Let's suppose the following use-case:
 - User has an input video they got from the internet somewhere. It's
640x480, 30 fps, theora.
 - User has a mobile phone that can play H.264 video at up to 720p (so
it can do the video at this resolution).
 - User wants to encode it at 320x240 to fit on their little micro-sd card.

So the video is already a supported size - but the application wants
to scale it smaller because the user has chosen that option - I don't
quite understand how that's meant to be expressed in your profiles/API
right now (I might be missing something).


>
>>
>> What about constraints that are not (currently, at least) expressible
>> through caps? e.g. bitrate, profiles, etc?
>
>  Those are tunable through the presets (through which all properties
> are expressed), in the N900 example above, it is set to baseline profile
> and the bitrate corresponding to "Quality High".

But the presets specify a particular set of settings, not the target
constraints. So there's no way to say "this device supports bitrates
up to 4 Mbps", but have a default bitrate for this profile of 2 Mbps,
I think?

I don't think the element presets are particularly helpful here - they
express "a particular configuration" not "a range of possibilities".

Mike

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Encoding and Profiles

Robert Swain-2
In reply to this post by Edward Hervey-2
Hello,

2009/10/19 Edward Hervey <[hidden email]>:
>  I have been working lately on researching ways to make the whole
> encoding experience better and more streamlined for applications using
> GStreamer, and have come up with a proposal.
>
>  You will find the proposal below, and attached to the mail the
> research and proposed C API.
>
>  Comments and feedback are most welcome

Great! Looks good. :)

A few notes:

- It's not clear if it's possible to derive target profiles from other
but it seems like it's been considered. My thought is that encoder
quality presets are usually orthogonal to profile/level and
device-specific restrictions. I think something like the following
might be useful:

* Quality-related
** Hypothetically, say we have qth264enc and x264enc. They can have
presets created for options which affect the encoding time/quality.
** Quality profiles can use these presets and add to them suggestions
for the available rate control methods (e.g. quantiser, bit rate or
so)

* Restriction-related
** Profile/level restrictions
** Device-specific restrictions

My point is that codec elements always have rather specific
configuration options and it would be good to maintain some kind of
range of quality option presets for each encoder that trade off speed
and quality for a given bit rate. If these options are included in
each target preset it will greatly increase maintenance and it's a big
enough job as it is. The whole set up lends itself well to this as
target devices could just override options as necessary.

- System-wide versus application-specific versus user-specific
profiles can be manageable through the API without too much difficulty
I think. If scope is added in the API for providing a path from which
one can load a profile then applications can use their own profiles.
Similarly if scope is added for creating non-stored profiles in memory
and just passing a pointer, this allows users/application developers
to create profiles however they like.

- To manage stream copying it should be simple enough to probe the
caps of the target profile to check if an input stream is supported
and if so flag somehow that it can be copied so that this can be
checked and specified whether to copy or transcode in the application.

- I think the API should allow an application to probe the profile to
find what ranges are supported and allow customisation within those
ranges.

Michael's use case about the device being able to play up to 720p but
to save space on the low capacity storage device he wants to encode at
a lower resolution/bit rate is sound.

Also, one may have a device that can play a higher resolution and has
some video output to hook up to a higher resolution display device,
but the playback device itself only has a small display. If one never
uses the video output functionality, one might want to restrict the
resolution.

Similarly one might want to downmix audio or so even though 5.1 is
supported e.g. a PS3 hooked up to a stereo amplifier.


In short: target profiles should specify the range of operation of a
target device, essentially being a device caps specification plus any
necessary encoder/muxer options (e.g. setting 0 b-frames or the mux
rate) to make things work on the device. Encoder element presets
should consider an unrestricted playback service and be focused on
speed/quality trade-off.


- Finally with regard to Michael's concerns about tying a target
profile to specific elements - how about one being able to specify
multiple possible elements for encode/mux a particular mime-type and
having some hierarchy for selection based on which is deemed best like
the rest of the GStreamer system works (I think... I'm new here. :))

Best regards,
Rob

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Encoding and Profiles

Edward Hervey
Administrator
On Tue, 2009-10-20 at 13:12 +0100, Robert Swain wrote:

> Hello,
>
> 2009/10/19 Edward Hervey <[hidden email]>:
> >  I have been working lately on researching ways to make the whole
> > encoding experience better and more streamlined for applications using
> > GStreamer, and have come up with a proposal.
> >
> >  You will find the proposal below, and attached to the mail the
> > research and proposed C API.
> >
> >  Comments and feedback are most welcome
>
> Great! Looks good. :)
>
> A few notes:
>
> - It's not clear if it's possible to derive target profiles from other
> but it seems like it's been considered.

  I'm not 100% certain about this. The main issue that comes to mind is
that if we start making a hierarchy of profiles, we'll have to make sure
that any change in a parent profile doesn't have any ill-effect on any
of the sub-profiles... including those you don't control.

>  My thought is that encoder
> quality presets are usually orthogonal to profile/level and
> device-specific restrictions. I think something like the following
> might be useful:
>
> * Quality-related
> ** Hypothetically, say we have qth264enc and x264enc. They can have
> presets created for options which affect the encoding time/quality.
> ** Quality profiles can use these presets and add to them suggestions
> for the available rate control methods (e.g. quantiser, bit rate or
> so)

  I don't really understand this part :(

>
> * Restriction-related
> ** Profile/level restrictions
> ** Device-specific restrictions
>
> My point is that codec elements always have rather specific
> configuration options and it would be good to maintain some kind of
> range of quality option presets for each encoder that trade off speed
> and quality for a given bit rate. If these options are included in
> each target preset it will greatly increase maintenance and it's a big
> enough job as it is. The whole set up lends itself well to this as
> target devices could just override options as necessary.

  Profile/Level are clear restrictions. If you choose those, you're
already limiting the rest of the choices (1)

  Then amongst the remaining choices I can only think three ways of
going:
  * Expose all remaining element-specific options...
  * Having a Quality/Speed setting ranging from Low-Quality/Fast to
HighestQuality/Sloww
  * A mix of the two :(

(1): that reminds me... if you select a certain profile that limits the
range of some option... how do we report that ?

>
> - System-wide versus application-specific versus user-specific
> profiles can be manageable through the API without too much difficulty
> I think. If scope is added in the API for providing a path from which
> one can load a profile then applications can use their own profiles.
> Similarly if scope is added for creating non-stored profiles in memory
> and just passing a pointer, this allows users/application developers
> to create profiles however they like.

  As stated before, all profiles will be available a C
structures/objects. Different backends can be written to support various
storage formats.

>
> - To manage stream copying it should be simple enough to probe the
> caps of the target profile to check if an input stream is supported
> and if so flag somehow that it can be copied so that this can be
> checked and specified whether to copy or transcode in the application.

  For that you would need to know the available streams in the file you
wish to convert. I'm working on that part, but should go along with
this.

>
> - I think the API should allow an application to probe the profile to
> find what ranges are supported and allow customisation within those
> ranges.

  Ranges of what ?

>
> Michael's use case about the device being able to play up to 720p but
> to save space on the low capacity storage device he wants to encode at
> a lower resolution/bit rate is sound.

  Agreed.

>
> Also, one may have a device that can play a higher resolution and has
> some video output to hook up to a higher resolution display device,
> but the playback device itself only has a small display. If one never
> uses the video output functionality, one might want to restrict the
> resolution.

  That could just be another target.
  Ex : "N1234/H264" and "N1234/H264 External viewing"

>
> Similarly one might want to downmix audio or so even though 5.1 is
> supported e.g. a PS3 hooked up to a stereo amplifier.

  Can be overridden in the profile once you've loaded it.

>
>
> In short: target profiles should specify the range of operation of a
> target device, essentially being a device caps specification plus any
> necessary encoder/muxer options (e.g. setting 0 b-frames or the mux
> rate) to make things work on the device. Encoder element presets
> should consider an unrestricted playback service and be focused on
> speed/quality trade-off.
>
>
> - Finally with regard to Michael's concerns about tying a target
> profile to specific elements - how about one being able to specify
> multiple possible elements for encode/mux a particular mime-type and
> having some hierarchy for selection based on which is deemed best like
> the rest of the GStreamer system works (I think... I'm new here. :))

  That's why I'm only specifying GstCaps in the profile. I'm figuring
out some extra signals on EncodeBin so that one can override/reorder the
selected muxers/encoders.

     Edward

>
> Best regards,
> Rob
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gstreamer-devel


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Encoding and Profiles

Edward Hervey
Administrator
In reply to this post by michael smith-6-3
On Mon, 2009-10-19 at 12:23 -0700, Michael Smith wrote:

> >
> >  That's the problem with trying to design the
> > one-API-to-rule-them-all :)
> >
> >  Seriously though... the problem is that if we expose everything... we
> > just come back to square one (or maybe two, but not that far ahead).
> >
> >  The other problem is also that we might have several 'converters'
> > available, none of them having well-known properties. Maybe some
> > platform might have a differently named converter, ...
> >
> >  An intermediate solution might be to provide a quality/speed knob over
> > those conversions.
> >  Maybe have it as a boolean. So by default you would get the highest
> > quality of conversion available (if you're in a live pipeline, QoS would
> > kick in to lower the quality so you don't lose any data), but if you
> > flip that boolean, you would get a low-quality/as-fast-as-possible
> > conversion.
>
> I was basically just thinking of something along the lines of what
> playbin does - by default, just do something sane (here, videoscale
> with the scaling mode set to the highest-quality mode), but also allow
> the app to just say "use this element instead" - the app can then
> override things as much as it wants to, but it doesn't _have to_,
> since the defaults work sensibly.
>

 Something like this ?

videoscale          : The video scaler to use for converting video
                      streams (if needed).
                      flags: readable/writable
                      Object of type "GstElement" (default : videoscale)
videocolorspace       The colorspace converter to use
                      flags: readable/writable
                      Object of type "GstElement" (default :
ffmpegcolorspace)



>
>
> >
> >
> >>
> >> >
> >> > (2) Actual encoding (optional for raw streams)
> >> >
> >> >  An encoder (with some optional settings) is used.
> >>
> >> Are you planning anything for specifying how the settings should work,
> >> such that a profile could contain settings that apply to several
> >> different encoders (probably selected by rank, or optionally forced by
> >> the application), or will the settings be tied to a specific element?
> >
> >  Right now the settings would be tied to a specific element, since the
> > profile system relies exclusively on presets (which are tied to an
> > element) for properties of an element.
> >
> >  The rationale behind this... is that we have no unified properties
> > across elements, let alone across encoders, let alone across different
> > encoders for the same format.
> >
> >  I'd *LOVE* to have a unified system for properties which are common to
> > encoders... but every time I put my head down on that problem.. I only
> > see one solution : base classes for encoders (guaranteeing *some*
> > consistency in properties).
>
> Yeah, probably. That's pretty unfortunate, though. e.g. in songbird,
> the profile I use will have something to do with the user's
> configuration and the device we're transcoding for - but the actual
> elements available to satisfy that profile will be different across
> platforms and depend on what things the user has installed.
>
> I can't see myself using this system if it's tightly tied to specific
> elements for encoders (parsers, muxers, decoders, scalers, etc are
> less problematic), which suggests that we do need _some_ mechanism to
> use these things across multiple elements, even if it requires custom
> application code rather than being automatic.

  One way (specifying element names and properties) or the other
(specifying caps and presets), it's going to require some custom work to
be done.

  The reason I prefer/recommend going the caps/presets way is that most
of the work will be done *in* the element and preset, and much less (if
not none) in the profiles and applications.

>
> >> This describes the constraints on the device (or whatever). Have you
> >> thought at all about splitting out "constraints on what the target can
> >> accept" from "what we actually want to encode"?
> >>
> >> e.g. this profile says that I can do any size (within that range)
> >> video, but my application wants to encode at a particular size -
> >> should I be replacing the caps in the profile at runtime, or should
> >> there be another object to represent these (somewhat different)
> >> concepts?
> >
> >  I guess I should have put another example, where the target is a less
> > 'flexible' device.
> >  Let's say your target is a portable device that only support
> > 320x240@25fps, then the profile would have those very specific caps in
> > the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1)
> >
> >  Do you have any more specific example in mind with the above
> > use-case ? Did you mean you wanted the application to be able to
> > fine-tune even more the profile at runtime ?
>
> Yeah - I want the app to fine tune.
>
> Let's suppose the following use-case:
>  - User has an input video they got from the internet somewhere. It's
> 640x480, 30 fps, theora.
>  - User has a mobile phone that can play H.264 video at up to 720p (so
> it can do the video at this resolution).
>  - User wants to encode it at 320x240 to fit on their little micro-sd card.
>
> So the video is already a supported size - but the application wants
> to scale it smaller because the user has chosen that option - I don't
> quite understand how that's meant to be expressed in your profiles/API
> right now (I might be missing something).

  In that case, your target for that device would have a restriction
caps along the looks of :
   video/x-raw-yuv,width=[16,1280],height=[16,720],framerate=[0/1,
1000/1]

  Meaning that your device can playback any videos between 16x16 and
1280x720.

  The example I gave before with 320x240@25 would be the case for
devices that can only playback one and only resolution/fps.

>
>
> >
> >>
> >> What about constraints that are not (currently, at least) expressible
> >> through caps? e.g. bitrate, profiles, etc?
> >
> >  Those are tunable through the presets (through which all properties
> > are expressed), in the N900 example above, it is set to baseline profile
> > and the bitrate corresponding to "Quality High".
>
> But the presets specify a particular set of settings, not the target
> constraints. So there's no way to say "this device supports bitrates
> up to 4 Mbps", but have a default bitrate for this profile of 2 Mbps,
> I think?
>
> I don't think the element presets are particularly helpful here - they
> express "a particular configuration" not "a range of possibilities".


   Are you saying presets don't satisfy all the requirements here ? I
completely agree. But apart from trying to extend presets there's only
one ugly other option:
  * make sure all properties (like bitrate) for encoders have the
SAME/EXACT/GUARANTEED name and meaning and range ...

 ... and that one doesn't seem trivial either.


        Edward

>
> Mike



------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Encoding and Profiles

Stefan Sauer
Edward Hervey schrieb:

> On Mon, 2009-10-19 at 12:23 -0700, Michael Smith wrote:
>
>>>  That's the problem with trying to design the
>>> one-API-to-rule-them-all :)
>>>
>>>  Seriously though... the problem is that if we expose everything... we
>>> just come back to square one (or maybe two, but not that far ahead).
>>>
>>>  The other problem is also that we might have several 'converters'
>>> available, none of them having well-known properties. Maybe some
>>> platform might have a differently named converter, ...
>>>
>>>  An intermediate solution might be to provide a quality/speed knob over
>>> those conversions.
>>>  Maybe have it as a boolean. So by default you would get the highest
>>> quality of conversion available (if you're in a live pipeline, QoS would
>>> kick in to lower the quality so you don't lose any data), but if you
>>> flip that boolean, you would get a low-quality/as-fast-as-possible
>>> conversion.
>> I was basically just thinking of something along the lines of what
>> playbin does - by default, just do something sane (here, videoscale
>> with the scaling mode set to the highest-quality mode), but also allow
>> the app to just say "use this element instead" - the app can then
>> override things as much as it wants to, but it doesn't _have to_,
>> since the defaults work sensibly.
>>
>
>  Something like this ?
>
> videoscale          : The video scaler to use for converting video
>                       streams (if needed).
>                       flags: readable/writable
>                       Object of type "GstElement" (default : videoscale)
> videocolorspace       The colorspace converter to use
>                       flags: readable/writable
>                       Object of type "GstElement" (default :
> ffmpegcolorspace)
>
>

Yes, something we should also do on playbin2, camerabin. Maybe we could also
have a autotransform elements that gets a klass and picks the highest ranked
element from the class (don't think we want autovideoscale, autocolorspace, ...)

Not sure if we should name them video-scale and colorspace-convert.

Stefan


>
>>
>>>
>>>>> (2) Actual encoding (optional for raw streams)
>>>>>
>>>>>  An encoder (with some optional settings) is used.
>>>> Are you planning anything for specifying how the settings should work,
>>>> such that a profile could contain settings that apply to several
>>>> different encoders (probably selected by rank, or optionally forced by
>>>> the application), or will the settings be tied to a specific element?
>>>  Right now the settings would be tied to a specific element, since the
>>> profile system relies exclusively on presets (which are tied to an
>>> element) for properties of an element.
>>>
>>>  The rationale behind this... is that we have no unified properties
>>> across elements, let alone across encoders, let alone across different
>>> encoders for the same format.
>>>
>>>  I'd *LOVE* to have a unified system for properties which are common to
>>> encoders... but every time I put my head down on that problem.. I only
>>> see one solution : base classes for encoders (guaranteeing *some*
>>> consistency in properties).
>> Yeah, probably. That's pretty unfortunate, though. e.g. in songbird,
>> the profile I use will have something to do with the user's
>> configuration and the device we're transcoding for - but the actual
>> elements available to satisfy that profile will be different across
>> platforms and depend on what things the user has installed.
>>
>> I can't see myself using this system if it's tightly tied to specific
>> elements for encoders (parsers, muxers, decoders, scalers, etc are
>> less problematic), which suggests that we do need _some_ mechanism to
>> use these things across multiple elements, even if it requires custom
>> application code rather than being automatic.
>
>   One way (specifying element names and properties) or the other
> (specifying caps and presets), it's going to require some custom work to
> be done.
>
>   The reason I prefer/recommend going the caps/presets way is that most
> of the work will be done *in* the element and preset, and much less (if
> not none) in the profiles and applications.
>
>>>> This describes the constraints on the device (or whatever). Have you
>>>> thought at all about splitting out "constraints on what the target can
>>>> accept" from "what we actually want to encode"?
>>>>
>>>> e.g. this profile says that I can do any size (within that range)
>>>> video, but my application wants to encode at a particular size -
>>>> should I be replacing the caps in the profile at runtime, or should
>>>> there be another object to represent these (somewhat different)
>>>> concepts?
>>>  I guess I should have put another example, where the target is a less
>>> 'flexible' device.
>>>  Let's say your target is a portable device that only support
>>> 320x240@25fps, then the profile would have those very specific caps in
>>> the restriction. (video/x-raw-yuv,width=320,height=240,framerate=25/1)
>>>
>>>  Do you have any more specific example in mind with the above
>>> use-case ? Did you mean you wanted the application to be able to
>>> fine-tune even more the profile at runtime ?
>> Yeah - I want the app to fine tune.
>>
>> Let's suppose the following use-case:
>>  - User has an input video they got from the internet somewhere. It's
>> 640x480, 30 fps, theora.
>>  - User has a mobile phone that can play H.264 video at up to 720p (so
>> it can do the video at this resolution).
>>  - User wants to encode it at 320x240 to fit on their little micro-sd card.
>>
>> So the video is already a supported size - but the application wants
>> to scale it smaller because the user has chosen that option - I don't
>> quite understand how that's meant to be expressed in your profiles/API
>> right now (I might be missing something).
>
>   In that case, your target for that device would have a restriction
> caps along the looks of :
>    video/x-raw-yuv,width=[16,1280],height=[16,720],framerate=[0/1,
> 1000/1]
>
>   Meaning that your device can playback any videos between 16x16 and
> 1280x720.
>
>   The example I gave before with 320x240@25 would be the case for
> devices that can only playback one and only resolution/fps.
>
>>
>>>> What about constraints that are not (currently, at least) expressible
>>>> through caps? e.g. bitrate, profiles, etc?
>>>  Those are tunable through the presets (through which all properties
>>> are expressed), in the N900 example above, it is set to baseline profile
>>> and the bitrate corresponding to "Quality High".
>> But the presets specify a particular set of settings, not the target
>> constraints. So there's no way to say "this device supports bitrates
>> up to 4 Mbps", but have a default bitrate for this profile of 2 Mbps,
>> I think?
>>
>> I don't think the element presets are particularly helpful here - they
>> express "a particular configuration" not "a range of possibilities".
>
>
>    Are you saying presets don't satisfy all the requirements here ? I
> completely agree. But apart from trying to extend presets there's only
> one ugly other option:
>   * make sure all properties (like bitrate) for encoders have the
> SAME/EXACT/GUARANTEED name and meaning and range ...
>
>  ... and that one doesn't seem trivial either.
>
>
>         Edward
>
>> Mike
>
>
>
> ------------------------------------------------------------------------------
> Come build with us! The BlackBerry(R) Developer Conference in SF, CA
> is the only developer event you need to attend this year. Jumpstart your
> developing skills, take BlackBerry mobile applications to market and stay
> ahead of the curve. Join us from November 9 - 12, 2009. Register now!
> http://p.sf.net/sfu/devconference
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/gstreamer-devel


------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Encoding and Profiles

Stefan Sauer
In reply to this post by Edward Hervey
Edward Hervey schrieb:

> On Tue, 2009-10-20 at 13:12 +0100, Robert Swain wrote:
>> Hello,
>>
>> 2009/10/19 Edward Hervey <[hidden email]>:
>>>  I have been working lately on researching ways to make the whole
>>> encoding experience better and more streamlined for applications using
>>> GStreamer, and have come up with a proposal.
>>>
>>>  You will find the proposal below, and attached to the mail the
>>> research and proposed C API.
>>>

<snip>

>> * Restriction-related
>> ** Profile/level restrictions
>> ** Device-specific restrictions
>>
>> My point is that codec elements always have rather specific
>> configuration options and it would be good to maintain some kind of
>> range of quality option presets for each encoder that trade off speed
>> and quality for a given bit rate. If these options are included in
>> each target preset it will greatly increase maintenance and it's a big
>> enough job as it is. The whole set up lends itself well to this as
>> target devices could just override options as necessary.
>
>   Profile/Level are clear restrictions. If you choose those, you're
> already limiting the rest of the choices (1)
>
>   Then amongst the remaining choices I can only think three ways of
> going:
>   * Expose all remaining element-specific options...
>   * Having a Quality/Speed setting ranging from Low-Quality/Fast to
> HighestQuality/Sloww
>   * A mix of the two :(
>
> (1): that reminds me... if you select a certain profile that limits the
> range of some option... how do we report that ?

This is a general restriction on e.g. GObject properties. E.g. in v4l2src I
somethimes would like to limmit the real value range once the device has been
openend. We can do that for caps, but not for properties. Sure we have
GstPropertyProbe for it.

Stefan

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: [RFC] Encoding and Profiles

Edward Hervey-2
In reply to this post by Edward Hervey-2
On Mon, 2009-10-19 at 18:53 +0200, Edward Hervey wrote:

> Hi all,
>
>   I have been working lately on researching ways to make the whole
> encoding experience better and more streamlined for applications using
> GStreamer, and have come up with a proposal.
>
>   You will find the proposal below, and attached to the mail the
> research and proposed C API.
>
>   Comments and feedback are most welcome

  I've started the implementation of that proposal and have put the
current work in a new module called gst-convenience.

  What is currently available:
  * encodebin
    support for single/multiple audio/video/containers
  * gstprofile
    Creating/copying/freeing GstEncodingProfile and
GstStreamEncodingProfile
  There are unit tests in tests/check that show how to use that API and
element, plus inline documentation.

  What remains to be done:
  * Use the restriction support in encodebin
  * Use the preset fields in encodebin
  * Design a default storage/loading system for profiles

  Bonus:
  * I re-implemented in C the Discoverer which many gst-python
applications are using (PiTiVi, Jokosher, Transmageddon, ..). The goal
is to be able to very quickly get a lot of information about one or many
URIs (number of streams, stream properties, duration, tags, ...).
    There is a test application showing how to use it in tests/examples/
    Currently it outputs the information as a GstStructure, I'm still
working on coming up with a saner API to access that information

  The proposal from this mail thread (modified according to feedback) is
available in docs/design/

  The code is available here :
http://git.collabora.co.uk/?p=user/edward/gst-convenience.git;a=summary

  Comments and feedback welcome,

   Edward

P.S. The goal in the long term is not to keep all of those in a separate
repository but to eventually move them in -base.

--
Edward Hervey  --  Collabora Multimedia
Lead Platforms Engineer      Co-Founder



------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel