Hello,
I want to add a parser for a format and use GstBaseParse as base class. However, this format has some peculiarities: 1) Audio data comes in 2048-byte blocks. No metadata between blocks, but one block equals the data for one channel. So, if this is stereo content, then I have to read 2*2048 bytes in order to output properly interleaved data. When I do TIME->BYTES conversions in the convert() vfunc, I plan on first doing the time->bytes conversion as usual (bytes = time * (sample_rate * bytes_per_sample * num_channels / GST_SECOND)), and then round down the result to an aligned value. So, with the stereo example from earlier, if for example the conversion yields a byte offset value of 6511, I would round it down to 4096, to ensure seeking does not end in the middle of a block. Does this make sense, or is there a better way? 2) Each block consists of 16-bit words, which are essentially the samples. I guess this means that when doing TIME->DEFAULT conversion, I should still do the conversion like this: default = time * (sample_rate / GST_SECOND) , correct? DEFAULT is supposed to mean "sample" or "frame" with audio data, right? 3) At first, I have to read the headers. I plan on setting the min_frame_size to the size of the first header, read its contents, add the DROP flag to the GstBaseParseFrame, and return GST_BASE_PARSE_FLOW_DROPPED in handle_frame(). Is this the correct/recommended way? 4) Is it possible that a seek query comes in while I am still scanning the headers, that is, before I even finished a frame without dropping it? If so, what happens then? 5) There might be trailing padding data. I therefore need to know what the current position is (in BYTES) to ensure that this trailing data is excluded and the EOS is sent when the end of the valid data is reached. What is the proper way of doing this? gst_base_parse_set_duration() does not seem appropriate for this (and also I anyway want to use it to let the application know about the duration in nanoseconds). 6) In addition to this trailing padding data, this format allows for id3v2 content ... at the end of the file. id3demux does not seem to support this - but ID3v2 does (at least in version 2.4), although admittedly, support for ID3v2 tags at the end of files is uncommon. I guess a patch for id3demux would be the best approach here? This format never allows for streaming, that is, the media always is of a known and finite length. Is it still a good idea to use baseparse? Or should I actually use GstElement in this case? _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
On Tue, 5 Jul 2016, at 11:10 PM, Carlos Rafael Giani wrote:
> Hello, > > I want to add a parser for a format and use GstBaseParse as base class. > However, this format has some peculiarities: > > 1) Audio data comes in 2048-byte blocks. No metadata between blocks, but > one block equals the data for one channel. So, if this is stereo > content, then I have to read 2*2048 bytes in order to output properly > interleaved data. When I do TIME->BYTES conversions in the convert() > vfunc, I plan on first doing the time->bytes conversion as usual (bytes > = time * (sample_rate * bytes_per_sample * num_channels / GST_SECOND)), > and then round down the result to an aligned value. So, with the stereo > example from earlier, if for example the conversion yields a byte offset > value of 6511, I would round it down to 4096, to ensure seeking does not > end in the middle of a block. Does this make sense, or is there a better > way? The convert vfunc is the wrong place to encode this logic. I'm not sure how we handle seeking into the middle of a frame. I guess you could (based on the offset or byte position) return DROP and the appropriate skip length. > 2) Each block consists of 16-bit words, which are essentially the > samples. I guess this means that when doing TIME->DEFAULT conversion, I > should still do the conversion like this: default = time * (sample_rate > / GST_SECOND) , correct? DEFAULT is supposed to mean "sample" or "frame" > with audio data, right? Yes. From the baseparse documentation: """ This base class uses GST_FORMAT_DEFAULT as a meaning of frames. So, subclass conversion routine needs to know that conversion from GST_FORMAT_TIME to GST_FORMAT_DEFAULT must return the frame number that can be found from the given byte position. """ > 3) At first, I have to read the headers. I plan on setting the > min_frame_size to the size of the first header, read its contents, add > the DROP flag to the GstBaseParseFrame, and return > GST_BASE_PARSE_FLOW_DROPPED in handle_frame(). Is this the > correct/recommended way? You don't need to set the DROP flag (that's for when you decide to drop the frame in finish_frame()). Returning GST_BASE_PARSE_FLOW_DROPPED is sufficient. If I understand your format right, it should be okay for you to first set min_frame_size() to the header size, and then to 4096 so you subsequently get enough data in each call to generate the interleaved data. > 4) Is it possible that a seek query comes in while I am still scanning > the headers, that is, before I even finished a frame without dropping > it? If so, what happens then? You could set gst_base_parse_set_syncable() to FALSE until you have sufficient information to identify headers. > 5) There might be trailing padding data. I therefore need to know what > the current position is (in BYTES) to ensure that this trailing data is > excluded and the EOS is sent when the end of the valid data is reached. > What is the proper way of doing this? gst_base_parse_set_duration() does > not seem appropriate for this (and also I anyway want to use it to let > the application know about the duration in nanoseconds). Would you have this information on the buffer via GST_BUFFER_OFFSET()? > 6) In addition to this trailing padding data, this format allows for > id3v2 content ... at the end of the file. id3demux does not seem to > support this - but ID3v2 does (at least in version 2.4), although > admittedly, support for ID3v2 tags at the end of files is uncommon. I > guess a patch for id3demux would be the best approach here? Indeed. > This format never allows for streaming, that is, the media always is of > a known and finite length. > > Is it still a good idea to use baseparse? Or should I actually use > GstElement in this case? It looks like baseparse should work. -- Arun _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
In reply to this post by Carlos Rafael Giani
On Tue, 2016-07-05 at 19:40 +0200, Carlos Rafael Giani wrote:
Hi, > 6) In addition to this trailing padding data, this format allows for > id3v2 content ... at the end of the file. id3demux does not seem to > support this - but ID3v2 does (at least in version 2.4), although > admittedly, support for ID3v2 tags at the end of files is uncommon. I > guess a patch for id3demux would be the best approach here? id3demux should support it in pull mode. In push mode it won't handle end tags, since that would imply scanning all data being pushed through, and some end tags are only identified by the last few bytes of the file, so by the time you detect the tag you may have pushed out the beginning already. And even if you detect it fine then you only announce the tag once all the data has been processed, so that's not so useful. In addition to that, in an autoplugging scenario id3demux would only get plugged if there's a start ID3 tag in addition to the end tag. I'm not sure if id3demux should do seeks in push mode to detect end tags, but I look forward to seeing your patch to see what you come up with :) Cheers -Tim -- Tim Müller, Centricular Ltd - http://www.centricular.com _______________________________________________ gstreamer-devel mailing list [hidden email] https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel |
Free forum by Nabble | Edit this page |