On the plugin cache

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

On the plugin cache

Behdad Esfahbod-3
Hi,

So, I blamed the gst plugin cache in my blog entry [1] for taking some 35ms on
every gst-using app startup.  It's only fair that I follow up with how I think
this should be fixed.

>From what I understand quickly looking through the code, here is how the
startup cache check currently works:

  1) Fork

  2) In the child, stat all plugin dirs and all plugins in the recursively

  3) If any plugins are newer than the binary cache, rebuild the cache

  4) In the parent, wait for child to finish.  If child failed to finish
cleanly, repeat 2 and 3 in the parent.


Lemme make this clear: yes, I know there's an option to disable forking.  But
that's beside the point.  I'm interested how this thing normally works, and
should work.  The rationale for rebuilding the cache in a forked child is legit:

  a) avoid crashing the parent,

  b) avoid polluting the parent process with lots of libraries.


Anyway, there are multiple problems with that scheme that I see.  In no
particular order:

  P1) Plain fork() is not clean.  Use g_spawn instead.  This has to do with
how the SIGCHLD is interpretted, etc.  As I understand, gst may generate a
SIGCHLD that is then left may trigger a handler installed by the user, using
glib or directly.  To handle the direct case, save/restoring SIGCHLD handler
may be needed.  I think handling SIGPIPE may be needed too.  Easy fix.

  P2) Stating plugins is safe in the parent and need not happen in a forked
child.  That obviates the need to fork in the common case.

  P3) *If* the child failed to update the registry (say, it crashed because of
a bad plugin), then the parent goes ahead and tries the same thing in the
parent process, expecting a different result!  That's plain wrong and almost
surely will crash the parent (heck, vuntz was facing this very same issue
today in gnome-settings-daemon).  If child fails, parent should simply print a
warning and proceed with using the old cache.

  P4) This is the main problem: the whole purpose of the cache should be to lt
us avoid scanning all plugins on each startup.

Lets look into why the scan is needed right now.  Let me also note that the
case at hand is *exactly* the same as the one we face in fontconfig with fonts.

So how should the cache work?  By comparing the timestamp of each plugin dir
to the recorded timestamp of that dir in the cache.  One must compare
timestamps for equality, not for being more recent as that is prune to clock
skew false negatives.

Also note that when I say all plugin dirs, that includes any recursively found
directories.  For the record, there are two schools of thought about how to
handle recursive directories:

  - Fully automatic: like fontconfig.  Record and check timestamp for all
directories found recursively.

  - Half automatic: like gtk-icon-cache.  Only record and check timestamp for
toplevel directories.  Requires every plugin install to also touch the
toplevel plugin directory.


So why does just checking the timestamp of all plugin dirs work?  Because:

  - File and directory add, move, and deletions are noticed by their parent
directory, hence detected by our code.

  - File copies and otherwise modifications are NOT detected.  BUT, such
things are not allowed anyway:  If you modify a file mmapped by another
process, you are going to crash the other process that is using the file.
Installs should always be done by the (fortunately, atomic) move operation,
not copy.

Also worth mentioning is that dumping the old cache and plugins is safe with
respect to other processes: A file is not deleted from disk as long as some
process holds an open mmap on it.

So, that's it.  It should all work by just stating directories, not files.  In
the case of fontconfig, it actually keeps the cache for each directory's fonts
in a separate cache file.  That's a tradeoff: more cache files, but cheaper
regeneration.  gst may want to keep separate cache files for system and user
plugins too.  That means, distro packages installing plugins can update the
system-wide cache once and each user does not have to do that.  Users would
need to update cache only for plugins they install in their home dir, or if
the system-wide cache is outdated (which is a distro bug if it does)


Regards,

behdad


[1] http://mces.blogspot.com/2008/10/improving-login-time-part-1-gnome.html



-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Jan Schmidt-6
On Wed, 2008-11-05 at 21:51 -0500, Behdad Esfahbod wrote:
> Hi,
>
> So, I blamed the gst plugin cache in my blog entry [1] for taking some 35ms on
> every gst-using app startup.  It's only fair that I follow up with how I think
> this should be fixed.

I have a quilt stack that starts to address some of these problems. It
has some bugs, so I'm not ready to commit it yet, and it doesn't (yet)
handle stat'ing plugin directories to shortcut re-scanning.

What I have so far replaces the fork+stat method below with a version
that stats, and g_spawns a helper when discovering an out of date
plugin. The helper loads the plugin (or crashes) and feeds the details
back to the parent, and the parent incorporates the new plugin info into
it's registry. When it's all done, the parent rewrites the registry
cache file if needed.

That improves the scanning procedure, which is obviously good - it
avoids forking unnecessarily, and avoids the cost of reloading the
registry cache in the parent. It's not yet clear to me how much that
saves though. From some tests, the cost of the fork is pretty low
(10ms), which implies that the bulk of the time you saw in your graph is
spent elsewhere. I suspect a big chunk of it may be the cost of creating
(here) 700 odd Plugin/Element/other GObject instances, and that's harder
to eliminate.

J.
--
Jan Schmidt <[hidden email]>


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Stefan Sauer
hi,

Jan Schmidt schrieb:

> On Wed, 2008-11-05 at 21:51 -0500, Behdad Esfahbod wrote:
>> Hi,
>>
>> So, I blamed the gst plugin cache in my blog entry [1] for taking some 35ms on
>> every gst-using app startup.  It's only fair that I follow up with how I think
>> this should be fixed.
>
> I have a quilt stack that starts to address some of these problems. It
> has some bugs, so I'm not ready to commit it yet, and it doesn't (yet)
> handle stat'ing plugin directories to shortcut re-scanning.
>
> What I have so far replaces the fork+stat method below with a version
> that stats, and g_spawns a helper when discovering an out of date
> plugin. The helper loads the plugin (or crashes) and feeds the details
> back to the parent, and the parent incorporates the new plugin info into
> it's registry. When it's all done, the parent rewrites the registry
> cache file if needed.
>
> That improves the scanning procedure, which is obviously good - it
> avoids forking unnecessarily, and avoids the cost of reloading the
> registry cache in the parent. It's not yet clear to me how much that
> saves though. From some tests, the cost of the fork is pretty low
> (10ms), which implies that the bulk of the time you saw in your graph is
> spent elsewhere. I suspect a big chunk of it may be the cost of creating
> (here) 700 odd Plugin/Element/other GObject instances, and that's harder
> to eliminate.
>
> J.

There is actually potential to improve the gobject creation time too. E.g.
http://bugzilla.gnome.org/show_bug.cgi?id=557047

Stefan

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Behdad Esfahbod-3
In reply to this post by Jan Schmidt-6
Jan Schmidt wrote:

> On Wed, 2008-11-05 at 21:51 -0500, Behdad Esfahbod wrote:
>> Hi,
>>
>> So, I blamed the gst plugin cache in my blog entry [1] for taking some 35ms on
>> every gst-using app startup.  It's only fair that I follow up with how I think
>> this should be fixed.
>
> I have a quilt stack that starts to address some of these problems. It
> has some bugs, so I'm not ready to commit it yet, and it doesn't (yet)
> handle stat'ing plugin directories to shortcut re-scanning.
>
> What I have so far replaces the fork+stat method below with a version
> that stats, and g_spawns a helper when discovering an out of date
> plugin. The helper loads the plugin (or crashes) and feeds the details
> back to the parent, and the parent incorporates the new plugin info into
> it's registry. When it's all done, the parent rewrites the registry
> cache file if needed.

Interesting.  Though that would do a fork per plugin?

> That improves the scanning procedure, which is obviously good - it
> avoids forking unnecessarily, and avoids the cost of reloading the
> registry cache in the parent. It's not yet clear to me how much that
> saves though. From some tests, the cost of the fork is pretty low
> (10ms), which implies that the bulk of the time you saw in your graph is
> spent elsewhere. I suspect a big chunk of it may be the cost of creating
> (here) 700 odd Plugin/Element/other GObject instances, and that's harder
> to eliminate.

I suspect creating GObject's is taking much of that time, though I have not
measured.  I'm more suspicious of the stats.

behdad

> J.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Jan Schmidt-6
On Fri, 2008-11-07 at 14:42 -0500, Behdad Esfahbod wrote:

> Jan Schmidt wrote:
> > On Wed, 2008-11-05 at 21:51 -0500, Behdad Esfahbod wrote:
> >> Hi,
> >>
> >> So, I blamed the gst plugin cache in my blog entry [1] for taking some 35ms on
> >> every gst-using app startup.  It's only fair that I follow up with how I think
> >> this should be fixed.
> >
> > I have a quilt stack that starts to address some of these problems. It
> > has some bugs, so I'm not ready to commit it yet, and it doesn't (yet)
> > handle stat'ing plugin directories to shortcut re-scanning.
> >
> > What I have so far replaces the fork+stat method below with a version
> > that stats, and g_spawns a helper when discovering an out of date
> > plugin. The helper loads the plugin (or crashes) and feeds the details
> > back to the parent, and the parent incorporates the new plugin info into
> > it's registry. When it's all done, the parent rewrites the registry
> > cache file if needed.
>
> Interesting.  Though that would do a fork per plugin?

No, it forks a helper the first time and then has a little protocol back
and forth across the fd's.

> > That improves the scanning procedure, which is obviously good - it
> > avoids forking unnecessarily, and avoids the cost of reloading the
> > registry cache in the parent. It's not yet clear to me how much that
> > saves though. From some tests, the cost of the fork is pretty low
> > (10ms), which implies that the bulk of the time you saw in your graph is
> > spent elsewhere. I suspect a big chunk of it may be the cost of creating
> > (here) 700 odd Plugin/Element/other GObject instances, and that's harder
> > to eliminate.
>
> I suspect creating GObject's is taking much of that time, though I have not
> measured.  I'm more suspicious of the stats.

did you mean "isn't" taking much of that time?

I think we need some more measurements :)

J.
--
Jan Schmidt <[hidden email]>


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Simon Holm Thøgersen
In reply to this post by Behdad Esfahbod-3
[ Resending as this one didn't seem to reach the list. ]

ons, 05 11 2008 kl. 21:51 -0500, skrev Behdad Esfahbod:
> So how should the cache work?  By comparing the timestamp of each plugin dir
> to the recorded timestamp of that dir in the cache.  One must compare
> timestamps for equality, not for being more recent as that is prune to clock
> skew false negatives.

I completely agree with you Behdad that excessive work is being done
when stating files and not just dirs. However, even with the current
design it is not where most of the time is spent.

The following is a profile of my laptop (Intel Pentium-m @1.5GHz)
running 'gst-launch-0.10 --gst-disable-registry-fork' with 157 plugins
present:

total 22 ms
  linking libs 4.1ms
  gst_init 17.9 ms
    misc 3.3 ms
    loading registry.i686.bin 12.4 ms
      crc 1.5 ms
      creating elements 10.9 ms
    stating 2.2 ms

I have a very simple patch that almost halves the time spent stating
that I did almost a year ago but never got around to posting; I'll do
that now.

Back then I also looked at speeding up the creation of elements.
According to a comment I made back then it should be possible to reduce
the time with 75%. I'm pretty sure that some of the changes broke ABI,
but I'm not sure whether those that did were responsible for the
speedup. I might try to port the work I did back then to the current
gstreamer.

Before I look more into this, it would be nice if others posted similar
profiles for their systems that could confirm this distribution. To
generate my profile I added a bit of custom debug for the crc part, but
the following should be sufficient for the rest.

export GST_DEBUG=3
time /usr/bin/gst-launch-0.10 --gst-disable-registry-fork

The value of doing the crc check seems pretty dubious to me btw; if
you've got disk corruptions there are plenty of other ways your system
could malfunction. It should be pretty easy to make it optional at load
time though.


Simon Holm Thøgersen


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Jan Schmidt-6
On Sat, 2008-11-08 at 10:25 +0100, Simon Holm Thøgersen wrote:

> [ Resending as this one didn't seem to reach the list. ]
>
> ons, 05 11 2008 kl. 21:51 -0500, skrev Behdad Esfahbod:
> > So how should the cache work?  By comparing the timestamp of each plugin dir
> > to the recorded timestamp of that dir in the cache.  One must compare
> > timestamps for equality, not for being more recent as that is prune to clock
> > skew false negatives.
>
> I completely agree with you Behdad that excessive work is being done
> when stating files and not just dirs. However, even with the current
> design it is not where most of the time is spent.
>
> The following is a profile of my laptop (Intel Pentium-m @1.5GHz)
> running 'gst-launch-0.10 --gst-disable-registry-fork' with 157 plugins
> present:
>
> total 22 ms
>   linking libs 4.1ms
>   gst_init 17.9 ms
>     misc 3.3 ms
>     loading registry.i686.bin 12.4 ms
>       crc 1.5 ms
>       creating elements 10.9 ms
>     stating 2.2 ms

That raises an interesting point - it wasn't clear from Behdad's email
if his system is using the (new) binary registry cache format, or the
(old and slower) xml registry.

I see something like those timings here on my machine, including the
2.2-ish ms to stat things, on a machine with 174 plugins, 802 features.
(2.33Ghz Core 2 duo)

> I have a very simple patch that almost halves the time spent stating
> that I did almost a year ago but never got around to posting; I'll do
> that now.

Yes please :)

> Back then I also looked at speeding up the creation of elements.
> According to a comment I made back then it should be possible to reduce
> the time with 75%. I'm pretty sure that some of the changes broke ABI,
> but I'm not sure whether those that did were responsible for the
> speedup. I might try to port the work I did back then to the current
> gstreamer.

What was your approach for that?

> Before I look more into this, it would be nice if others posted similar
> profiles for their systems that could confirm this distribution. To
> generate my profile I added a bit of custom debug for the crc part, but
> the following should be sufficient for the rest.
>
> export GST_DEBUG=3
> time /usr/bin/gst-launch-0.10 --gst-disable-registry-fork
>
> The value of doing the crc check seems pretty dubious to me btw; if
> you've got disk corruptions there are plenty of other ways your system
> could malfunction. It should be pretty easy to make it optional at load
> time though.
>

I'm not sure what the rationale was for adding a CRC to the binary
registry format originally. It might have been done as a sanity check to
detect partially-written registry files, so they can be rebuilt without
crashing every GStreamer app.

J.
--
Jan Schmidt <[hidden email]>


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Behdad Esfahbod-3
In reply to this post by Jan Schmidt-6
Jan Schmidt wrote:
>> I suspect creating GObject's is taking much of that time, though I have not
>> measured.  I'm more suspicious of the stats.
>
> did you mean "isn't" taking much of that time?

Ah, right.

behdad

> I think we need some more measurements :)
>
> J.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Behdad Esfahbod-3
In reply to this post by Jan Schmidt-6
Jan Schmidt wrote:

> On Sat, 2008-11-08 at 10:25 +0100, Simon Holm Thøgersen wrote:
>> [ Resending as this one didn't seem to reach the list. ]
>>
>> ons, 05 11 2008 kl. 21:51 -0500, skrev Behdad Esfahbod:
>>> So how should the cache work?  By comparing the timestamp of each plugin dir
>>> to the recorded timestamp of that dir in the cache.  One must compare
>>> timestamps for equality, not for being more recent as that is prune to clock
>>> skew false negatives.
>> I completely agree with you Behdad that excessive work is being done
>> when stating files and not just dirs. However, even with the current
>> design it is not where most of the time is spent.
>>
>> The following is a profile of my laptop (Intel Pentium-m @1.5GHz)
>> running 'gst-launch-0.10 --gst-disable-registry-fork' with 157 plugins
>> present:
>>
>> total 22 ms
>>   linking libs 4.1ms
>>   gst_init 17.9 ms
>>     misc 3.3 ms
>>     loading registry.i686.bin 12.4 ms
>>       crc 1.5 ms
>>       creating elements 10.9 ms
>>     stating 2.2 ms
>
> That raises an interesting point - it wasn't clear from Behdad's email
> if his system is using the (new) binary registry cache format, or the
> (old and slower) xml registry.

I've been testing with Fedora Rawhide which has the binary cache.  Something
around gstreamer-0.10.21-1.fc10.i386.  Maybe a bit older.

> I see something like those timings here on my machine, including the
> 2.2-ish ms to stat things, on a machine with 174 plugins, 802 features.
> (2.33Ghz Core 2 duo)

Sure, it may not be the stats that are taking the time but the objects you
build from them.  It is still true that if you avoid checking on plugins on
each startup, the cost disappears.

>> The value of doing the crc check seems pretty dubious to me btw; if
>> you've got disk corruptions there are plenty of other ways your system
>> could malfunction. It should be pretty easy to make it optional at load
>> time though.
>
> I'm not sure what the rationale was for adding a CRC to the binary
> registry format originally. It might have been done as a sanity check to
> detect partially-written registry files, so they can be rebuilt without
> crashing every GStreamer app.

Yeah, the CRC check is totally bogus.  The cache must be built in a temp file
first, then moved to it's final place.  Assuming a journaling filesystem,
there's no way you end up with a partially-written file.  The only way the CRC
can fail is disk failure or manual modification of the file.  Neither one is
particularly interesting.  Imagine what would happen if we did the same to
shared objects, fonts, icons, and any other kind of binary data :).

behdad

> J.

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Simon Holm Thøgersen
[ CC Sebastian Dröge that comitted the crc code. Sebastian, please see
  bottom of mail. ]

man, 10 11 2008 kl. 18:54 +0100, skrev Behdad Esfahbod:

> Jan Schmidt wrote:
> > On Sat, 2008-11-08 at 10:25 +0100, Simon Holm Thøgersen wrote:
> >> [ Resending as this one didn't seem to reach the list. ]
> >>
> >> ons, 05 11 2008 kl. 21:51 -0500, skrev Behdad Esfahbod:
> >>> So how should the cache work?  By comparing the timestamp of each plugin dir
> >>> to the recorded timestamp of that dir in the cache.  One must compare
> >>> timestamps for equality, not for being more recent as that is prune to clock
> >>> skew false negatives.
> >> I completely agree with you Behdad that excessive work is being done
> >> when stating files and not just dirs. However, even with the current
> >> design it is not where most of the time is spent.
> >>
> >> The following is a profile of my laptop (Intel Pentium-m @1.5GHz)
> >> running 'gst-launch-0.10 --gst-disable-registry-fork' with 157 plugins
> >> present:
> >>
> >> total 22 ms
> >>   linking libs 4.1ms
> >>   gst_init 17.9 ms
> >>     misc 3.3 ms
> >>     loading registry.i686.bin 12.4 ms
> >>       crc 1.5 ms
> >>       creating elements 10.9 ms
> >>     stating 2.2 ms
> >
> > That raises an interesting point - it wasn't clear from Behdad's email
> > if his system is using the (new) binary registry cache format, or the
> > (old and slower) xml registry.
>
> I've been testing with Fedora Rawhide which has the binary cache.  Something
> around gstreamer-0.10.21-1.fc10.i386.  Maybe a bit older.
>
> > I see something like those timings here on my machine, including the
> > 2.2-ish ms to stat things, on a machine with 174 plugins, 802 features.
> > (2.33Ghz Core 2 duo)
>
> Sure, it may not be the stats that are taking the time but the objects you
> build from them.  It is still true that if you avoid checking on plugins on
> each startup, the cost disappears.
>
Well, I did the profiling now, and it is the building of objects that
takes more than 55% of the time in gst_registry_binary_read_cache. I'm
sorry to say that my remark about a patch with a 75% reduction were
completely wrong.

That is not to say that such a speed up isn't possible though. The
problem is that in order for gst_element_factory_make etc. to work we
must know about all names and types of plugins. Right now the registry
builds all the features up front and put them on a list that can be
filtered, but there's really nothing to prevent using a simple index for
names and types and creating the objects lazily.

This issue is completely orthogonal to not statting anything but
directories on startup.

I'll volunteer to file the bugs and write the patches for both unless
someone give convincing arguments not to.

> >> The value of doing the crc check seems pretty dubious to me btw; if
> >> you've got disk corruptions there are plenty of other ways your system
> >> could malfunction. It should be pretty easy to make it optional at load
> >> time though.
> >
> > I'm not sure what the rationale was for adding a CRC to the binary
> > registry format originally. It might have been done as a sanity check to
> > detect partially-written registry files, so they can be rebuilt without
> > crashing every GStreamer app.
>
> Yeah, the CRC check is totally bogus.  The cache must be built in a temp file
> first, then moved to it's final place.  Assuming a journaling filesystem,
> there's no way you end up with a partially-written file.  The only way the CRC
> can fail is disk failure or manual modification of the file.  Neither one is
> particularly interesting.  Imagine what would happen if we did the same to
> shared objects, fonts, icons, and any other kind of binary data :).
>
Sebastian, can you tell us the rationale for the crc check or should we just
file a bug report for the removal? I'm volunteering for this as well.


Simon Holm Thøgersen


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Sebastian Dröge
Am Dienstag, den 11.11.2008, 16:13 +0100 schrieb Simon Holm Thøgersen:

> > >> The value of doing the crc check seems pretty dubious to me btw; if
> > >> you've got disk corruptions there are plenty of other ways your system
> > >> could malfunction. It should be pretty easy to make it optional at load
> > >> time though.
> > >
> > > I'm not sure what the rationale was for adding a CRC to the binary
> > > registry format originally. It might have been done as a sanity check to
> > > detect partially-written registry files, so they can be rebuilt without
> > > crashing every GStreamer app.
> >
> > Yeah, the CRC check is totally bogus.  The cache must be built in a temp file
> > first, then moved to it's final place.  Assuming a journaling filesystem,
> > there's no way you end up with a partially-written file.  The only way the CRC
> > can fail is disk failure or manual modification of the file.  Neither one is
> > particularly interesting.  Imagine what would happen if we did the same to
> > shared objects, fonts, icons, and any other kind of binary data :).
> >
> Sebastian, can you tell us the rationale for the crc check or should we just
> file a bug report for the removal? I'm volunteering for this as well.
The original rationale for the CRC check was, to ensure that the
registry was a) completely written and b) wasn't corrupted by something
(memory failure, ....). I agree that a) could be done much simpler,
either just a single value that is written when the registry is finished
or the temp file & move thing which was written in this thread. b) is
still valid but you're right that, if this has happened, there are much
larger problems anyway.

Best would be to file a bug for removal of the CRC but I'm probably
going to do that tomorrow already...

-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel

signature.asc (204 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: On the plugin cache

Simon Holm Thøgersen
tir, 11 11 2008 kl. 19:03 +0100, skrev Sebastian Dröge:

> Am Dienstag, den 11.11.2008, 16:13 +0100 schrieb Simon Holm Thøgersen:
>
> > > >> The value of doing the crc check seems pretty dubious to me btw; if
> > > >> you've got disk corruptions there are plenty of other ways your system
> > > >> could malfunction. It should be pretty easy to make it optional at load
> > > >> time though.
> > > >
> > > > I'm not sure what the rationale was for adding a CRC to the binary
> > > > registry format originally. It might have been done as a sanity check to
> > > > detect partially-written registry files, so they can be rebuilt without
> > > > crashing every GStreamer app.
> > >
> > > Yeah, the CRC check is totally bogus.  The cache must be built in a temp file
> > > first, then moved to it's final place.  Assuming a journaling filesystem,
> > > there's no way you end up with a partially-written file.  The only way the CRC
> > > can fail is disk failure or manual modification of the file.  Neither one is
> > > particularly interesting.  Imagine what would happen if we did the same to
> > > shared objects, fonts, icons, and any other kind of binary data :).
> > >
> > Sebastian, can you tell us the rationale for the crc check or should we just
> > file a bug report for the removal? I'm volunteering for this as well.
>
> The original rationale for the CRC check was, to ensure that the
> registry was a) completely written and b) wasn't corrupted by something
> (memory failure, ....). I agree that a) could be done much simpler,
> either just a single value that is written when the registry is finished
> or the temp file & move thing which was written in this thread.

The code already use a temp file and then rename, and also did before
the crc code was introduced.

> b) is
> still valid but you're right that, if this has happened, there are much
> larger problems anyway.
>
> Best would be to file a bug for removal of the CRC but I'm probably
> going to do that tomorrow already...

I filed it as #503675. I'll submit patch during the weekend if noone
beats me to it.


Simon


-------------------------------------------------------------------------
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK & win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100&url=/
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/gstreamer-devel