GStreamer-devel

gitlab.fd.o financial situation and impact on services

Classic

List

Threaded

61 messages Options

1234

Erik Faye-Lund

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

On Fri, 2020-02-28 at 10:43 +0000, Daniel Stone wrote:

> On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
> <[hidden email]> wrote:
> > On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
> > > Yeah, changes on vulkan drivers or backend compilers should be
> > > fairly
> > > sandboxed.
> > >
> > > We also have tools that only work for intel stuff, that should
> > > never
> > > trigger anything on other people's HW.
> > >
> > > Could something be worked out using the tags?
> >
> > I think so! We have the pre-defined environment variable
> > CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
> >
> > https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
> >
> > That sounds like a pretty neat middle-ground to me. I just hope
> > that
> > new pipelines are triggered if new labels are added, because not
> > everyone is allowed to set labels, and sometimes people forget...
>
> There's also this which is somewhat more robust:
> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569
>
>

I'm not sure it's more robust, but yeah that a useful tool too.

The reason I'm skeptical about the robustness is that we'll miss
testing if this misses a path. That can of course be fixed by testing
everything once things are in master, and fixing up that list when
something breaks on master.

The person who wrote a change knows more about the intricacies of the
changes than a computer will ever do. But humans are also good at
making mistakes, so I'm not sure which one is better. Maybe the union
of both?

As long as we have both rigorous testing after something landed in
master (doesn't nessecarily need to happen right after, but for now
that's probably fine), as well as a reasonable heuristic for what
testing is needed pre-merge, I think we're good.

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Daniel Stone

Re: gitlab.fd.o financial situation and impact on services

In reply to this post by Jan Engelhardt

Hi Jan,

On Fri, 28 Feb 2020 at 10:09, Jan Engelhardt <[hidden email]> wrote:

> On Friday 2020-02-28 08:59, Daniel Stone wrote:
> >I believe that in January, we had $2082 of network cost (almost
> >entirely egress; ingress is basically free) and $1750 of
> >cloud-storage cost (almost all of which was download). That's based
> >on 16TB of cloud-storage (CI artifacts, container images, file
> >uploads, Git LFS) egress and 17.9TB of other egress (the web service
> >itself, repo activity). Projecting that out [×12 for a year] gives
> >us roughly $45k of network activity alone,
>
> I had come to a similar conclusion a few years back: It is not very
> economic to run ephemereal buildroots (and anything like it) between
> two (or more) "significant locations" of which one end is located in
> a Large Cloud datacenter like EC2/AWS/etc.
>
> As for such usecases, me and my surrounding peers have used (other)
> offerings where there is 50 TB free network/month, and yes that may
> have entailed doing more adminning than elsewhere - but an admin
> appreciates $2000 a lot more than a corporation, too.

Yes, absolutely. For context, our storage & network costs have
increased >10x in the past 12 months (~$320 Jan 2019), >3x in the past
6 months (~$1350 July 2019), and ~2x in the past 3 months (~$2000 Oct
2019).

I do now (personally) think that it's crossed the point at which it
would be worthwhile paying an admin to solve the problems that cloud
services currently solve for us - which wasn't true before. Such an
admin could also deal with things like our SMTP delivery failure rate,
which in the past year has spiked over 50% (see previous email),
demand for new services such as Discourse which will enable user
support without either a) users having to subscribe to a mailing list,
or b) bug trackers being cluttered up with user requests and other
non-bugs, etc.

Cheers,
Daniel
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Michel Dänzer

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

In reply to this post by Erik Faye-Lund

On 2020-02-28 10:28 a.m., Erik Faye-Lund wrote:
>
> We could also do stuff like reducing the amount of tests we run on each
> commit, and punt some testing to a per-weekend test-run or someting
> like that. We don't *need* to know about every problem up front, just
> the stuff that's about to be released, really. The other stuff is just
> nice to have. If it's too expensive, I would say drop it.

I don't agree that pre-merge testing is just nice to have. A problem
which is only caught after it lands in mainline has a much bigger impact
than one which is already caught earlier.

--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and X developer
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Michel Dänzer

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

In reply to this post by Erik Faye-Lund

On 2020-02-28 12:02 p.m., Erik Faye-Lund wrote:

> On Fri, 2020-02-28 at 10:43 +0000, Daniel Stone wrote:
>> On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
>> <[hidden email]> wrote:
>>> On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
>>>> Yeah, changes on vulkan drivers or backend compilers should be
>>>> fairly
>>>> sandboxed.
>>>>
>>>> We also have tools that only work for intel stuff, that should
>>>> never
>>>> trigger anything on other people's HW.
>>>>
>>>> Could something be worked out using the tags?
>>>
>>> I think so! We have the pre-defined environment variable
>>> CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
>>>
>>> https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
>>>
>>> That sounds like a pretty neat middle-ground to me. I just hope
>>> that
>>> new pipelines are triggered if new labels are added, because not
>>> everyone is allowed to set labels, and sometimes people forget...
>>
>> There's also this which is somewhat more robust:
>> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569
>
> I'm not sure it's more robust, but yeah that a useful tool too.
>
> The reason I'm skeptical about the robustness is that we'll miss
> testing if this misses a path.

Surely missing a path will be less likely / often to happen compared to
an MR missing a label. (Users which aren't members of the project can't
even set labels for an MR)

--
Earthling Michel Dänzer | https://redhat.com
Libre software enthusiast | Mesa and X developer
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Lionel Landwerlin-2

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

On 28/02/2020 13:46, Michel Dänzer wrote:

> On 2020-02-28 12:02 p.m., Erik Faye-Lund wrote:
>> On Fri, 2020-02-28 at 10:43 +0000, Daniel Stone wrote:
>>> On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
>>> <[hidden email]> wrote:
>>>> On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
>>>>> Yeah, changes on vulkan drivers or backend compilers should be
>>>>> fairly
>>>>> sandboxed.
>>>>>
>>>>> We also have tools that only work for intel stuff, that should
>>>>> never
>>>>> trigger anything on other people's HW.
>>>>>
>>>>> Could something be worked out using the tags?
>>>> I think so! We have the pre-defined environment variable
>>>> CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
>>>>
>>>> https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
>>>>
>>>> That sounds like a pretty neat middle-ground to me. I just hope
>>>> that
>>>> new pipelines are triggered if new labels are added, because not
>>>> everyone is allowed to set labels, and sometimes people forget...
>>> There's also this which is somewhat more robust:
>>> https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569
>> I'm not sure it's more robust, but yeah that a useful tool too.
>>
>> The reason I'm skeptical about the robustness is that we'll miss
>> testing if this misses a path.
> Surely missing a path will be less likely / often to happen compared to
> an MR missing a label. (Users which aren't members of the project can't
> even set labels for an MR)
>
>

Sounds like a good alternative to tags.

-Lionel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Rob Clark

Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services

In reply to this post by Michel Dänzer

On Fri, Feb 28, 2020 at 3:43 AM Michel Dänzer <[hidden email]> wrote:

>
> On 2020-02-28 10:28 a.m., Erik Faye-Lund wrote:
> >
> > We could also do stuff like reducing the amount of tests we run on each
> > commit, and punt some testing to a per-weekend test-run or someting
> > like that. We don't *need* to know about every problem up front, just
> > the stuff that's about to be released, really. The other stuff is just
> > nice to have. If it's too expensive, I would say drop it.
>
> I don't agree that pre-merge testing is just nice to have. A problem
> which is only caught after it lands in mainline has a much bigger impact
> than one which is already caught earlier.
>

one thought.. since with mesa+margebot we effectively get at least
two(ish) CI runs per MR, ie. one when it is initially pushed, and one
when margebot rebases and tries to merge, could we leverage this to
have trimmed down pre-margebot CI which tries to just target affected
drivers, with margebot doing a full CI run (when it is potentially
batching together multiple MRs)?

Seems like a way to reduce our CI runs with a good safety net to
prevent things from slipping through the cracks.

(Not sure how much that would help reduce bandwidth costs, but I guess
it should help a bit.)

BR,
-R
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Kristian Høgsberg

Re: [Intel-gfx] gitlab.fd.o financial situation and impact on services

In reply to this post by Dave Airlie

On Thu, Feb 27, 2020 at 7:38 PM Dave Airlie <[hidden email]> wrote:

>
> On Fri, 28 Feb 2020 at 07:27, Daniel Vetter <[hidden email]> wrote:
> >
> > Hi all,
> >
> > You might have read the short take in the X.org board meeting minutes
> > already, here's the long version.
> >
> > The good news: gitlab.fd.o has become very popular with our
> > communities, and is used extensively. This especially includes all the
> > CI integration. Modern development process and tooling, yay!
> >
> > The bad news: The cost in growth has also been tremendous, and it's
> > breaking our bank account. With reasonable estimates for continued
> > growth we're expecting hosting expenses totalling 75k USD this year,
> > and 90k USD next year. With the current sponsors we've set up we can't
> > sustain that. We estimate that hosting expenses for gitlab.fd.o
> > without any of the CI features enabled would total 30k USD, which is
> > within X.org's ability to support through various sponsorships, mostly
> > through XDC.
> >
> > Note that X.org does no longer sponsor any CI runners themselves,
> > we've stopped that. The huge additional expenses are all just in
> > storing and serving build artifacts and images to outside CI runners
> > sponsored by various companies. A related topic is that with the
> > growth in fd.o it's becoming infeasible to maintain it all on
> > volunteer admin time. X.org is therefore also looking for admin
> > sponsorship, at least medium term.
> >
> > Assuming that we want cash flow reserves for one year of gitlab.fd.o
> > (without CI support) and a trimmed XDC and assuming no sponsor payment
> > meanwhile, we'd have to cut CI services somewhere between May and June
> > this year. The board is of course working on acquiring sponsors, but
> > filling a shortfall of this magnitude is neither easy nor quick work,
> > and we therefore decided to give an early warning as soon as possible.
> > Any help in finding sponsors for fd.o is very much appreciated.
>
> a) Ouch.
>
> b) we probably need to take a large step back here.

If we're taking a step back here, I also want to recognize what a
tremendous success this has been so far and thank everybody involved
for building something so useful. Between gitlab and the CI, our
workflow has improved and code quality has gone up. I don't have
anything useful to add to the technical discussion, except that that
it seems pretty standard engineering practice to build a system,
observe it and identify and eliminate bottlenecks. Planning never
hurts, of course, but I don't think anybody could have realistically
modeled and projected the cost of this infrastructure as it's grown
organically and fast.

Kristian

> Look at this from a sponsor POV, why would I give X.org/fd.o
> sponsorship money that they are just giving straight to google to pay
> for hosting credits? Google are profiting in some minor way from these
> hosting credits being bought by us, and I assume we aren't getting any
> sort of discounts here. Having google sponsor the credits costs google
> substantially less than having any other company give us money to do
> it.
>
> If our current CI architecture is going to burn this amount of money a
> year and we hadn't worked this out in advance of deploying it then I
> suggest the system should be taken offline until we work out what a
> sustainable system would look like within the budget we have, whether
> that be never transferring containers and build artifacts from the
> google network, just having local runner/build combos etc.
>
> Dave.
> _______________________________________________
> Intel-gfx mailing list
> [hidden email]
> https://lists.freedesktop.org/mailman/listinfo/intel-gfx

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Eric Anholt

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

In reply to this post by Dave Airlie

On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie <[hidden email]> wrote:

>
> On Fri, 28 Feb 2020 at 18:18, Daniel Stone <[hidden email]> wrote:
> >
> > On Fri, 28 Feb 2020 at 03:38, Dave Airlie <[hidden email]> wrote:
> > > b) we probably need to take a large step back here.
> > >
> > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > sponsorship money that they are just giving straight to google to pay
> > > for hosting credits? Google are profiting in some minor way from these
> > > hosting credits being bought by us, and I assume we aren't getting any
> > > sort of discounts here. Having google sponsor the credits costs google
> > > substantially less than having any other company give us money to do
> > > it.
> >
> > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > comparable in terms of what you get and what you pay for them.
> > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > services are cheaper, but then you need to find someone who is going
> > to properly administer the various machines, install decent
> > monitoring, make sure that more storage is provisioned when we need
> > more storage (which is basically all the time), make sure that the
> > hardware is maintained in decent shape (pretty sure one of the fd.o
> > machines has had a drive in imminent-failure state for the last few
> > months), etc.
> >
> > Given the size of our service, that's a much better plan (IMO) than
> > relying on someone who a) isn't an admin by trade, b) has a million
> > other things to do, and c) hasn't wanted to do it for the past several
> > years. But as long as that's the resources we have, then we're paying
> > the cloud tradeoff, where we pay more money in exchange for fewer
> > problems.
>
> Admin for gitlab and CI is a full time role anyways. The system is
> definitely not self sustaining without time being put in by you and
> anholt still. If we have $75k to burn on credits, and it was diverted
> to just pay an admin to admin the real hw + gitlab/CI would that not
> be a better use of the money? I didn't know if we can afford $75k for
> an admin, but suddenly we can afford it for gitlab credits?

As I think about the time that I've spent at google in less than a
year on trying to keep the lights on for CI and optimize our
infrastructure in the current cloud environment, that's more than the
entire yearly budget you're talking about here. Saying "let's just
pay for people to do more work instead of paying for full-service
cloud" is not a cost optimization.

> > Yes, we could federate everything back out so everyone runs their own
> > builds and executes those. Tinderbox did something really similar to
> > that IIRC; not sure if Buildbot does as well. Probably rules out
> > pre-merge testing, mind.
>
> Why? does gitlab not support the model? having builds done in parallel
> on runners closer to the test runners seems like it should be a thing.
> I guess artifact transfer would cost less then as a result.

Let's do some napkin math. The biggest artifacts cost we have in Mesa
is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
makes ~1.8TB/month ($180 or so). We could build a local storage next
to the lava dispatcher so that the artifacts didn't have to contain
the rootfs that came from the container (~2/3 of the insides of the
zip file), but that's another service to build and maintain. Building
the drivers once locally and storing it would save downloading the
other ~1/3 of the inside of the zip file, but that requires a big
enough system to do builds in time.

I'm planning on doing a local filestore for google's lava lab, since I
need to be able to move our xml files off of the lava DUTs to get the
xml results we've become accustomed to, but this would not bubble up
to being a priority for my time if I wasn't doing it anyway. If it
takes me a single day to set all this up (I estimate a couple of
weeks), that costs my employer a lot more than sponsoring the costs of
the inefficiencies of the system that has accumulated.
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Dave Airlie

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

On Sat, 29 Feb 2020 at 05:34, Eric Anholt <[hidden email]> wrote:

>
> On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie <[hidden email]> wrote:
> >
> > On Fri, 28 Feb 2020 at 18:18, Daniel Stone <[hidden email]> wrote:
> > >
> > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie <[hidden email]> wrote:
> > > > b) we probably need to take a large step back here.
> > > >
> > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > sponsorship money that they are just giving straight to google to pay
> > > > for hosting credits? Google are profiting in some minor way from these
> > > > hosting credits being bought by us, and I assume we aren't getting any
> > > > sort of discounts here. Having google sponsor the credits costs google
> > > > substantially less than having any other company give us money to do
> > > > it.
> > >
> > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > comparable in terms of what you get and what you pay for them.
> > > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > > services are cheaper, but then you need to find someone who is going
> > > to properly administer the various machines, install decent
> > > monitoring, make sure that more storage is provisioned when we need
> > > more storage (which is basically all the time), make sure that the
> > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > machines has had a drive in imminent-failure state for the last few
> > > months), etc.
> > >
> > > Given the size of our service, that's a much better plan (IMO) than
> > > relying on someone who a) isn't an admin by trade, b) has a million
> > > other things to do, and c) hasn't wanted to do it for the past several
> > > years. But as long as that's the resources we have, then we're paying
> > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > problems.
> >
> > Admin for gitlab and CI is a full time role anyways. The system is
> > definitely not self sustaining without time being put in by you and
> > anholt still. If we have $75k to burn on credits, and it was diverted
> > to just pay an admin to admin the real hw + gitlab/CI would that not
> > be a better use of the money? I didn't know if we can afford $75k for
> > an admin, but suddenly we can afford it for gitlab credits?
>
> As I think about the time that I've spent at google in less than a
> year on trying to keep the lights on for CI and optimize our
> infrastructure in the current cloud environment, that's more than the
> entire yearly budget you're talking about here. Saying "let's just
> pay for people to do more work instead of paying for full-service
> cloud" is not a cost optimization.
>
>
> > > Yes, we could federate everything back out so everyone runs their own
> > > builds and executes those. Tinderbox did something really similar to
> > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > pre-merge testing, mind.
> >
> > Why? does gitlab not support the model? having builds done in parallel
> > on runners closer to the test runners seems like it should be a thing.
> > I guess artifact transfer would cost less then as a result.
>
> Let's do some napkin math. The biggest artifacts cost we have in Mesa
> is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> makes ~1.8TB/month ($180 or so). We could build a local storage next
> to the lava dispatcher so that the artifacts didn't have to contain
> the rootfs that came from the container (~2/3 of the insides of the
> zip file), but that's another service to build and maintain. Building
> the drivers once locally and storing it would save downloading the
> other ~1/3 of the inside of the zip file, but that requires a big
> enough system to do builds in time.
>
> I'm planning on doing a local filestore for google's lava lab, since I
> need to be able to move our xml files off of the lava DUTs to get the
> xml results we've become accustomed to, but this would not bubble up
> to being a priority for my time if I wasn't doing it anyway. If it
> takes me a single day to set all this up (I estimate a couple of
> weeks), that costs my employer a lot more than sponsoring the costs of
> the inefficiencies of the system that has accumulated.

I'm not trying to knock the engineering works the CI contributors have
done at all, but I've never seen a real discussion about costs until
now. Engineers aren't accountants.

The thing we seem to be missing here is fiscal responsibility. I know
this email is us being fiscally responsible, but it's kinda after the
fact.

I cannot commit my employer to spending a large amount of money (> 0
actually) without a long and lengthy process with checks and bounds.
Can you?

The X.org board has budgets and procedures as well. I as a developer
of Mesa should not be able to commit the X.org foundation to spending
large amounts of money without checks and bounds.

The CI infrastructure lacks any checks and bounds. There is no link
between editing .gitlab-ci/* and cashflow. There is no link to me
adding support for a new feature to llvmpipe that blows out test times
(granted it won't affect CI budget but just an example).

The fact that clouds run on credit means that it's not possible to say
budget 30K and say when that runs out it runs out, you end up getting
bills for ever increasing amounts that you have to cover, with nobody
"responsible" for ever reducing those bills. Higher Faster Further
baby comes to mind.

Has X.org actually allocated the remaining cash in it's bank account
to this task previously? Was there plans for this money that can't be
executed now because we have to pay the cloud fees? If we continue to
May and the X.org bank account hits 0, can XDC happen?

Budgeting and cloud is hard, the feedback loops are messy. In the old
system the feedback loop was simple, we don't have admin time or money
for servers we don't get the features, cloud allows us to get the
features and enjoy them and at some point in the future the bill gets
paid by someone else. Credit cards lifestyles all the way.

Like maybe we can grow up here and find sponsors to cover all of this,
but it still feels a bit backwards from a fiscal pov.

Again I'm not knocking the work people have done at all, CI is very
valuable to the projects involved, but that doesn't absolve us from
costs.

Dave.
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Matt Turner

Re: gitlab.fd.o financial situation and impact on services

In reply to this post by Daniel Stone

On Fri, Feb 28, 2020 at 12:00 AM Daniel Stone <[hidden email]> wrote:

>
> Hi Matt,
>
> On Thu, 27 Feb 2020 at 23:45, Matt Turner <[hidden email]> wrote:
> > We're paying 75K USD for the bandwidth to transfer data from the
> > GitLab cloud instance. i.e., for viewing the https site, for
> > cloning/updating git repos, and for downloading CI artifacts/images to
> > the testing machines (AFAIU).
>
> I believe that in January, we had $2082 of network cost (almost
> entirely egress; ingress is basically free) and $1750 of cloud-storage
> cost (almost all of which was download). That's based on 16TB of
> cloud-storage (CI artifacts, container images, file uploads, Git LFS)
> egress and 17.9TB of other egress (the web service itself, repo
> activity). Projecting that out gives us roughly $45k of network
> activity alone, so it looks like this figure is based on a projected
> increase of ~50%.
>
> The actual compute capacity is closer to $1150/month.

Could we have the full GCP bill posted?
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Daniel Vetter

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

In reply to this post by Dave Airlie

On Fri, Feb 28, 2020 at 9:31 PM Dave Airlie <[hidden email]> wrote:

>
> On Sat, 29 Feb 2020 at 05:34, Eric Anholt <[hidden email]> wrote:
> >
> > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie <[hidden email]> wrote:
> > >
> > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone <[hidden email]> wrote:
> > > >
> > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie <[hidden email]> wrote:
> > > > > b) we probably need to take a large step back here.
> > > > >
> > > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > > sponsorship money that they are just giving straight to google to pay
> > > > > for hosting credits? Google are profiting in some minor way from these
> > > > > hosting credits being bought by us, and I assume we aren't getting any
> > > > > sort of discounts here. Having google sponsor the credits costs google
> > > > > substantially less than having any other company give us money to do
> > > > > it.
> > > >
> > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > > comparable in terms of what you get and what you pay for them.
> > > > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > > > services are cheaper, but then you need to find someone who is going
> > > > to properly administer the various machines, install decent
> > > > monitoring, make sure that more storage is provisioned when we need
> > > > more storage (which is basically all the time), make sure that the
> > > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > > machines has had a drive in imminent-failure state for the last few
> > > > months), etc.
> > > >
> > > > Given the size of our service, that's a much better plan (IMO) than
> > > > relying on someone who a) isn't an admin by trade, b) has a million
> > > > other things to do, and c) hasn't wanted to do it for the past several
> > > > years. But as long as that's the resources we have, then we're paying
> > > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > > problems.
> > >
> > > Admin for gitlab and CI is a full time role anyways. The system is
> > > definitely not self sustaining without time being put in by you and
> > > anholt still. If we have $75k to burn on credits, and it was diverted
> > > to just pay an admin to admin the real hw + gitlab/CI would that not
> > > be a better use of the money? I didn't know if we can afford $75k for
> > > an admin, but suddenly we can afford it for gitlab credits?
> >
> > As I think about the time that I've spent at google in less than a
> > year on trying to keep the lights on for CI and optimize our
> > infrastructure in the current cloud environment, that's more than the
> > entire yearly budget you're talking about here. Saying "let's just
> > pay for people to do more work instead of paying for full-service
> > cloud" is not a cost optimization.
> >
> >
> > > > Yes, we could federate everything back out so everyone runs their own
> > > > builds and executes those. Tinderbox did something really similar to
> > > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > > pre-merge testing, mind.
> > >
> > > Why? does gitlab not support the model? having builds done in parallel
> > > on runners closer to the test runners seems like it should be a thing.
> > > I guess artifact transfer would cost less then as a result.
> >
> > Let's do some napkin math. The biggest artifacts cost we have in Mesa
> > is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> > downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> > makes ~1.8TB/month ($180 or so). We could build a local storage next
> > to the lava dispatcher so that the artifacts didn't have to contain
> > the rootfs that came from the container (~2/3 of the insides of the
> > zip file), but that's another service to build and maintain. Building
> > the drivers once locally and storing it would save downloading the
> > other ~1/3 of the inside of the zip file, but that requires a big
> > enough system to do builds in time.
> >
> > I'm planning on doing a local filestore for google's lava lab, since I
> > need to be able to move our xml files off of the lava DUTs to get the
> > xml results we've become accustomed to, but this would not bubble up
> > to being a priority for my time if I wasn't doing it anyway. If it
> > takes me a single day to set all this up (I estimate a couple of
> > weeks), that costs my employer a lot more than sponsoring the costs of
> > the inefficiencies of the system that has accumulated.
>
> I'm not trying to knock the engineering works the CI contributors have
> done at all, but I've never seen a real discussion about costs until
> now. Engineers aren't accountants.
>
> The thing we seem to be missing here is fiscal responsibility. I know
> this email is us being fiscally responsible, but it's kinda after the
> fact.
>
> I cannot commit my employer to spending a large amount of money (> 0
> actually) without a long and lengthy process with checks and bounds.
> Can you?
>
> The X.org board has budgets and procedures as well. I as a developer
> of Mesa should not be able to commit the X.org foundation to spending
> large amounts of money without checks and bounds.
>
> The CI infrastructure lacks any checks and bounds. There is no link
> between editing .gitlab-ci/* and cashflow. There is no link to me
> adding support for a new feature to llvmpipe that blows out test times
> (granted it won't affect CI budget but just an example).

We're working to get the logging in place to know which projects
exactly burn down the money so that we can take specific actions. If
needed. So pretty soon you wont be able to just burn down endless
amounts of cash with a few gitlab-ci commits. Or at least not for long
until we catch you and you either fix things up or CI is gone for your
project.

> The fact that clouds run on credit means that it's not possible to say
> budget 30K and say when that runs out it runs out, you end up getting
> bills for ever increasing amounts that you have to cover, with nobody
> "responsible" for ever reducing those bills. Higher Faster Further
> baby comes to mind.

We're working on this, since it's the boards responsibility to be on
top of stuff. It's simply that we didn't expect a massive growth of
this scale and this quickly, so we're a bit behind on the controlling
aspect.

Also I guess it wasnt clear, but the board decision yesterday was the
stop loss order where we cut the cord (for CI at least). So yeah the
short term budget is firmly in place now.

> Has X.org actually allocated the remaining cash in it's bank account
> to this task previously? Was there plans for this money that can't be
> executed now because we have to pay the cloud fees? If we continue to
> May and the X.org bank account hits 0, can XDC happen?

There's numbers elsewhere in this thread, but if you'd read the
original announcement it states that the stop loss would still
guarantee that we can pay for everything for at least one year. We're
not going to get even close to 0 in the bank account.

So yeah XDC happens, and it'll also still happen next year. Also fd.o
servers will keep running. The only thing we might need to switch off
is the CI support.

> Budgeting and cloud is hard, the feedback loops are messy. In the old
> system the feedback loop was simple, we don't have admin time or money
> for servers we don't get the features, cloud allows us to get the
> features and enjoy them and at some point in the future the bill gets
> paid by someone else. Credit cards lifestyles all the way.

Uh ... where exactly do you get the credit card approach from? SPI is
legally not allowed to extend us a credit (we're not a legal org
anymore), so if we hit 0 it's out real quick. No credit for us. If SPI
isnt on top of that it's their loss (but they're getting pretty good
at tracking stuff with the contractor they now have and all that).

Which is not going to happen btw, if you've read the announcement mail
and all that.

Cheers, Daniel

> Like maybe we can grow up here and find sponsors to cover all of this,
> but it still feels a bit backwards from a fiscal pov.
>
> Again I'm not knocking the work people have done at all, CI is very
> valuable to the projects involved, but that doesn't absolve us from
> costs.
>
> Dave.

--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Nuritzi Sanchez

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

Hi All,

I know there's been a lot of discussion already, but I wanted to respond to Daniel's original post.

I joined GitLab earlier this month as their new Open Source Program Manager [1] and wanted to introduce myself here since I’ll be involved from the GitLab side as we work together to problem-solve the financial situation here. My role at GitLab is to help make it easier for Open Source organizations to migrate (by helping to smooth out some of the current pain points), and to help advocate internally for changes to the product and our workflows to make GitLab better for Open Source orgs. We want to make sure that our Open Source community feels supported beyond just migration. As such, I’ll be running the GitLab Open Source Program [2].

My background is that I’m the former President and Chairperson of the GNOME Foundation, which is one of the earliest Free Software projects to migrate to GitLab. GNOME initially faced some limitations with the CI runner costs too, but thanks to generous support from donors, has no longer experienced those issues in recent times. I know there's already a working relationship between our communities, but it could be good to examine what GNOME and KDE have done and see if there's anything we can apply here. We've reached out to Daniel Stone, our main contact for the freedesktop.org migration, and he has gotten us in touch with Daniel V. and the X.Org Foundation Board to learn more about what's already been done and what we can do next.

Please bear with me as I continue to get ramped up in my new job, but I’d like to offer as much support as possible with this issue. We’ll be exploring ways for GitLab to help make sure there isn’t a gap in coverage during the time that freedesktop looks for sponsors. I know that on GitLab’s side, supporting our Open Source user community is a priority.

Best,

Nuritzi

[1] https://about.gitlab.com/company/team/#nuritzi

[2] https://about.gitlab.com/handbook/marketing/community-relations/opensource-program/

On Fri, Feb 28, 2020 at 1:22 PM Daniel Vetter <[hidden email]> wrote:

On Fri, Feb 28, 2020 at 9:31 PM Dave Airlie <[hidden email]> wrote:
>
> On Sat, 29 Feb 2020 at 05:34, Eric Anholt <[hidden email]> wrote:
> >
> > On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie <[hidden email]> wrote:
> > >
> > > On Fri, 28 Feb 2020 at 18:18, Daniel Stone <[hidden email]> wrote:
> > > >
> > > > On Fri, 28 Feb 2020 at 03:38, Dave Airlie <[hidden email]> wrote:
> > > > > b) we probably need to take a large step back here.
> > > > >
> > > > > Look at this from a sponsor POV, why would I give X.org/fd.o
> > > > > sponsorship money that they are just giving straight to google to pay
> > > > > for hosting credits? Google are profiting in some minor way from these
> > > > > hosting credits being bought by us, and I assume we aren't getting any
> > > > > sort of discounts here. Having google sponsor the credits costs google
> > > > > substantially less than having any other company give us money to do
> > > > > it.
> > > >
> > > > The last I looked, Google GCP / Amazon AWS / Azure were all pretty
> > > > comparable in terms of what you get and what you pay for them.
> > > > Obviously providers like Packet and Digital Ocean who offer bare-metal
> > > > services are cheaper, but then you need to find someone who is going
> > > > to properly administer the various machines, install decent
> > > > monitoring, make sure that more storage is provisioned when we need
> > > > more storage (which is basically all the time), make sure that the
> > > > hardware is maintained in decent shape (pretty sure one of the fd.o
> > > > machines has had a drive in imminent-failure state for the last few
> > > > months), etc.
> > > >
> > > > Given the size of our service, that's a much better plan (IMO) than
> > > > relying on someone who a) isn't an admin by trade, b) has a million
> > > > other things to do, and c) hasn't wanted to do it for the past several
> > > > years. But as long as that's the resources we have, then we're paying
> > > > the cloud tradeoff, where we pay more money in exchange for fewer
> > > > problems.
> > >
> > > Admin for gitlab and CI is a full time role anyways. The system is
> > > definitely not self sustaining without time being put in by you and
> > > anholt still. If we have $75k to burn on credits, and it was diverted
> > > to just pay an admin to admin the real hw + gitlab/CI would that not
> > > be a better use of the money? I didn't know if we can afford $75k for
> > > an admin, but suddenly we can afford it for gitlab credits?
> >
> > As I think about the time that I've spent at google in less than a
> > year on trying to keep the lights on for CI and optimize our
> > infrastructure in the current cloud environment, that's more than the
> > entire yearly budget you're talking about here. Saying "let's just
> > pay for people to do more work instead of paying for full-service
> > cloud" is not a cost optimization.
> >
> >
> > > > Yes, we could federate everything back out so everyone runs their own
> > > > builds and executes those. Tinderbox did something really similar to
> > > > that IIRC; not sure if Buildbot does as well. Probably rules out
> > > > pre-merge testing, mind.
> > >
> > > Why? does gitlab not support the model? having builds done in parallel
> > > on runners closer to the test runners seems like it should be a thing.
> > > I guess artifact transfer would cost less then as a result.
> >
> > Let's do some napkin math. The biggest artifacts cost we have in Mesa
> > is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
> > downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
> > makes ~1.8TB/month ($180 or so). We could build a local storage next
> > to the lava dispatcher so that the artifacts didn't have to contain
> > the rootfs that came from the container (~2/3 of the insides of the
> > zip file), but that's another service to build and maintain. Building
> > the drivers once locally and storing it would save downloading the
> > other ~1/3 of the inside of the zip file, but that requires a big
> > enough system to do builds in time.
> >
> > I'm planning on doing a local filestore for google's lava lab, since I
> > need to be able to move our xml files off of the lava DUTs to get the
> > xml results we've become accustomed to, but this would not bubble up
> > to being a priority for my time if I wasn't doing it anyway. If it
> > takes me a single day to set all this up (I estimate a couple of
> > weeks), that costs my employer a lot more than sponsoring the costs of
> > the inefficiencies of the system that has accumulated.
>
> I'm not trying to knock the engineering works the CI contributors have
> done at all, but I've never seen a real discussion about costs until
> now. Engineers aren't accountants.
>
> The thing we seem to be missing here is fiscal responsibility. I know
> this email is us being fiscally responsible, but it's kinda after the
> fact.
>
> I cannot commit my employer to spending a large amount of money (> 0
> actually) without a long and lengthy process with checks and bounds.
> Can you?
>
> The X.org board has budgets and procedures as well. I as a developer
> of Mesa should not be able to commit the X.org foundation to spending
> large amounts of money without checks and bounds.
>
> The CI infrastructure lacks any checks and bounds. There is no link
> between editing .gitlab-ci/* and cashflow. There is no link to me
> adding support for a new feature to llvmpipe that blows out test times
> (granted it won't affect CI budget but just an example).

We're working to get the logging in place to know which projects
exactly burn down the money so that we can take specific actions. If
needed. So pretty soon you wont be able to just burn down endless
amounts of cash with a few gitlab-ci commits. Or at least not for long
until we catch you and you either fix things up or CI is gone for your
project.

> The fact that clouds run on credit means that it's not possible to say
> budget 30K and say when that runs out it runs out, you end up getting
> bills for ever increasing amounts that you have to cover, with nobody
> "responsible" for ever reducing those bills. Higher Faster Further
> baby comes to mind.

We're working on this, since it's the boards responsibility to be on
top of stuff. It's simply that we didn't expect a massive growth of
this scale and this quickly, so we're a bit behind on the controlling
aspect.

Also I guess it wasnt clear, but the board decision yesterday was the
stop loss order where we cut the cord (for CI at least). So yeah the
short term budget is firmly in place now.

> Has X.org actually allocated the remaining cash in it's bank account
> to this task previously? Was there plans for this money that can't be
> executed now because we have to pay the cloud fees? If we continue to
> May and the X.org bank account hits 0, can XDC happen?

There's numbers elsewhere in this thread, but if you'd read the
original announcement it states that the stop loss would still
guarantee that we can pay for everything for at least one year. We're
not going to get even close to 0 in the bank account.

So yeah XDC happens, and it'll also still happen next year. Also fd.o
servers will keep running. The only thing we might need to switch off
is the CI support.

> Budgeting and cloud is hard, the feedback loops are messy. In the old
> system the feedback loop was simple, we don't have admin time or money
> for servers we don't get the features, cloud allows us to get the
> features and enjoy them and at some point in the future the bill gets
> paid by someone else. Credit cards lifestyles all the way.

Uh ... where exactly do you get the credit card approach from? SPI is
legally not allowed to extend us a credit (we're not a legal org
anymore), so if we hit 0 it's out real quick. No credit for us. If SPI
isnt on top of that it's their loss (but they're getting pretty good
at tracking stuff with the contractor they now have and all that).

Which is not going to happen btw, if you've read the announcement mail
and all that.

Cheers, Daniel

> Like maybe we can grow up here and find sponsors to cover all of this,
> but it still feels a bit backwards from a fiscal pov.
>
> Again I'm not knocking the work people have done at all, CI is very
> valuable to the projects involved, but that doesn't absolve us from
> costs.
>
> Dave.

--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch
_______________________________________________
wayland-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/wayland-devel

Nuritzi SanchezSenior Open Source Program Manager | GitLab

Create, Collaborate, and Deploy together

Free Trial | Upgrade Now | Contact Support | Community

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Nicholas Krause

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

In reply to this post by Daniel Vetter

On 2/28/20 4:22 PM, Daniel Vetter wrote:

> On Fri, Feb 28, 2020 at 9:31 PM Dave Airlie <[hidden email]> wrote:
>> On Sat, 29 Feb 2020 at 05:34, Eric Anholt <[hidden email]> wrote:
>>> On Fri, Feb 28, 2020 at 12:48 AM Dave Airlie <[hidden email]> wrote:
>>>> On Fri, 28 Feb 2020 at 18:18, Daniel Stone <[hidden email]> wrote:
>>>>> On Fri, 28 Feb 2020 at 03:38, Dave Airlie <[hidden email]> wrote:
>>>>>> b) we probably need to take a large step back here.
>>>>>>
>>>>>> Look at this from a sponsor POV, why would I give X.org/fd.o
>>>>>> sponsorship money that they are just giving straight to google to pay
>>>>>> for hosting credits? Google are profiting in some minor way from these
>>>>>> hosting credits being bought by us, and I assume we aren't getting any
>>>>>> sort of discounts here. Having google sponsor the credits costs google
>>>>>> substantially less than having any other company give us money to do
>>>>>> it.
>>>>> The last I looked, Google GCP / Amazon AWS / Azure were all pretty
>>>>> comparable in terms of what you get and what you pay for them.
>>>>> Obviously providers like Packet and Digital Ocean who offer bare-metal
>>>>> services are cheaper, but then you need to find someone who is going
>>>>> to properly administer the various machines, install decent
>>>>> monitoring, make sure that more storage is provisioned when we need
>>>>> more storage (which is basically all the time), make sure that the
>>>>> hardware is maintained in decent shape (pretty sure one of the fd.o
>>>>> machines has had a drive in imminent-failure state for the last few
>>>>> months), etc.
>>>>>
>>>>> Given the size of our service, that's a much better plan (IMO) than
>>>>> relying on someone who a) isn't an admin by trade, b) has a million
>>>>> other things to do, and c) hasn't wanted to do it for the past several
>>>>> years. But as long as that's the resources we have, then we're paying
>>>>> the cloud tradeoff, where we pay more money in exchange for fewer
>>>>> problems.
>>>> Admin for gitlab and CI is a full time role anyways. The system is
>>>> definitely not self sustaining without time being put in by you and
>>>> anholt still. If we have $75k to burn on credits, and it was diverted
>>>> to just pay an admin to admin the real hw + gitlab/CI would that not
>>>> be a better use of the money? I didn't know if we can afford $75k for
>>>> an admin, but suddenly we can afford it for gitlab credits?
>>> As I think about the time that I've spent at google in less than a
>>> year on trying to keep the lights on for CI and optimize our
>>> infrastructure in the current cloud environment, that's more than the
>>> entire yearly budget you're talking about here. Saying "let's just
>>> pay for people to do more work instead of paying for full-service
>>> cloud" is not a cost optimization.
>>>
>>>
>>>>> Yes, we could federate everything back out so everyone runs their own
>>>>> builds and executes those. Tinderbox did something really similar to
>>>>> that IIRC; not sure if Buildbot does as well. Probably rules out
>>>>> pre-merge testing, mind.
>>>> Why? does gitlab not support the model? having builds done in parallel
>>>> on runners closer to the test runners seems like it should be a thing.
>>>> I guess artifact transfer would cost less then as a result.
>>> Let's do some napkin math. The biggest artifacts cost we have in Mesa
>>> is probably meson-arm64/meson-arm (60MB zipped from meson-arm64,
>>> downloaded by 4 freedreno and 6ish lava, about 100 pipelines/day,
>>> makes ~1.8TB/month ($180 or so). We could build a local storage next
>>> to the lava dispatcher so that the artifacts didn't have to contain
>>> the rootfs that came from the container (~2/3 of the insides of the
>>> zip file), but that's another service to build and maintain. Building
>>> the drivers once locally and storing it would save downloading the
>>> other ~1/3 of the inside of the zip file, but that requires a big
>>> enough system to do builds in time.
>>>
>>> I'm planning on doing a local filestore for google's lava lab, since I
>>> need to be able to move our xml files off of the lava DUTs to get the
>>> xml results we've become accustomed to, but this would not bubble up
>>> to being a priority for my time if I wasn't doing it anyway. If it
>>> takes me a single day to set all this up (I estimate a couple of
>>> weeks), that costs my employer a lot more than sponsoring the costs of
>>> the inefficiencies of the system that has accumulated.
>> I'm not trying to knock the engineering works the CI contributors have
>> done at all, but I've never seen a real discussion about costs until
>> now. Engineers aren't accountants.
>>
>> The thing we seem to be missing here is fiscal responsibility. I know
>> this email is us being fiscally responsible, but it's kinda after the
>> fact.
>>
>> I cannot commit my employer to spending a large amount of money (> 0
>> actually) without a long and lengthy process with checks and bounds.
>> Can you?
>>
>> The X.org board has budgets and procedures as well. I as a developer
>> of Mesa should not be able to commit the X.org foundation to spending
>> large amounts of money without checks and bounds.
>>
>> The CI infrastructure lacks any checks and bounds. There is no link
>> between editing .gitlab-ci/* and cashflow. There is no link to me
>> adding support for a new feature to llvmpipe that blows out test times
>> (granted it won't affect CI budget but just an example).
> We're working to get the logging in place to know which projects
> exactly burn down the money so that we can take specific actions. If
> needed. So pretty soon you wont be able to just burn down endless
> amounts of cash with a few gitlab-ci commits. Or at least not for long
> until we catch you and you either fix things up or CI is gone for your
> project.
>
>> The fact that clouds run on credit means that it's not possible to say
>> budget 30K and say when that runs out it runs out, you end up getting
>> bills for ever increasing amounts that you have to cover, with nobody
>> "responsible" for ever reducing those bills. Higher Faster Further
>> baby comes to mind.
> We're working on this, since it's the boards responsibility to be on
> top of stuff. It's simply that we didn't expect a massive growth of
> this scale and this quickly, so we're a bit behind on the controlling
> aspect.
>
> Also I guess it wasnt clear, but the board decision yesterday was the
> stop loss order where we cut the cord (for CI at least). So yeah the
> short term budget is firmly in place now.
>
>> Has X.org actually allocated the remaining cash in it's bank account
>> to this task previously? Was there plans for this money that can't be
>> executed now because we have to pay the cloud fees? If we continue to
>> May and the X.org bank account hits 0, can XDC happen?
> There's numbers elsewhere in this thread, but if you'd read the
> original announcement it states that the stop loss would still
> guarantee that we can pay for everything for at least one year. We're
> not going to get even close to 0 in the bank account.
>
> So yeah XDC happens, and it'll also still happen next year. Also fd.o
> servers will keep running. The only thing we might need to switch off
> is the CI support.
>
>> Budgeting and cloud is hard, the feedback loops are messy. In the old
>> system the feedback loop was simple, we don't have admin time or money
>> for servers we don't get the features, cloud allows us to get the
>> features and enjoy them and at some point in the future the bill gets
>> paid by someone else. Credit cards lifestyles all the way.
> Uh ... where exactly do you get the credit card approach from? SPI is
> legally not allowed to extend us a credit (we're not a legal org
> anymore), so if we hit 0 it's out real quick. No credit for us. If SPI
> isnt on top of that it's their loss (but they're getting pretty good
> at tracking stuff with the contractor they now have and all that).
>
> Which is not going to happen btw, if you've read the announcement mail
> and all that.
>
> Cheers, Daniel

Sorry to enter mid conversation. You may want to see how the
GCC test Farm does it or the Yocto Project. I do get their different
projects but they seem to be managing fine. I'm not sure of
their funding but I do recall that a lot of the machines I use
for work on the farm are donated from IBM or some company.

Not sure if you can get a company(ies) to donate some machines
and store them in a data center like GCC or the Yocto Project,

Nick

>> Like maybe we can grow up here and find sponsors to cover all of this,
>> but it still feels a bit backwards from a fiscal pov.
>>
>> Again I'm not knocking the work people have done at all, CI is very
>> valuable to the projects involved, but that doesn't absolve us from
>> costs.
>>
>> Dave.
>
>

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Jason Ekstrand

Re: [Intel-gfx] [Mesa-dev] gitlab.fd.o financial situation and impact on services

In reply to this post by Rob Clark

On Fri, Feb 28, 2020 at 11:00 AM Rob Clark <[hidden email]> wrote:

>
> On Fri, Feb 28, 2020 at 3:43 AM Michel Dänzer <[hidden email]> wrote:
> >
> > On 2020-02-28 10:28 a.m., Erik Faye-Lund wrote:
> > >
> > > We could also do stuff like reducing the amount of tests we run on each
> > > commit, and punt some testing to a per-weekend test-run or someting
> > > like that. We don't *need* to know about every problem up front, just
> > > the stuff that's about to be released, really. The other stuff is just
> > > nice to have. If it's too expensive, I would say drop it.
> >
> > I don't agree that pre-merge testing is just nice to have. A problem
> > which is only caught after it lands in mainline has a much bigger impact
> > than one which is already caught earlier.
> >
>
> one thought.. since with mesa+margebot we effectively get at least
> two(ish) CI runs per MR, ie. one when it is initially pushed, and one
> when margebot rebases and tries to merge, could we leverage this to
> have trimmed down pre-margebot CI which tries to just target affected
> drivers, with margebot doing a full CI run (when it is potentially
> batching together multiple MRs)?
>
> Seems like a way to reduce our CI runs with a good safety net to
> prevent things from slipping through the cracks.

Here are a couple more hopefully constructive but possibly bogus ideas:

1. Suggest people put their CI farms behind a squid transparent
caching proxy. There seem to be many HowTo's on the internet for
doing this and it shouldn't be terribly hard. Maybe GitLab uses too
much HTTPS and that messes things up? If not, this would cut
downloads to one-per-farm rather than one-per-machine

2. Add -Dstrip=true to the meson config. We want asserts but do we
really need those debug symbols? Quick testing on my machine, it
seems to reduce the size of build artifacts by about 60%

Feel free to tell the peanut gallery (me) why I'm wrong. :-)

--Jason
_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Timur Kristóf

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

In reply to this post by Daniel Stone

On Fri, 2020-02-28 at 10:43 +0000, Daniel Stone wrote:

My 20 cents:

1. I think we should completely disable running the CI on MRs which are
marked WIP. Speaking from personal experience, I usually make a lot of
changes to my MRs before they are merged, so it is a waste of CI
resources.

2. Maybe we could take this one step further and only allow the CI to
be only triggered manually instead of automatically on every push.

3. I completely agree with Pierre-Eric on MR 2569, let's not run the
full CI pipeline on every change, only those parts which are affected
by the change. It not only costs money, but is also frustrating when
you submit a change and you get unrelated failures from a completely
unrelated driver.

Best regards,
Timur

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Nicolas Dufresne-5

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

Le samedi 29 février 2020 à 19:14 +0100, Timur Kristóf a écrit :

> On Fri, 2020-02-28 at 10:43 +0000, Daniel Stone wrote:
> > On Fri, 28 Feb 2020 at 10:06, Erik Faye-Lund
> > <[hidden email]> wrote:
> > > On Fri, 2020-02-28 at 11:40 +0200, Lionel Landwerlin wrote:
> > > > Yeah, changes on vulkan drivers or backend compilers should be
> > > > fairly
> > > > sandboxed.
> > > >
> > > > We also have tools that only work for intel stuff, that should
> > > > never
> > > > trigger anything on other people's HW.
> > > >
> > > > Could something be worked out using the tags?
> > >
> > > I think so! We have the pre-defined environment variable
> > > CI_MERGE_REQUEST_LABELS, and we can do variable conditions:
> > >
> > > https://docs.gitlab.com/ee/ci/yaml/#onlyvariablesexceptvariables
> > >
> > > That sounds like a pretty neat middle-ground to me. I just hope
> > > that
> > > new pipelines are triggered if new labels are added, because not
> > > everyone is allowed to set labels, and sometimes people forget...
> >
> > There's also this which is somewhat more robust:
> > https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2569
>
> My 20 cents:
>
> 1. I think we should completely disable running the CI on MRs which are
> marked WIP. Speaking from personal experience, I usually make a lot of
> changes to my MRs before they are merged, so it is a waste of CI
> resources.

In the mean time, you can help by taking the habit to use:

git push -o ci.skip

CI is in fact run for all branches that you push. When we (GStreamer
Project) started our CI we wanted to limit this to MR, but haven't
found a good way yet (and Gitlab is not helping much). The main issue
is that it's near impossible to use gitlab web API from a runner
(requires private key, in an all or nothing manner). But with the
current situation we are revisiting this.

The truth is that probably every CI have lot of room for optimization,
but it can be really time consuming. So until we have a reason to, we
live with inefficiency, like over sized artifact, unused artifacts,
over-sized docker image, etc. Doing a new round of optimization is
obviously a clear short term goals for project, including GStreamer
project. We have discussions going on and are trying to find solutions.
Notably, we would like to get rid of the post merge CI, as in a rebase
flow like we have in GStreamer, it's a really minor risk.

>
> 2. Maybe we could take this one step further and only allow the CI to
> be only triggered manually instead of automatically on every push.
>
> 3. I completely agree with Pierre-Eric on MR 2569, let's not run the
> full CI pipeline on every change, only those parts which are affected
> by the change. It not only costs money, but is also frustrating when
> you submit a change and you get unrelated failures from a completely
> unrelated driver.

That's a much more difficult goal then it looks like. Let each projects
manage their CI graph and content, as each case is unique. Running more
tests, or building more code isn't the main issue as the CPU time is
mostly sponsored. The data transfers between the cloud of gitlab and
the runners (which are external), along to sending OS image to Lava
labs is what is likely the most expensive.

As it was already mention in the thread, what we are missing now, and
being worked on, is per group/project statistics that give us the
hotspot so we can better target the optimization work.

>
> Best regards,
> Timur
>
> _______________________________________________
> gstreamer-devel mailing list
> [hidden email]
> https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Timur Kristóf

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

On Sat, 2020-02-29 at 14:46 -0500, Nicolas Dufresne wrote:

> >
> > 1. I think we should completely disable running the CI on MRs which
> > are
> > marked WIP. Speaking from personal experience, I usually make a lot
> > of
> > changes to my MRs before they are merged, so it is a waste of CI
> > resources.
>
> In the mean time, you can help by taking the habit to use:
>
> git push -o ci.skip

Thanks for the advice, I wasn't aware such an option exists. Does this
also work on the mesa gitlab or is this a GStreamer only thing?

How hard would it be to make this the default?

> That's a much more difficult goal then it looks like. Let each
> projects
> manage their CI graph and content, as each case is unique. Running
> more
> tests, or building more code isn't the main issue as the CPU time is
> mostly sponsored. The data transfers between the cloud of gitlab and
> the runners (which are external), along to sending OS image to Lava
> labs is what is likely the most expensive.
>
> As it was already mention in the thread, what we are missing now, and
> being worked on, is per group/project statistics that give us the
> hotspot so we can better target the optimization work.

Yes, would be nice to know what the hotspot is, indeed.

As far as I understand, the problem is not CI itself, but the bandwidth
needed by the build artifacts, right? Would it be possible to not host
the build artifacts on the gitlab, but rather only the place where the
build actually happened? Or at least, only transfer the build artifacts
on-demand?

I'm not exactly familiar with how the system works, so sorry if this is
a silly question.

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Jason Ekstrand

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

On Sat, Feb 29, 2020 at 3:47 PM Timur Kristóf <[hidden email]> wrote:

>
> On Sat, 2020-02-29 at 14:46 -0500, Nicolas Dufresne wrote:
> > >
> > > 1. I think we should completely disable running the CI on MRs which
> > > are
> > > marked WIP. Speaking from personal experience, I usually make a lot
> > > of
> > > changes to my MRs before they are merged, so it is a waste of CI
> > > resources.
> >
> > In the mean time, you can help by taking the habit to use:
> >
> > git push -o ci.skip
>
> Thanks for the advice, I wasn't aware such an option exists. Does this
> also work on the mesa gitlab or is this a GStreamer only thing?

Mesa is already set up so that it only runs on MRs and branches named
ci-* (or maybe it's ci/*; I can't remember).

> How hard would it be to make this the default?

I strongly suggest looking at how Mesa does it and doing that in
GStreamer if you can. It seems to work pretty well in Mesa.

--Jason

> > That's a much more difficult goal then it looks like. Let each
> > projects
> > manage their CI graph and content, as each case is unique. Running
> > more
> > tests, or building more code isn't the main issue as the CPU time is
> > mostly sponsored. The data transfers between the cloud of gitlab and
> > the runners (which are external), along to sending OS image to Lava
> > labs is what is likely the most expensive.
> >
> > As it was already mention in the thread, what we are missing now, and
> > being worked on, is per group/project statistics that give us the
> > hotspot so we can better target the optimization work.
>
> Yes, would be nice to know what the hotspot is, indeed.
>
> As far as I understand, the problem is not CI itself, but the bandwidth
> needed by the build artifacts, right? Would it be possible to not host
> the build artifacts on the gitlab, but rather only the place where the
> build actually happened? Or at least, only transfer the build artifacts
> on-demand?
>
> I'm not exactly familiar with how the system works, so sorry if this is
> a silly question.
>
> _______________________________________________
> mesa-dev mailing list
> [hidden email]
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Nicolas Dufresne-5

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

Le samedi 29 février 2020 à 15:54 -0600, Jason Ekstrand a écrit :

> On Sat, Feb 29, 2020 at 3:47 PM Timur Kristóf <[hidden email]> wrote:
> > On Sat, 2020-02-29 at 14:46 -0500, Nicolas Dufresne wrote:
> > > > 1. I think we should completely disable running the CI on MRs which
> > > > are
> > > > marked WIP. Speaking from personal experience, I usually make a lot
> > > > of
> > > > changes to my MRs before they are merged, so it is a waste of CI
> > > > resources.
> > >
> > > In the mean time, you can help by taking the habit to use:
> > >
> > > git push -o ci.skip
> >
> > Thanks for the advice, I wasn't aware such an option exists. Does this
> > also work on the mesa gitlab or is this a GStreamer only thing?
>
> Mesa is already set up so that it only runs on MRs and branches named
> ci-* (or maybe it's ci/*; I can't remember).
>
> > How hard would it be to make this the default?
>
> I strongly suggest looking at how Mesa does it and doing that in
> GStreamer if you can. It seems to work pretty well in Mesa.

You are right, they added CI_MERGE_REQUEST_SOURCE_BRANCH_NAME in 11.6
(we started our CI a while ago). But there is even better now, ou can
do:

only:
refs:
- merge_requests

Thanks for the hint, I'll suggest that. I've lookup some of the backend
of mesa, I think it's really nice, though there is a lot of concept
that won't work in a multi-repo CI. Again, I need to refresh on what
was moved from the enterprise to the community version in this regard,

>
> --Jason
>
>
> > > That's a much more difficult goal then it looks like. Let each
> > > projects
> > > manage their CI graph and content, as each case is unique. Running
> > > more
> > > tests, or building more code isn't the main issue as the CPU time is
> > > mostly sponsored. The data transfers between the cloud of gitlab and
> > > the runners (which are external), along to sending OS image to Lava
> > > labs is what is likely the most expensive.
> > >
> > > As it was already mention in the thread, what we are missing now, and
> > > being worked on, is per group/project statistics that give us the
> > > hotspot so we can better target the optimization work.
> >
> > Yes, would be nice to know what the hotspot is, indeed.
> >
> > As far as I understand, the problem is not CI itself, but the bandwidth
> > needed by the build artifacts, right? Would it be possible to not host
> > the build artifacts on the gitlab, but rather only the place where the
> > build actually happened? Or at least, only transfer the build artifacts
> > on-demand?
> >
> > I'm not exactly familiar with how the system works, so sorry if this is
> > a silly question.
> >
> > _______________________________________________
> > mesa-dev mailing list
> > [hidden email]
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

Marek Olšák

Re: [Mesa-dev] [Intel-gfx] gitlab.fd.o financial situation and impact on services

For Mesa, we could run CI only when Marge pushes, so that it's a strictly pre-merge CI.

Marek

On Sat., Feb. 29, 2020, 17:20 Nicolas Dufresne, <[hidden email]> wrote:

Le samedi 29 février 2020 à 15:54 -0600, Jason Ekstrand a écrit :
> On Sat, Feb 29, 2020 at 3:47 PM Timur Kristóf <[hidden email]> wrote:
> > On Sat, 2020-02-29 at 14:46 -0500, Nicolas Dufresne wrote:
> > > > 1. I think we should completely disable running the CI on MRs which
> > > > are
> > > > marked WIP. Speaking from personal experience, I usually make a lot
> > > > of
> > > > changes to my MRs before they are merged, so it is a waste of CI
> > > > resources.
> > >
> > > In the mean time, you can help by taking the habit to use:
> > >
> > > git push -o ci.skip
> >
> > Thanks for the advice, I wasn't aware such an option exists. Does this
> > also work on the mesa gitlab or is this a GStreamer only thing?
>
> Mesa is already set up so that it only runs on MRs and branches named
> ci-* (or maybe it's ci/*; I can't remember).
>
> > How hard would it be to make this the default?
>
> I strongly suggest looking at how Mesa does it and doing that in
> GStreamer if you can. It seems to work pretty well in Mesa.

You are right, they added CI_MERGE_REQUEST_SOURCE_BRANCH_NAME in 11.6
(we started our CI a while ago). But there is even better now, ou can
do:

only:
refs:
- merge_requests

Thanks for the hint, I'll suggest that. I've lookup some of the backend
of mesa, I think it's really nice, though there is a lot of concept
that won't work in a multi-repo CI. Again, I need to refresh on what
was moved from the enterprise to the community version in this regard,

>
> --Jason
>
>
> > > That's a much more difficult goal then it looks like. Let each
> > > projects
> > > manage their CI graph and content, as each case is unique. Running
> > > more
> > > tests, or building more code isn't the main issue as the CPU time is
> > > mostly sponsored. The data transfers between the cloud of gitlab and
> > > the runners (which are external), along to sending OS image to Lava
> > > labs is what is likely the most expensive.
> > >
> > > As it was already mention in the thread, what we are missing now, and
> > > being worked on, is per group/project statistics that give us the
> > > hotspot so we can better target the optimization work.
> >
> > Yes, would be nice to know what the hotspot is, indeed.
> >
> > As far as I understand, the problem is not CI itself, but the bandwidth
> > needed by the build artifacts, right? Would it be possible to not host
> > the build artifacts on the gitlab, but rather only the place where the
> > build actually happened? Or at least, only transfer the build artifacts
> > on-demand?
> >
> > I'm not exactly familiar with how the system works, so sorry if this is
> > a silly question.
> >
> > _______________________________________________
> > mesa-dev mailing list
> > [hidden email]
> > https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
mesa-dev mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

_______________________________________________
gstreamer-devel mailing list
[hidden email]
https://lists.freedesktop.org/mailman/listinfo/gstreamer-devel

1234