Home » Mailing lists » Devel » [PATCH] BC: resource beancounters (v4) (added user memory)
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6231 is a reply to message #6222] |
Tue, 12 September 2006 11:06   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
Srivatsa Vaddagiri wrote:
> On Tue, Sep 12, 2006 at 02:24:25PM +0400, Pavel Emelianov wrote:
>
>> Srivatsa Vaddagiri wrote:
>>
>>> On Mon, Sep 11, 2006 at 11:02:06AM +0400, Pavel Emelianov wrote:
>>>
>>>
>>>> Sure. At the beginning I have one task with one BC. Then
>>>> 1. A thread is spawned and new BC is created;
>>>>
>>>>
>>> Why do we have to create a BC for every new thread? A new BC is needed
>>> for every new service level instead IMO. And typically there wont be
>>> unlimited service levels.
>>>
>>>
>> That's the scenario we started from - each domain is served in a separate
>> BC with *threaded* Apache.
>>
>
> Sure ..but you can still meet that requirement by creating fixed set of
> BCs (for each domain) and let each new thread be associated with a
> corresponding BC (w/o requiring to create BC for every new thread),
> depending on which domain's request it is serving?
>
Hmmm... Beancounters can provide this after trivial changes.
We may schedule them in current set of "pending" features
(http://wiki.openvz.org/UBC_discussion)
But this can create a kind of DoS within an application:
A thread continuously touches new and new pages to it's BC and
these pages are get touched by other threads also. Sooner or later
this BC will hit it's limit and reclaiming this set of pages would affect
all the other threads.
Also such accounting reveals you NOTHING about real memory usage.
E.g. 100Mb charged for one BC can mean "this BC ate 100Mb of
memory" as well as "this BC uses one page really, but all the others
are just used by other threads" and anything between these two
corner cases.
Well. We've digressed from our main thread - discussing (dis)advantages
of current BC implemenation.
>
>>>
>>>
>>>> 2. New thread touches a new page (e.g. maps a new file) which is charged
>>>> to new BC
>>>> (and this means that this BC's must stay in memory till page is
>>>> uncharged);
>>>> 3. Thread exits after serving the request, but since it's mm is shared
>>>> with parent
>>>> all the touched pages stay resident and, thus, the new BC is still
>>>> pinned in memory.
>>>> Steps 1-3 are done multiple times for new pages (new files).
>>>> Remember that we're discussing the case when pages are not recharged.
>>>>
|
|
|
|
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6254 is a reply to message #6252] |
Tue, 12 September 2006 17:40   |
Srivatsa Vaddagiri
Messages: 241 Registered: August 2006
|
Senior Member |
|
|
On Tue, Sep 12, 2006 at 10:22:32AM -0700, Rohit Seth wrote:
> On Tue, 2006-09-12 at 16:14 +0530, Srivatsa Vaddagiri wrote:
> > On Mon, Sep 11, 2006 at 12:10:31PM -0700, Rohit Seth wrote:
> > > It seems that a single notion of limit should suffice, and that limit
> > > should more be treated as something beyond which that resource
> > > consumption in the container will be throttled/not_allowed.
> >
> > The big question is : are containers/RG allowed to use *upto* their
> > limit always? In other words, will you typically setup limits such that
> > sum of all limits = max resource capacity?
> >
>
> If a user is really interested in ensuring that all scheduled jobs (or
> containers) get what they have asked for (guarantees) then making the
> sum of all container limits equal to total system limit is the right
> thing to do.
>
> > If it is setup like that, then what you are considering as limit is
> > actually guar no?
> >
> Right. And if we do it like this then it is up to sysadmin to configure
> the thing right without adding additional logic in kernel.
Perhaps calling it as "limit" in confusing then (otoh it may go down well
with Linus!). I perhaps agree we need to go with one for now (in the
interest of making some progress), but we probably will come back to
this at a later point. For ex, I chanced upon this document:
www.vmware.com/pdf/vmware_drs_wp.pdf
which explains how supporting a hard limit (in contrast to guar as we
have been discussing) can be usefull sometimes.
--
Regards,
vatsa
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6258 is a reply to message #6204] |
Tue, 12 September 2006 23:54   |
Chandra Seetharaman
Messages: 88 Registered: August 2006
|
Member |
|
|
On Mon, 2006-09-11 at 16:58 -0700, Rohit Seth wrote:
> On Mon, 2006-09-11 at 12:42 -0700, Chandra Seetharaman wrote:
> > On Mon, 2006-09-11 at 12:10 -0700, Rohit Seth wrote:
> > > On Mon, 2006-09-11 at 11:25 -0700, Chandra Seetharaman wrote:
>
> > > > There could be a default container which doesn't have any guarantee or
> > > > limit.
> > >
> > > First, I think it is critical that we allow processes to run outside of
> > > any container (unless we know for sure that the penalty of running a
> > > process inside a container is very very minimal).
> >
> > When I meant a default container I meant a default "resource group". In
> > case of container that would be the default environment. I do not see
> > any additional overhead associated with it, it is only associated with
> > how resource are allocated/accounted.
> >
>
> There should be some cost when you do atomic inc/dec accounting and
> locks for add/remove resources from any container (including default
> resource group). No?
yes, it would be there, but is not heavy, IMO.
>
> > >
> > > And anything running outside a container should be limited by default
> > > Linux settings.
> >
> > note that the resource available to the default RG will be (total system
> > resource - allocated to RGs).
>
> I think it will be preferable to not change the existing behavior for
> applications that are running outside any container (in your case
> default resource group).
hmm, when you provide QoS for a set of apps, you will affect (the
resource availability of) other apps. I don't see any way around it. Any
ideas ?
>
> > >
> > > > When you create containers and assign guarantees to each of them
> > > > make sure that you leave some amount of resource unassigned.
> > > ^^^^^ This will force the "default" container
> > > with limits (indirectly). IMO, the whole guarantee feature gets defeated
> >
> > You _will_ have limits for the default RG even if we don't have
> > guarantees.
> >
> > > the moment you bring in this fuzziness.
> >
> > Not really.
> > - Each RG will have a guarantee and limit of each resource.
> > - default RG will have (system resource - sum of guarantees)
> > - Every RG will be guaranteed some amount of resource to provide QoS
> > - Every RG will be limited at "limit" to prevent DoS attacks.
> > - Whoever doesn't care either of those set them to don't care values.
> >
>
> For the cases that put this don't care, do you depend on existing
> reclaim algorithm (for memory) in kernel?
Yes.
>
> > >
> > > > That
> > > > unassigned resources can be used by the default container or can be used
> > > > by containers that want more than their guarantee (and less than their
> > > > limit). This is how CKRM/RG handles this issue.
> > > >
> > > >
> > >
> > > It seems that a single notion of limit should suffice, and that limit
> > > should more be treated as something beyond which that resource
> > > consumption in the container will be throttled/not_allowed.
> >
> > As I stated in an earlier email "Limit only" approach can prevent a
> > system from DoS attacks (and also fits the container model nicely),
> > whereas to provide QoS one would need guarantee.
> >
> > Without guarantee, a RG that the admin cares about can starve if
> > all/most of the other RGs consume upto their limits.
> >
> > >
>
> If the limits are set appropriately so that containers total memory
> consumption does not exceed the system memory then there shouldn't be
> any QoS issue (to whatever extent it is applicable for specific
> scenario).
Then you will not be work-conserving (IOW over-committing), which is one
of the main advantage of this type of feature.
>
> -rohit
>
>
> ------------------------------------------------------------ -------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&b id=263057&dat=121642
> _______________________________________________
> ckrm-tech mailing list
> https://lists.sourceforge.net/lists/listinfo/ckrm-tech
--
------------------------------------------------------------ ----------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
------------------------------------------------------------ ----------
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6259 is a reply to message #6227] |
Tue, 12 September 2006 23:58   |
Chandra Seetharaman
Messages: 88 Registered: August 2006
|
Member |
|
|
On Tue, 2006-09-12 at 14:48 +0400, Pavel Emelianov wrote:
<snip>
> > I do not think it is that simple since
> > - there is typically more than one class I want to set guarantee to
> > - I will not able to use both limit and guarantee
> > - Implementation will not be work-conserving.
> >
> > Also, How would you configure the following in your model ?
> >
> > 5 classes: Class A(10, 40), Class B(20, 100), Class C (30, 100), Class D
> > (5, 100), Class E(15, 50); (class_name(guarantee, limit))
> >
> What's the total memory amount on the node? Without it it's hard to make
> any
> guarantee.
I wrote the example treating them as %, so 100 would be the total amount
of memory.
> > "Limit only" approach works for DoS prevention. But for providing QoS
> > you would need guarantee.
> >
> You may not provide guarantee on physycal resource for a particular group
> without limiting its usage by other groups. That's my major idea.
I agree with that, but the other way around (i.e provide guarantee for
everyone by imposing limits on everyone) is what I am saying is not
possible.
>
> ------------------------------------------------------------ -------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&b id=263057&dat=121642
> _______________________________________________
> ckrm-tech mailing list
> https://lists.sourceforge.net/lists/listinfo/ckrm-tech
--
------------------------------------------------------------ ----------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
------------------------------------------------------------ ----------
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6260 is a reply to message #6252] |
Wed, 13 September 2006 00:02   |
Chandra Seetharaman
Messages: 88 Registered: August 2006
|
Member |
|
|
On Tue, 2006-09-12 at 10:22 -0700, Rohit Seth wrote:
> On Tue, 2006-09-12 at 16:14 +0530, Srivatsa Vaddagiri wrote:
> > On Mon, Sep 11, 2006 at 12:10:31PM -0700, Rohit Seth wrote:
> > > It seems that a single notion of limit should suffice, and that limit
> > > should more be treated as something beyond which that resource
> > > consumption in the container will be throttled/not_allowed.
> >
> > The big question is : are containers/RG allowed to use *upto* their
> > limit always? In other words, will you typically setup limits such that
> > sum of all limits = max resource capacity?
> >
>
> If a user is really interested in ensuring that all scheduled jobs (or
> containers) get what they have asked for (guarantees) then making the
> sum of all container limits equal to total system limit is the right
> thing to do.
>
> > If it is setup like that, then what you are considering as limit is
> > actually guar no?
> >
> Right. And if we do it like this then it is up to sysadmin to configure
> the thing right without adding additional logic in kernel.
It won't be a complete solution, as the user won't be able to
- set both guarantee and limit for a resource group
- use limit on some and guarantee on some
- optimize the usage of available resources
>
> -rohit
>
>
>
> ------------------------------------------------------------ -------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&b id=263057&dat=121642
> _______________________________________________
> ckrm-tech mailing list
> https://lists.sourceforge.net/lists/listinfo/ckrm-tech
--
------------------------------------------------------------ ----------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
------------------------------------------------------------ ----------
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6261 is a reply to message #6258] |
Wed, 13 September 2006 00:39   |
Rohit Seth
Messages: 101 Registered: August 2006
|
Senior Member |
|
|
On Tue, 2006-09-12 at 16:54 -0700, Chandra Seetharaman wrote:
> On Mon, 2006-09-11 at 16:58 -0700, Rohit Seth wrote:
> > On Mon, 2006-09-11 at 12:42 -0700, Chandra Seetharaman wrote:
> > > On Mon, 2006-09-11 at 12:10 -0700, Rohit Seth wrote:
> > > > On Mon, 2006-09-11 at 11:25 -0700, Chandra Seetharaman wrote:
> >
> > > > > There could be a default container which doesn't have any guarantee or
> > > > > limit.
> > > >
> > > > First, I think it is critical that we allow processes to run outside of
> > > > any container (unless we know for sure that the penalty of running a
> > > > process inside a container is very very minimal).
> > >
> > > When I meant a default container I meant a default "resource group". In
> > > case of container that would be the default environment. I do not see
> > > any additional overhead associated with it, it is only associated with
> > > how resource are allocated/accounted.
> > >
> >
> > There should be some cost when you do atomic inc/dec accounting and
> > locks for add/remove resources from any container (including default
> > resource group). No?
>
> yes, it would be there, but is not heavy, IMO.
I think anything greater than 1% could be a concern for people who are
not very interested in containers but would be forced to live with them.
> >
> > > >
> > > > And anything running outside a container should be limited by default
> > > > Linux settings.
> > >
> > > note that the resource available to the default RG will be (total system
> > > resource - allocated to RGs).
> >
> > I think it will be preferable to not change the existing behavior for
> > applications that are running outside any container (in your case
> > default resource group).
>
> hmm, when you provide QoS for a set of apps, you will affect (the
> resource availability of) other apps. I don't see any way around it. Any
> ideas ?
When I say, existing behavior, I mean not getting impacted by some
artificial limits that are imposed by container subsystem. IOW, if a
sysadmin is okay to have certain apps running outside of container then
he is basically forgoing any QoS for any container on that system.
>
> >
> > > >
> > > > > When you create containers and assign guarantees to each of them
> > > > > make sure that you leave some amount of resource unassigned.
> > > > ^^^^^ This will force the "default" container
> > > > with limits (indirectly). IMO, the whole guarantee feature gets defeated
> > >
> > > You _will_ have limits for the default RG even if we don't have
> > > guarantees.
> > >
> > > > the moment you bring in this fuzziness.
> > >
> > > Not really.
> > > - Each RG will have a guarantee and limit of each resource.
> > > - default RG will have (system resource - sum of guarantees)
> > > - Every RG will be guaranteed some amount of resource to provide QoS
> > > - Every RG will be limited at "limit" to prevent DoS attacks.
> > > - Whoever doesn't care either of those set them to don't care values.
> > >
> >
> > For the cases that put this don't care, do you depend on existing
> > reclaim algorithm (for memory) in kernel?
>
> Yes.
So one container with these don't care condition(s) can turn the whole
guarantee thing bad. Because existing kernel reclaimer does not know
about memory commitments to other containers. Right?
> >
> > > >
> > > > > That
> > > > > unassigned resources can be used by the default container or can be used
> > > > > by containers that want more than their guarantee (and less than their
> > > > > limit). This is how CKRM/RG handles this issue.
> > > > >
> > > > >
> > > >
> > > > It seems that a single notion of limit should suffice, and that limit
> > > > should more be treated as something beyond which that resource
> > > > consumption in the container will be throttled/not_allowed.
> > >
> > > As I stated in an earlier email "Limit only" approach can prevent a
> > > system from DoS attacks (and also fits the container model nicely),
> > > whereas to provide QoS one would need guarantee.
> > >
> > > Without guarantee, a RG that the admin cares about can starve if
> > > all/most of the other RGs consume upto their limits.
> > >
> > > >
> >
> > If the limits are set appropriately so that containers total memory
> > consumption does not exceed the system memory then there shouldn't be
> > any QoS issue (to whatever extent it is applicable for specific
> > scenario).
>
> Then you will not be work-conserving (IOW over-committing), which is one
> of the main advantage of this type of feature.
>
If for the systems where QoS is important, not over-committing will be
fine (at least to start with).
-rohit
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6262 is a reply to message #6260] |
Wed, 13 September 2006 00:43   |
Rohit Seth
Messages: 101 Registered: August 2006
|
Senior Member |
|
|
On Tue, 2006-09-12 at 17:02 -0700, Chandra Seetharaman wrote:
> On Tue, 2006-09-12 at 10:22 -0700, Rohit Seth wrote:
> > On Tue, 2006-09-12 at 16:14 +0530, Srivatsa Vaddagiri wrote:
> > > On Mon, Sep 11, 2006 at 12:10:31PM -0700, Rohit Seth wrote:
> > > > It seems that a single notion of limit should suffice, and that limit
> > > > should more be treated as something beyond which that resource
> > > > consumption in the container will be throttled/not_allowed.
> > >
> > > The big question is : are containers/RG allowed to use *upto* their
> > > limit always? In other words, will you typically setup limits such that
> > > sum of all limits = max resource capacity?
> > >
> >
> > If a user is really interested in ensuring that all scheduled jobs (or
> > containers) get what they have asked for (guarantees) then making the
> > sum of all container limits equal to total system limit is the right
> > thing to do.
> >
> > > If it is setup like that, then what you are considering as limit is
> > > actually guar no?
> > >
> > Right. And if we do it like this then it is up to sysadmin to configure
> > the thing right without adding additional logic in kernel.
>
> It won't be a complete solution, as the user won't be able to
> - set both guarantee and limit for a resource group
> - use limit on some and guarantee on some
> - optimize the usage of available resources
I think, if we have some of the dynamic resource limit adjustments
possible then some of the above functionality could be achieved. And I
think that could be a good start point.
-rohit
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6263 is a reply to message #6261] |
Wed, 13 September 2006 01:10   |
Chandra Seetharaman
Messages: 88 Registered: August 2006
|
Member |
|
|
On Tue, 2006-09-12 at 17:39 -0700, Rohit Seth wrote:
<snip>
> > yes, it would be there, but is not heavy, IMO.
>
> I think anything greater than 1% could be a concern for people who are
> not very interested in containers but would be forced to live with them.
If they are not interested in resource management and/or containers, i
do not think they need to pay.
>
> > >
> > > > >
> > > > > And anything running outside a container should be limited by default
> > > > > Linux settings.
> > > >
> > > > note that the resource available to the default RG will be (total system
> > > > resource - allocated to RGs).
> > >
> > > I think it will be preferable to not change the existing behavior for
> > > applications that are running outside any container (in your case
> > > default resource group).
> >
> > hmm, when you provide QoS for a set of apps, you will affect (the
> > resource availability of) other apps. I don't see any way around it. Any
> > ideas ?
>
> When I say, existing behavior, I mean not getting impacted by some
> artificial limits that are imposed by container subsystem. IOW, if a
That is what I understood and replied above.
> sysadmin is okay to have certain apps running outside of container then
> he is basically forgoing any QoS for any container on that system.
Not at all. If the container they are interested in is guaranteed, I do
not see how apps running outside a container would affect them.
<snip>
> > > > Not really.
> > > > - Each RG will have a guarantee and limit of each resource.
> > > > - default RG will have (system resource - sum of guarantees)
> > > > - Every RG will be guaranteed some amount of resource to provide QoS
> > > > - Every RG will be limited at "limit" to prevent DoS attacks.
> > > > - Whoever doesn't care either of those set them to don't care values.
> > > >
> > >
> > > For the cases that put this don't care, do you depend on existing
> > > reclaim algorithm (for memory) in kernel?
> >
> > Yes.
>
> So one container with these don't care condition(s) can turn the whole
> guarantee thing bad. Because existing kernel reclaimer does not know
> about memory commitments to other containers. Right?
No, the reclaimer would free up pages associated with the don't care RGs
( as the user don't care about the resource made available to them).
<snip>
> > > If the limits are set appropriately so that containers total memory
> > > consumption does not exceed the system memory then there shouldn't be
> > > any QoS issue (to whatever extent it is applicable for specific
> > > scenario).
> >
> > Then you will not be work-conserving (IOW over-committing), which is one
> > of the main advantage of this type of feature.
> >
>
> If for the systems where QoS is important, not over-committing will be
> fine (at least to start with).
The problem is that you can't do it with just limit.
--
------------------------------------------------------------ ----------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
------------------------------------------------------------ ----------
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6264 is a reply to message #6263] |
Wed, 13 September 2006 01:25   |
Rohit Seth
Messages: 101 Registered: August 2006
|
Senior Member |
|
|
On Tue, 2006-09-12 at 18:10 -0700, Chandra Seetharaman wrote:
> On Tue, 2006-09-12 at 17:39 -0700, Rohit Seth wrote:
> <snip>
> > > yes, it would be there, but is not heavy, IMO.
> >
> > I think anything greater than 1% could be a concern for people who are
> > not very interested in containers but would be forced to live with them.
>
> If they are not interested in resource management and/or containers, i
> do not think they need to pay.
> >
Think of a single kernel from a vendor that has container support built
in.
> > > >
> > > > > >
> > > > > > And anything running outside a container should be limited by default
> > > > > > Linux settings.
> > > > >
> > > > > note that the resource available to the default RG will be (total system
> > > > > resource - allocated to RGs).
> > > >
> > > > I think it will be preferable to not change the existing behavior for
> > > > applications that are running outside any container (in your case
> > > > default resource group).
> > >
> > > hmm, when you provide QoS for a set of apps, you will affect (the
> > > resource availability of) other apps. I don't see any way around it. Any
> > > ideas ?
> >
> > When I say, existing behavior, I mean not getting impacted by some
> > artificial limits that are imposed by container subsystem. IOW, if a
>
> That is what I understood and replied above.
> > sysadmin is okay to have certain apps running outside of container then
> > he is basically forgoing any QoS for any container on that system.
>
> Not at all. If the container they are interested in is guaranteed, I do
> not see how apps running outside a container would affect them.
>
Because the kernel (outside the container subsystem) doesn't know of
these guarantees...unless you modify the page allocator to have another
variant of overcommit memory.
> <snip>
> > > > > Not really.
> > > > > - Each RG will have a guarantee and limit of each resource.
> > > > > - default RG will have (system resource - sum of guarantees)
> > > > > - Every RG will be guaranteed some amount of resource to provide QoS
> > > > > - Every RG will be limited at "limit" to prevent DoS attacks.
> > > > > - Whoever doesn't care either of those set them to don't care values.
> > > > >
> > > >
> > > > For the cases that put this don't care, do you depend on existing
> > > > reclaim algorithm (for memory) in kernel?
> > >
> > > Yes.
> >
> > So one container with these don't care condition(s) can turn the whole
> > guarantee thing bad. Because existing kernel reclaimer does not know
> > about memory commitments to other containers. Right?
>
> No, the reclaimer would free up pages associated with the don't care RGs
> ( as the user don't care about the resource made available to them).
>
And how will the kernel reclaimer know which RGs are don't care?
-rohit
|
|
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6274 is a reply to message #6259] |
Wed, 13 September 2006 08:06   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
Chandra Seetharaman wrote:
> On Tue, 2006-09-12 at 14:48 +0400, Pavel Emelianov wrote:
> <snip>
>
>>> I do not think it is that simple since
>>> - there is typically more than one class I want to set guarantee to
>>> - I will not able to use both limit and guarantee
>>> - Implementation will not be work-conserving.
>>>
>>> Also, How would you configure the following in your model ?
>>>
>>> 5 classes: Class A(10, 40), Class B(20, 100), Class C (30, 100), Class D
>>> (5, 100), Class E(15, 50); (class_name(guarantee, limit))
>>>
>>>
>> What's the total memory amount on the node? Without it it's hard to make
>> any
>> guarantee.
>>
>
> I wrote the example treating them as %, so 100 would be the total amount
> of memory.
>
OK. Then limiting must be done this way (unreclaimable limit/total limit)
A (15/40)
B (25/100)
C (35/100)
D (10/100)
E (20/50)
In this case each group will receive it's guarantee for sure.
E.g. even if A, B, E and D will eat all it's unreclaimable memory then
we'll have
100 - 15 - 25 - 20 - 10 = 30% of memory left (maybe after reclaiming) which
is perfectly enough for C's guarantee.
>
>>> "Limit only" approach works for DoS prevention. But for providing QoS
>>> you would need guarantee.
>>>
>>>
>> You may not provide guarantee on physycal resource for a particular group
>> without limiting its usage by other groups. That's my major idea.
>>
>
> I agree with that, but the other way around (i.e provide guarantee for
> everyone by imposing limits on everyone) is what I am saying is not
> possible.
Then how do you make sure that memory WILL be available when the group needs
it without limiting the others in a proper way?
|
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6287 is a reply to message #6285] |
Wed, 13 September 2006 13:35   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
Srivatsa Vaddagiri wrote:
> On Wed, Sep 13, 2006 at 12:06:41PM +0400, Pavel Emelianov wrote:
>> OK. Then limiting must be done this way (unreclaimable limit/total limit)
>> A (15/40)
>> B (25/100)
>> C (35/100)
>
> s/35/30?
Hmmm... No, it must be 35. It IS higher than guarantee you proposed,
but that's OK to have a limit higher than guarantee, isn't it?
>
> Also the different b/n total and unreclaimable limits goes towards
> limiting reclaimable memory i suppose? And 1st limit seems to be a
> hard-limit while the 2nd one is soft?
The first limit (let's call it soft one) is limit for unreclaimable
memory, the second (hard limit) - for booth reclaimable and not.
The ploicy is
1. if BC tries to *mmap()* unreclaimable region (e.g. w/o backed
file as moving page to swap is not a pure "reclamation") then
check the soft limit and prohibit mapping in case it is hit;
2. if BC tries to *touch* a page - then check for the hard limit
and start reclaiming this BC's pages if the limit is hit.
That's how guarantees can be met. Current BC code does perform the
first check and gives you all the levers for the second one - just
the patch(es) with reclamation mechanism is required.
>
>> D (10/100)
>> E (20/50)
>> In this case each group will receive it's guarantee for sure.
>>
>> E.g. even if A, B, E and D will eat all it's unreclaimable memory then
>> we'll have
>> 100 - 15 - 25 - 20 - 10 = 30% of memory left (maybe after reclaiming) which
>> is perfectly enough for C's guarantee.
>
> I agree by carefully choosing these limits, we can provide some sort of
> QoS, which is a good step to begin with.
Sure. As I've said - soft limiting is already done with BC patches, the
hard one is not prohibited by BC (BCs even prepare a good pad for it).
When reclaiming is done we'll have a hard limit described above.
|
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6308 is a reply to message #6264] |
Wed, 13 September 2006 22:20   |
Chandra Seetharaman
Messages: 88 Registered: August 2006
|
Member |
|
|
On Tue, 2006-09-12 at 18:25 -0700, Rohit Seth wrote:
> On Tue, 2006-09-12 at 18:10 -0700, Chandra Seetharaman wrote:
> > On Tue, 2006-09-12 at 17:39 -0700, Rohit Seth wrote:
> > <snip>
> > > > yes, it would be there, but is not heavy, IMO.
> > >
> > > I think anything greater than 1% could be a concern for people who are
> > > not very interested in containers but would be forced to live with them.
> >
> > If they are not interested in resource management and/or containers, i
> > do not think they need to pay.
> > >
>
> Think of a single kernel from a vendor that has container support built
> in.
Ok. Understood.
Here are results of some of the benchmarks we have run in the past
(April 2005) with CKRM which showed no/negligible performance impact in
that scenario.
http://marc.theaimsgroup.com/?l=ckrm-tech&m=111325064322 305&w=2
http://marc.theaimsgroup.com/?l=ckrm-tech&m=111385973226 267&w=2
http://marc.theaimsgroup.com/?l=ckrm-tech&m=111291409731 929&w=2
>
<snip>
> > Not at all. If the container they are interested in is guaranteed, I do
> > not see how apps running outside a container would affect them.
> >
>
> Because the kernel (outside the container subsystem) doesn't know of
The core resource subsystem (VM subsystem for memory) would know about
the guarantees and don't cares, and it would handle it appropriately.
> these guarantees...unless you modify the page allocator to have another
> variant of overcommit memory.
>
<snip>
>
> > No, the reclaimer would free up pages associated with the don't care RGs
> > ( as the user don't care about the resource made available to them).
> >
>
> And how will the kernel reclaimer know which RGs are don't care?
By looking into the beancounter associated with the container/RG
--
------------------------------------------------------------ ----------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
------------------------------------------------------------ ----------
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6311 is a reply to message #6274] |
Wed, 13 September 2006 22:31   |
Chandra Seetharaman
Messages: 88 Registered: August 2006
|
Member |
|
|
On Wed, 2006-09-13 at 12:06 +0400, Pavel Emelianov wrote:
> Chandra Seetharaman wrote:
> > On Tue, 2006-09-12 at 14:48 +0400, Pavel Emelianov wrote:
> > <snip>
> >
> >>> I do not think it is that simple since
> >>> - there is typically more than one class I want to set guarantee to
> >>> - I will not able to use both limit and guarantee
> >>> - Implementation will not be work-conserving.
> >>>
> >>> Also, How would you configure the following in your model ?
> >>>
> >>> 5 classes: Class A(10, 40), Class B(20, 100), Class C (30, 100), Class D
> >>> (5, 100), Class E(15, 50); (class_name(guarantee, limit))
> >>>
> >>>
> >> What's the total memory amount on the node? Without it it's hard to make
> >> any
> >> guarantee.
> >>
> >
> > I wrote the example treating them as %, so 100 would be the total amount
> > of memory.
> >
> OK. Then limiting must be done this way (unreclaimable limit/total limit)
> A (15/40)
> B (25/100)
> C (35/100)
> D (10/100)
> E (20/50)
> In this case each group will receive it's guarantee for sure.
>
> E.g. even if A, B, E and D will eat all it's unreclaimable memory then
> we'll have
> 100 - 15 - 25 - 20 - 10 = 30% of memory left (maybe after reclaiming) which
> is perfectly enough for C's guarantee.
How did you arrive at the +5 number ?
What if I have 40 containers each with 2% guarantee ? what do we do
then ? and many other different combinations (what I gave was not the
_only_ scenario).
> >
> >>> "Limit only" approach works for DoS prevention. But for providing QoS
> >>> you would need guarantee.
> >>>
> >>>
> >> You may not provide guarantee on physycal resource for a particular group
> >> without limiting its usage by other groups. That's my major idea.
> >>
> >
> > I agree with that, but the other way around (i.e provide guarantee for
> > everyone by imposing limits on everyone) is what I am saying is not
> > possible.
> Then how do you make sure that memory WILL be available when the group needs
> it without limiting the others in a proper way?
You could limit others only if you _know_ somebody is not getting what
they are supposed to get (based on guarantee).
>
> ------------------------------------------------------------ -------------
> Using Tomcat but need to do more? Need to support web services, security?
> Get stuff done quickly with pre-integrated technology to make your job easier
> Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&b id=263057&dat=121642
> _______________________________________________
> ckrm-tech mailing list
> https://lists.sourceforge.net/lists/listinfo/ckrm-tech
--
------------------------------------------------------------ ----------
Chandra Seetharaman | Be careful what you choose....
- sekharan@us.ibm.com | .......you may get it.
------------------------------------------------------------ ----------
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6316 is a reply to message #6308] |
Thu, 14 September 2006 01:22   |
Rohit Seth
Messages: 101 Registered: August 2006
|
Senior Member |
|
|
On Wed, 2006-09-13 at 15:20 -0700, Chandra Seetharaman wrote:
> On Tue, 2006-09-12 at 18:25 -0700, Rohit Seth wrote:
> > On Tue, 2006-09-12 at 18:10 -0700, Chandra Seetharaman wrote:
> > > On Tue, 2006-09-12 at 17:39 -0700, Rohit Seth wrote:
> > > <snip>
> > > > > yes, it would be there, but is not heavy, IMO.
> > > >
> > > > I think anything greater than 1% could be a concern for people who are
> > > > not very interested in containers but would be forced to live with them.
> > >
> > > If they are not interested in resource management and/or containers, i
> > > do not think they need to pay.
> > > >
> >
> > Think of a single kernel from a vendor that has container support built
> > in.
>
> Ok. Understood.
>
> Here are results of some of the benchmarks we have run in the past
> (April 2005) with CKRM which showed no/negligible performance impact in
> that scenario.
> http://marc.theaimsgroup.com/?l=ckrm-tech&m=111325064322 305&w=2
> http://marc.theaimsgroup.com/?l=ckrm-tech&m=111385973226 267&w=2
> http://marc.theaimsgroup.com/?l=ckrm-tech&m=111291409731 929&w=2
> >
These are good results. But I still think the cost will increase over a
period of time as more logic gets added. Any data on microbenchmarks
like lmbench.
> <snip>
>
> > > Not at all. If the container they are interested in is guaranteed, I do
> > > not see how apps running outside a container would affect them.
> > >
> >
> > Because the kernel (outside the container subsystem) doesn't know of
>
> The core resource subsystem (VM subsystem for memory) would know about
> the guarantees and don't cares, and it would handle it appropriately.
>
...meaning hooks in the generic kernel reclaim algorithm. Getting
something like that in mainline will be at best tricky.
-rohit
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6325 is a reply to message #6311] |
Thu, 14 September 2006 07:53   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
Chandra Seetharaman wrote:
> On Wed, 2006-09-13 at 12:06 +0400, Pavel Emelianov wrote:
>
>> Chandra Seetharaman wrote:
>>
>>> On Tue, 2006-09-12 at 14:48 +0400, Pavel Emelianov wrote:
>>> <snip>
>>>
>>>
>>>>> I do not think it is that simple since
>>>>> - there is typically more than one class I want to set guarantee to
>>>>> - I will not able to use both limit and guarantee
>>>>> - Implementation will not be work-conserving.
>>>>>
>>>>> Also, How would you configure the following in your model ?
>>>>>
>>>>> 5 classes: Class A(10, 40), Class B(20, 100), Class C (30, 100), Class D
>>>>> (5, 100), Class E(15, 50); (class_name(guarantee, limit))
>>>>>
>>>>>
>>>>>
>>>> What's the total memory amount on the node? Without it it's hard to make
>>>> any
>>>> guarantee.
>>>>
>>>>
>>> I wrote the example treating them as %, so 100 would be the total amount
>>> of memory.
>>>
>>>
>> OK. Then limiting must be done this way (unreclaimable limit/total limit)
>> A (15/40)
>> B (25/100)
>> C (35/100)
>> D (10/100)
>> E (20/50)
>> In this case each group will receive it's guarantee for sure.
>>
>> E.g. even if A, B, E and D will eat all it's unreclaimable memory then
>> we'll have
>> 100 - 15 - 25 - 20 - 10 = 30% of memory left (maybe after reclaiming) which
>> is perfectly enough for C's guarantee.
>>
>
> How did you arrive at the +5 number ?
>
I've solved a linear equations set :)
> What if I have 40 containers each with 2% guarantee ? what do we do
> then ? and many other different combinations (what I gave was not the
> _only_ scenario).
>
Then you need to solve a set of 40 equations. This sounds weird, but
don't afraid - sets like these are solved lightly.
>
>>>
>>>
>>>>> "Limit only" approach works for DoS prevention. But for providing QoS
>>>>> you would need guarantee.
>>>>>
>>>>>
>>>>>
>>>> You may not provide guarantee on physycal resource for a particular group
>>>> without limiting its usage by other groups. That's my major idea.
>>>>
>>>>
>>> I agree with that, but the other way around (i.e provide guarantee for
>>> everyone by imposing limits on everyone) is what I am saying is not
>>> possible.
>>>
>> Then how do you make sure that memory WILL be available when the group needs
>> it without limiting the others in a proper way?
>>
>
> You could limit others only if you _know_ somebody is not getting what
> they are supposed to get (based on guarantee).
>
I don't understand your idea. Limit does _not_ imply anything - it's
just a limit.
You may limit anything to anyone w/o bothering the consequences.
Guarantee implies that the resource you guarantee will be available and
this "will be" is something not that easy.
So I repeat my question - how can you be sure that these X megabytes you
guarantee to some group won't be used by others so that you won't be able
to reclaim them?
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6461 is a reply to message #6450] |
Mon, 18 September 2006 11:20   |
Balbir Singh
Messages: 491 Registered: August 2006
|
Senior Member |
|
|
Pavel Emelianov wrote:
> Balbir Singh wrote:
>
> [snip]
>
>> This approach has the following disadvantages
>> 1. Lets consider initialization - When we create 'n' groups
>> initially, we need
>> to spend O(n^2) time to assign guarantees.
>
> 1. Not guarantees - limits. If you do not need guarantees - assign
> overcommited limits. Most of OpenVZ users do so and nobody claims.
> 2. If you start n groups at once then limits are calculated in O(n)
> time, not O(n^2).
Yes.. if you start them at once, but if they are incrementally
added and started it is O(n^2)
>
>> 2. Every time a limit or a guarantee changes, we need to recalculate
>> guarantees
>> and ensure that the change will not break any guarantees
>
> The same.
>
>> 3. The same thing as stated above, when a resource group is created
>> or deleted
>>
>> This can lead to some instability; a change in one group propagates to
>> all other groups.
>
> Let me cite a part of your answer on my letter from 11.09.2006:
> "...
> xemul> I have a node with 1Gb of ram and 10 containers with 100Mb
> xemul> guarantee each. I want to start one more.
> xemul> What shall I do not to break guarantees?
>
> Don't start the new container or change the guarantees of the
> existing ones to accommodate this one ... It would be perfectly
> ok to have a container that does not care about guarantees to
> set their guarantee to 0 and set their limit to the desired value
> ..."
>
> The same for the limiting - either do not start new container, or
> recalculate limits to meet new requirements. You may not take care of
> guarantees as weel and create an overcommited configuration.
>
> And one more thing. We've asked it many times and I ask it again -
> please, show us the other way for providing guarantee rather than
> limiting or reserving.
There are some other options, I am sure Chandra will probably have
more.
1. Reclaim resources from other containers. This can be done well for
user-pages, if we ensure that each container does not mlock more
than its guaranteed share of memory.
2. Provide best effort guarantees for non-reclaimable memory
3. oom-kill a container or a task within a resource group that has
exceeded its guarantee and some other container is unable to meet its
guarantee
--
Balbir Singh,
Linux Technology Center,
IBM Software Labs
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6462 is a reply to message #6461] |
Mon, 18 September 2006 11:32   |
Pavel Emelianov
Messages: 1149 Registered: September 2006
|
Senior Member |
|
|
Balbir Singh wrote:
> Pavel Emelianov wrote:
>> Balbir Singh wrote:
>>
>> [snip]
>>
>>> This approach has the following disadvantages
>>> 1. Lets consider initialization - When we create 'n' groups
>>> initially, we need
>>> to spend O(n^2) time to assign guarantees.
>>
>> 1. Not guarantees - limits. If you do not need guarantees - assign
>> overcommited limits. Most of OpenVZ users do so and nobody claims.
>> 2. If you start n groups at once then limits are calculated in O(n)
>> time, not O(n^2).
>
> Yes.. if you start them at once, but if they are incrementally
> added and started it is O(n^2)
See my comment below.
>
>>
>>> 2. Every time a limit or a guarantee changes, we need to recalculate
>>> guarantees
>>> and ensure that the change will not break any guarantees
>>
>> The same.
>>
>>> 3. The same thing as stated above, when a resource group is created
>>> or deleted
>>>
>>> This can lead to some instability; a change in one group propagates to
>>> all other groups.
>>
>> Let me cite a part of your answer on my letter from 11.09.2006:
>> "...
>> xemul> I have a node with 1Gb of ram and 10 containers with 100Mb
>> xemul> guarantee each. I want to start one more.
>> xemul> What shall I do not to break guarantees?
>>
>> Don't start the new container or change the guarantees of the
>> existing ones to accommodate this one ... It would be perfectly
>> ok to have a container that does not care about guarantees to
>> set their guarantee to 0 and set their limit to the desired value
>> ..."
>>
>> The same for the limiting - either do not start new container, or
>> recalculate limits to meet new requirements. You may not take care of
>> guarantees as weel and create an overcommited configuration.
As I do not see any reply on this I consider "O(n^2) disadvantage" to
be irrelevant.
>>
>> And one more thing. We've asked it many times and I ask it again -
>> please, show us the other way for providing guarantee rather than
>> limiting or reserving.
>
> There are some other options, I am sure Chandra will probably have
> more.
>
> 1. Reclaim resources from other containers. This can be done well for
> user-pages, if we ensure that each container does not mlock more
> than its guaranteed share of memory.
We've already agreed to consider unreclaimable resources only.
If we provide reclaimable memory *only* then we can provide any
guarantee with a single page available for user-space.
Unreclaimable resource is the most interesting one.
> 2. Provide best effort guarantees for non-reclaimable memory
That's the question - how?
> 3. oom-kill a container or a task within a resource group that has
> exceeded its guarantee and some other container is unable to meet its
> guarantee
Oom-killer must start only when there are no other ways to find memory.
This must be a "last argument", not the regular solution.
|
|
|
Re: [ckrm-tech] [PATCH] BC: resource beancounters (v4) (added user memory) [message #6468 is a reply to message #6399] |
Mon, 18 September 2006 11:27   |
Balbir Singh
Messages: 491 Registered: August 2006
|
Senior Member |
|
|
Pavel Emelianov wrote:
> Kirill Korotaev wrote:
>
> [snip]
>>> I have a C program that computes limits to obtain desired guarantees
>>> in a single 'for (i = 0; i < n; n++)' loop for any given set of guarantees.
>>> With all error handling, beautifull output, nice formatting etc it weights
>>> only 60 lines.
>
> Look at http://wiki.openvz.org/Containers/Guarantees_for_resources
> I've described there how a guarantee can be get with limiting in details.
>
> [snip]
>
>>> I do not 'do not like guarantee'. I'm just sure that there are two ways
>>> for providing guarantee (for unreclaimable resorces):
>>> 1. reserving resource for group in advance
>>> 2. limit resource for others
>>> Reserving is worse as it is essentially limiting (you cut off 100Mb from
>>> 1Gb RAM thus limiting the other groups by 900Mb RAM), but this limiting
>>> is too strict - you _have_ to reserve less than RAM size. Limiting in
>>> run-time is more flexible (you may create an overcommited BC if you
>>> want to) and leads to the same result - guarantee.
>> I think this deserves putting on Wiki.
>> It is very good clear point.
>
> This is also on the page I gave link at.
The program (calculate_limits()) listed on the website does not work for
the following case
N=2;
R=100;
g[2] = {30, 30};
The output is -10 and -10 for the limits
For
N=3;
R=100;
g[3] = {30, 30, 10};
I get -70, -70 and -110 as the limits
Am I interpreting the parameters correctly? Or the program is broken?
--
Balbir Singh,
Linux Technology Center,
IBM Software Labs
|
|
|
|
Goto Forum:
Current Time: Fri Oct 24 20:02:37 GMT 2025
Total time taken to generate the page: 0.09849 seconds
|