Mailing List Archive: [nova] Nova-scheduler: when are filters applied?

[nova] Nova-scheduler: when are filters applied?

Aug 30, 2018, 7:21 AM

Post #1 of 7 (1537 views)

Sorry. I was to quick with the send button...

Hi *,

I posted my question in [1] a week ago, but no answer yet.

When does Nova apply its filters (Ram, CPU, etc.)?
Of course at instance creation and (live-)migration of existing
instances. But what about existing instances that have been shutdown
and in the meantime more instances on the same hypervisor have been
launched?

When you start one of the pre-existing instances and even with RAM
overcommitment you can end up with an OOM-Killer resulting in
forceful shutdowns if you reach the limits. Is there something I've
been missing or maybe a bad configuration of my scheduler filters? Or
is it the admin's task to keep an eye on the load?

I'd appreciate any insights or pointers to something I've missed.

Regards,
Eugen

[1]
https://ask.openstack.org/en/question/115812/nova-scheduler-when-are-filters-applied/

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [nova] Nova-scheduler: when are filters applied? [ In reply to ]

chris.friesen at windriver

Aug 30, 2018, 10:02 AM

Post #2 of 7 (1537 views)

On 08/30/2018 08:54 AM, Eugen Block wrote:
> Hi Jay,
>
>> You need to set your ram_allocation_ratio nova.CONF option to 1.0 if you're
>> running into OOM issues. This will prevent overcommit of memory on your
>> compute nodes.
>
> I understand that, the overcommitment works quite well most of the time.
>
> It just has been an issue twice when I booted an instance that had been shutdown
> a while ago. In the meantime there were new instances created on that
> hypervisor, and this old instance caused the OOM.
>
> I would expect that with a ratio of 1.0 I would experience the same issue,
> wouldn't I? As far as I understand the scheduler only checks at instance
> creation, not when booting existing instances. Is that a correct assumption?

The system keeps track of how much memory is available and how much has been
assigned to instances on each compute node. With a ratio of 1.0 it shouldn't
let you consume more RAM than is available even if the instances have been shut
down.

Chris

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [nova] Nova-scheduler: when are filters applied? [ In reply to ]

jaypipes at gmail

Aug 30, 2018, 1:33 PM

Post #3 of 7 (1537 views)

On 08/30/2018 10:54 AM, Eugen Block wrote:
> Hi Jay,
>
>> You need to set your ram_allocation_ratio nova.CONF option to 1.0 if
>> you're running into OOM issues. This will prevent overcommit of memory
>> on your compute nodes.
>
> I understand that, the overcommitment works quite well most of the time.
>
> It just has been an issue twice when I booted an instance that had been
> shutdown a while ago. In the meantime there were new instances created
> on that hypervisor, and this old instance caused the OOM.
>
> I would expect that with a ratio of 1.0 I would experience the same
> issue, wouldn't I? As far as I understand the scheduler only checks at
> instance creation, not when booting existing instances. Is that a
> correct assumption?

To echo what cfriesen said, if you set your allocation ratio to 1.0, the
system will not overcommit memory. Shut down instances consume memory
from an inventory management perspective. If you don't want any danger
of an instance causing an OOM, you must set you ram_allocation_ratio to 1.0.

The scheduler doesn't really have anything to do with this.

Best,
-jay

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [nova] Nova-scheduler: when are filters applied? [ In reply to ]

Sep 3, 2018, 4:27 AM

Post #4 of 7 (1533 views)

Hi,

> To echo what cfriesen said, if you set your allocation ratio to 1.0,
> the system will not overcommit memory. Shut down instances consume
> memory from an inventory management perspective. If you don't want
> any danger of an instance causing an OOM, you must set you
> ram_allocation_ratio to 1.0.

let's forget about the scheduler, I'll try to make my question a bit clearer.

Let's say I have a ratio of 1.0 on my hypervisor, and let it have 24
GB of RAM available, ignoring the OS for a moment. Now I launch 6
instances, each with a flavor requesting 4 GB of RAM, that would leave
no space for further instances, right?
Then I shutdown two instances (freeing 8 GB RAM) and create a new one
with 8 GB of RAM, the compute node is full again (assuming all
instances actually consume all of their RAM).
Now I boot one of the shutdown instances again, the compute node would
require additional 4 GB of RAM for that instance, and this would lead
to OOM, isn't that correct? So a ratio of 1.0 would not prevent that
from happening, would it?

Regards,
Eugen

Zitat von Jay Pipes <jaypipes@gmail.com>:

> On 08/30/2018 10:54 AM, Eugen Block wrote:
>> Hi Jay,
>>
>>> You need to set your ram_allocation_ratio nova.CONF option to 1.0
>>> if you're running into OOM issues. This will prevent overcommit of
>>> memory on your compute nodes.
>>
>> I understand that, the overcommitment works quite well most of the time.
>>
>> It just has been an issue twice when I booted an instance that had
>> been shutdown a while ago. In the meantime there were new instances
>> created on that hypervisor, and this old instance caused the OOM.
>>
>> I would expect that with a ratio of 1.0 I would experience the same
>> issue, wouldn't I? As far as I understand the scheduler only checks
>> at instance creation, not when booting existing instances. Is that
>> a correct assumption?
>
> To echo what cfriesen said, if you set your allocation ratio to 1.0,
> the system will not overcommit memory. Shut down instances consume
> memory from an inventory management perspective. If you don't want
> any danger of an instance causing an OOM, you must set you
> ram_allocation_ratio to 1.0.
>
> The scheduler doesn't really have anything to do with this.
>
> Best,
> -jay
>
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [nova] Nova-scheduler: when are filters applied? [ In reply to ]

balazs.gibizer at ericsson

Sep 3, 2018, 4:55 AM

Post #5 of 7 (1533 views)

On Mon, Sep 3, 2018 at 1:27 PM, Eugen Block <eblock@nde.ag> wrote:
> Hi,
>
>> To echo what cfriesen said, if you set your allocation ratio to 1.0,
>> the system will not overcommit memory. Shut down instances consume
>> memory from an inventory management perspective. If you don't want
>> any danger of an instance causing an OOM, you must set you
>> ram_allocation_ratio to 1.0.
>
> let's forget about the scheduler, I'll try to make my question a bit
> clearer.
>
> Let's say I have a ratio of 1.0 on my hypervisor, and let it have 24
> GB of RAM available, ignoring the OS for a moment. Now I launch 6
> instances, each with a flavor requesting 4 GB of RAM, that would
> leave no space for further instances, right?
> Then I shutdown two instances (freeing 8 GB RAM) and create a new one
> with 8 GB of RAM, the compute node is full again (assuming all
> instances actually consume all of their RAM).

When you shutdown the two instances the phyisical RAM will be
deallocated BUT nova will not remove the resource allocation in
placement. Therefore your new instance which requires 8GB RAM will not
be placed to the host in question because on that host all the 24G RAM
is still allocated even if physically not consumed at the moment.

> Now I boot one of the shutdown instances again, the compute node
> would require additional 4 GB of RAM for that instance, and this
> would lead to OOM, isn't that correct? So a ratio of 1.0 would not
> prevent that from happening, would it?

Nova did not place the instance require 8G RAM to this host above.
Therefore you can freely start up the two 4G consuming instances on
this host later.

> Regards,
> Eugen
>
>
> Zitat von Jay Pipes <jaypipes@gmail.com>:
>
>> On 08/30/2018 10:54 AM, Eugen Block wrote:
>>> Hi Jay,
>>>
>>>> You need to set your ram_allocation_ratio nova.CONF option to 1.0
>>>> if you're running into OOM issues. This will prevent overcommit
>>>> of memory on your compute nodes.
>>>
>>> I understand that, the overcommitment works quite well most of the
>>> time.
>>>
>>> It just has been an issue twice when I booted an instance that had
>>> been shutdown a while ago. In the meantime there were new
>>> instances created on that hypervisor, and this old instance
>>> caused the OOM.
>>>
>>> I would expect that with a ratio of 1.0 I would experience the same
>>> issue, wouldn't I? As far as I understand the scheduler only
>>> checks at instance creation, not when booting existing instances.
>>> Is that a correct assumption?
>>
>> To echo what cfriesen said, if you set your allocation ratio to 1.0,
>> the system will not overcommit memory. Shut down instances consume
>> memory from an inventory management perspective. If you don't want
>> any danger of an instance causing an OOM, you must set you
>> ram_allocation_ratio to 1.0.
>>
>> The scheduler doesn't really have anything to do with this.
>>
>> Best,
>> -jay
>>
>> _______________________________________________
>> Mailing list:
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to : openstack@lists.openstack.org
>> Unsubscribe :
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack@lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [nova] Nova-scheduler: when are filters applied? [ In reply to ]

jaypipes at gmail

Sep 3, 2018, 4:56 AM

Post #6 of 7 (1533 views)

On 09/03/2018 07:27 AM, Eugen Block wrote:
> Hi,
>
>> To echo what cfriesen said, if you set your allocation ratio to 1.0,
>> the system will not overcommit memory. Shut down instances consume
>> memory from an inventory management perspective. If you don't want any
>> danger of an instance causing an OOM, you must set you
>> ram_allocation_ratio to 1.0.
>
> let's forget about the scheduler, I'll try to make my question a bit
> clearer.
>
> Let's say I have a ratio of 1.0 on my hypervisor, and let it have 24 GB
> of RAM available, ignoring the OS for a moment. Now I launch 6
> instances, each with a flavor requesting 4 GB of RAM, that would leave
> no space for further instances, right?
> Then I shutdown two instances (freeing 8 GB RAM) and create a new one
> with 8 GB of RAM, the compute node is full again (assuming all instances
> actually consume all of their RAM).
> Now I boot one of the shutdown instances again, the compute node would
> require additional 4 GB of RAM for that instance, and this would lead to
> OOM, isn't that correct? So a ratio of 1.0 would not prevent that from
> happening, would it?

I'm not entirely sure what you mean by "shut down an instance". Perhaps
this is what is leading to confusion. I consider "shutting down an
instance" to be stopping or suspending an instance.

As I mentioned below, shutdown instances consume memory from an
inventory management perspective. If you stop or suspend an instance on
your host, that instance is still consuming the same amount of memory in
the placement service. You will *not* be able to launch a new instance
on that same compute host *unless* your allocation ratio is >1.0.

Now, if by "shut down an instance", you actually mean "terminate an
instance" or possibly "shelve and then offload an instance", then that
is a different thing, and in both of *those* cases, resources are
released on the compute host.

Best,
-jay

> Zitat von Jay Pipes <jaypipes@gmail.com>:
>
>> On 08/30/2018 10:54 AM, Eugen Block wrote:
>>> Hi Jay,
>>>
>>>> You need to set your ram_allocation_ratio nova.CONF option to 1.0 if
>>>> you're running into OOM issues. This will prevent overcommit of
>>>> memory on your compute nodes.
>>>
>>> I understand that, the overcommitment works quite well most of the time.
>>>
>>> It just has been an issue twice when I booted an instance that had
>>> been shutdown a while ago. In the meantime there were new instances
>>> created on that hypervisor, and this old instance caused the OOM.
>>>
>>> I would expect that with a ratio of 1.0 I would experience the same
>>> issue, wouldn't I? As far as I understand the scheduler only checks
>>> at instance creation, not when booting existing instances. Is that a
>>> correct assumption?
>>
>> To echo what cfriesen said, if you set your allocation ratio to 1.0,
>> the system will not overcommit memory. Shut down instances consume
>> memory from an inventory management perspective. If you don't want any
>> danger of an instance causing an OOM, you must set you
>> ram_allocation_ratio to 1.0.
>>
>> The scheduler doesn't really have anything to do with this.
>>
>> Best,
>> -jay
>>
>> _______________________________________________
>> Mailing list:
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to : openstack@lists.openstack.org
>> Unsubscribe :
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
>
>
> _______________________________________________
> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack@lists.openstack.org
> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

Re: [nova] Nova-scheduler: when are filters applied? [ In reply to ]

Sep 3, 2018, 5:00 AM

Post #7 of 7 (1533 views)

Thanks, that is a very good explanation, I get it now.

Thank you very much for your answers!

Zitat von Balázs Gibizer <balazs.gibizer@ericsson.com>:

> On Mon, Sep 3, 2018 at 1:27 PM, Eugen Block <eblock@nde.ag> wrote:
>> Hi,
>>
>>> To echo what cfriesen said, if you set your allocation ratio to
>>> 1.0, the system will not overcommit memory. Shut down instances
>>> consume memory from an inventory management perspective. If you
>>> don't want any danger of an instance causing an OOM, you must set
>>> you ram_allocation_ratio to 1.0.
>>
>> let's forget about the scheduler, I'll try to make my question a
>> bit clearer.
>>
>> Let's say I have a ratio of 1.0 on my hypervisor, and let it have
>> 24 GB of RAM available, ignoring the OS for a moment. Now I launch
>> 6 instances, each with a flavor requesting 4 GB of RAM, that would
>> leave no space for further instances, right?
>> Then I shutdown two instances (freeing 8 GB RAM) and create a new
>> one with 8 GB of RAM, the compute node is full again (assuming all
>> instances actually consume all of their RAM).
>
> When you shutdown the two instances the phyisical RAM will be
> deallocated BUT nova will not remove the resource allocation in
> placement. Therefore your new instance which requires 8GB RAM will
> not be placed to the host in question because on that host all the
> 24G RAM is still allocated even if physically not consumed at the
> moment.
>
>
>> Now I boot one of the shutdown instances again, the compute node
>> would require additional 4 GB of RAM for that instance, and this
>> would lead to OOM, isn't that correct? So a ratio of 1.0 would not
>> prevent that from happening, would it?
>
> Nova did not place the instance require 8G RAM to this host above.
> Therefore you can freely start up the two 4G consuming instances on
> this host later.
>
>> Regards,
>> Eugen
>>
>>
>> Zitat von Jay Pipes <jaypipes@gmail.com>:
>>
>>> On 08/30/2018 10:54 AM, Eugen Block wrote:
>>>> Hi Jay,
>>>>
>>>>> You need to set your ram_allocation_ratio nova.CONF option to
>>>>> 1.0 if you're running into OOM issues. This will prevent
>>>>> overcommit of memory on your compute nodes.
>>>>
>>>> I understand that, the overcommitment works quite well most of the time.
>>>>
>>>> It just has been an issue twice when I booted an instance that
>>>> had been shutdown a while ago. In the meantime there were new
>>>> instances created on that hypervisor, and this old instance
>>>> caused the OOM.
>>>>
>>>> I would expect that with a ratio of 1.0 I would experience the
>>>> same issue, wouldn't I? As far as I understand the scheduler
>>>> only checks at instance creation, not when booting existing
>>>> instances. Is that a correct assumption?
>>>
>>> To echo what cfriesen said, if you set your allocation ratio to
>>> 1.0, the system will not overcommit memory. Shut down instances
>>> consume memory from an inventory management perspective. If you
>>> don't want any danger of an instance causing an OOM, you must set
>>> you ram_allocation_ratio to 1.0.
>>>
>>> The scheduler doesn't really have anything to do with this.
>>>
>>> Best,
>>> -jay
>>>
>>> _______________________________________________
>>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>> Post to : openstack@lists.openstack.org
>>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>>
>>
>>
>> _______________________________________________
>> Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>> Post to : openstack@lists.openstack.org
>> Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack

_______________________________________________
Mailing list: http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
Post to : openstack@lists.openstack.org
Unsubscribe : http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack