Mailing List Archive

[gnocchi][aodh] Unable to trigger aggregate alarms
This is on a Newton Packstack.

I try to trigger alarms based on average cpu_util of a group of
instances. *Problem: *The alarm perpetually remains in state
"insufficient data".

Ceilometer is configured to use Gnocchi and the medium archive policy
(which stores data once a minute).  The intervals in pipeline.yaml are
set to 60.

I run two instances with high CPU usage. Both have a metadata item
"metering.server_group=hicpu". The alarm uses a query
"server_group==hicpu", has a granularity of 60 and evalution periods set
to 1. I expect it to be in state /alarm /or /ok /after less than 2 minutes.

From Gnocchi, I can retrieve measures, both of the two individual
instances and of aggregate measures.

*Why "insufficient data"? **How can I find out what's going on in Aodh's
mind? *More info below. Thanks.

Bernd Bausch

My alarm:

$ openstack alarm show cpuhigh-agg
+---------------------------+--------------------------------------------------+
| Field                     |
Value                                            |
+---------------------------+--------------------------------------------------+
| aggregation_method        |
sum                                              |
| alarm_actions             |
[u'http://localhost:1234']                       |
| alarm_id                  |
6adb333a-b306-470d-b673-2c8e72c7a468             |
| comparison_operator       |
gt                                               |
| description               | gnocchi_aggregation_by_resources_threshold
alarm |
|                           |
rule                                             |
| enabled                   |
True                                             |
| evaluation_periods        |
1                                                |
| granularity               |
60                                               |
| insufficient_data_actions |
[]                                               |
| metric                    |
cpu_util                                         |
| name                      |
cpuhigh-agg                                      |
| ok_actions                |
[u'http://localhost:1234']                       |
| project_id                |
55a05c4f3908490ca2419591837575ba                 |
| query                     | {"and": [{"=":
{"created_by_project_id":         |
|                           | "55a05c4f3908490ca2419591837575ba"}},
{"=":      |
|                           | {"server_group":
"hicpu"}}]}                     |
| repeat_actions            |
False                                            |
| resource_type             |
instance                                         |
| severity                  |
low                                              |
*| state                     | insufficient
data                                |*
| state_timestamp           |
2018-07-19T11:05:38.098000                       |
| threshold                 |
80.0                                             |
| time_constraints          |
[]                                               |
| timestamp                 |
2018-07-19T11:05:38.098000                       |
| type                      |
gnocchi_aggregation_by_resources_threshold       |
| user_id                   |
96ce6a7200a54c79add0cc27ded03422                 |
+---------------------------+--------------------------------------------------+

My instances look like this:

$ openstack server show cpu-user1
+--------------------------------------+---------------------------------------+
| Field                                |
Value                                 |
+--------------------------------------+---------------------------------------+
...
| project_id                           |
55a05c4f3908490ca2419591837575ba      |
| properties                           |
*metering.server_group='hicpu'*         |
| security_groups                      | [{u'name': u'default'},
{u'name':     |
|                                      |
u'ssh'}]                              |
| status                               |
ACTIVE                                |
...
+--------------------------------------+---------------------------------------+

Gnocchi contains enough data I would think:

gnocchi measures aggregation -m cpu_util --query server_group=hicpu
--aggregation sum --resource-type instance
+---------------------------+-------------+---------------+
| timestamp                 | granularity |         value |
+---------------------------+-------------+---------------+
| 2018-07-19T09:00:00+00:00 |      3600.0 | 676.454821872 |
| 2018-07-19T10:00:00+00:00 |      3600.0 | 927.148462196 |
| 2018-07-19T09:46:00+00:00 |        60.0 | 79.0149064873 |
| 2018-07-19T09:47:00+00:00 |        60.0 | 54.6575832468 |
| 2018-07-19T09:48:00+00:00 |        60.0 | 46.0457056053 |
| 2018-07-19T09:49:00+00:00 |        60.0 | 52.5139041993 |
| 2018-07-19T09:50:00+00:00 |        60.0 | 42.7994058262 |
| 2018-07-19T09:51:00+00:00 |        60.0 | 40.0215359957 |
...
Re: [gnocchi][aodh] Unable to trigger aggregate alarms [ In reply to ]
Last time I tried and it worked. But now I meet the same issue with
Ceilometer master version.

On Fri, Jul 20, 2018 at 5:54 PM Bernd Bausch <berndbausch@gmail.com> wrote:

> This is on a Newton Packstack.
>
> I try to trigger alarms based on average cpu_util of a group of instances. *Problem:
> *The alarm perpetually remains in state "insufficient data".
>
> Ceilometer is configured to use Gnocchi and the medium archive policy
> (which stores data once a minute). The intervals in pipeline.yaml are set
> to 60.
>
> I run two instances with high CPU usage. Both have a metadata item
> "metering.server_group=hicpu". The alarm uses a query
> "server_group==hicpu", has a granularity of 60 and evalution periods set to
> 1. I expect it to be in state *alarm *or *ok *after less than 2 minutes.
>
> From Gnocchi, I can retrieve measures, both of the two individual
> instances and of aggregate measures.
>
> *Why "insufficient data"? **How can I find out what's going on in Aodh's
> mind? *More info below. Thanks.
>
> Bernd Bausch
>
> My alarm:
>
> $ openstack alarm show cpuhigh-agg
>
> +---------------------------+--------------------------------------------------+
> | Field |
> Value |
>
> +---------------------------+--------------------------------------------------+
> | aggregation_method |
> sum |
> | alarm_actions | [u'http://localhost:1234']
> |
> | alarm_id |
> 6adb333a-b306-470d-b673-2c8e72c7a468 |
> | comparison_operator |
> gt |
> | description | gnocchi_aggregation_by_resources_threshold
> alarm |
> | |
> rule |
> | enabled |
> True |
> | evaluation_periods |
> 1 |
> | granularity |
> 60 |
> | insufficient_data_actions |
> [] |
> | metric |
> cpu_util |
> | name |
> cpuhigh-agg |
> | ok_actions | [u'http://localhost:1234']
> |
> | project_id |
> 55a05c4f3908490ca2419591837575ba |
> | query | {"and": [{"=":
> {"created_by_project_id": |
> | | "55a05c4f3908490ca2419591837575ba"}},
> {"=": |
> | | {"server_group":
> "hicpu"}}]} |
> | repeat_actions |
> False |
> | resource_type |
> instance |
> | severity |
> low |
> *| state | insufficient
> data |*
> | state_timestamp |
> 2018-07-19T11:05:38.098000 |
> | threshold |
> 80.0 |
> | time_constraints |
> [] |
> | timestamp |
> 2018-07-19T11:05:38.098000 |
> | type |
> gnocchi_aggregation_by_resources_threshold |
> | user_id |
> 96ce6a7200a54c79add0cc27ded03422 |
>
> +---------------------------+--------------------------------------------------+
>
> My instances look like this:
>
> $ openstack server show cpu-user1
>
> +--------------------------------------+---------------------------------------+
> | Field |
> Value |
>
> +--------------------------------------+---------------------------------------+
> ...
> | project_id |
> 55a05c4f3908490ca2419591837575ba |
> | properties | *metering.server_group='hicpu'*
> |
> | security_groups | [{u'name': u'default'},
> {u'name': |
> | |
> u'ssh'}] |
> | status |
> ACTIVE |
> ...
>
> +--------------------------------------+---------------------------------------+
>
> Gnocchi contains enough data I would think:
>
> gnocchi measures aggregation -m cpu_util --query server_group=hicpu
> --aggregation sum --resource-type instance
> +---------------------------+-------------+---------------+
> | timestamp | granularity | value |
> +---------------------------+-------------+---------------+
> | 2018-07-19T09:00:00+00:00 | 3600.0 | 676.454821872 |
> | 2018-07-19T10:00:00+00:00 | 3600.0 | 927.148462196 |
> | 2018-07-19T09:46:00+00:00 | 60.0 | 79.0149064873 |
> | 2018-07-19T09:47:00+00:00 | 60.0 | 54.6575832468 |
> | 2018-07-19T09:48:00+00:00 | 60.0 | 46.0457056053 |
> | 2018-07-19T09:49:00+00:00 | 60.0 | 52.5139041993 |
> | 2018-07-19T09:50:00+00:00 | 60.0 | 42.7994058262 |
> | 2018-07-19T09:51:00+00:00 | 60.0 | 40.0215359957 |
> ...
>
>
> _______________________________________________
> Mailing list:
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
> Post to : openstack@lists.openstack.org
> Unsubscribe :
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack
>
Re: [gnocchi][aodh] Unable to trigger aggregate alarms [ In reply to ]
Additional info:

I found /var/log/aodh/evaluator.log. In there, each time Aodh evaluates
alarm conditions, it issues this message:

pruned statistics to 0

This occurs in .../aodh/evaluator/gnocchi.py. I don't understand the
logic of the code, in particular why I end up with 0 statistics, but my
guess is that "insufficient data" is caused by this. At least, I have
the confirmation that Aodh uses Gnocchi to get mesaures.

I tried the autoscaling example in Red Hat's documentation
https://access.redhat.com/documentation/en-us/red_hat_openstack_platform/10/html/manual_installation_procedures/sect-ceilometer-gnocchi-backend.
Same result: The alarm remains at "insufficient data". Assuming that the
documented code works, I guess something is wrong with my configuration.
But what?

Bernd Bausch


On 7/20/2018 5:45 PM, Bernd Bausch wrote:
>
> This is on a Newton Packstack.
>
> I try to trigger alarms based on average cpu_util of a group of
> instances. *Problem: *The alarm perpetually remains in state
> "insufficient data".
>
> Ceilometer is configured to use Gnocchi and the medium archive policy
> (which stores data once a minute).  The intervals in pipeline.yaml are
> set to 60.
>
> I run two instances with high CPU usage. Both have a metadata item
> "metering.server_group=hicpu". The alarm uses a query
> "server_group==hicpu", has a granularity of 60 and evalution periods
> set to 1. I expect it to be in state /alarm /or /ok /after less than 2
> minutes.
>
> From Gnocchi, I can retrieve measures, both of the two individual
> instances and of aggregate measures.
>
> *Why "insufficient data"? **How can I find out what's going on in
> Aodh's mind? *More info below. Thanks.
>
> Bernd Bausch
>
> My alarm:
>
> $ openstack alarm show cpuhigh-agg
> +---------------------------+--------------------------------------------------+
> | Field                     |
> Value                                            |
> +---------------------------+--------------------------------------------------+
> | aggregation_method        |
> sum                                              |
> | alarm_actions             |
> [u'http://localhost:1234']                       |
> | alarm_id                  |
> 6adb333a-b306-470d-b673-2c8e72c7a468             |
> | comparison_operator       |
> gt                                               |
> | description               |
> gnocchi_aggregation_by_resources_threshold alarm |
> |                           |
> rule                                             |
> | enabled                   |
> True                                             |
> | evaluation_periods        |
> 1                                                |
> | granularity               |
> 60                                               |
> | insufficient_data_actions |
> []                                               |
> | metric                    |
> cpu_util                                         |
> | name                      |
> cpuhigh-agg                                      |
> | ok_actions                |
> [u'http://localhost:1234']                       |
> | project_id                |
> 55a05c4f3908490ca2419591837575ba                 |
> | query                     | {"and": [{"=":
> {"created_by_project_id":         |
> |                           | "55a05c4f3908490ca2419591837575ba"}},
> {"=":      |
> |                           | {"server_group":
> "hicpu"}}]}                     |
> | repeat_actions            |
> False                                            |
> | resource_type             |
> instance                                         |
> | severity                  |
> low                                              |
> *| state                     | insufficient
> data                                |*
> | state_timestamp           |
> 2018-07-19T11:05:38.098000                       |
> | threshold                 |
> 80.0                                             |
> | time_constraints          |
> []                                               |
> | timestamp                 |
> 2018-07-19T11:05:38.098000                       |
> | type                      |
> gnocchi_aggregation_by_resources_threshold       |
> | user_id                   |
> 96ce6a7200a54c79add0cc27ded03422                 |
> +---------------------------+--------------------------------------------------+
>
> My instances look like this:
>
> $ openstack server show cpu-user1
> +--------------------------------------+---------------------------------------+
> | Field                                |
> Value                                 |
> +--------------------------------------+---------------------------------------+
> ...
> | project_id                           |
> 55a05c4f3908490ca2419591837575ba      |
> | properties                           |
> *metering.server_group='hicpu'*         |
> | security_groups                      | [{u'name': u'default'},
> {u'name':     |
> |                                      |
> u'ssh'}]                              |
> | status                               |
> ACTIVE                                |
> ...
> +--------------------------------------+---------------------------------------+
>
> Gnocchi contains enough data I would think:
>
> gnocchi measures aggregation -m cpu_util --query server_group=hicpu
> --aggregation sum --resource-type instance
> +---------------------------+-------------+---------------+
> | timestamp                 | granularity |         value |
> +---------------------------+-------------+---------------+
> | 2018-07-19T09:00:00+00:00 |      3600.0 | 676.454821872 |
> | 2018-07-19T10:00:00+00:00 |      3600.0 | 927.148462196 |
> | 2018-07-19T09:46:00+00:00 |        60.0 | 79.0149064873 |
> | 2018-07-19T09:47:00+00:00 |        60.0 | 54.6575832468 |
> | 2018-07-19T09:48:00+00:00 |        60.0 | 46.0457056053 |
> | 2018-07-19T09:49:00+00:00 |        60.0 | 52.5139041993 |
> | 2018-07-19T09:50:00+00:00 |        60.0 | 42.7994058262 |
> | 2018-07-19T09:51:00+00:00 |        60.0 | 40.0215359957 |
> ...
>
>
Re: [gnocchi][aodh] Unable to trigger aggregate alarms [ In reply to ]
My problem is related to
https://docs.openstack.org/releasenotes/aodh/newton.html#bug-fixes. The
default value of Aodh's config variable /gnocchi_external_project_owner
/is /service, /but in Newton-based Packstack, the service project is
named /services /- plural.

The fix: Adding /gnocchi_external_project_owner = //services /to aodh.conf.

Bernd Bausch

On 7/20/2018 5:45 PM, Bernd Bausch wrote:
>
> This is on a Newton Packstack.
>
> I try to trigger alarms based on average cpu_util of a group of
> instances. *Problem: *The alarm perpetually remains in state
> "insufficient data".
>
> Ceilometer is configured to use Gnocchi and the medium archive policy
> (which stores data once a minute).  The intervals in pipeline.yaml are
> set to 60.
>
> I run two instances with high CPU usage. Both have a metadata item
> "metering.server_group=hicpu". The alarm uses a query
> "server_group==hicpu", has a granularity of 60 and evalution periods
> set to 1. I expect it to be in state /alarm /or /ok /after less than 2
> minutes.
>
> From Gnocchi, I can retrieve measures, both of the two individual
> instances and of aggregate measures.
>
> *Why "insufficient data"? **How can I find out what's going on in
> Aodh's mind? *More info below. Thanks.
>
> Bernd Bausch
>
> My alarm:
>
> $ openstack alarm show cpuhigh-agg
> +---------------------------+--------------------------------------------------+
> | Field                     |
> Value                                            |
> +---------------------------+--------------------------------------------------+
> | aggregation_method        |
> sum                                              |
> | alarm_actions             |
> [u'http://localhost:1234']                       |
> | alarm_id                  |
> 6adb333a-b306-470d-b673-2c8e72c7a468             |
> | comparison_operator       |
> gt                                               |
> | description               |
> gnocchi_aggregation_by_resources_threshold alarm |
> |                           |
> rule                                             |
> | enabled                   |
> True                                             |
> | evaluation_periods        |
> 1                                                |
> | granularity               |
> 60                                               |
> | insufficient_data_actions |
> []                                               |
> | metric                    |
> cpu_util                                         |
> | name                      |
> cpuhigh-agg                                      |
> | ok_actions                |
> [u'http://localhost:1234']                       |
> | project_id                |
> 55a05c4f3908490ca2419591837575ba                 |
> | query                     | {"and": [{"=":
> {"created_by_project_id":         |
> |                           | "55a05c4f3908490ca2419591837575ba"}},
> {"=":      |
> |                           | {"server_group":
> "hicpu"}}]}                     |
> | repeat_actions            |
> False                                            |
> | resource_type             |
> instance                                         |
> | severity                  |
> low                                              |
> *| state                     | insufficient
> data                                |*
> | state_timestamp           |
> 2018-07-19T11:05:38.098000                       |
> | threshold                 |
> 80.0                                             |
> | time_constraints          |
> []                                               |
> | timestamp                 |
> 2018-07-19T11:05:38.098000                       |
> | type                      |
> gnocchi_aggregation_by_resources_threshold       |
> | user_id                   |
> 96ce6a7200a54c79add0cc27ded03422                 |
> +---------------------------+--------------------------------------------------+
>
> My instances look like this:
>
> $ openstack server show cpu-user1
> +--------------------------------------+---------------------------------------+
> | Field                                |
> Value                                 |
> +--------------------------------------+---------------------------------------+
> ...
> | project_id                           |
> 55a05c4f3908490ca2419591837575ba      |
> | properties                           |
> *metering.server_group='hicpu'*         |
> | security_groups                      | [{u'name': u'default'},
> {u'name':     |
> |                                      |
> u'ssh'}]                              |
> | status                               |
> ACTIVE                                |
> ...
> +--------------------------------------+---------------------------------------+
>
> Gnocchi contains enough data I would think:
>
> gnocchi measures aggregation -m cpu_util --query server_group=hicpu
> --aggregation sum --resource-type instance
> +---------------------------+-------------+---------------+
> | timestamp                 | granularity |         value |
> +---------------------------+-------------+---------------+
> | 2018-07-19T09:00:00+00:00 |      3600.0 | 676.454821872 |
> | 2018-07-19T10:00:00+00:00 |      3600.0 | 927.148462196 |
> | 2018-07-19T09:46:00+00:00 |        60.0 | 79.0149064873 |
> | 2018-07-19T09:47:00+00:00 |        60.0 | 54.6575832468 |
> | 2018-07-19T09:48:00+00:00 |        60.0 | 46.0457056053 |
> | 2018-07-19T09:49:00+00:00 |        60.0 | 52.5139041993 |
> | 2018-07-19T09:50:00+00:00 |        60.0 | 42.7994058262 |
> | 2018-07-19T09:51:00+00:00 |        60.0 | 40.0215359957 |
> ...
>
>