Mailing List Archive: gcc optimization breaks NumPy?

gcc optimization breaks NumPy?

Aug 23, 1999, 1:27 PM

Post #1 of 5 (506 views)

Hey folks,

I was getting consistent but inexplicable segmentation faults with the
following code, using NumPy's C API to move data between Python and C
(for sparse matrix multiplication; the data structure is from Meschach,
in case that's relevant). I found the problem to be caused by gcc's
optimization options.

static PyObject * mul(PyObject *self, PyObject *args) {

...Snip declarations...

...Snip working code. Function ends with:

return MakeFromMes(product);
}

which is defined as:

PyObject * MakeFromMes(SPMAT *in) {
int i, j, m, n, nzs, count, pos, therow, thecol;
int dummy[0];
double elem;
PyArrayObject *pr, *ir, *jc;

m = in->m; n = in->n; nzs = 0;

/* Count the number of nonzero elements of in
ISZERO is a macro to test for "good enough" floating point 0 */
for(i = 0; i < m; i++) {
for(j = 0; j < n; j++) {
if(!ISZERO(sp_get_val(in, i, j))) nzs++;
}
}

dummy[0] = nzs;

if(!( (pr = PyArray_FromDims(1, dummy, PyArray_DOUBLE)) &&
(ir = PyArray_FromDims(1, dummy, PyArray_INT)) &&
(jc = PyArray_FromDims(1, dummy, PyArray_INT)) )) {
fprintf(stderr, "MakeFromMes: Could not create output arrays.\n");
return Py_BuildValue("O", Py_None);
}

count = pos = 0;

/* Fill column-wise */
for(j = 0; j < n; j++) {
for(i = 0; i < n; i++) {
elem = sp_get_val(in, i, j);
if(!ISZERO(elem)) {
therow = pos % n;
thecol = (int)(pos / n);
*(double *)(pr->data + count * pr->strides[0]) = elem;
*(int *)(ir->data + count * ir->strides[0]) = therow;
*(int *)(jc->data + count * jc->strides[0]) = thecol;
count++;
}
pos++;
}
}
return Py_BuildValue("OOO", pr, ir, jc);
}

The segmentation fault occured immediately after MakeMes() returned.
Were I to change the mul() function to end with

PyObject *out = MakeFromMes(prod);
printf("Hello.");
return out;
}

the error would occur even before execution passed to the printf.

The problem occurs only with an -O option in compilation. Other
specific -f optimization options do not cause the seg fault. So my
question is -- because I've let gcc optimize other extensions to Python
I've written without incident -- what was it here that caused this
behavior? I'd appreciate any ideas and speculations, because I'm still
baffled by this, though the problem seems to be solved.

Thanks,
John

Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.

gcc optimization breaks NumPy? [ In reply to ]

kaz at ashi

Aug 23, 1999, 2:09 PM

Post #2 of 5 (516 views)

Permalink

On Mon, 23 Aug 1999 20:27:34 GMT, John Fisher <jfisher@are.berkeley.edu> wrote:
>Hey folks,
>
>I was getting consistent but inexplicable segmentation faults with the
>following code, using NumPy's C API to move data between Python and C
>(for sparse matrix multiplication; the data structure is from Meschach,
>in case that's relevant). I found the problem to be caused by gcc's
>optimization options.
>
>static PyObject * mul(PyObject *self, PyObject *args) {
>
> ...Snip declarations...
>
> ...Snip working code. Function ends with:
>
> return MakeFromMes(product);
>}
>
>which is defined as:
>
>PyObject * MakeFromMes(SPMAT *in) {
> int i, j, m, n, nzs, count, pos, therow, thecol;
> int dummy[0];

Ouch. This is an ANSI C constraint violation that requires a diagnostic. ANSI C
does not support zero size arrays. This code is brain-damaged.

> double elem;
> PyArrayObject *pr, *ir, *jc;
>
> m = in->m; n = in->n; nzs = 0;
>
> /* Count the number of nonzero elements of in
> ISZERO is a macro to test for "good enough" floating point 0 */
> for(i = 0; i < m; i++) {
> for(j = 0; j < n; j++) {
> if(!ISZERO(sp_get_val(in, i, j))) nzs++;
> }
> }
>
> dummy[0] = nzs;

Ouch. What does this mean? Assuming that the compiler accepts the
zero length array extension (which gcc does, by default) it's an access beyond
the ``end'' of the zero-element array.

The code seems to depend on the order of allocation of auto objects in the
stack frame and on some dubious aliasing between a pseudo array object and
other auto vars.

I'm afraid that someone will have to re-write this code in the C language,
or think of a more clever hack to achieve whatever aliasing trick this is
supposed to do in somewwhat more portable manner that is more resilient against
optimization, such as maybe using a union to overlap a non-zero length array
with the other values.

>The problem occurs only with an -O option in compilation. Other
>specific -f optimization options do not cause the seg fault. So my

It could be that optimization causes certain of the auto vars to not be given
actual storage, but to be placed in registers. Declaring these variables as
volatile will probably prevent GCC from applying these optimizations.
Certainly, it's probably safest to use volatile when doing obtuse aliasing like
this, in case the compiler doesn't realize what is going on.

>question is -- because I've let gcc optimize other extensions to Python
>I've written without incident -- what was it here that caused this
>behavior? I'd appreciate any ideas and speculations, because I'm still
>baffled by this, though the problem seems to be solved.

The behavior of your code is undefined. This means that a conforming C
implementation can do anything it wants. For example, bring up a friendly game
of Tetris when the offending function is called. Or make demons fly out
of your nose. :)

gcc optimization breaks NumPy? [ In reply to ]

nascheme at ucalgary

Aug 23, 1999, 4:38 PM

Post #3 of 5 (498 views)

Permalink

gcc optimization breaks NumPy? [ In reply to ]

fredrik at pythonware

Aug 24, 1999, 1:39 AM

Post #4 of 5 (499 views)

Permalink

John Fisher <jfisher@are.berkeley.edu> wrote:
> I was getting consistent but inexplicable segmentation faults with the
> following code, using NumPy's C API to move data between Python and C
> (for sparse matrix multiplication; the data structure is from Meschach,
> in case that's relevant). I found the problem to be caused by gcc's
> optimization options.

well, there's also a little strange thing in that¨
code of yours:

> PyObject * MakeFromMes(SPMAT *in) {
> int i, j, m, n, nzs, count, pos, therow, thecol;
> int dummy[0];

(I didn't even know you could have arrays
with zero elements...)

> double elem;
> PyArrayObject *pr, *ir, *jc;
>
> m = in->m; n = in->n; nzs = 0;
>
> /* Count the number of nonzero elements of in
> ISZERO is a macro to test for "good enough" floating point 0 */
> for(i = 0; i < m; i++) {
> for(j = 0; j < n; j++) {
> if(!ISZERO(sp_get_val(in, i, j))) nzs++;
> }
> }
>
> dummy[0] = nzs;

but here you assume it has one element more than
you requested...

depending on the stack layout, this may smash one
of the PyArrayObject pointers. looks like a great
way to get a segmentation fault ;-)

</F>

gcc optimization breaks NumPy? [ In reply to ]

jfisher at are

Aug 24, 1999, 4:24 PM

Post #5 of 5 (507 views)

Permalink

In article <slrn7s3e59.8pd.kaz@ashi.FootPrints.net>,
kaz@ashi.FootPrints.net (Kaz Kylheku) wrote:

> > int dummy[0];
>
> Ouch. This is an ANSI C constraint violation that requires a diagnostic. ANSI C
> does not support zero size arrays. This code is brain-damaged.

Eep. Indeed. Unintentionally so, and I'm rather embarrassed that I
skimmed over the code without noticing it. That was, of course, the
problem; declaring dummy as a length 1 array works fine. Thanks for the
quick reply.

John

Sent via Deja.com http://www.deja.com/
Share what you know. Learn what you don't.