Mailing List Archive

[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #2 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
Aha, it seems to crash in pcre2:

(gdb) bt
#0 0xfff8000100d38704 in pcre2_general_context_create_8 () from
/lib/sparc64-linux-gnu/libpcre2-8.so.0
#1 0x00000100000517a8 in main ()
(gdb)

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #1 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
OK, I was able to build exim now using the build instructions from [1].

Bisecting led me to this commit:

commit 22ed7a5295f196fce32563f6e9c669110dd36f4d
Author: Jeremy Harris <jgh146exb@wizmail.org>
Date: Sun Sep 12 15:42:51 2021 +0100

pcre2

> [1] https://www.exim.org/exim-html-current/doc/html/spec_html/ch-building_and_installing_exim.html

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #3 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
With debug symbols, the crash can be located in pcre2:

(gdb) bt
#0 pcre2_general_context_create_8 (private_malloc=0x1000004f680
<function_store_malloc>,
private_free=0x1000004f650 <function_store_free>, memory_data=0x0) at
src/pcre2_context.c:123
#1 0x00000100000517a8 in main ()
(gdb)

which is:
https://salsa.debian.org/debian/pcre2/-/blob/master/src/pcre2_context.c#L123

I'm going to file a bug report with pcre2.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #4 from Jeremy Harris <jgh146exb@wizmail.org> ---
Fixed by exim commit 8f9402a038 - it seems that slightly older versions of
pcre2 than the one used for development of the 4.95 do this nasty callback
of a null free().

Try a build with that (or the current HEAD)?

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #5 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
I have reported this issue to pcre2 [1].

> [1] https://github.com/PhilipHazel/pcre2/issues/59

There were actually other exim bugs that relate to alignment issues in pcre2,
but these are non-public and I cannot access them:

> https://bugs.exim.org/show_bug.cgi?id=2247
> https://bugs.exim.org/show_bug.cgi?id=2357

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #6 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
(In reply to Jeremy Harris from comment #4)
> Fixed by exim commit 8f9402a038 - it seems that slightly older versions of
> pcre2 than the one used for development of the 4.95 do this nasty callback
> of a null free().
>
> Try a build with that (or the current HEAD)?

Just tried building HEAD, still crashes unfortunately:

glaubitz@gcc202:~/exim$ git describe
exim-4.95-54-g2ba76be6f
glaubitz@gcc202:~/exim$ src/build-Linux-sparc64/exim
Bus error
glaubitz@gcc202:~/exim$

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #7 from Jeremy Harris <jgh146exb@wizmail.org> ---
Stack for that? Also, please add "CFLAGS += -ggdb -O0" to the top of your
Local/Makefile - then "make distclean" and rebuild.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #8 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
(In reply to Jeremy Harris from comment #7)
> Stack for that? Also, please add "CFLAGS += -ggdb -O0" to the top of your
> Local/Makefile - then "make distclean" and rebuild.

It looks like the function pointer "private_malloc" is the problem as it's not
on a 64-bit address.

I have built with the suggested CFLAGS now and get this backtrace:

(gdb) bt
#0 pcre2_general_context_create_8 (private_malloc=0x10000064a44
<function_store_malloc>,
private_free=0x10000064a9c <function_store_free>, memory_data=0x0) at
src/pcre2_context.c:123
#1 0x0000010000064cc8 in pcre_init () at exim.c:128
#2 0x0000010000067ca4 in main (argc=1, cargv=0x7fefffff4f8) at exim.c:1747
(gdb)

(gdb) info frame
Stack level 0, frame at 0x7fefffbe960:
pc = 0xfff8000100d38704 in pcre2_general_context_create_8
(src/pcre2_context.c:123); saved pc = 0x10000064cc8
called by frame at 0x7fefffbea10
source language c.
Arglist at 0x7fefffbe960, args: private_malloc=0x10000064a44
<function_store_malloc>,
private_free=0x10000064a9c <function_store_free>, memory_data=0x0
Locals at 0x7fefffbe960, Previous frame's sp in fp
Saved registers:
l0 at 0x7fefffbe960, l1 at 0x7fefffbe968, l2 at 0x7fefffbe970, l3 at
0x7fefffbe978, l4 at 0x7fefffbe980, l5 at 0x7fefffbe988,
l6 at 0x7fefffbe990, l7 at 0x7fefffbe998, i0 at 0x7fefffbe9a0, i1 at
0x7fefffbe9a8, i2 at 0x7fefffbe9b0, i3 at 0x7fefffbe9b8,
i4 at 0x7fefffbe9c0, i5 at 0x7fefffbe9c8, fp at 0x7fefffbe9d0, i7 at
0x7fefffbe9d8
(gdb)

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #9 from Jeremy Harris <jgh146exb@wizmail.org> ---
Also, if you're willing to support the Exim project, please set up a SPARC
buildfarm animal: https://buildfarm.exim.org/cgi-bin/register-form.pl
We don't have one currently, one to auto-run the testsuite will detect issues
like this earlier. Especially if you monitor the buildfarm status.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #10 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
(In reply to Jeremy Harris from comment #9)
> Also, if you're willing to support the Exim project, please set up a SPARC
> buildfarm animal: https://buildfarm.exim.org/cgi-bin/register-form.pl
> We don't have one currently, one to auto-run the testsuite will detect issues
> like this earlier. Especially if you monitor the buildfarm status.

That would be possible. We're running CI for other projects on SPARC as well
such as the Free Pascal Compiler.

FWIW, Debian regularly builds and tests the latest release versions of exim on
multiple targets. This is how this issue actually was caught.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #11 from Jeremy Harris <jgh146exb@wizmail.org> ---
You're saying the compiler has placed function_store_malloc() at an address
that it cannot legally call, for 64b SPARC? Or am I misunderstanding?

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #12 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
(In reply to Jeremy Harris from comment #11)
> You're saying the compiler has placed function_store_malloc() at an address
> that it cannot legally call, for 64b SPARC? Or am I misunderstanding?

I'm not sure yet what the alignment issue is. I was just guessing.

I have not fully understood the code yet and I'm still looking where
"store_malloc()" is defined.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #13 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
Patching pcre_init() to use the system's alloc/free functions fixes the problem
for me:

diff --git a/src/src/exim.c b/src/src/exim.c
index 42db457c0..501fe1853 100644
--- a/src/src/exim.c
+++ b/src/src/exim.c
@@ -125,7 +125,7 @@ return yield;
static void
pcre_init(void)
{
-pcre_gen_ctx = pcre2_general_context_create(function_store_malloc,
function_store_free, NULL);
+pcre_gen_ctx = pcre2_general_context_create(NULL, NULL, NULL);
pcre_cmp_ctx = pcre2_compile_context_create(pcre_gen_ctx);
pcre_mtc_ctx = pcre2_match_context_create(pcre_gen_ctx);
}

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #14 from Jeremy Harris <jgh146exb@wizmail.org> ---
... and eliminates Exim's memory accounting. Not recommended.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #15 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
OK, I found the problem. The issue is that exim's own alloc functions use "int"
instead of "size_t" and cast between both types.

The problem is that the size of "int" is not necessarily the same size as
"size_t" on a given platform. So I think the proper fix would be to switch
exim's own memory management from int to size_t.

The following quick and dirty hack fixes the issue for me:

diff --git a/src/src/store.c b/src/src/store.c
index e4cd722c3..0564676c4 100644
--- a/src/src/store.c
+++ b/src/src/store.c
@@ -192,7 +192,7 @@ static const uschar * poolclass[NPOOLS] = {
#endif


-static void * internal_store_malloc(int, const char *, int);
+static void * internal_store_malloc(size_t, const char *, int);
static void internal_store_free(void *, const char *, int linenumber);


/******************************************************************************/
@@ -867,7 +867,7 @@ Returns: pointer to gotten store (panic on failure)
*/

static void *
-internal_store_malloc(int size, const char *func, int line)
+internal_store_malloc(size_t size, const char *func, int line)
{
void * yield;

@@ -876,17 +876,17 @@ if (size < 0 || size >= INT_MAX/2)
"bad memory allocation requested (%d bytes) at %s %d",
size, func, line);

-size += sizeof(int); /* space to store the size, used under debug */
+size += sizeof(size_t); /* space to store the size, used under debug */
if (size < 16) size = 16;

-if (!(yield = malloc((size_t)size)))
+if (!(yield = malloc(size)))
log_write(0, LOG_MAIN|LOG_PANIC_DIE, "failed to malloc %d bytes of memory: "
"called from line %d in %s", size, line, func);

#ifndef COMPILE_UTILITY
-DEBUG(D_any) *(int *)yield = size;
+DEBUG(D_any) *(size_t *)yield = size;
#endif
-yield = US yield + sizeof(int);
+yield = US yield + sizeof(size_t);

if ((nonpool_malloc += size) > max_nonpool_malloc)
max_nonpool_malloc = nonpool_malloc;
@@ -899,7 +899,7 @@ giving warnings. */
is not filled with zeros so as to catch problems. */

if (f.running_in_test_harness)
- memset(yield, 0xF0, (size_t)size - sizeof(int));
+ memset(yield, 0xF0, (size_t)size - sizeof(size_t));
DEBUG(D_memory) debug_printf("--Malloc %6p %5d bytes\t%-20s %4d\tpool %5d
nonpool %5d\n",
yield, size, func, line, pool_malloc, nonpool_malloc);
#endif /* COMPILE_UTILITY */
@@ -908,7 +908,7 @@ return yield;
}

void *
-store_malloc_3(int size, const char *func, int linenumber)
+store_malloc_3(size_t size, const char *func, int linenumber)
{
if (n_nonpool_blocks++ > max_nonpool_blocks)
max_nonpool_blocks = n_nonpool_blocks;
@@ -933,10 +933,10 @@ Returns: nothing
static void
internal_store_free(void * block, const char * func, int linenumber)
{
-uschar * p = US block - sizeof(int);
+uschar * p = US block - sizeof(size_t);
#ifndef COMPILE_UTILITY
-DEBUG(D_any) nonpool_malloc -= *(int *)p;
-DEBUG(D_memory) debug_printf("----Free %6p %5d bytes\t%-20s %4d\n", block,
*(int *)p, func, linenumber);
+DEBUG(D_any) nonpool_malloc -= *(size_t *)p;
+DEBUG(D_memory) debug_printf("----Free %6p %5d bytes\t%-20s %4d\n", block,
*(size_t *)p, func, linenumber);
#endif
free(p);
}
diff --git a/src/src/store.h b/src/src/store.h
index ccfa8f012..3e4240842 100644
--- a/src/src/store.h
+++ b/src/src/store.h
@@ -65,7 +65,7 @@ typedef void ** rmark;
extern BOOL store_extend_3(void *, BOOL, int, int, const char *, int);
extern void store_free_3(void *, const char *, int);
/* store_get_3 & store_get_perm_3 are in local_scan.h */
-extern void *store_malloc_3(int, const char *, int) ALLOC
ALLOC_SIZE(1) WARN_UNUSED_RESULT;
+extern void *store_malloc_3(size_t, const char *, int) ALLOC
ALLOC_SIZE(1) WARN_UNUSED_RESULT;
extern rmark store_mark_3(const char *, int);
extern void *store_newblock_3(void *, BOOL, int, int, const char *, int);
extern void store_release_above_3(void *, const char *, int);

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

Jeremy Harris <jgh146exb@wizmail.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Assignee|unallocated@exim.org |jgh146exb@wizmail.org
Status|NEW |ASSIGNED

--- Comment #16 from Jeremy Harris <jgh146exb@wizmail.org> ---
Yup, that sounds like a reasonable cause. Thanks for working on this.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #17 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
Created attachment 1406
--> https://bugs.exim.org/attachment.cgi?id=1406&action=edit
Suggested patch to fix bug 2838

Attaching my current suggestion to fix this issue.

Since I'm not an expert on the exim codebase, feel free to adjust this patch to
your needs or use a better approach.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #18 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
I also just verified that this patch fixes the segmentation fault of exim on
Alpha.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

Jeremy Harris <jgh146exb@wizmail.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|ASSIGNED |WAIT_FIX_CONFIRMATION

--- Comment #19 from Jeremy Harris <jgh146exb@wizmail.org> ---
d73b9f478a addresses.

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

--- Comment #20 from John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de> ---
(In reply to Jeremy Harris from comment #19)
> d73b9f478a addresses.

I can confirm that the issue is now fixed for me on master and exim works
normally again on SPARC.

Thanks!

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##
[Bug 2838] exim binary crashes during testsuite with Bus Error on SPARC due to alignment issues [ In reply to ]
https://bugs.exim.org/show_bug.cgi?id=2838

Jeremy Harris <jgh146exb@wizmail.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Resolution|--- |FIXED
Status|WAIT_FIX_CONFIRMATION |RESOLVED

--- Comment #21 from Jeremy Harris <jgh146exb@wizmail.org> ---
Thanks - closing as fixed

--
You are receiving this mail because:
You are on the CC list for the bug.
--
## List details at https://lists.exim.org/mailman/listinfo/exim-dev Exim details at http://www.exim.org/ ##