The source of SEGFAULT was not solely in violation of the strict aliasing rule, as the problem persisted even with -fno-strict-aliasing flag.
It was indeed accessing unaligned memory, but not as simple as that. As modern processors, generally allow unaligned memory access and there is even not much of an overhead nowadays. I've done some benchmarking and didn't observe a big difference in algined vs unaligned read on my Intel(R) Xeon(R) CPU E5-2680 v2 @ 2.80GHz. Also there are some very similar (and more or less recent) results in the web.
My problem was that -O3
mode enables -ftree-vectorize
flag, therefore my for
cycle was vectorized (as I could see using -ftree-vectorizer-verbose
flag). And (AFAIU) there is no support (yet?) for unaligned memory access using vectorized instructions, so there was a runtime exception.
This article helped me out a lot in understanding theory, though it seems that today unaligned memory access is not as harmful as it was, though still tricky
与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…