Reported at https://github.com/rust-lang/rust/issues/72145 It looks like LLD/COFF doesn't emit properly aligned TLS sections. Test program: ```C++ #include <immintrin.h> #include <stdio.h> __declspec( thread ) char tb = 42; __declspec( thread ) char zb32[32]; int main() { printf("%p %p\n", &tb, zb32); *(__m256 *)zb32 = _mm256_set_ps(0, 0, 0, 0, 0, 0, 0, 0); printf("All good\n"); return 0; } ``` Using ld=link: > clang repro.cpp -march=sandybridge -O3 -o repro.exe -fuse-ld=link -g > .\repro.exe 0000024472806E40 0000024472806E60 All good Using ld=lld-link: > .\repro.exe 000001E5D2CE6D30 000001E5D2CE6D50 (crashes at this point) 0:000> .excr rax=0000000000000022 rbx=000001e5d2ce6e20 rcx=00000000ffffffff rdx=0000000000000001 rsi=000001e5d2ce6d10 rdi=000001e5d2cf1660 rip=00007ff6d5a01036 rsp=0000000c855bfb30 rbp=0000000000000000 r8=00007ff6d5a61bb0 r9=000001e5d2cef812 r10=0000000000000000 r11=0000000c855bfa20 r12=0000000000000000 r13=0000000000000000 r14=0000000000000000 r15=0000000000000000 iopl=0 nv up ei pl nz na pe nc cs=0033 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010202 repro!main+0x36: 00007ff6`d5a01036 c5fc298640000000 vmovaps ymmword ptr [rsi+40h],ymm0 ds:000001e5`d2ce6d50=00 The tls derived pointer is not properly aligned. The thread local itself and the store to it is marked as align 32: @"?zb32@@3PADA" = dso_local thread_local global [32 x i8] zeroinitializer, align 32, !dbg !14 store <8 x float> zeroinitializer, <8 x float>* bitcast ([32 x i8]* @"?zb32@@3PADA" to <8 x float>*), align 32, !dbg !45, !tbaa !46 Digging into it, as far as I can tell when ntdll is allocating the TLS slots they're 16 byte aligned. But looking at the dumpbin output for the ld=lld-link case we see: Dump of file repro.exe File Type: EXECUTABLE IMAGE Section contains the following TLS directory: 0000000140069000 Start of raw data 0000000140069060 End of raw data 0000000140060BC8 Address of index 0000000140059898 Address of callbacks 0 Size of zero fill 00000000 Characteristics (no align specified) Compared to the dumpbin output for the ld=link case: Dump of file repro.exe File Type: EXECUTABLE IMAGE Section contains the following TLS directory: 000000014008C000 Start of raw data 000000014008C26C End of raw data 000000014008307C Address of index 000000014006D8A8 Address of callbacks 0 Size of zero fill 00600000 Characteristics 32 byte align >lld-link --version LLD 10.0.0 >clang --version clang version 10.0.0 Target: x86_64-pc-windows-msvc Thread model: posix InstalledDir: C:\Program Files\LLVM\bin
Fixed in https://reviews.llvm.org/rG6a73d6564a3c7d0757692aa93bdef5be0c8f01a5