LLVM Bugzilla is read-only and represents the historical archive of all LLVM issues filled before November 26, 2021. Use github to submit LLVM bugs

Bug 37303 - LLD + -fsanitize=address segfaults in scandir.
Summary: LLD + -fsanitize=address segfaults in scandir.
Status: RESOLVED FIXED
Alias: None
Product: lld
Classification: Unclassified
Component: ELF (show other bugs)
Version: unspecified
Hardware: PC Linux
: P normal
Assignee: Unassigned LLVM Bugs
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-05-01 04:51 PDT by Jean-Michaël Celerier
Modified: 2019-01-16 15:49 PST (History)
4 users (show)

See Also:
Fixed By Commit(s): r351396 r351401


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jean-Michaël Celerier 2018-05-01 04:51:09 PDT
Hi,

the following fails when compiled with -fsanitize=address -fuse-ld=lld : 

bug.c: 

    #include <dirent.h>
    #include <fcntl.h>

    int filter(const struct dirent *dirent) { return 0; }
    int main() {
      struct dirent **namelist;
      scandir("/usr/lib", &namelist, filter, versionsort);
    }

    $ clang -D_GNU_SOURCE -O0 foo.c -fsanitize=address -fuse-ld=lld
    $ ./a.out


    AddressSanitizer:DEADLYSIGNAL
    =================================================================
    ==23603==ERROR: AddressSanitizer: SEGV on unknown address 0x0000fffd2f8a (pc 0x55bd8515b3b3 bp 0x7ffecc9d0cd0 sp 0x7ffecc9d0450 T0)
    ==23603==The signal is caused by a WRITE memory access.
        #0 0x55bd8515b3b2 in __interceptor_scandir.part.106 (/tmp/a.out+0xa73b2)
        #1 0x55bd85201c5e in main (/tmp/a.out+0x14dc5e)
        #2 0x7fb2e5c819a6 in __libc_start_main (/usr/lib/libc.so.6+0x219a6)
        #3 0x55bd85105029 in _start (/tmp/a.out+0x51029)

It works fine with -fuse-ld=gold/bfd
Comment 1 George Rimar 2018-05-15 07:58:16 PDT
I was unable to reproduce with the lasted sources of both clang and LLD.

Though I noticed that you have foo.c in the invocation,
but bug.c in the report.

Can you provide -reproduce file may be if the bug still happens for you?
Comment 2 Jean-Michaël Celerier 2018-05-16 11:03:44 PDT
For me this happens with clang 6.0 and lld 6.0. Is there an easy way to test the latest snapshots ? (e.g. without recompiling the whole stuff). For instance a docker image of some kind, a debian package ?
Comment 3 Jean-Michaël Celerier 2018-05-16 11:04:27 PDT
(The different file names are due to improper sanitization of my post, of course :p)
Comment 4 George Rimar 2018-05-17 01:57:48 PDT
(In reply to Jean-Michaël Celerier from comment #2)
> For me this happens with clang 6.0 and lld 6.0. Is there an easy way to test
> the latest snapshots ? (e.g. without recompiling the whole stuff). For
> instance a docker image of some kind, a debian package ?

Given that commits are done quite often, I do not think there is some way to test the *latest* sources without building them.
Comment 5 Vitaly Buka 2018-06-13 19:24:46 PDT
Can't reproduce this as well.
Should we close it?
Comment 6 George Rimar 2018-12-12 02:52:56 PST
Closing this, based on the comments (work with thunk).
Comment 7 Peter Wu 2019-01-15 07:33:35 PST
Re-opening as I am still able to reproduce the issue with lld from trunk, but using an *older* compiler-rt.

Initially reproduced with lld 7.0.1-1 on Arch Linux x86_64, it indeed crashed in the scandir interceptor of compiler-rt. Whether the object file was compiled with GCC/clang did not make a difference.

-fuse-ld=gold and -fuse-ld=bfd (binutils 2.31.1-4) works, -fuse-ld=lld produces broken binaries (as does -fuse-ld=/path/to/trunk/ld.lld).

# Broken with 7.0.1, works on trunk:
clang -fsanitize=address repro.o && ./a.out

# Broken with both 7.0.1 and trunk clang+lld (but using compiler-rt 7.0.1)
clang -fuse-ld=lld /usr/lib/clang/7.0.1/lib/linux/libclang_rt.asan-x86_64.a -ldl -pthread -lrt -lm repro.o && ./a.out

(Note: the above command with -fuse-ld=bfd seems to ignore the static library, use "clang -v" to see the link command and replace the libclang_rt.asan-x86_64.a and --dynamic-list=... paths and change ld.lld -> ld.bfd or ld.gold for testing.)

Disassembly of __interceptor_scandir64.part.126 for a working (bfd-linked) binary:

   add85:       ff 15 05 7b 0c 00       callq  *0xc7b05(%rip)        # 175890 <_ZN14__interception11real_strlenE>
   add8b:       4c 8d 60 01             lea    0x1(%rax),%r12
   add8f:       48 89 d8                mov    %rbx,%rax
   add92:       4c 01 e0                add    %r12,%rax
   add95:       0f 82 5d 03 00 00       jb     ae0f8 <__interceptor_scandir64.part.126+0x398>
   add9b:       4c 89 e6                mov    %r12,%rsi
   add9e:       48 89 df                mov    %rbx,%rdi
   adda1:       e8 9a a0 f8 ff          callq  37e40 <_ZN6__asanL29QuickCheckForUnpoisonedRegionEmm>
   adda6:       84 c0                   test   %al,%al
   adda8:       0f 84 b2 02 00 00       je     ae060 <__interceptor_scandir64.part.126+0x300>
   addae:       66 66 66 66 64 48 8b    data16 data16 data16 data16 mov %fs:0x0,%rax
   addb5:       04 25 00 00 00 00 
   addbb:       48 8d 0d ee 98 f8 ff    lea    -0x76712(%rip),%rcx        # 376b0 <_ZL24wrapped_scandir64_comparPPKN11__sanitizer20__sanitizer_dirent64ES3_>

Disassembly of __interceptor_scandir64.part.126 for a broken (lld-linked) binary:

   e4d35:       ff 15 d5 40 09 00       callq  *0x940d5(%rip)        # 178e10 <_ZN14__interception11real_strlenE>
   e4d3b:       4c 8d 60 01             lea    0x1(%rax),%r12
   e4d3f:       48 89 d8                mov    %rbx,%rax
   e4d42:       4c 01 e0                add    %r12,%rax
   e4d45:       0f 82 5d 03 00 00       jb     e50a8 <__interceptor_scandir64.part.126+0x398>
   e4d4b:       4c 89 e6                mov    %r12,%rsi
   e4d4e:       48 89 df                mov    %rbx,%rdi
   e4d51:       e8 9a a0 f8 ff          callq  6edf0 <_ZN6__asanL29QuickCheckForUnpoisonedRegionEmm>
   e4d56:       84 c0                   test   %al,%al
   e4d58:       0f 84 b2 02 00 00       je     e5010 <__interceptor_scandir64.part.126+0x300>
   e4d5e:       66 66 66 64 48 8b 04    data16 data16 data16 mov %fs:0x0,%rax
   e4d65:       25 00 00 00 00 
   e4d6a:       00 48 8d                add    %cl,-0x73(%rax)
   e4d6d:       0d ee 98 f8 ff          or     $0xfff898ee,%eax

The "data16" instruction looks wrong, the "add %cl,-0x73(%rax)" is the one that corrupts RAX and causes a crash down the line.
Comment 8 Peter Wu 2019-01-15 08:44:37 PST
Perhaps a relocation went wrong. Some annotated "objdump -d" output:

static THREADLOCAL scandir_filter_f scandir_filter;

INTERCEPTOR(int, scandir, char *dirp, __sanitizer_dirent ***namelist,
            scandir_filter_f filter, scandir_compar_f compar) {
  void *ctx;
  COMMON_INTERCEPTOR_ENTER(ctx, scandir, dirp, namelist, filter, compar);
  if (dirp) COMMON_INTERCEPTOR_READ_RANGE(ctx, dirp, REAL(strlen)(dirp) + 1);
   <elided unimportant context>
   76871:       e8 9a a0 f8 ff          callq  910 <_ZN6__asanL29QuickCheckForUnpoisonedRegionEmm>
   76876:       84 c0                   test   %al,%al
   76878:       0f 84 b2 02 00 00       je     76b30 <__interceptor_scandir64.part.126+0x300>

  scandir_filter = filter;
   7687e:       48 8d 3d [00 00 00 00]  lea    0x0(%rip),%rdi        # 76885 <__interceptor_scandir64.part.126+0x55>
   76885:       ff 15 [00 00 00 00]     callq  *0x0(%rip)        # 7688b <__interceptor_scandir64.part.126+0x5b>
   7688b:       48 8d 0d ee 98 f8 ff    lea    -0x76712(%rip),%rcx        # 180 <_ZL24wrapped_scandir64_comparPPKN11__sanitizer20__sanitizer_dirent64ES3_>

$ objdump -d /usr/lib/clang/7.0.1/lib/linux/libclang_rt.asan-x86_64.a
0000000000076881 R_X86_64_TLSLD    _ZL16scandir64_filter-0x0000000000000004
0000000000076887 R_X86_64_GOTPCRELX  __tls_get_addr-0x0000000000000004


Compare this to this part of __interceptor_scandir64.part.126:
   76bd1:       e8 3a 9d f8 ff          callq  910 <_ZN6__asanL29QuickCheckForUnpoisonedRegionEmm>
   76bd6:       84 c0                   test   %al,%al
   76bd8:       0f 84 b2 02 00 00       je     76e90 <__interceptor_scandir64.part.126+0x300>
  scandir_filter = filter;
   76bde:       48 8d 3d [00 00 00 00]  lea    0x0(%rip),%rdi        # 76be5 <__interceptor_scandir64.part.126+0x55>
   76be5:       e8 [00 00 00 00]        callq  76bea <__interceptor_scandir64.part.126+0x5a>
   76bea:       48 8d 0d 8f 95 f8 ff    lea    -0x76a71(%rip),%rcx        # 180 <_ZL24wrapped_scandir64_comparPPKN11__sanitizer20__sanitizer_dirent64ES3_>

$ objdump -r .../lib/clang/8.0.0/lib/linux/libclang_rt.asan-x86_64.a
0000000000076be1 R_X86_64_TLSLD    _ZL16scandir64_filter-0x0000000000000004
0000000000076be6 R_X86_64_PLT32    __tls_get_addr-0x0000000000000004
Comment 9 Peter Wu 2019-01-15 17:46:28 PST
I'm working on a patch now, the problem is that the TLS Linker Optimizations for Thread-Local Storage without PLT is not implemented in ldd. Relevant reference:

https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf
Comment 10 Peter Wu 2019-01-16 05:41:09 PST
Proposed patch: https://reviews.llvm.org/D56779
Comment 11 Peter Wu 2019-01-16 15:38:57 PST
Fixed in r351396, Rui will request a cherry-pick for the 8.0 release branch.
Comment 12 Peter Wu 2019-01-16 15:49:39 PST
And merged as r351401 in the 8.0 branch :)