MCLinker - the final toolchain frontier
Jörg Sonnenberger
joerg@NetBSD.org
Naples, April 06, 2013
BSD Day 2013
Overview
- Introduction
- Architecture
- Performance
- Implementation status
- Future work
Introduction
- Machine Code Linker complements the MC layer of LLVM
- Created by Luba Tang from MediaTek in 2011
- Uses same BSD-license as LLVM
Architecture: High-level view
- Build input tree
- Build fragment reference graph
- Layout sections, relocate and write output
- GNU ld: three steps mixed up
- gold: merge first two phases
Build the input tree
- Goal: High-level intermediate reprensentation
- Based on command line
- ...and file system content
- Deals with positional arguments (--start-group, --as-needed)
- Nesting: linker archives contain objects
- Typed objects: object files, linker archives, shared libraries
Build fragment reference graph
- Goal: symbol resolution
- Build a graph with sections as nodes, symbol references as edges
- Traverse input tree and look for files
- If it requested OR provides a missing definition
- ...process sections and symbol table
- Linker groups: use stack, push when hitting start
- ...repeat from start as long as new undefined reference occur
- Optimize for cache locality
- Place symbol attributes and initial part of name in same cache line
Layout sections
- Goal: decide section order and final positions
- Merge sections with same name and subsections
- Drop redundant or unused sections
- Finalize symbol values
- Advantage of late layout: avoids recomputations
- Single pass for ordering and address assignment
Compute relocations
- Apply finalized symbol values to relocations
- Decide which relocations are known at link time
- ...and which are left for the run time linker
- ...or whether they can be replaced by cheaper versions
- Constant tables vs limited intermediate encoding
- Global dynamic vs initial exec TLS method
Write output
- Goal: write final binary
- Apply relocations to input sections
- Write resulting sections/segmentions
- Mix in metadata
- Use memory mapped files if possible
- ...helps page lookup table (TLB) cache
- ...improves page locality
- ...helps filesystem cache
Performance: Time and memory use
Binary |
|
GNU ld |
gold |
MCLinker |
llvm-tblgen |
Run time |
0.10s |
0.04s |
0.05s |
Peak RSS |
17,700KB |
17,528KB |
17,508KB |
clang |
Run time |
1.41s |
0.44s |
0.69s |
Peak RSS |
150MB |
182MB |
176MB |
Output size
Binary |
Segment |
GNU ld |
gold |
MCLinker |
llvm-tblgen |
text |
1,828KB |
1,786LB |
2,124KB |
data |
2,664 |
2,520 |
2,408 |
bss |
5,912 |
2,520 |
5,360 |
clang |
text |
26.9MB |
26.7MB |
34.3MB |
data |
22,112 |
22,112 |
21,984 |
bss |
47,736 |
47,704 |
47,624 |
- MCLinker behaves like --export-dynamic
- Text size difference in .rodata and .dynstr
Linking GCC's cc1
|
GNU ld |
MCLinker |
Run time |
0.20s |
0.16s |
Peak RSS |
47,888KB |
51,752KB |
Code size |
8,618KB |
8,178KB |
Data size |
1,154KB |
1,154KB (+48B) |
Implementation status: MI
- Most basic ELF functionality works:
- Static/dynamic linkage
- Partial linking
- Visibility and binding rules
- DT_NEEDED not honoured yet
i386 and amd64
- build.sh release works
- ...using a fallback to GNU ld for parts depending on linker scripts
- TLS support incomplete: relaxation tests fail
ARM
- build.sh release builds
- ...using a few more hacks than X86
- ...parts of libc.so don't work optimized
- ...analysis is still running
- TLS support incomplete
- ARM ELF header flags problematic
- Optional system linker for Android
- No support for AArch64
MIPS
- Used by Android/MIPS
- NetBSD untested (yet)
- No support for N64 or O64
Future work
- Extensive testsuite
- Symbol versioning
- Linker scripts
- LTO
- Research: fine grained layout on a per function base
- EH table optimisations
- Platform work:
- To-be-completed: X86 (i386 and amd64), ARM and MIPS support
- Work-in-progress: X32, MIPS64, Hexagon
- Not-started-yet: AArch64
Corporate supporters
- MediaTek
- Google
- Intel
- MIPS
- Qualcomm