Difference between revisions of "NaplesPU LLD Linker"

From NaplesPU Documentation
Jump to: navigation, search
 
(8 intermediate revisions by the same user not shown)
Line 1: Line 1:
When adding a new target architecture to llvm, some changes are required to the tools located in the llvm/tools folder.
+
LLD is the LLVM proprietary linker. It is based on the GNU linker, and it is compatible to it in terms of arguments accepted. It also fully supports the ''ELF'' generation but also, in a progressive lower percentage, ''PE/COFF'', ''Mach-O'', ''WebAssembly'' and custom linker-scripts.
== llvm-objdump ==
+
LLD is built as a set of libraries to join the main idea of the software modular design.
This is the llvm object file dumper. The full documentation for llvm-objdump is maintained as a Texinfo manual. If the info and llvm-objdump programs are properly installed, the command
 
<code> info llvm-objdump </code> should give you access to the complete manual.
 
  
In order to be properly used with the NuPlus architecture a change is required to the getRelocationValueString method inside the llvm-objdump.cpp file. Basically, we add a case statement for nu+ that behaves in the same way of other architectures such as MIPS.
+
The NaplesPU implementation of LLD is made on an ELF-based approach, by adding a new Target class inside lld/ELF/Arch folder. [[NaplesPU.cpp]] contains the ''relocation'' logic and other information needed to support NaplesPU. Each NaplesPU application must be linked together with two additional object files:
  
<syntaxhighlight lang="cpp" line='line'>
+
* ''crt0.o'', is the object file resulting of the compilation of the start routine for the custom platform, composed of operations needed to set-up the platform.  
template <class ELFT>
+
* ''vectors.o'', is the Interrupt Vector Table.
static std::error_code getRelocationValueString(const ELFObjectFile<ELFT> *Obj,
 
                                                const RelocationRef &RelRef,
 
                                                SmallVectorImpl<char> &Result) {
 
  ...
 
  switch (EF.getHeader()->e_machine) {
 
  ...
 
  case ELF::EM_NUPLUS:
 
  case ELF::EM_MIPS:
 
  ...
 
</syntaxhighlight>
 
  
== llvm-readobj ==
+
== Relocations ==
The llvm-readobj tool displays low-level format-specific information about one or more object files.  
+
This section explains how the relocations are implemented in NaplesPU.
In order to be properly used with the NuPlus architecture the following change is required inside the ELFDumper.cpp file.
+
All relocations are specified in the ''relocateOne'' method implementation. By looking at the target ISA, LLD must check for each instruction type in which an immediate value is specified to verify if it can fit the machine instruction field.
  
<syntaxhighlight lang="cpp" line='line'>
+
<syntaxhighlight>
static const EnumEntry<unsigned> ElfMachineType[] = {
+
void NaplesPU::relocateOne(uint8_t *Loc, uint32_t Type, uint64_t Val) const {
   ...
+
   int64_t Offset;
   ENUM_ENT(EM_NUPLUS,        "EM_NUPLUS")
+
  switch (Type) {
 +
   default:
 +
    fatal("unrecognized reloc " + Twine(Type));
 +
    case R_NUPLUS_ABS32:
 +
    write32le(Loc, Val);
 +
    break;
 +
  case R_NUPLUS_BRANCH:
 +
    //Check Offset for J-Type Instructions
 +
    checkInt(Loc, Val, 18, Type);
 +
    applyNaplesPUReloc<18, 0>(Loc, Type, Val);
 +
    break;
 +
  case R_NUPLUS_PCREL_MEM:
 +
    //Check Offset for M-Type Instructions
 +
    checkInt(Loc, Val, 9, Type);
 +
    applyNaplesPUReloc<9, 3>(Loc, Type, Val);
 +
    break;
 +
  case R_NUPLUS_PCREL_LEA:
 +
    //Check Offset for I-Type Instructions
 +
    checkInt(Loc, Val, 9, Type);
 +
    applyNaplesPUReloc<9, 3>(Loc, Type, Val);
 +
    break;
 +
  case R_NUPLUS_ABS_HIGH_LEA:
 +
    //Check Offset for MOVEI-Type Instructions
 +
    checkInt(Loc, (Val >> 16) & 0xFFFF , 16, Type);
 +
    applyNaplesPUReloc<16, 2>(Loc, Type, (Val >> 16) & 0xFFFF);
 +
    break;
 +
  case R_NUPLUS_ABS_LOW_LEA:
 +
    //Check Offset for MOVEI-Type Instructions
 +
    checkInt(Loc, (Val) & 0xFFFF , 16, Type);
 +
    applyNaplesPUReloc<16, 2>(Loc, Type, (Val) & 0xFFFF);
 +
    break;
 +
  }
 +
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
== linker ==
+
All of these implemented relocations relies on the method ''applyNaplesPURelocation'' that simply writes back the value into the instruction.
[https://github.com/mclinker/mclinker mclinker] is the linker used with the LLVM ver. 3.9. However, it is not under active development so we must migrate towards [http://lld.llvm.org/ LLD].
 
 
 
The linker main purpose is to link object files in order to generate an executable file. It must resolve symbol names and relocations in order to produce a proper executable.
 
 
 
The nu+ version of mclinker is located in "compiler/tools". The aspects related to nu+ are in the "NuPlus" folder at the "mclinker/lib/Target" subdirectory.
 
  
The main aspects that were implemented are the relocations and the scratchpad symbols resolution.
+
== Linkerscript ==
The compiler is responsible of emitting these symbols, while the linker must recognize and resolve them by substituting with the correct offset. This is done through a different functions defined in the '''NuPlusRelocator''' class. The symbol-function binding is done in the NuPlusRelocationFunctions.h file.
 
  
To correctly link the different sections, it is also created a [https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/4/html/Using_ld_the_GNU_Linker/scripts.html linker script]. It is located in "misc/lnkrscrpt.ld". Other then instruct the linker on where the different sections must be located, it also defines the stack starting address for each core thread. The stacks are defined by a starting address and their dimensions. In the following, we illustrate the used link script and the memory layout. By default, each stack can own 16KB of data and all stacks are located sequentially in the memory starting at address 0x080000.
+
All memory images generated for NaplesPU must be linked in a proper way. This way is specified by the linkerscript:
The memory area between 0x000000-0x00037F is used to store input and output data through/from nu+, such as the kernel arguments. At 0x000380-0x0003FF, we have the interrupt vector jump table, i.e. jump instructions to the different ISRs. Starting at 0x000400, we have the crt0 startup routines followed by the different text and data sections.
 
[[File:MemoryLayout.jpg|400px|thumb|left|nu+ memory layout]]
 
  
<syntaxhighlight lang="c" line='line'>
+
<syntaxhighlight>
stacks_dim = 4K;
+
stacks_dim = 16K;
 
stacks_base = 0x80000;
 
stacks_base = 0x80000;
__stack0_top__ = stacks_base;
+
threads_per_core = 0x8;
__stack1_top__ = __stack0_top__ + stacks_dim;
 
__stack2_top__ = __stack1_top__ + stacks_dim;
 
__stack3_top__ = __stack2_top__ + stacks_dim;
 
__stack4_top__ = __stack3_top__ + stacks_dim;
 
__stack5_top__ = __stack4_top__ + stacks_dim;
 
__stack6_top__ = __stack5_top__ + stacks_dim;
 
__stack7_top__ = __stack6_top__ + stacks_dim;
 
  
 
SECTIONS
 
SECTIONS
 
{
 
{
   . = 0x00000400;
+
   . = 0x00000380;
   .text.vectors : { ../..//libs/vectors.o(.text)}
+
   .text.vectors : {
 +
    *vectors.o(.text)
 +
    . = 0x00000400;
 +
  }
  
   .text.start : { ../..//libs/crt0.o(.text)}
+
   .text.start : {
 +
    *crt0.o(.text)
 +
  }
 
   .text : { *(.text) }
 
   .text : { *(.text) }
  
 +
  . = 0x00800000;
 
   .data : { *(.data) }
 
   .data : { *(.data) }
 
+
 
 
   . = ALIGN(64);
 
   . = ALIGN(64);
 
   scratchpad : { *(scratchpad) }
 
   scratchpad : { *(scratchpad) }
  
 
   .bss : { *(.bss) }
 
   .bss : { *(.bss) }
 +
 +
  .eh_frame : { *(.eh_frame) }
  
 
}
 
}
 
</syntaxhighlight>
 
</syntaxhighlight>
  
 
+
As shown above, the linker script commonly defines where the sections must be placed into the memory. Also, it solves undefined symbols of the start routine, specifying the stack size per-thread, the number of threads and the starting address for the stack frame.
 
 
== elf2hex ==
 
The elf2hex tool (located in "compiler/tools/elf2hex/elf2hex.cpp") takes the elf executable and converts it into three different hex format files. It just takes the data and instructions from the elf file and print it to the output files. The files generated can be directly loaded into the memory or used as input to [[Emulator | EmuPlus]]. The three files are characterized by different suffixes:
 
# ''*.hex'': little-endian HEX file with 32-bit alignment
 
# ''*_mem.hex'': little-endian HEX file with 512-bit alignment. This file is used when simulating the architecture
 
# ''*_mem_mango_mem.hex'': big-endian HEX file with 512-bit alignment. This file is used when using nu+ inside the MANGO project
 

Latest revision as of 17:21, 21 June 2019

LLD is the LLVM proprietary linker. It is based on the GNU linker, and it is compatible to it in terms of arguments accepted. It also fully supports the ELF generation but also, in a progressive lower percentage, PE/COFF, Mach-O, WebAssembly and custom linker-scripts. LLD is built as a set of libraries to join the main idea of the software modular design.

The NaplesPU implementation of LLD is made on an ELF-based approach, by adding a new Target class inside lld/ELF/Arch folder. NaplesPU.cpp contains the relocation logic and other information needed to support NaplesPU. Each NaplesPU application must be linked together with two additional object files:

  • crt0.o, is the object file resulting of the compilation of the start routine for the custom platform, composed of operations needed to set-up the platform.
  • vectors.o, is the Interrupt Vector Table.

Relocations

This section explains how the relocations are implemented in NaplesPU. All relocations are specified in the relocateOne method implementation. By looking at the target ISA, LLD must check for each instruction type in which an immediate value is specified to verify if it can fit the machine instruction field.

void NaplesPU::relocateOne(uint8_t *Loc, uint32_t Type, uint64_t Val) const {
  int64_t Offset;
  switch (Type) {
  default:
    fatal("unrecognized reloc " + Twine(Type));
    case R_NUPLUS_ABS32:
     write32le(Loc, Val);
     break;
   case R_NUPLUS_BRANCH:
     //Check Offset for J-Type Instructions
     checkInt(Loc, Val, 18, Type);
     applyNaplesPUReloc<18, 0>(Loc, Type, Val);
     break;
   case R_NUPLUS_PCREL_MEM:
     //Check Offset for M-Type Instructions
     checkInt(Loc, Val, 9, Type);
     applyNaplesPUReloc<9, 3>(Loc, Type, Val);
     break;
   case R_NUPLUS_PCREL_LEA:
     //Check Offset for I-Type Instructions
     checkInt(Loc, Val, 9, Type);
     applyNaplesPUReloc<9, 3>(Loc, Type, Val);
     break;
   case R_NUPLUS_ABS_HIGH_LEA:
     //Check Offset for MOVEI-Type Instructions
     checkInt(Loc, (Val >> 16) & 0xFFFF , 16, Type);
     applyNaplesPUReloc<16, 2>(Loc, Type, (Val >> 16) & 0xFFFF);
     break;
   case R_NUPLUS_ABS_LOW_LEA:
     //Check Offset for MOVEI-Type Instructions
     checkInt(Loc, (Val) & 0xFFFF , 16, Type);
     applyNaplesPUReloc<16, 2>(Loc, Type, (Val) & 0xFFFF);
     break;
  }
}

All of these implemented relocations relies on the method applyNaplesPURelocation that simply writes back the value into the instruction.

Linkerscript

All memory images generated for NaplesPU must be linked in a proper way. This way is specified by the linkerscript:

stacks_dim = 16K;
stacks_base = 0x80000;
threads_per_core = 0x8;

SECTIONS
{
  . = 0x00000380;
  .text.vectors : {
    *vectors.o(.text)
    . = 0x00000400;
  }

  .text.start : {
    *crt0.o(.text)
  }
  .text : { *(.text) }

  . = 0x00800000;
  .data : { *(.data) }

  . = ALIGN(64);
  scratchpad : { *(scratchpad) }

  .bss : { *(.bss) }

  .eh_frame : { *(.eh_frame) }

}

As shown above, the linker script commonly defines where the sections must be placed into the memory. Also, it solves undefined symbols of the start routine, specifying the stack size per-thread, the number of threads and the starting address for the stack frame.