DOKK / manpages / debian 13 / avr-libc / mem_sections.3avr.en

NAME

mem_sections - Memory Sections

Section are used to organize code and data of a program on the binary level.

The (compiler-generated) assembly code assigns code, data and other entities like debug information to so called input sections. These sections serve as input to the linker, which bundles similar sections together to output sections like .text and .data according to rules defined in the linker description file.

The final ELF binary is then used by programming tools like avrdude, simulators, debuggers and other programs, for example programs from the GNU Binutils family like avr-size, avr-objdump and avr-readelf.

Sections may have extra properties like section alignment, section flags, section type and rules to locate them or to assign them to memory regions.

•: Concepts

•: Named Sections
•: Section Flags
•: Section Type
•: Section Alignment
•: Subsections
•: Orphan Sections
•: LMA: Load Memory Address
•: VMA: Virtual Memory Address

•: The Linker Script: Building Blocks

•: Input Sections and Output Sections
•: Memory Regions

•: Output Sections of the Default Linker Script

•: .text
•: .data
•: .bss
•: .noinit
•: .rodata
•: .eeprom
•: .fuse, .lock and .signature
•: .note.gnu.avr.deviceinfo

Symbols in the Default Linker Script
Output Sections and Code Size
Using Sections

•: In C/C++ Code
•: In Assembly Code

Concepts

Named Sections

Named sections are sections that can be referred to by their name. The name and other properties can be provided with the .section directive like in

.section name, "flags", @type

or with the .pushsection directive, which directs the assembler to assemble the following code into the named section.

An example of a section that is not referred to by its name is the COMMON section. In order to put an object in that section, special directives like .comm name,size or .lcomm name,size have to be used.

Directives like .text are basically the same like .section .text, where the assembler assumes appropriate section flags and type; same for directives .data and .bss.

Section Flags

The section flags can be specified with the .section and .pushsection directives, see section type for an example. Section flags of output sections can be specified in the linker description file, and the linker implements heuristics to determine the section flags of output sections from the various input section that go into it.

Flag Meaning a The section will be allocated, i.e. it occupies space on the target hardware w The section contains data that can be written at run-time. Sections that only contain read-only entities don't have the w flag set x The section contains executable code, though the section may also contain non-executable objects M A mergeable section S A string section G A section group, like used with comdat objects

The last three flags are listed for completeness. They are used by the compiler, for example for header-only C++ modules and to ensure that multiplle instanciations of the same template in different compilaton units does occur at most once in the executable file.

Section Type

The section type can be specified with the .section and .pushsection directives, like in

.section .text.myfunc,"ax",@progbits
.pushsection ".data.myvar", "a", "progbits"

On ELF level, the section type is stored in the section header like Elf32_Shdr.sh_type = SHT_PROGBITS.

Type Meaning @progbits The section contains data that will be loaded to the target, like objects in the .text and .data sections. @nobits The section does not contain data that needs to be transferred to the target device, like data in the .bss and .noinit sections. The section still occupies space on the target. @note The section is a note, like for example the .note.gnu.avr.deviceinfo section.

Section Alignment

The alignment of a section is the maximum over the alignments of the objects in the section.

Subsections

Subsections are compartments of named sections and are introduced with the .subsection directive. Subsections are located in order of increasing index in their input section. The default subsection after switching to a new section is subsection 0.

Note

A common misconception is that a section like .text.module.func were a subsection of .text.module. This is not the case. These two sections are independent, and there is no subset relation. The sections may have different flags and type, and they may be assigned to different output sections.

Orphan Sections

Orphan sections are sections that are not mentioned in the linker description file. When an input section is orphan, then the GNU linker implicitly generates an output section of the same name. The linker implements various heuristics to determine sections flags, section type and location of orphaned sections. One use of orphan sections is to locate code to a fixed address.

Like for any other output section, the start address can be specified by means of linking with -Wl,--section-start,secname=address

LMA: Load Memory Address

The LMA of an object is the address where a loader like avrdude puts the object when the binary is being uploaded to the target device.

VMA: Virtual Memory Address

The VMA is the address of an object as used by the running program.

VMA and LMA may be different: Suppose a small ATmega8 program with executable code that extends from byte address 0x0 to 0x20f, and one variable my_var in static strorage. The default linker script puts the content of the .data output section after the .text output section and into the text segment. The startup code then copies my_data from its LMA location beginning at 0x210 to its VMA location beginning at 0x800060, because C/C++ requires that all data in static storage must have been initialized when main is entered.

The internal SRAM of ATmega8 starts at RAM address 0x60, which is offset by 0x800000 in order to linearize the address space (VMA 0x60 is a flash address). The AVR program only ever uses the lower 16 bits of VMAs in static storage so that the offset of 0x800000 is masked out. But code like 'LDI r24,hh8(my_data)' actually sets R24 to 0x80 and reveals that my_data is an object located in RAM.

The Linker Script: Building Blocks

The linker description file is the central hub to channel functions and static storage objects of a program to the various memory spaces and address ranges of a device.

Input Sections and Output Sections

Input sections are sections that are inputs to the linker. Functions and static variables but also additional notes and debug information are assigned to different input sections by means of assembler directives like .section or .text. The linker takes all these sections and assigns them to output sections as specified in the linker script.

Output sections are defind in the linker description file. Contrary to the unlimited number of input sections a program can come up with, there is only a handfull of output sections like .text and .data, that roughly correspond to the memory spaces of the target device.

One step in the final link is to locate the sections, that is the linker/locator determines at which memory location to put the output sections, and how to arrange the many input sections within their assigned output section. Locating means that the linker assigns Load Memory Addresses --- addresses as used by a loader like avrdude --- and Virtual Memory Addresses, which are the addresses as used by the running program.

While it is possible to directly assign LMAs and VMAs to output sections in the linker script, the default linker scripts provided by Binutils assign memory regions (aka. memory segments) to the output sections. This has some advantages like a linker script that is easier to maintain. An output sections can be assigned to more than one memory region. For example, non-zero data in static storage (.data) goes to

1.: the data region (VMA), because such variables occupy RAM which has to be allocated
2.: the text region (LMA), because the initializers for such data has to be kept in some non-volatile memory (program ROM), so that the startup code can initialize that data so that the variables have their expected initial values when main() is entered.

The SECTIONS{} portion of a linker script models the input and output section, and it assignes the output section to the memory regions defined in the MEMORY{} part.

Memory Regions

The memory regions defined in the default linker script model and correspond to the different kinds of memories of a device.

Region Virtual
Address1 Flags Purpose text 02 rx Executable code, vector table, data in PROGMEM, __flash and __memx, startup code, linker stubs, initializers for .data data 0x8000002 rw Data in static storage rodata3 0xa000002 r Read-only data in static storage eeprom 0x810000 rw EEPROM data fuse 0x820000 rw Fuse bytes lock 0x830000 rw Lock bytes signature 0x840000 rw Device signature user_signatures 0x850000 rw User signature

Notes

1.: The VMAs for regions other than text are offset in order to linearize the non-linear memory address space of the AVR Harvard architecture. The target code only ever uses the lower 16 bits of the VMA to access objects in non-text regions.
2.: The addresses for regions text, data and rodata are actually defined as symbols like __TEXT_REGION_ORIGIN__, so that they can be adjusted by means of, say -Wl,--defsym,__DATA_REGION_ORIGIN__=0x800060. Same applies for the lengths of all the regions, which is __NAME_REGION_LENGTH__ for region name.
3.: The rodata region is only present in the avrxmega2_flmap and avrxmega4_flmap emulations, which is the case for Binutils since v2.42 for the AVR64 and AVR128 devices without -mrodata-in-ram.

Output Sections of the Default Linker Script

This section describes the various output sections defined in the default linker description files.

Output Purpose Memory Region Section LMA VMA .text Executable code, data in progmem text text .data Non-zero data in static storage text data .bss Zero data in static storage --- data .noinit Non-initialized data in static storage --- data .rodata1 Read-only data in static storage text LMA + offset3 .rodata2 Read-only data in static storage 0x8000 * __flmap4 rodata .eeprom Data in EEPROM Note5 eeprom .fuse Fuse bytes fuse .lock Lock bytes lock .signature Signature bytes signature User signature bytes user_signatures

Notes

1.: On avrxmega3 and avrtiny devices.
2.: On AVR64 and AVR128 devices without -mrodata-in-ram.
3.: With an offset __RODATA_PM_OFFSET__ of 0x4000 or 0x8000 depending on the device.
4.: The value of symbol __flmap defaults to the last 32 KiB block of program memory, see the GCC v14 release notes.
5.: The LMA actually equals the VMA, but is unused. The flash loader like avrdude knows where to put the data,

The .text Output Section

The .text output section contains the actual machine instructions which make up the program, but also additional code like jump tables and lookup tables placed in program memory with the PROGMEM attribute.

The .text output section contains the input sections described below. Input sections that are not used by the tools are omitted. A * wildcard stands for any sequence of characters, including empty ones, that are valid in a section name.

.vectors: The .vectors sections contains the interrupt vector table which consists of jumps to weakly defined labels: To __init for the first entry at index 0, and to __vector_N for the entry at index N 1. The default value for __vector_N is __bad_interrupt, which jumps to weakly defined __vector_default, which jumps to __vectors, which is the start of the .vectors section.

Implementing an interrupt service ruotine (ISR) is performed with the help of the ISR macro in C/C++ code.

.progmem.data
.progmem.data.*
.progmem.gcc.*: This section is used for read-only data declared with attribute PROGMEM, and for data in address-space __flash.

The compiler assumes that the .progmem sectons are located in the lower 64 KiB of program memory. When it does not fit in the lower 64 KiB block, then the program reads garbage except pgm_read_*_far is used. In that case however, code can be located in the .progmemx section which does not require to be located in the lower program memory.

.trampolines: Linker stubs for indirect jumps and calls on devices with more than 128 KiB of program memory. This section must be located in the same 128 KiB block like the interrupt vector table. For some background on linker stubs, see the GCC documentation on EIND.
.text
.text.*: Executable code. This is where almost all of the executable code of an application will go.
.ctors
.dtors: Tables with addresses of static constructors and destructors, like C++ static constructors and functions declared with attribute constructor.
The .initN Sections: These sections are used to hold the startup code from reset up through the start of main().

The .initN sections are executed in order from 0 to 9: The code from one init section falls through to the next higher init section. This is the reason for why code in these sections must be naked (more precisely, it must not contain return instructions), and why code in these sections must never be called explicitly.

When several modules put code in the same init section, the order of execuation is not specified.

Section Performs Hosted By Symbol1 .init0 Weakly defines the __init label which is the jump target of the first vector in the interrupt vector table. When the user defines the __init() function, it will be jumped to instead. AVR-LibC2 .init1 Unused --- .init2

Clears __zero_reg__
Initializes the stack pointer to the value of weak symbol __stack, which has a default value of RAMEND as defined in avr/io.h
Initializes EIND to hh8(pm(__vectors)) on devices that have it
Initializes RAMPX, RAMPY, RAMPZ and RAMPD on devices that have all of them

AVR-LibC .init3 Initializes the NVMCTRLB.FLMAP bit-field on devices that have it, except when -mrodata-in-ram is specified AVR-LibC __do_flmap_init .init4 Initializes data in static storage: Initializes .data and clears .bss libgcc __do_copy_data
__do_clear_bss .init5 Unused --- .init6 Run static C++ constructors and functions defined with __attribute__((constructor)). libgcc __do_global_ctors .init7 Unused --- .init8 Unused --- .init9 Calls main and then jumps to exit AVR-LibC

Notes

1.: Code in the .init3, .init4 and .init6 sections is optional; it will only be present when there is something to do. This will be tracked by the compiler --- or has to be tracked by the assembly programmer --- which pulls in the code from the respective library by means of the mentioned symbols, e.g. by linking with -Wl,-u,__do_flmap_init or by means of

.global __do_copy_data

Conversely, when the respective code is not desired for some reason, the symbol can be satisfied by defining it with, say, -Wl,--defsym,__do_copy_data=0 so that the code is not pulled in any more.

2.: The code is provided by gcrt1.S.
The .finiN Sections: Shutdown code. These sections are used to hold the exit code executed after return from main() or a call to exit().

The .finiN sections are executed in descending order from 9 to 0 in a fallthrough manner.

Section Performs Hosted By Symbol .fini9 Defines _exit and weakly defines the exit label libgcc .fini8 Run functions registered with atexit() AVR-LibC .fini7 Unused --- .fini6 Run static C++ destructors and functions defined with __attribute__((destructor)) libgcc __do_global_dtors .fini5...1 Unused --- .fini0 Globally disables interrupts and enters an infinite loop to label __stop_program libgcc It is unlikely that ordinary code uses the fini sections. When there are no static destructors and atexit() is not used, then the respective code is not pulled in form the libraries, and the fini code just consumes four bytes: a CLI and a RJMP to itself. Common use cases of fini code is when running the GCC test suite where it reduces fallout, and in simulators to determine (un)orderly termination of a simulated program.

.progmemx.*: Read-only data in program memory without the requirement that it must reside in the lower 64 KiB. The compiler uses this section for data in the named address-space __memx. Data can be accessed with pgm_read_*_far when it is not in a named address-space:

#include <avr/pgmspace.h>
const __memx int array1[] = { 1, 4, 9, 16, 25, 36 };
PROGMEM_FAR
const int array2[] = { 2, 3, 5, 7, 11, 13, 17 };
int add (uint8_t id1, uint8_t id2)
{


    uint_farptr_t p_array2 = pgm_get_far_address (array2);


    int val2 = pgm_read_int_far (p_array2 + sizeof(int) * id2);


    return val2 + array1[id1];
}

.jumptables*: Used to place jump tables in some cases.

The .data Output Section

This section contains data in static storage which has an initializer that is not all zeroes. This includes the following input sections:

.data*: Read-write data
.rodata*: Read-only data. These input sections are only included on devices that host read-only data in RAM.

It is possible to tell the linker the SRAM address of the beginning of the .data section. This is accomplished by linking with

avr-gcc ... -Tdata addr -Wl,--defsym,__DATA_REGION_START__=addr

Note that addr must be offset by adding 0x800000 the to real SRAM address so that the linker knows that the address is in the SRAM memory segment. Thus, if you want the .data section to start at 0x1100, pass 0x801100 as the address to the linker.

Note

When using malloc() in the application (which could even happen inside library calls), additional adjustments are required.

The .bss Output Section

Data in static storage that will be zeroed by the startup code. This are data objects without explicit initializer, and data objects with initializers that are all zeroes.

Input sections are .bss* and COMMON. Common symbols are defined with directives .comm or .lcomm.

The .noinit Output Section

Data objects in static storage that should not be initialized by the startup code. As the C/C++ standard requires that all data in static storage is initialized --- which includes data without explicit initializer, which will be initialized to all zeroes --- such objects have to be put into section .noinit by hand:

__attribute__ ((section (".noinit")))
int foo;

The only input section in this output section is .noinit. Only data without initializer can be put in this section.

The .rodata Output Section

This section contains read-only data in static storage from .rodata* input sections. This output section is only present for devices where read-only data remains in program memory, which are the devices where (parts of) the program memory are visible in the RAM address space. This is currently the case for the emulations avrtiny, avrxmega3, avrxmega2_flmap and avrxmega4_flmap.

The .eeprom Output Section

This is where EEPROM variables are stored, for example variables declared with the EEMEM attribute. The only input section (pattern) is .eeprom*.

The .fuse, .lock and .signature Output Sections

These sections contain fuse bytes, lock bytes and device signature bytes, respectively. The respective input section patterns are .fuse* .lock* and .signature*.

The .note.gnu.avr.deviceinfo Section

This section is actually not mentioned in the default linker script, which means it is an orphan section and hence the respective output section is implicit.

The startup code from AVR-LibC puts device information in that section to be picked up by simulators or tools like avr-size, avr-objdump, avr-readelf, etc,

The section is contained in the ELF file but not loaded onto the target. Source of the device specific information are the device header file and compiler builtin macros. The layout conforms to the standard ELF note section layout and is laid out as follows.

#include <elf.h>
typedef struct
{


    Elf32_Word n_namesz;     /* AVR_NOTE_NAME_LEN */


    Elf32_Word n_descsz;     /* size of avr_desc */


    Elf32_Word n_type;       /* 1 - the only AVR note type */
} Elf32_Nhdr;
#define AVR_NOTE_NAME_LEN 4
struct note_gnu_avr_deviceinfo
{


    Elf32_Nhdr nhdr;


    char note_name[AVR_NOTE_NAME_LEN]; /* = "AVR\0" */


    struct


    {


        Elf32_Word flash_start;


        Elf32_Word flash_size;


        Elf32_Word sram_start;


        Elf32_Word sram_size;


        Elf32_Word eeprom_start;


        Elf32_Word eeprom_size;


        Elf32_Word offset_table_size;


        /* Offset table containing byte offsets into


           string table that immediately follows it.


           index 0: Device name byte offset */


        Elf32_Off offset_table[1];


        /* Standard ELF string table.


           index 0 : NULL


           index 1 : Device name


           index 2 : NULL */


        char strtab[2 + strlen(__AVR_DEVICE_NAME__)];


    } avr_desc;
};

The contents of this section can be displayed with

avr-objdump -P avr-deviceinfo file, which is supported since Binutils v2.43.
avr-readelf -n file, which displays all notes.

Symbols in the Default Linker Script

Most of the symbols like main are defined in the code of the application, but some symbols are defined in the default linker script:

__name_REGION_ORIGIN__: Describes the physical properties of memory region name, where name is one of TEXT or DATA. The address is a VMA and offset at explained above.
The linker script only supplies a default for the symbol values when they have not been defined by other means, like for example in the startup code or by --defsym. For example, to let the code start at address 0x100, one can link with

avr-gcc ... -Ttext=0x100 -Wl,--defsym,__TEXT_REGION_ORIGIN__=0x100

__name_REGION_LENGTH__: Describes the physical properties of memory region name, where name is one of: TEXT, DATA, EEPROM, LOCK, FUSE, SIGNATURE or USER_SIGNATURE.
Only a default is supplied when the symbol is not yet defined by other means. Most of these symbols are weakly defined in the startup code.
__data_start
__data_end: Start and (one past the) end VMA address of the .data section in RAM.
__data_load_start
__data_load_end: Start and (one past the) end LMA address of the .data section initializers located in program memory. Used together with the VMA addresses above by the startup code to copy data initializers from program memory to RAM.
__bss_start
__bss_end: Start and (one past the) end VMA address of the .bss section. The startup code clears this part of the RAM.
__rodata_start
__rodata_end
__rodata_load_start
__rodata_load_end: Start and (one past the) end VMA resp. LMA address of the .rodata output section. These symbols are only defined when .rodata is not output to the text region, which is the case for emulations avrxmega2_flmap and avrxmega4_flmap.
__heap_start: One past the last object located in static storage. Immediately follows the .noinit section (which immediately follows .bss, which immediately follows .data). Used by malloc() and friends.

Code that computes a checksum over all relevant code and data in program memory has to consider:

The range from the beginning of the .text section (address 0x0 in the default layout) up to __data_load_start.
For emulations that have the rodata memory region, the range from __rodata_load_start to __rodata_load_end has also to be taken into account.

Output Sections and Code Size

The avr-size program (part of Binutils), coming from a Unix background, doesn't account for the .data initialization space added to the .text section, so in order to know how much flash the final program will consume, one needs to add the values for both, .text and .data (but not .bss), while the amount of pre-allocated SRAM is the sum of .data and .bss.

Memory usage and free memory can also be displayed with

avr-objdump -P mem-usage code.elf

Using Sections

In C/C++ Code

The following example shows how to read and reset the MCUCR special function register on ATmega328. This SFR holds to reset source like 'watchdog reset' or 'external reset', and should be read early, prior to the initialization of RAM and execution of static constructors which may take some time. This means the code has to be placed prior to .init4 which initializes static storage, but after .init2 which initializes __zero_reg__. As the code runs prior to the initialization of static storage, variable mcucr must be placed in section .noinit so that it won't be overridden by that part of the startup code:

#include <avr/io.h>
__attribute__((section(".noinit")))
uint8_t mcucr;
__attribute__((used, unused, naked, section(".init3")))
static void read_MCUCR (void)
{


    mcucr = MCUCR;


    MCUCR = 0;
}

The used attribute tells the compiler that the function is used although it is never called.
The unused attribute tells the compiler that it is fine that the function is unused, and silences respective diagnostics about the seemingly unused functions.
The naked attribute is required because the code is located in an init section. The function must not have a RET statement because the function is never called. According to the GCC documentation, the only code supported in naked functions is inline assembly, but the code above is simple enough so that GCC can deal with it.

In Assembly Code

Example:

#include <avr/io.h>
.section .init3,"ax",@progbits


    lds     r0, MCUCR
.pushsection .noinit,"a",@nobits
mcucr:


    .type   mcucr, @object


    .size   mcucr, 1


    .space  1
.popsection                     ; Proceed with .init3


    sts     mcucr, r0


    sts     MCUCR, __zero_reg__ ; Initialized in .init2
.text


    .global main


    .type   main, @function


    lds     r24,    mcucr


    clr     r25


    rjmp    putchar


    .size main, .-main

The 'ax' flags tells that the sections is allocatable (consumes space on the target hardware) and is executable.
The @progbits type tells that the section contains bits that have to be uploaded to the target hardware.

For more detais, see the see the gas user manual on the .section directive.

Version 2.2.1

AVR-LibC