Linux 系统编程学习总结（二）ELF（linux编程技术详解）

2023-08-13 05:42:11

本文使用 Zhihu On VSCode 创作并发布

学习网络上某一专栏的记录。非入门教材，不会事无巨细的写，所以阅读前最好有一定Linux编程基础和了解。

一开始以为这个专栏挺水，没想到这么硬核啊。。学吧。。

看这篇文章之前，要了解什么是ELF + readelf指令。如果你不懂，一定要去谷歌一下ELF。

理解Sections和Segment

（这章几乎有干货的内容全靠cv。

对应中文:

Section : 分区

Segment : 段

我们需要把Section划分为Segment段，并规定Segment段的内存起始位置等等。这样kernel就知道怎么把这些段通过mmap映射到虚拟内存。cv一下一段英文来解释：

An ELF file consists of zero or more segments, and describe how to create a process/memory image for runtime execution. When the kernel sees these segments, it uses them to map them into virtual address space, using the mmap(2) system call. In other words, it converts predefined instructions into a memory image.

Segments are viewed by the kernel and mapped into memory (using mmap). Sections are viewed by the linker to create executable code or shared objects.

source: https://linux-audit.com/elf-binaries-on-linux-understanding-and-analysis/

还有接下来的一些解释也很好

Section是什么？

Sections comprise all information needed for linking a target object file in order to build a working executable. (It’s important to highlight that sections are needed on linktime but they are not needed on runtime.)

Segement是什么？

Segments, which are commonly known as Program Headers, break down the structure of an ELF binary into suitable chunks to prepare the executable to be loaded into memory. In contrast with Section Headers, Program Headers are not needed on linktime.

两个之间的关系？

In contrast from other File formats, ELF files are composed of sections and segments. As previously mentioned, sections gather all needed information to link a given object file and build an executable, while Program Headers split the executable into segments with different attributes, which will eventually be loaded into memory.

In order to understand the relationship between Sections and Segments, we can picture segments as a tool to make the linux loader’s life easier, as they group sections by attributes into single segments in order to make the loading process of the executable more efficient, instead of loading each individual section into memory. The following diagram attempts to illustrate this concept:

The reason for this alignment is to prevent the mapping of two different segments within a single memory page. This is due to the fact that different segments usually have different access attributes, and these cannot be enforced if two segments are mapped within the same memory page. Therefore, the default segment alignment for PT_LOAD segments is usually a system page size.

source: https://www.intezer.com/blog/research/executable-linkable-format-101-part1-sections-segments/

ELF 静态布局

问题导入：const是如何保证不被修改的？

答案是：分2种变量。一种是在函数内部定义的const，一种是定义在全局的const。如果是函数内部定义的const，编译器来检查你有没有修改const。比如你如果在函数内部写

const int const_value = 100; const_value = 200;

编译器会报错。但是我们可以骗编译器，比如

const int const_value = 100; int * ptr = (int *)&const_value; *ptr = 200;

嗯，然后你可以修改所谓的const代码了。

但是如果说是在全局变量里定义的，比如下面这个例子。

#include <stdio.h>#include <stdlib.h> static char static_data[16] = "Im Static Data"; static char raw_static_data[40960]; static const char const_data[16] = "Im Const Data"; int main(int args,char ** argv) { printf("Message In Main\n"); return 0; }

如果你这么改，按照上面一样的trick

char * pc = (char *)const_data; *pc = X;

可以过编译，但是会出现运行时runtime memory violation。为什么呢？

答案是，在生成ELF的过程中，代码的各部分变量或者数据不是全部无脑放在一起的。他们会被放在不同的地方。用readelf -S obj看一下ELF静态的目标分区。嗯，你会看见很多东西。其中.rodata 分区会存放全局常量，.text分区存放源码编译的机器指令。.rodata分区会和.text分区加载到一个段中，同时设置权限为R + E，没有W，所以你改的时候会报错，因为你往没有W权限的内存区进行写操作。

所以本质上，为什么2个const一个能改一个不能，就是因为变量所在的存储区不同。函数级变量是在函数的帧里的，程序拥有对这个存储区写的权限。而全局性的const变量是放在另一个存储区里的，程序默认不拥有写权限。

ELF 动态布局

需求：

在应用中，有一块静态长度的数据非常重要，我们需要保护这块数据，确保它不会被意料之外的数组溢出或者错误的指针修改。如果真发生意料之外的修改，那么就让程序崩溃，以让该逻辑错误尽早地暴露出来。同时，该数据块又不是完全只读的，某些特定情况下，还需修改这块数据的内容。需要长期稳定运行的软件产品，其配置数据通常会有这样的需求。

由于GNU支持自定义节区，方案：

把受保护的变量放入一个自定义的节区，让该节区的加载地址按照内存页大小对齐，并让它之后的其他节区从新的内存页开始布局；受保护的内容初始化完毕之后，用 mprotect 系统调用将受保护内容所占的内存页都设置为写保护；为受保护变量添加专门的数据修改 API，API 中先为要修改的内存页添加写权限，修改完成后马上移除写权限。

总的来说，分几步：

先在链接脚本中定义一个自定义节区，脚本语法不会展开讲，因为我暂时也不会。在程序内部使用__attribute__((section(".protect"))) SecurityDataStruct g_secData; 把数据放入.protect节区。编译可执行文件的时候注明使用你自己定义的链接器脚本。

感受就是，C语言真自由，连什么数据放进什么区都可以自定义，太草了。通过我对链接器默认脚本的阅读，连节区移到内存页哪个位置都可以自定义，不过当然都是新开一个内存页。链接器就像一个装配工，记录这个节区放到内存哪个位置。同时，我还可以在API里修改这块节区的操作权限，卧槽，C真的太强了。

以上就是关于《Linux 系统编程学习总结（二）ELF（linux编程技术详解）》的全部内容，本文网址：https://www.7ca.cn/baike/65534.shtml，如对您有帮助可以分享给好友，谢谢。

标签:

声明

Linux 系统编程学习总结 （二）ELF（linux编程技术详解）

理解Sections和Segment

ELF 静态布局

ELF 动态布局

Linux 系统编程学习总结（二）ELF（linux编程技术详解）