Authors Razvan Deaconescu, Razvan Nitu, Alexandru Militaru, Constantin Eduard Staniloiu,
License CC-BY-4.0
Received 1 November 2022, accepted 9 December 2022, date of publication 15 December 2022,
date of current version 29 December 2022.
Digital Object Identifier 10.1109/ACCESS.2022.3229461
Safer Linux Kernel Modules Using
the D Programming Language
CONSTANTIN EDUARD STANILOIU , ALEXANDRU MILITARU,
RAZVAN NITU , AND RAZVAN DEACONESCU
Faculty of Automatic Control and Computers, University POLITEHNICA of Bucharest, RO-060042 Bucharest, Romania
Corresponding author: Razvan Nitu (razvan.nitu1305@upb.ro)
ABSTRACT Since its creation, the Linux kernel has gained international recognition and has been employed
on a large range of devices: servers, supercomputers, smart devices and embedded systems. Given its
popularity, the security of the kernel has become a critical research topic. As a consequence, a wide
range of third party tools were created to detect bugs in its implementation. However, new vulnerabilities
are discovered and exploited every year. The explanation for this phenomenon lies in the fact that the
programming language that is used for the kernel implementation, C, is designed to allow unsafe memory
operations. In this paper, we show that it is possible to incrementally transition the kernel code from
C to a memory safe programming language, D, by porting and integrating a device driver. In addition,
we propose a series of code transformations that allow the D compiler to reason about the safety of certain
memory operations. Our implementation increases the security guarantees of the kernel without incurring
any performance penalties.
INDEX TERMS Memory safety, Linux kernel, driver development, security, D programming language.
I. INTRODUCTION
One of the most popular operating system kernels, Linux,
is used on a wide range of hardware, from supercomputers to
IoT devices. While Microsoft Windows dominates the desk-
top market, Linux is the most popular operating system used
by supercomputers [29], in the server market [31], handheld
devices, as part of the Android operating system [27] and the
embedded world [1].
Like all operating system kernels, Linux runs in a
privileged processor mode (called kernel mode or supervisor
mode) with complete access to system memory and devices.
A successful attack on Linux will provide the attacker
full control of the entire system, making it a sought after
target. Such attacks represent a common occurrence. Figure 1 FIGURE 1. Number of Common Vulnerability and Exposure (CVE) reports.
highlights the number of vulnerabilities discovered based on
the Common Vulnerability and Exposure (CVE) reports [12].
The trend appears to be slightly decreasing, however, it still is no way of knowing how many undiscovered vulnerabilities
amounts to an average of roughly 250 reports per year. exist and are being actively exploited.
This number is extremely large, considering the years of To protect itself from potential security attacks, the Linux
manpower invested in securing the kernel. In addition, there kernel employs a variety of self-protection mechanisms [10],
[17] such as Kernel Address Space Layout Randomization
The associate editor coordinating the review of this manuscript and (KASLR), Kernel Page Table Isolation (KPTI), stack protec-
approving it for publication was Alba Amato . tor etc.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/
134502 VOLUME 10, 2022
C. E. Staniloiu et al.: Safer Linux Kernel Modules Using the D Programming Language
Kernel self-protection mechanisms usually rely on We selected a Linux kernel driver (virtio_net) and ported
enabling specific configuration parameters and adding it successfully in the D programming language. The ported
runtime checks to prevent exploitation of code vulnerabilities. driver benefited from the safety features of the D program-
Vulnerabilities appear as a combination of programmer ming language, improving its security: bounds checking, safe
mistakes and lack of safety support from the programming functions, templates. The performance costs were negligible.
language. The Linux kernel is mostly written in C, a fast In summary, in this paper we make the following
programming language but with minimal safety features. contributions:
C syntax allows easy access to the program memory such as • We demonstrate the feasibility of using a modern pro-
liberal use of pointers, weak typing, no bounds checking for gramming language in the Linux kernel by successfully
arrays etc. While these give flexibility to the programmer, porting a Linux kernel module to the D programming
they are also the main source of vulnerabilities: buffer language. We ported virtio_net, the network driver of the
overflows, pointers to expired data, pointers to uninitialized virtio framework [11].
memory etc. • We design and implement techniques that rely on
In this paper we propose a complementary approach to specific D language features in order to improve
securing the Linux kernel: the use of a safe programming the Linux kernel drivers. The performance costs are
language, i.e. a language with features that assist the negligible with the security benefits being provided by
developer in writing secure code. the D programming language.
Our choice is the D programming language [5], that has a • We provide a methodology for porting Linux kernel
syntax similar to C/C++ and provides modern programming modules to the D programming language. Demonstrated
and safety features. D aims to provide as many of the by our successful port, the methodology can be used to
performance benefits of the C programming language, with port other Linux kernel modules.
as few of the security downsides as possible.
With the goal of porting a Linux kernel module to The rest of the paper proceeds as follows. Section II
the D programming language, we answer the overarch- details the D programming language and Linux kernel
ing research question: Can critical software components specifics. Section III presents the methodology employed for
(operating system drivers) be rewritten in a safe program- porting Linux kernel modules to D. Section IV presents the
ming language with reasonable effort while maintaining concrete steps and challenges in porting the virtio_net Linux
performance?. kernel module. Section V evaluates the security benefits and
Rewriting a software component from an older language performance costs for the ported module. Section VI presents
to a newer one offers the possibility to use more modern related work. Section VII concludes.
programming features. In our case there are safety benefits
such as: array bounds checking, immutable variables, safe II. BACKGROUND
functions, guaranteed initialization. At the same time, the A. LINUX KERNEL MODULES
translation process poses multiple challenges. Firstly, each Linux source code consists of the kernel proper and a plethora
feature in the initial programming language has to be of device drivers and configurable components. Loading a
available in the new programming language; if not, it has to be Linux kernel image with all the device drivers included will
adapted. Secondly, the newly rewritten software component result in unnecessary memory consumption and an increase
has to be built and linked against the main program: symbol of the attack surface. For this, similarly to other modern
names, calling conventions, memory references have to be operating systems, Linux uses loadable kernel modules, i.e.
compatible. Thirdly, dependencies of the newly rewritten object files that can be added to the kernel at runtime to extend
software component, such as its runtime library, have to be its functionality. Kernel modules can be loaded or unloaded
added to the new program or need to be disabled. upon request, without the need to reboot the system or to
Additionally, the Linux kernel adds its own challenges. recompile the kernel.
Certain features such as a standard C library or the use of Device drivers are typically implemented as kernel mod-
floating point are missing. Memory allocations are typically ules. On a given system, only drivers for its particular set of
resident in the Linux kernel. The stack size is limited. hardware devices will be loaded in the kernel. The loading of
While we also considered Rust and Go as programming these specific device drivers usually takes place at startup.
languages for the Linux kernel port, we ultimately chose D. Past studies have shown that device drivers host security
Our choice of D was based on three criteria: syntax similarity vulnerabilities. Johnson et al. have found that 9 of 11 vul-
to the C programming language, interoperability with C nerabilities in the Linux kernel located in device drivers [9].
programs and high performance generated code. D fitted An investigation using Parfait, a C/C++ static analyzer, has
these criteria, with its close syntax to C, its reasonably easy found that 81% of the bugs are located in device driver
interoperability1 with other languages and its proven track code [4]. A two year investigation has revealed that 85% of
record of generating code that is on par with C’s performance. Android kernel bugs are found in vendor drivers. As such, this
paper focuses on securing a device driver by porting it to the
1 The D primary compiler (DMD) has a feature called -betterC D programming language. We use the virtio_net Linux kernel
enabling it to build a C program with some D features. module as a proof-of-concept of our approach.
VOLUME 10, 2022 134503
C. E. Staniloiu et al.: Safer Linux Kernel Modules Using the D Programming Language
B. THE D PROGRAMMING LANGUAGE Another important part of the D language are its
D is a general-purpose, statically typed, systems program- metaprogramming features. Template metaprogramming is
ming language. It has a similar syntax to the C programming a technique that allows the user to make decisions based
language and it compiles to native code, i.e. it is not on the template type properties. This technique makes
interpreted nor does it use a virtual machine. D supports both generic programming even more powerful, allowing generic
automatic and manual memory management: one can rely types to be more flexible based on the types that they are
on the garbage collector (GC) for memory management or instantiated with. We have used metaprogramming to employ
directly use the malloc and free functions for manual compile-time polymorphism inside the Linux kernel in order
allocation and deallocation of memory, similarly to C. to replace the use and casts to/from void* with concrete
D is designed as a more feature rich and safe alternative to types.
the C programming language. It aims to create programs with
comparable performance to those written in C but without C. INTERFACING C WITH D
the safety issues of it. D provides a set of features aimed at Regarding interoperability with C, the D programming
reducing the likelihood of memory issues and vulnerabilities language was designed to match most of the C data types, data
typically found in C programs. structure memory layout and calling convention. Moreover,
D implements bounds checking for both static and dynamic the compatibility extends to the format of the object files.
arrays. To address the C design flaw of conflating pointers D and C use the same application binary interface (ABI)
with arrays and losing the length information, D implements and the same linkers. D permits access to the C standard
two separate types for pointers and arrays. While the normal library through bindings in the D runtime library and the
pointers have the same implementation as in C, the arrays D standard library; similarly, C programs can access D
are implemented as fat pointers: the pointer representation is functions. Due to name mangling, C functions called in D
extended to a structure that includes length information used need to be declared with the appropriate linkage attribute
in bounds checking. (extern “C”); similarly, D functions called in C code are
In D, the type system is more stringent and void pointers prepended with the same linkage attribute. This is identical to
are not implicitly converted to other pointer types. Moreover, the integration of C++ functions in C code and viceversa.
local variables marked with the scope keyword are limited Linking D code to a C program relies on restricting D
to the function scope, reducing the presence of dangling objects only to the C standard library. D-generated object
pointers. files can be linked to C-generated object files by restricting
Besides common pointers such as those found in C, D code to a subset that is not reliant on the D runtime library.
D provides a memory-safe option called slices. A slice acts This is achieved through the -betterC compiler switch
as a ‘‘view’’ of a precise segment of an array. It tracks both that limits the language to a specific subset that meets the
the pointer and the length of the segment. Instead of referring foregoing requirement. This subset, called BetterC, results
an array through a pointer that may cause an out of bounds from removing or altering certain features of the language
memory access, one can use a bounded slice. that rely on the runtime library. While some important
D offers the @safe annotation for functions. This enables functionalities, such as garbage collection, are removed, most
the compiler to statically check the body of annotated func- relevant memory-safety features are preserved. Array bounds
tions for instructions that could lead to memory corruption checking and slicing, metaprogramming facilities, automatic
such as pointer arithmetic and casts. By default, D relies on initialization of local variables, function safety are part of the
the GC to safely manage the lifetime of objects. Although BetterC subset.
the GC has proven to aid productivity and memory safety,
its use is incompatible with performance critical or real-time
D. INTEGRATING D CODE IN THE LINUX KERNEL
applications such as the Linux kernel.
As a consequence, an advanced user has the possibility of We ported virtio_net, the network driver of the virtio
opting out of using the GC and using a different approach framework.
for lifetime management. Among the possible alternatives While C and D integration of user space applications is a
are reference counting or the Resource Acquisition Is well documented process, integrating D code in the Linux
Initialization (RAII) technique. As an alternative to reference kernel poses its own set of challenges. To the best of our
counting [16], the language maintainers have added support knowledge, we are the first to have successfully integrated
for an ownership/borrowing system [7] that can be mechani- a D software component in the Linux kernel.
cally checked, similar to Rust’s borrow checker. At the time In the next sections we highlight the challenges, methodol-
of this writing, October 2022, D’s ownership system is not on ogy and outcomes of integrating D code in the Linux kernel.
par with Rust’s, but it is under active development.
We note that the garbage collector is not involved in any III. METHODOLOGY FOR PORTING AND ENHANCING
of the safety checks that the compiler employs, apart from KERNEL MODULES USING D
lifetime management. Array bounds checking, compile time A. INTRODUCING D CODE IN THE LINUX KERNEL
safety checks and scope analysis are performed even when There are two ways of adding new functionalities to the Linux
the GC is turned off. kernel: (1) statically linking the new object file directly with
134504 VOLUME 10, 2022
C. E. Staniloiu et al.: Safer Linux Kernel Modules Using the D Programming Language
the core kernel or (2) compiling the code as a loadable module • Conduct the first set of benchmarks: assess the module
and linking it into the kernel on demand. behaviour. Compare the D and the C versions of the
A general rule of thumb is to add new functionalities as a module.
loadable module. This practice has the advantage of keeping • Introduce D idiomatic constructs and features into the
the kernel code as clean as possible and is easier to maintain. code. Add bounds checking, replace macros and casts
Also, it permits customization to a greater extent, as necessary with metaprogramming, add @safe, @trusted and
functionalities can be loaded and unloaded on demand. other useful features.
Moreover, it keeps the Trusted Computing Base (TCB) small • Perform the second set of benchmarks: assess the effect
and reduces the overall susceptibility to compromise, thus of the idiomatic code added. Compare the idiomatic D
increasing security. and the rough D versions of the module. Compare the D
Regardless of the type of module that has to be built, the and the C versions of the module.
kernel build system assumes the source files are written in The first step, the porting of data structures, is the most
C. As such, a source file written in another programming complex one. In a kernel module, some structures are defined
language won’t successfully compile and the build will fail. inside the code of the module, while others come from
This is also the case for the D language. At the same time, different header files. To be able to generate an object that can
the module entry point and exit point functions must be in pass and receive structures from a C program, a D compiler
C, so that the kernel can reach them. Summarizing, porting a (like any other compiler) must know the layout in memory of
module to the D programming language requires: those C structures. This means porting them to D.
This porting can be done using dpp [28], ‘‘a compiler
• writing the corresponding source code in D wrapper that will parse a D source file with the .dpp extension
• providing module entry points as C interface functions and expand in place any #include directives it encounters,
• updating the build system files to link the new module
translating all of the C or C++ symbols to D, and then
pass the result to a D compiler’’. However, a high level of
For the 2nd requirement, a C interface must be imple-
branching in header files or recursive inclusions may lead
mented between the kernel and the D-written module. This
to the impossibility of using dpp. In this case, one has two
C interface should contain only the entry point functions and
alternatives: (1) port the data structures by hand or (2) make
bindings to macros and functions that can not be ported to D.
dpp work with the Linux kernel headers. We chose the former.
This interface will imply that new features will require at least
Regardless of the porting method, the size and layout of
two source code files: one in C and the ones in D. Therefore
each new structure ported to D should be compared with the
the directives in the Linux kernel build file must be written
size and layout of the original one from C. In the case of
accordingly, for the 3rd requirement.
a size or layout mismatch, the bug can be easily detected
The kernel build system assumes that it is dealing with C
by comparing the offsets of the fields from D with the
source files and it tries to build the object files accordingly.
offsets of their C counterparts. In D, the offset of a field can
Fortunately, the build system also accepts pre-built object
be obtained using the .offsetof field property. In the Linux
binaries, as dependencies, that it will link with the object files
kernel, it can be obtained using the offsetof(TYPE, MEMBER)
it built in order to create the kernel module. This is done by
macro.
changing the name of the dependency from module-file.o to
A difference to consider is the size of an empty structure:
module-file.o_shipped. To link D object files into a kernel
the C kernel size of an empty structure is 0, while in D
module, the D source files must be compiled beforehand
this kind of structure has the size of 1 byte. We used D’s
and have their name with the suffix .o_shipped. The source
powerful compile-time introspection to solve this issue. Also,
files will be compiled by a D compiler with the -betterC
one should consider the fact that the D language does not
switch. One can choose between using the LLVM-based D
implicitly support bitfields. However, the same functionality
Compiler (LDC) and the GCC-based D Compiler (GDC).
can be achieved using the std.bitmanip.bitfields
After they are compiled independently, they will be shipped
library type.
to the kernel build system to be linked together with the other
While porting the implementation, the D functions called
C objects.
from C must be annotated with the extern(C) linkage attribute.
The attribute instructs the linker to use the C naming and
B. PORTING A KERNEL MODULE calling convention instead of the D one. The same must be
Porting the kernel module, we followed 5 steps, including done when declaring, in the D header, a function that is
testing and benchmarking: implemented in C.
In D, the non-immutable global variables are placed in
• Port the data structures used inside the module. Ensure the thread-local storage (TLS), while in C they are placed
the size and layout of each new ported structure is in the global storage. To achieve functional parity, one must
identical to the size and layout of the original one. annotate D global variables with the __gshared attribute.
• Port the module implementation one function at a Also, the const qualifier is transitive in D, meaning that it
time. Check module functionality after each new ported applies recursively to every subcomponent of the type that
function. it is applied to.
VOLUME 10, 2022 134505
C. E. Staniloiu et al.: Safer Linux Kernel Modules Using the D Programming Language
Primitive data type equivalence can be problematic too. 3) STATIC ARRAYS
The equivalence between basic C and D types is described are by default bounds-checked.
in [6].
Not all the functionalities that are used or implemented in 4) SLICES
a kernel module are worth to be ported. This is the case of specify a part of an array, via a reference and length
certain macros, which in their turn call other macros and so information. They are used to bounds-check dynamically-
on and are very deeply rooted in the kernel code. It is also allocated arrays. Note that this requires knowledge of the
the case of certain kernel functions that use GCC features initial size of the dynamically-allocated arrays.
that extend the standard C language and which may not be
implemented in the D compiler. A way to avoid the porting
5) TEMPLATES
these macros or functions is to create C bindings (functions
that only call other functions), that can be exposed to a can be used as replacement for C void pointers and macro
D object and called from there. These bindings should be definitions for generic programming, thus enabling type
created in the C interface of the module. system checks.
After each new ported function, a functionality test suite
should be run. If bugs were introduced, there is only one 6) SAFE FUNCTIONS
function to debug. The process of porting should be more (annotated with @safe) are statically verified against cases
syntax-oriented in the first two steps of the methodology. of undefined behavior. Within safe functions, there are several
One straightforward way of solving syntax related issues language features that cannot be used, such as casts that break
is to follow and solve the errors that are issued by the the type system or pointer arithmetic.
compiler. On the other hand, step 4 should be more oriented Scope, return ref and return scope function parameters
towards functionality and one should use all the features that are used to ensure that parameters do not escape their scope,
the BetterC subset retains, in order to improve the safety do not outlive their matching parameter lifetime and are
and the performance of the module. Several techniques for correctly tracked even through pointer indirections.
enhancing the safety of a module are presented in the next
section. 7) TRUSTED FUNCTIONS
The benchmarks (steps 3 and 5) should be done according (annotated with @trusted) provide the same guarantees as
to the module functionality. As a rule of thumb, a benchmark a safe function, but checks must be done by the programmer.
should be done after the module is ported (step 3) to assess
if the D version of the module can ‘‘keep up’’ with the C 8) SAFE FUNCTIONS
version. Then, one should take into account that memory can only call other safe functions and trusted functions.
safety features can lead to further performance penalties.
Safety checks are likely to introduce additional overhead. The
IV. METHODOLOGY IN ACTION. THE virtio_net DRIVER
second benchmark (step 5) should be done to assess if the
Given the steps described above, the goal was to select and
addition of idiomatic code and safety features is worthwhile
port a Linux kernel driver from C to D. This was an iterative
from a performance perspective.
process with the methodology being updated with feedback
C. SAFETY ENHANCEMENTS
from the porting process.
To select a target driver we considered the following
These are some of the security enhancements provided by the
criteria:
D programming language. They are used to implement and
build the newly implemented kernel module in D. • The driver is in the Linux kernel mainline and it is
maintained, so it is relevant for the kernel community.
1) VARIABLES • The driver is easy to test and benchmark: being a
are initialized to a default value of their type, removing network driver, one can easily send and receive packets
initialization bugs. and measure what bandwidth is achieved.
• The driver should be medium-sized (thousands of lines
2) IMPLICIT CONVERSIONS of code). This is a nice trade-off between feature
of void pointers to any other pointer types are not permitted. complexity and porting effort.
D requires an explicit cast for converting pointers of different Based on these criteria, we selected the virtio_net
types. driver, part of the virtio framework [11]. As it’s name
The C implicit switch fall-through behaviour is not suggests, it is a virtual network device driver, used as a
permitted in D. D also uses the final switch statement communication channel between the guest and the hypervisor
where the default case is not required nor permitted, useful in a paravirtualized environment. It satisfies the three criteria:
when the default statement is useless. The final (1) it is actively maintained and used for virtualization use
switch statement is especially useful when it is applied cases, (2) it can be easily tested with network tools: network
on an enum type, as it will enforce the use of all the enum functionality and network metrics such as bandwidth and
members in the case statements. latency can be part of a comparison evaluation process and
134506 VOLUME 10, 2022
C. E. Staniloiu et al.: Safer Linux Kernel Modules Using the D Programming Language
(3) it has roughly 3.3k lines of code, fitting into the
medium-size range we wanted.
The Linux kernel version used, and the compatible driver,
was 4.19.0. For development, testing and evaluation we used
a virtual machine (VM) based on QEMU.
V. EVALUATION
To validate our approach we show that:
1) The D code has the exact same behavior as the C code
that it replaces.
2) The safety mechanisms inserted successfully prevent FIGURE 2. VM to VM setup. One VM runs the iperf3 server, the other is
running the client.
the occurrences of memory corruption bugs.
3) The performance of the replacement software does not
degrade with regards to its predecessor. From the total number of array accesses inside the
We created a setup where we provide both implementations virtio_net driver, we were able to enable array bounds
of the virtio_net driver (C and D) and ran similar scenarios to checking in 88.4% of the cases. The rest of 11.6% represent
compare functionality, safety and performance. accesses to dynamic arrays that have been allocated outside
of the ported driver. To test the effect of adding array bounds
checking on the driver, we have added artificial out of bounds
A. EXPERIMENTAL SETUP
accesses to the code. In 60% of the cases, the C version of
We created a virtual machine image with the 4.19.0 version of the driver has finished execution gracefully, whereas the D
the Linux kernel. The virtual machine is run as two instances: version has stopped with a kernel panic in 100% of the cases.
one running the C version of the virtio_net driver, and the
other one running the D version. We refer to the virtual 2) @SAFE FUNCTIONS
machines using guest and the physical system using host.
To enable the D compiler to check the safety of the code,
We compiled the D source files of the module using the
we aimed to annotate all the functions present in the
GDC compiler, version 10.3.0, with the following flags:
driver with the @safe keyword. 19% of the functions have
-fno-druntime -mcmodel=kernel -O2 -c.
successfully compiled without any modifications, whereas
For evaluation we focused on functional correctness /
81.2% have failed compilation due to performing unsafe
parity, safety and performance.
operations. Most of these functions rely on pointer operations
and casts that are forbidden in @safe code. Additional
B. FUNCTIONAL CORRECTNESS modifications are required to bring the code in a @safe state,
We then run network tools in each virtual machine to check however, this can be done incrementally after the initial port
for parity of functionality. For example, using ping to of the driver.
validate functionality, using wget to download information
from the Internet. Additionally, we check whether the 3) TEMPLATES
transferred file is the correct one by comparing its MD5 hash D code may use templated functions that are instantiated
with the expected one. at compile time with the right type. In case of a type
mismatch, that will result in a compilation error, thus making
C. SAFETY it impossible to have runtime memory corruption bugs.
To enhance the safety of the ported driver code we modified By using templated functions, we replaced 56% of the total
the code as to use several D language features: array bounds number of void pointer usages. The remaining 44% could not
checking, @safe functions and templates. be replaced because there was no conversion pattern that we
could detect and leverage for our transformation.
1) ARRAY BOUNDS CHECKING
The virtio driver uses both statically and dynamically D. PERFORMANCE
allocated arrays. In the case of static arrays defined inside For performance, we used the iperf3 tool that sends packets
the driver, the D language compiler has sufficient information between a client and a server. We used a virtual machine
at compile time to insert bounds checking code. Dynamic instance running the original C version of the virtio_net driver
arrays, on the other hand, are represented in C as a pointer to and a virtual machine running the D version. Each VM was
a chunk of data, therefore there isn’t sufficient information at allocated 1GB of RAM and 1 CPU. iperf3 was deployed on
compile time to offer the possibility of implementing runtime both VMs.
checks. However, using slices, we are able to enable bounds We devised 3 setups:
checking for dynamic arrays that are defined inside the ported • vm-to-vm (in Figure 2): One VM is running the server,
driver. Accesses to arrays that are dynamically allocated one VM is running the client. Both machines are of the
outside the driver remain without bound checks. same type: either C and either D.
VOLUME 10, 2022 134507
C. E. Staniloiu et al.: Safer Linux Kernel Modules Using the D Programming Language
FIGURE 3. VM to host setup. The host is running the iperf3 server, the VM
is running the client.
FIGURE 5. Comparative TCP Performance (C vs D).
FIGURE 4. VM to remote setup. Another system in the host network is
running the iperf3 server, the VM is running the client.
TABLE 1. Comparative performance. FIGURE 6. Comparative UDP Performance (C vs D).
we consider performance similar and subject to network and
measurement variation.
One thing to note is the relatively reduced impact of
the changes: the basic network driver functionalities are
unmodified, most of the code responsible for that being
shared between the two implementations. Porting other
drivers may affect a larger part of the implementation and
could feature a higher slowdown. This is subject for analysis
in the future.
• vm-to-host (in Figure 3): The host is running the server,
the VM is running the client. E. REPLICABILITY
• vm-to-remote (in Figure 4): Another system in the host In the interest of the validating our work, we provide
network is running the server, the VM is running the it to the community on GitHub as a fork of the Linux
client. kernel, an implementation of the D virtio_net driver and
Each of those setups was used for 2 × 2 types of experiment scripts: https://github.com/edi33416/d-virtio.
measurements: (1) the VM is running D or or the VM is The implementation of the D virtio_net is on the
running C and (2) iperf3 is using TCP or it is using UDP. test_dvirtio_gdc branch, in the drivers/net/
Results are summarized in Table 1 and in Figure 5 and dfiles folder. Alongside the .d source files present in the
Figure 6. drivers/net/dfiles path, there are also a Makefile
Results show negligible overhead for the D module and two test.sh files. The Makefile is used to compile
implementation compared to the C implementation. Given the .d source files into the .o_shipped objects that, in turn,
that parts of the measurements show a negative slowdown, will be linked by the kernel build system to build the
134508 VOLUME 10, 2022
C. E. Staniloiu et al.: Safer Linux Kernel Modules Using the D Programming Language
virtio_net.ko module. The test.sh and test2.sh kernel. Although the memory safety guarantees that Rust
helper scripts are used to validate the experimental setup. offers are superior when compared to D, integrating it in
They load the compiled kernel module, configure the IP the Linux kernel is a very complicated task. As evidence,
address and routing table, and validate that the network is the work required to add support for Rust in the Linux
working properly; this is done by downloading a file and kernel was done by 173 people (present in the commit
comparing its md5sum with the reference value. changelog [21]) over the course of 18 months. This included
In order to be able to compile the D driver, one needs solely the implementation of the infrastructure required to
to install gdc-10, the GCC based D compiler. As we are integrate Rust code in the kernel. It does not implement
using QEMU to run the VMs, one also needs to ensure that it any device driver or any parts of the Linux kernel in Rust.
has installed qemu-system-x86_64 with KVM support. By comparison, our work was done by 3 people over the
We have been connecting to our VMs using a serial port with course of 4 months, including the initial exploratory phase of
a serial communication program, such as Minicom.2 the Linux infrastructure as well as the porting of the kernel
Once all the prerequisites are met, one can build the kernel header files. The actual porting time of the device driver
module, boot-up the VM and start using the compiled driver. required only 2 to 3 weeks. The reader should consider that,
This process is automated in the tools/labs/ directory. in the meantime, work has been advanced to automate the
The tools/labs/Makefile, through the run target, porting of kernel header files to D [26], thus reducing the
will (1) compile the .o_shipped object, (2) trigger the required time to integrate D device drivers to a minimum.
kernel build system that will result in the virtio_net.ko In addition, the effort to integrate Rust in the kernel has
module, (3) download the YOCTO_IMAGE specified in the required compiler changes to accommodate the esoteric code
tools/labs/qemu/Makefile and boot-up the VM, encountered, whereas our work does not necessitate any
and (4) copy the module inside the VM. It will also setup compiler changes.
IP forwarding and NAT Masquerading for the eno1 network A previous attempt to create a memory-safe version of the
interface on the host machine, so one must update the C language and to use it into the Linux kernel is CCured [2].
Makefile if one’s system is using a different network interface CCured is a program transformation system that extends the
name. existing type system of the C language by classifying new
Once the VM has booted, one can connect to it through the pointer types according to their usage. There are three pointer
serial1.pts serial pipe with the help of the minicom categories: (1) SAFE qualified pointers may be dereferenced,
utility tool, as such minicom -D serial1.pts. The but cannot be cast to other types or be used as part of
default login username is root and requires no password. All pointer arithmetic operations, (2) SEQ qualified pointers may
the files can be found in the skels/ directory inside the VM, be used as part of pointer arithmetic, but not in type casts
the kernel object being named virtio_net_tmp.ko. and (3) WILD qualified pointers that can be cast to other
Precompiled .ko objects can be found in the Releases3 on pointer types. Each category is treated separately at runtime.
Github: SAFE pointers simply require a null check. SEQ pointers are
• Precompiled D .ko: https://github.com/edi33416/d- subjected to bounds checking, since they are typically used
virtio/releases/download/dvirtio-ko/virtio_net_tmp.ko for array operations. WILD pointers are the most expensive
• Precompiled C .ko: https://github.com/edi33416/d- in terms of runtime cost, because they require runtime type
virtio/releases/download/cvirtio-ko/virtio_net_tmp.ko information to track the various conversion types that the
It is our hope that the availability of our work will make it pointer may be subjected to. It has been previously discovered
easier to evaluate, to replicate and to provide a critical eye on. that, in practice, a large percentage of the casts in C codebases
between different types are either upcasts or downcasts [23].
VI. RELATED WORK This is also true for the Linux kernel where void* is used as
Improving the safety of the Linux kernel and its drivers is a generic base type in order to enable polymorphism. These
the constant focus of the professional and research security types of casts will be treated as WILD pointers by CCured
community. There are different approaches ranging from which will be subjected to the costs of runtime checks.
static analysis of the Linux kernel code [4], [9], [14] to By using D, we were able to leverage it’s metaprogramming
fuzzing [3], [8], [22], [25] to the use of runtime checks and/or support in order to achieve compile-time polymorphism and
instrumentation [13], [24]. type safety without adding any runtime costs.
The idea of using programming languages that implement The pointers defined by CCured are fat pointers: a structure
different memory safety features in order to make the Linux that packs together the raw pointer and metadata related to the
kernel code safer has also been tackled. boundaries and type information. The authors acknowledge
The recent availability of Rust as a programming language [15] that, because of this, multithreaded programs that rely
in the Linux kernel [19], [20] paves the way for adding code on shared memory will not work with CCured. The isssue
written in a secure programming language. This is compatible with shared memory programs stems from the fact that the
with our own approach of using D to write code in the Linux programs not written using CCured will assume that the
pointers are one word long and can be written to atomically,
2 https://wiki.emacinc.com/wiki/Getting_Started_With_Minicom when they are, in fact, a fat pointer that occupies multiple
3 https://github.com/edi33416/d-virtio/releases words in memory and requires multiple instructions in order
VOLUME 10, 2022 134509
C. E. Staniloiu et al.: Safer Linux Kernel Modules Using the D Programming Language
to perform the write, and thus the pointer could get in It is important to note that unsafety inside the kernel is a
an inconsistent state. As D’s arrays are also fat pointers, fact of life. Although one can use a programming language
they suffer from the same problem. We, as do the authors that uses different mechanics that increase the safety of the
of CCured, believe that this problem can be resolved by code that a developer writes, at one point the developer will
acquiring locks on the shared memory before accessing it. be forced to perform unsafe actions. Those can come from the
Although, in theory, this solution will impact performance we need to interact with specific pins on the underlying hardware
have not encountered it in practice while interfacing D with or the need to interact with the kernel API. Most of the kernel
programs written in other languages. API core works with raw pointers; as such, even though the
CCured was used on two Linux kernel device drivers, safe code might implement a sound object lifetime algorithm,
on Linux kernel version 2.4.5, with no significant per- being forced to pass the raw pointer to the kernel will void all
formance penalties. However, it has incurred performance the safety bets and assumptions. In spite of this, we believe
penalties ranging from 11% to 87% on other programs, as it that there are two strong arguments that enable the use of
is detailed in its paper. safe languages in practice: 1) the kernel core is extremely
Another approach to use a modern programming language stable and robust as it benefits from 30 years of development
for the Linux kernel drivers, in order to increase the and bug fixes, and 2) the kernel API clearly defines whose
reliability of the system, was done using the Decaf drivers responsibility, the kernel’s or the driver’s, is to free allocated
architecture [18]. The Decaf architecture partitions the resources.
code of a driver in two separate parts: one that must Another important observation is that a programming
run in the kernel-space for high performance and must language must be able to adhere to the constraints and design
satisfy the OS requirements and one that can be moved to patterns implemented inside the Linux kernel. As Linus Tor-
the user-space and be rewritten in another language. The valds has stated [30], kernel needs to trump any programming
communication between these two parts was done through language’s needs. For this reason, we believe that the D
extension procedure call (XPC). Using this architecture and programming language is a good fit given its proven ease of
the Linux kernel version 2.6.18.1, five drivers were converted interoperability with C and the kernel infrastructure.
to Java, gaining exception handling and automatic memory The extent to which the kernel safety can be improved
management through garbage collection. The performance depends on the degree to which the module implementation is
achieved was close to the one achieved by the native kernel self-sufficient. The more external functionalities the module
drivers. The drawbacks of using Decaf result are traced to uses, the fewer safety enhancements can be done.
the Java programming language, that has no pointers support. The performance evaluation we conducted on the vir-
As such, critical paths in the code that use pointers are left in tio_net driver shows that the D version of the driver adds
the unsafe part, still running in kernel space. little to no overhead to the original C variant. The safety
Conversely, our methodology covers the use of the D features added are sustainable and do not introduce overhead,
language for memory safety enhancements in any type of therefore, we consider the performance results encouraging.
kernel modules, including those that use multithreading Given the methodology we created, we are confident other
and shared memory, as is the case with CCured [2]. The drivers could be ported to D with reasonable effort. Given
implementation of new components and the interfacing with the similarity to the C programming language, getting accus-
other kernel components can be easily done thanks to the tomed to the D programming language will have minimal
language’s high compatibility with C, compared to the more impact on the driver developer. This is in contrast to the Rust
complicated syntax of Rust. The entire code of a kernel programming language whose syntax and features are very
module can be rewritten in D to improve memory safety, with different from the C programming language. We believe that
no need of leaving any part of the code unchanged, as is the the increasing interest of adding safe languages into the Linux
case with Decaf [18]. kernel is a great step forward, as it provides kernel developers
with alternatives and flexibility such that they can strike the
VII. CONCLUSION right balance for their needs and goals.
In this paper we presented an approach to improve the With these solutions, further drivers could be ported using
security of Linux kernel modules using the D programming the methodology described in this paper. Later on, this could
language. We selected virtio_net as our target driver, be extended to entire built-in components and subsystems
a medium-sized and actively maintained component in the in the Linux kernel. Those would bring a much needed
Linux kernel. We ported the driver in the D programming improvement in the overall security of the kernel with close-
language and highlighted the functional and performance to-no overhead, with a welcoming C-similar programming
parity to the original C driver and discussed the security language.
benefits. We elaborated a methodology that can be used on
other types of drivers for the same purpose. REFERENCES
The safety features added to the driver show that the
D language is able to leverage safety improvements in a [1] AspenCore. (Nov. 2019). Mobile Operating System Market Share
Worldwide. Accessed: Apr. 17, 2022. [Online]. Available: https://www.
kernel module, array bounds checking and compile-time embedded.com/wp-content/uploads/2019/11/EETimes
polymorphism being the most important ones. _Embedded_2019_Embedded_Markets_Study.pdf
134510 VOLUME 10, 2022
C. E. Staniloiu et al.: Safer Linux Kernel Modules Using the D Programming Language
[2] J. Condit, M. Harren, S. McPeak, G. C. Necula, and W. Weimer, ‘‘Cured [27] Mobile Operating System Market Share Worldwide. Accessed:
in the real world,’’ ACM SIGPLAN Notices, vol. 38, no. 5, pp. 232–244, Apr. 17, 2022. [Online]. Available: https://gs.statcounter.com/os-market-
2003. share/mobile/worldwide
[3] J. Corina, A. Machiry, C. Salls, Y. Shoshitaishvili, S. Hao, C. Kruegel, and [28] Project Highlight: DPP. Accessed: Apr. 17, 2022. [Online]. Available:
G. Vigna, ‘‘DIFUZE: Interface aware fuzzing for kernel drivers,’’ in Proc. https://dlang.org/blog/2019/04/08/project-highlight-dpp/
ACM SIGSAC Conf. Comput. Commun. Secur., Oct. 2017, pp. 2123–2138. [29] Operating System Family/Linux | TOP50. Accessed: Apr. 17, 2022.
[4] D. Dawson, N. Hawes, C. Hoermann, N. Keynes, and C. Cifuentes, [Online]. Available: https://www.top500.org/statistics/details/osfam/1/
‘‘Finding bugs in open source kernels using parfait,’’ Sun Microsyst. [30] LKML: Linus Torvalds: Re: [Patch v9 12/27] Rust: Add Kernel
Lab., Brisbane, QLD, Australia, Tech. Rep., 2009. [Online]. Available: Crate. Accessed: Oct. 29, 2022. [Online]. Available: https://lkml.org/
https://www.researchgate.net/publication/242083507_Finding_Bugs_in_ lkml/2022/9/19/1105
Open_Source_Kernels_using_Parfait [31] W3Techs. Linux vs. Windows Usage Statistics for Websites.
[5] D Programming Language. Accessed: Apr. 17, 2022. [Online]. Available: Accessed: Apr. 17, 2022. [Online]. Available: https://w3techs.com/
https://dlang.org/ technologies/comparison/os-linux,os-windows
[6] Programming in D for C Programmers. Accessed: Apr. 17, 2022. [Online].
Available: https://dlang.org/articles/ctod.html
[7] Live Functions: Ownership and Borrowing in D. Accessed: Oct. 29, 2022.
[Online]. Available: https://dlang.org/spec/ob.html CONSTANTIN EDUARD STANILOIU received
[8] D. R. Jeong, K. Kim, B. Shivakumar, B. Lee, and I. Shin, ‘‘Razzer: Finding the B.Sc. and M.Sc. degrees in computer
kernel race bugs through fuzzing,’’ in Proc. IEEE Symp. Secur. Privacy science and engineering from the University
(SP), May 2019, pp. 754–768. POLITEHNICA of Bucharest (UPB), Bucharest,
[9] R. Johnson and D. Wagner, ‘‘Finding user/kernel pointer bugs with type Romania, in 2016 and 2018, respectively, where he
inference,’’ in Proc. 13th USENIX Secur. Symp. (USENIX Security). is currently pursuing the Ph.D. degree in computer
San Diego, CA, USA: USENIX Association, Aug. 2004, pp. 119–134. science and information security.
[10] Kernel Self-Protection. Accessed: Apr. 17, 2022. [Online]. Available:
Since 2018, he has been a Teaching Assistant
https://www.kernel.org/doc/html/latest/security/self-protection.html
with the Department of Computer, Faculty of
[11] (2022). Virtio. Accessed: Apr. 17, 2022. [Online]. Available:
https://wiki.libvirt.org/page/Virtio Automatic Control and Computers, UPB. He is
[12] Linux Kernel CVEs. Accessed: Apr. 17, 2022. [Online]. Available: also a member of the Secure Systems Group, Department of Computer. His
https://www.linuxkernelcves.com/ research interests include programming languages, security and vulnerability
[13] K. Lu, A. Pakki, and Q. Wu, ‘‘Automatically identifying security checks for detection, code and binary analysis, distributed systems, the IoT, and
detecting kernel semantic bugs,’’ in Proc. Eur. Symp. Res. Comput. Secur. computer vision.
Luxembourg: Springer, 2019, pp. 3–25.
[14] A. Machiry, C. Spensky, J. Corina, N. Stephens, C. Kruegel, and G. Vigna,
‘‘DR.CHECKER: A soundy analysis for Linux kernel drivers,’’ in Proc.
26th USENIX Secur. Symp. (USENIX Security), 2017, pp. 1007–1024. ALEXANDRU MILITARU received the B.Sc. and
[15] G. C. Necula, J. Condit, M. Harren, S. McPeak, and W. Weimer, ‘‘CCured: M.Sc. degrees in computer science and engi-
Type-safe retrofitting of legacy software,’’ ACM Trans. Program. Lang. neering from the University POLITEHNICA of
Syst., vol. 27, no. 3, pp. 477–526, May 2005. Bucharest (UPB). He is currently pursuing the
[16] R. Nitu, E. Staniloiu, R. Deaconescu, and R. Rughinis, ‘‘Adding support M.A. degree in philosophy with the University of
for reference counting in the d programming language,’’ in Proc. 17th Int. Bucharest, Bucharest, Romania.
Conf. Softw. Technol., H.-G. Fill, M. van Sinderen, and L. A. Maciaszek, His research interests include programming
Eds. Lisbon, Portugal: SCITEPRESS, 2022, pp. 299–306. languages and compilers.
[17] A. Popov. Linux Kernel Defence Map. Accessed: Apr. 17, 2022. [Online].
Available: https://github.com/a13xp0p0v/linux-kernel-defence-map
[18] M. J. Renzelmann and M. M. Swift, ‘‘Decaf: Moving device drivers
to a modern language,’’ in Proc. USENIX Annu. Tech. Conf., 2009,
p. 14. [Online]. Available: https://www.researchgate.net/publication/
RAZVAN NITU received the B.Sc. and M.Sc.
234787227_Decaf_moving_device_drivers_to_a_modern_language/
degrees in computer science and engineering
citation/download
[19] Rust in the Linux Kernel: Good Enough. Accessed: Apr. 17, 2022. [Online]. from the University POLITEHNICA of Bucharest
Available: https://thenewstack.io/rust-in-the-linux-kernel-good-enough/ (UPB), Bucharest, Romania, where he is currently
[20] Rust for Linux. Accessed: Apr. 17, 2022. [Online]. Available: pursuing the Ph.D. degree in programming lan-
https://github.com/Rust-for-Linux guages and security. His research interests include
[21] Linux Kernel Commit to Add Rust Support. Accessed: programming languages, security, computer archi-
Oct. 22, 2022. [Online]. Available: https://git.kernel.org/pub/scm/linux/ tecture, and education techniques.
kernel/git/torvalds/linux.git/commit/?id=8aebac82933ff1a7c8eede18cab
11e1115e2062b
[22] S. Schumilo, C. Aschermann, R. Gawlik, S. Schinzel, and T. Holz,
‘‘kAFL: Hardware-assisted feedback fuzzing for OS kernels,’’ in Proc.
26th USENIX Secur. Symp. (USENIX Security), 2017, pp. 167–182. RAZVAN DEACONESCU is currently an Asso-
[23] M. Siff, S. Chandra, T. Ball, K. Kunchithapadam, and T. Reps, ‘‘Coping ciate Professor at the Computer Science and Engi-
with type casts in C,’’ ACM SIGSOFT Softw. Eng. Notes, vol. 24, no. 6, neering Department, University POLITEHNICA
pp. 180–198, Nov. 1999. of Bucharest, Romania. His research interests
[24] C. Song, B. Lee, K. Lu, W. Harris, T. Kim, and W. Lee, ‘‘Enforcing
include operating systems and security, with a
kernel security invariants with data flow integrity,’’ in Proc. NDSS, 2016,
penchant for teaching and mentoring. If a class
pp. 1–15.
[25] D. Song, F. Hetzelt, J. Kim, B. B. Kang, J.-P. Seifert, and M. Franz, uses ‘‘operating systems’’ as part of its name, it’s
‘‘Agamotto: Accelerating kernel driver fuzzing with lightweight virtual likely he is part of the team. Research-wise, he is
machine checkpoints,’’ in Proc. 29th USENIX Secur. Symp. (USENIX working on software security, particularly Apple
Security), 2020, pp. 2541–2557. iOS security and the Unikraft unikernel in recent
[26] E. Staniloiu, R. Nitu, C. Becerescu, and R. Rughinis, ‘‘Automatic years. He is a part of the open source and security community in the university
integration of d code with the Linux kernel,’’ in Proc. 20th RoEduNet and in Romania.
Conference: Netw. Educ. Res. (RoEduNet), Nov. 2021, pp. 1–6.
VOLUME 10, 2022 134511