kGraft - A project by SUSE for Live Patching of the Linux Kernel without rebooting server

kGraft technology developed by SUSE mainly focuses on the live patching of the Linux Kernel particularly for the environments where shutting and rebooting a Linux server is not the preferred option. Consider the environment where 1000 servers are up and running but there comes an immediate bug or security fix to be made in all the servers. Existing solutions require rebooting of the Linux Kernel after fixing but that also carries the risk of not coming up life again for some other reasons. That's where the live patching becomes handy. It allows quick response and leaving an actual update to a scheduled downtime window.

kGraft is basically a research project developed by SUSE Labs. It is a live patching technology developed specifically for the Linux Kernel. It is based on modern Linux technologies such as INT3/IPI-NMI self-modifying code, RCU (Read-Copy-Update) update mechanism, and mount-based NOP space allocation. To patch a buggy function, kGraft replaces whole functions in the kernel with a new fixed function. kGraft needs some space (5-bytes) at the start of the buggy function that is provided by the NOP space allocation method. The first byte of NOP is replaced by INT3 that takes care of incomplete instruction. The remaining bytes are replaced by an address where the first byte is replaced by JMP. This whole process actually replaces a new fixed function before calling the existing buggy function as explained in the figure below.

There can be three different tiers of change management within the Linux server where a different level of handling and response is required: First is the Incident response situation where the system is down and the immediate response is required to fix the problem and bring the system up and running. Second is the Emergency change situation where the system is vulnerable and can go down anytime. It requires emergency intervention to fix the problem. The third is the Scheduled change where change is not time-critical. It is normally done on the scheduled downtime set normally on the weekends. Live patching fits in the first and second situations discussed above.

kGraft does not require stopping the kernel not even for short time periods, unlike other technologies. kGraft patch can be built from C source directly without the need for object code manipulation. kGraft consists of a very small amount of code thanks to utilizing other existing Linux technologies. However, there are some limitations of kGraft as well. kGraft is designed for fixing critical bugs only and particularly for simple changes. Changes in kernel data structure require special care depending on the size of the change. kGraft depends on a stable build environment and thus best suited for Linux distributions.