System calls
• System Calls are the basic OS mechanism
for providing privileged kernel services to
application programs (e.g., fork(), clone(),
execve(), read(), write(), signal(), getpid(),
waitpid(), gettimeofday(), setitimer(), etc.)
• Linux implements over 300 system calls
• To understand how system calls work, we
can try creating one of our own design
30 trang |
Chia sẻ: candy98 | Lượt xem: 925 | Lượt tải: 0
Bạn đang xem trước 20 trang tài liệu Advanced Systems Programming - Lesson 27: ‘Dynamic’ kernel patching, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
‘Dynamic’ kernel patching
How you could add your own
system-calls to Linux without
editing and recompiling the kernel
System calls
• System Calls are the basic OS mechanism
for providing privileged kernel services to
application programs (e.g., fork(), clone(),
execve(), read(), write(), signal(), getpid(),
waitpid(), gettimeofday(), setitimer(), etc.)
• Linux implements over 300 system calls
• To understand how system calls work, we
can try creating one of our own design
‘Open Source’ philosophy
• Linux source-code is publicly available
• In principle, anyone could edit the sources
to add their own new functions into Linux
• In practice, it is inconvenient to do this
• The steps needed involve reconfiguring,
recompiling, and reinstalling your kernel
• For novices these steps are treacherous!
• Any error risks data-loss and down-time
Alternative to edit/recompile
• Linux modules offer an alternative method
for modifying the OS kernel’s functionality
• It’s safer -- and vastly more convenient –
since error-recovery only needs a reboot,
and minimal system knowledge suffices
• The main hurdle to be overcome concerns
the issue of ‘linking’ module code to some
non-exported Linux kernel data-structures
Invoking kernel services
application
program
user-mode
(restricted privileges)
kernel-mode
(unrestricted privileges)
standard
runtime
libraries
call
ret
Linux kernel
int 0x80
iret
installable module
call
ret
The system-call jump-table
• There are approximately 300 system-calls
• Any specific system-call is selected by its
ID-number (it’s placed into register %eax)
• It would be inefficient to use if-else tests or
even a switch-statement to transfer to the
service-routine’s entry-point
• Instead an array of function-pointers is
directly accessed (using the ID-number)
• This array is named ‘sys_call_table[]’
Assembly language (.data)
.section .data
sys_call_table:
.long sys_restart_syscall
.long sys_exit
.long sys_fork
.long sys_read
.long sys_write
// etc (from ‘arch/i386/kernel/entry.S’)
The ‘jump-table’ idea
sys_restart_syscall
sys_exit
sys_fork
sys_read
sys_write
sys_open
sys_close
etc
sys_call_table
.section .text0
1
2
3
4
5
6
7
8
Assembly language (.text)
.section .text
system_call:
// copy parameters from registers onto stack
call sys_call_table(, %eax, 4)
jmp ret_from_sys_call
ret_from_sys_call:
// perform rescheduling and signal-handling
iret // return to caller (in user-mode)
Changing the jump-table
• To install our own system-call function, we
just need to change an entry in the Linux
‘sys_call_table[]’ array, so it points to our
own module function, but save the former
entry somewhere (so we can restore it if
we remove our module from the kernel)
• But we first need to find ‘sys_call_table[]’
-- and there are two easy ways to do that
Finding the jump-table
• Older versions of Linux (prior to 2.4.18)
used to ‘export’ the ‘sys_call_table[]’ as a
global symbol, but current versions keep
this table’s address private (for security)
• But often during kernel-installation there is
a ‘System.map’ file that gets put into the
‘/boot’ directory and – assuming it matches
your compiled kernel – it holds the kernel
address for the ‘sys_call_table[]’ array
Using ‘uname’ and ‘grep’
• You can use the ‘uname’ command to find
out which kernel-version is running:
$ uname -r
• Then you can use the ‘grep’ command to
find ‘sys_call_table’ in your System.map
file, like this:
$ grep sys_call_table /boot/System.map-2.6.22.5cslabs
The ‘vmlinux’ file
• Your compiled kernel (uncompressed) is
left in the ‘/usr/src/linux’ directory
• It is an ELF-format (executable) file
• It contains .text and .data sections
• You can examine your ‘vmlinux’ kernel
with the ‘objdump’ system-utility
• You can pipe the output through the ‘grep’
utility to locate the ‘sys_call_table’ symbol
Section-Header Table
(optional)
Executable versus Linkable
ELF Header
Section 2 Data
Section 3 Data
Section n Data
Segment 1 Data
Segment 2 Data
Segment 3 Data
Segment n Data
Linkable File Executable File
Section-Header Table
Program-Header Table
(optional)
Program-Header Table
ELF Header
Section 1 Data
Where is ‘sys_call_table[ ]’?
• This is how you use ‘objdump’ and ‘grep’
to find the ‘sys_call_table[]’ address:
$ cd /usr/src/linux
$ objdump –t vmlinux | grep sys_call_table
Exporting ‘sys_call_table’
• Once you know the address of your kernel’s
‘sys_call_table[]’, you can write a module to export that
address to other modules, e.g.:
// declare global variable
unsigned long *sys_call_table;
EXPORT_SYMBOL(sys_call_table);
int init_module( void)
{
sys_call_table = (unsigned long *)0xC0251500;
return 0;
}
Avoid hard-coded constant
• You probably don’t want to ‘hard code’ the
sys_call_table’s value in your module – if
you ever recompile your kernel, or use a
differently configured kernel, you’d have to
remember to edit your module and then
recompile it – or risk a corrupted system!
• There’s a way to suply the required value
as a module-parameter during ‘insmod’
Module paramerers
char *svctable; // declare global variable
module_param( svctable, charp, 0444 );
// Then you install your module like this:
$ /sbin/insmod myexport.ko svctable=c0251500
// Linux will assign the address of your input
string “c0251500” to the ‘svctable’ pointer:
simple_strtoul()
• There is a kernel function you can use, in your
‘init_module()’ function, that will convert a string
of hexadecimal digits into an ‘unsigned long’’:
int init_module( void )
{
unsigned long myval;
myval = simple_strtoul( svctable, NULL, 16 );
sys_call_table = (unsigned long *)myval;
return 0;
}
Shell scripts
• It’s inconvenient – and risks typing errors –
if you must manually search ‘vmlinux’ and
then type in the sys_call_table[]’s address
every time you want to install your module
• Fortunately this sequence of steps can be
readily automated – by using a shell-script
• We have created an example: ‘myscript’
shell-script format
• First line: #!/bin/sh
• Some assignment-statements:
version=$(uname –r)
mapfile=/boot/System.map-$version
• Some commands (useful while debugging)
echo $version
echo $mapfile
The ‘cut’ command
• You can use the ‘cut’ operation on a line of
text to remove the parts you don’t want
• An output-line from the ‘grep’ program can
be piped in as a input-line to ‘cut’
• You supply a command-line argument to
the ‘cut’ program, to tell it which parts of
the character-array you wish to retain:
– For example: cut –c0-8
– Only characters 0 through 8 will be retained
Finishing up
• Our ‘myscript’ concludes by executing the
command which installs our ‘myexport.o’
module into the kernel, and automatically
supplies the required module-parameter
• If your ‘/boot’ directory doesn’t happen to
have the ‘System.map’ file in it, you can
extract the ‘sys_call_table[]’ address from
the uncompressed ‘vmlinux’ kernel-binary
The ‘objdump’ program
• The ‘vmlinux’ file contains a Symbol-Table
section that includes ‘sys_call_table’
• You can display that Symbol-Table using
the ‘objdump’ command with the –t flag:
$ objdump –t /usr/src/linux/vmlinux
• You can pipe the output into ‘grep’ to find
the ‘sys_call_table’ symbol-value
• You can use ‘cut’ to isolate the address
Which entry can we change?
• We would not want to risk disrupting the
normal Linux behavior through unintended
alterations of some vital system-service
• But a few entries in ‘sys_call_table[]’ are
no longer being used by the newer kernels
• If documented as being ‘obsolete’ it would
be reasonably safe for us to ‘reuse’ an
array-entry for our own purposes
• For example: system-call 17 is ‘obsolete’
‘newcall.c’
• We created this module to demonstrate
the ‘dynamic kernel patching’ technique
• It installs a function for system-call 17
• This function increments the value stored
in a variable of type ‘int’ whose address is
supplied as a function-argument
• We wrote the ‘try17.cpp’ demo to test it!
Recently an extra obstacle
• Some recent versions of the Linux kernel
(including ours in the classroom and labs)
have placed the ‘sys_call_table[]’ (as the
default configuration-option) in ‘read-only’
memory within kernel-space, despite the
already existing protections of ‘ring 0’
• What this achieves is creation of an added
obstacle to alterations by privileged-code
page-frame attributes
1100 0000 00 10 0101 0001 0101 0000 00000xC0251500 =
Page-Directory Page-Tables
Page-Frame
CR3
sys_
call_
table
P
R
/
W
S
/
U
frame attributes
2 1 0
virtual memory address of our ‘sys_call_table[]’ array
We cannot modify entries in ‘sys_call_table[]’ unless its page-frame is ‘writable’
Tweak page-frame attributes
• Our ‘newcall.c’ module needs to be sure it
can modify entry 17 in ‘sys_call_table[]’
• So it locates the page-table entry for the
page-frame containing ‘sys_call_table[]’
and sets its ‘writable’ bit to be ‘TRUE’
• But it preserves the previous value of that
entry, so it can be restored if we remove
our ‘newcall.ko’ object from the kernel
In-class exercise #1
• Write a kernel module (named ‘unused.c’)
which will create a pseudo-file that reports
how many ‘unimplemented’ system-calls
are still available. The total number of
locations in the ‘sys_call_table[]’ array is
given by a defined constant: NR_syscalls
so you can just search the array to count
how many entries match ‘sys_ni_syscall’
(it’s the value found initially in location 17)