snoopy记录命令原理分析

说明

根据其描述Snoopy is a tiny library that logs all executed commands (+ arguments) on your system.于是对其原理很好奇.目前就我所知的命令监控的方案无非是

  • 修改bash配置,记录命令.这种方式的代表是cornerstone
  • 直接替换bash为自定义能够记录命令的bash
  • auditd, auditd的原理其实是通过sys_execve的方式记录的系统调用
  • execve_hook, 记录的execve系统调用的命令
  • bash readline,原理是通过uprobes(一类特殊的uretprobes)hook readline()函数,获取结果.

本篇文章就是在分析snoopy能够监控命令的原理.看看是否是利用了什么其他的技术,还是上面所说的方案的一种.

snoopy

我们在ubuntu18.04上面进行测试,我们直接采用apt install的方式安装.

1
$ apt install snoopy

安装过程中会弹出如下一句话:

snoopy is a library that can only reliably do its work if it is mandatorily preloaded via /etc/ld.so.preload. Since this can potentially do harm to the system, your consent is needed. │
Install snoopy library to /etc/ld.so.preload?

即,是否运行将snoopy安装到/etc/ld.so.preload中.成功安装之后,我们查看:

1
2
$ cat /etc/ld.so.preload
/lib/x86_64-linux-gnu/libsnoopy.so

/etc/ld.so.preload中会出现/lib/x86_64-linux-gnu/libsnoopy.so. 根据文档的说法.

A list of additional, user-specified, ELF shared libraries to be loaded before all others. The items of the list can be separated by spaces or colons.
This can be used to selectively override functions in other shared libraries. The libraries are searched for using the rules given under DESCRIPTION.
For set-user-ID/set-group-ID ELF binaries, preload pathnames containing slashes are ignored, and libraries in the standard search directories are loaded only if the set-user-ID permission bit is enabled on the library file.

大致含义就是,在ld.so.preload中的动态链接库会优先于其他的动态链接库加载.那么我们设置在ld.so.preload中添加的动态链接库就可有覆盖后面的动态链接库中的方法,从而完成HOOK.猜想snoopy也是同样的原理.

1
2
3
4
5
6
7
8
9
$ ldd /bin/ls
linux-vdso.so.1 (0x00007ffea29c4000)
/lib/x86_64-linux-gnu/libsnoopy.so (0x00007f8048c4e000)
libselinux.so.1 => /lib/x86_64-linux-gnu/libselinux.so.1 (0x00007f8048a26000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f8048635000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f8048416000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f8048212000)
libpcre.so.3 => /lib/x86_64-linux-gnu/libpcre.so.3 (0x00007f8047fa0000)
/lib64/ld-linux-x86-64.so.2 (0x00007f804907c000)

可以看到/lib/x86_64-linux-gnu/libsnoopy.so (0x00007f8048c4e000)的确已经在动态链接库中.所以我们猜想snoopy实现的原理就是覆盖了libc.so中的某些库函数从而实现了命令记录的功能.

1
2
3
$ tail -f /var/log/auth.log
Dec 12 04:47:21 ubuntu snoopy[2224]: [login:spoock ssh:((undefined)) sid:2130 tty:/dev/pts/2 (1000/spoock) uid:spoock(1000)/spoock(1000) cwd:/home/spoock]: id
Dec 12 04:47:23 ubuntu snoopy[2225]: [login:spoock ssh:((undefined)) sid:2130 tty:/dev/pts/2 (1000/spoock) uid:spoock(1000)/spoock(1000) cwd:/home/spoock]: whoami

snoopy源代码分析

由于当前snoopy整体项目过于庞大,为了便于我们看清整个架构和实际的记录原理,我选择分析Release generation script fixups III

snoopy.c

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
/* snoopy.c -- execve() logging wrapper 
* Copyright (c) 2000 marius@linux.com,mbm@linux.com
*
* $Id: snoopy.c 30 2010-02-13 16:31:16Z bostjanskufca $
*
* Part hacked on flight KL 0617, 30,000 ft or so over the Atlantic :)
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 2, or (at your option)
* any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License
* along with this program; if not, write to the Free Software Foundation,
* Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
*/
#include "config.h"
#include "snoopy.h"
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <dlfcn.h>
#include <syslog.h>
#include <string.h>
#include <unistd.h>
#include <limits.h>

#define min(a,b) a<b ? a : b

#if defined(RTLD_NEXT)
# define REAL_LIBC RTLD_NEXT
#else
# define REAL_LIBC ((void *) -1L)
#endif

#define FN(ptr,type,name,args) ptr = (type (*)args)dlsym (REAL_LIBC, name)



static inline void snoopy_log(const char *filename, char *const argv[])
{
char *logString = NULL;
size_t logStringLength = 0;
char cwd[PATH_MAX+1];
char *getCwdRet = NULL;

char *ttyPath = NULL;
char ttyPathEmpty[] = "";

int i = 0;
int argc = 0;
size_t argLength = 0;


#if SNOOPY_ROOT_ONLY
if ((geteuid() != 0) && (getuid() != 0)) {
return;
}
#endif


/* Count number of arguments */
for (argc=0 ; *(argv+argc) != '\0' ; argc++);

/* Get ttyname */
ttyPath = ttyname(0);
if (ttyPath == NULL) {
ttyPath = ttyPathEmpty;
}

/* Allocate memory for logString */
logStringLength = 0;
for (i=0 ; i<argc ; i++) {
/* Argument length + space */
logStringLength += sizeof(char) * (min(SNOOPY_MAX_ARG_LENGTH, strlen(argv[i])) + 1);
}
logString = (char *) malloc(logStringLength + 1); /* +1 for last \0 */

/* Create logString */
strcpy(logString, "");
for (i=0 ; i<argc ; i++) {
argLength = strlen(argv[i]);
strncat(logString, argv[i], min(SNOOPY_MAX_ARG_LENGTH, argLength));
strcat(logString, " ");
}
strcat(logString, "\0");

/* Log it */
openlog("snoopy", LOG_PID, LOG_AUTHPRIV);
#if defined(SNOOPY_CWD_LOGGING)
getCwdRet = getcwd(cwd, PATH_MAX+1);
syslog(LOG_INFO, "[uid:%d sid:%d cwd:%s tty:%s]: %s", getuid(), getsid(0), cwd, ttyPath, logString);
#else
syslog(LOG_INFO, "[uid:%d sid:%d tty:%s]: %s", getuid(), getsid(0), ttyPath, logString);
#endif

/* Free the logString memory */
free(logString);
}



int execve(const char *filename, char *const argv[], char *const envp[])
{
static int (*func)(const char *, char **, char **);

FN(func,int,"execve",(const char *, char **const, char **const));
snoopy_log(filename, argv);

return (*func) (filename, (char**) argv, (char **) envp);
}



int execv(const char *filename, char *const argv[]) {
static int (*func)(const char *, char **);

FN(func,int,"execv",(const char *, char **const));
snoopy_log(filename, argv);

return (*func) (filename, (char **) argv);
}

在snoopy.c中一共定义了三个函数.分别是execv,execve,snoopy_log. execv和execve都是libc中的库函数,最终回去调用sys_execve和sys_execve两个系统调用.在系统中所有执行的命令最终都会调用execve()这个一系列的系统调用函数,由于覆盖了libc中的库函数,所以理所当然就能够记录到所有的命令了.

通过 FN(func,int,"execv",(const char *, char **const)); 来执行实际的函数,然后调用snoopy_log()来记录命令.
在snoopy_log()中,通过

  1. strncat(logString, argv[i], min(SNOOPY_MAX_ARG_LENGTH, argLength));得到execve所有的参数,拼接起来就是命令
  2. syslog(LOG_INFO, "[uid:%d sid:%d cwd:%s tty:%s]: %s", getuid(), getsid(0), cwd, ttyPath, logString);写入对应的系统文件.在ubuntu系统中,对应的系统文件就是/var/log/auth.log

我们通过strace查看命令的执行:

1
2
3
4
5
6
7
8
9
10
$ strace ls /abc
execve("/bin/ls", ["ls", "/abc"], 0x7ffe37d7b218 /* 50 vars */) = 0
brk(NULL) = 0x555b5553a000
access("/etc/ld.so.nohwcap", F_OK) = -1 ENOENT (No such file or directory)
access("/etc/ld.so.preload", R_OK) = 0
openat(AT_FDCWD, "/etc/ld.so.preload", O_RDONLY|O_CLOEXEC) = 3
fstat(3, {st_mode=S_IFREG|0644, st_size=35, ...}) = 0
mmap(NULL, 35, PROT_READ|PROT_WRITE, MAP_PRIVATE, 3, 0) = 0x7f2cb74bf000
close(3)
.......

detect

除了snoopy的检测之外,还存在一个detect.c的文件.代码如下:

1
2
3
4
5
6
7
void *handle = dlopen("/lib/libc.so.6", RTLD_LAZY);
//simple test to see if the execve in memory matches libc.so.6
if (dlsym(handle, "execve") != dlsym(RTLD_DEFAULT, "execve"))
printf("something fishy...\n");
else
printf("secure\n");
return 0;

就是用于检测在libc中是否妇女在execve相关的库函数,如果不存在,那么我们的snoopy可能无法通过覆盖execve库函数的方式拿到命令.

总结

snoopy的原理其实也非常的简单,通过hook libc中的execve相关的库函数达到命令记录的功能,算是一种在用户态的hook方式.既然知道了采用的ld_preload的方式,那么可以绕过的方式也就有很多种了.本篇文章就不如何绕过这个话题了.
意外地发现snoopy使用了ld_preload,等之后学习ld_preload相关知识之后,再写篇相关的文章.