使用osquery监控docker进程

说明

osquery除了常规的的进程之类的监控外,oquery还可以进行docker相关的监控,主要是docker进程的监控.osquery中与docker相关的监控全部都是以docker开头,如docker_container_labels,docker_container_ports等等. 我们着重关注docker_container_processes.

我们最终的目的是关联我们在宿主机上面的进程和docker里面的进程,那么问题就转换为如何进行关联呢?

首先我们需要明确一点的是,如果一个进程是在docker容器中的进程,那么他的pid_namespace和docker容器中的pid_namespace是相同的,那么我们就可以用这两者作为关联的桥梁.osquery刚好存在docker_containers和process_namespaces,关联方式如下

pid =====> pid_namespace(process_namespaces) =====> pid_namespace(docker_containers) ======> image,name

process_namespaces

processNamespacesTablePlugin

文件位于 build/linux/generated/tables_additional/process_namespaces.cpp
其关键代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
QueryData generate(QueryContext& context) override {
auto results = tables::genProcessNamespaces(context);
return results;
}

TableColumns columns() const override {
return {
std::make_tuple("pid", INTEGER_TYPE, ColumnOptions::INDEX),
std::make_tuple("cgroup_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("ipc_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("mnt_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("net_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("pid_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("user_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("uts_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
};
}

genProcessNamespaces

跟踪进入到genProcessNamespaces()
代码位于:osquery/tables/system/linux/processes.cpp

1
2
3
4
5
6
7
8
9
10
QueryData genProcessNamespaces(QueryContext& context) {
QueryData results;

const auto pidlist = getProcList(context);
for (const auto& pid : pidlist) {
genNamespaces(pid, results);
}

return results;
}

根据函数名猜测,通过const auto pidlist = getProcList(context);得到所有的pid,然后通过genNamespaces()得到每个pid的namespace.

genNamespaces

跟踪进入到genNamespaces()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
void genNamespaces(const std::string& pid, QueryData& results) {
Row r;

ProcessNamespaceList proc_ns;
Status status = procGetProcessNamespaces(pid, proc_ns);
if (!status.ok()) {
VLOG(1) << "Namespaces for pid " << pid
<< " are imcomplete: " << status.what();
}

r["pid"] = pid;
for (const auto& pair : proc_ns) {
r[pair.first + "_namespace"] = std::to_string(pair.second);
}

results.push_back(r);
}

通过procGetProcessNamespaces()得到每个进程的namespace

procGetProcessNamespaces

跟踪进入到procGetProcessNamespaces()
文件位于:osquery/filesystem/linux/proc.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
const std::vector<std::string> kUserNamespaceList = {
"cgroup", "ipc", "mnt", "net", "pid", "user", "uts"};
const std::string kLinuxProcPath = "/proc";

Status procGetProcessNamespaces(const std::string& process_id,
ProcessNamespaceList& namespace_list,
std::vector<std::string> namespaces) {
namespace_list.clear();

if (namespaces.empty()) {
namespaces = kUserNamespaceList;
}

auto process_namespace_root = kLinuxProcPath + "/" + process_id + "/ns";

for (const auto& namespace_name : namespaces) {
ino_t namespace_inode;
auto status = procGetNamespaceInode(
namespace_inode, namespace_name, process_namespace_root);
if (!status.ok()) {
continue;
}

namespace_list[namespace_name] = namespace_inode;
}

return Status(0, "OK");
}

procGetProcessNamespaces()函数就是解析/proc/pid/ns下面的文件,进行一一对应.举例说明:

1
2
3
4
5
6
7
8
9
10
11
12
$sudo ls -al  /proc/18759/ns

dr-x--x--x 2 root root 0 Jun 30 17:08 .
dr-xr-xr-x 9 root root 0 Jun 30 15:45 ..
lrwxrwxrwx 1 root root 0 Jun 30 17:08 cgroup -> 'cgroup:[4026531835]'
lrwxrwxrwx 1 root root 0 Jun 30 17:08 ipc -> 'ipc:[4026532369]'
lrwxrwxrwx 1 root root 0 Jun 30 17:08 mnt -> 'mnt:[4026532367]'
lrwxrwxrwx 1 root root 0 Jun 30 17:08 net -> 'net:[4026532372]'
lrwxrwxrwx 1 root root 0 Jun 30 17:08 pid -> 'pid:[4026532370]'
lrwxrwxrwx 1 root root 0 Jun 30 17:08 pid_for_children -> 'pid:[4026532370]'
lrwxrwxrwx 1 root root 0 Jun 30 17:08 user -> 'user:[4026531837]'
lrwxrwxrwx 1 root root 0 Jun 30 17:08 uts -> 'uts:[4026532368]'

实际上就得到了每个pid所有的namespace对应的inode编号.

通过分析整个解析流程,其实原理非常简单,就是遍历所有的pid,得到每一个namespace的inode编号。

docker_containers

dockerContainersTablePlugin

文件位于build/bionic/generated/tables_additional/docker_containers.cpp
其主要代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
class dockerContainersTablePlugin : public TablePlugin {
private:
TableColumns columns() const override {
return {
std::make_tuple("id", TEXT_TYPE, ColumnOptions::INDEX),
std::make_tuple("name", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("image", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("image_id", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("command", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("created", BIGINT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("state", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("status", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("pid", BIGINT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("path", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("config_entrypoint", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("started_at", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("finished_at", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("privileged", INTEGER_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("security_options", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("env_variables", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("readonly_rootfs", INTEGER_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("cgroup_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("ipc_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("mnt_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("net_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("pid_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("user_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
std::make_tuple("uts_namespace", TEXT_TYPE, ColumnOptions::DEFAULT),
};
}


TableAttributes attributes() const override {
return TableAttributes::NONE;
}

QueryData generate(QueryContext& context) override {
auto results = tables::genContainers(context);

return results;
}

}

解析数据的函数是genContainers()

genContainers

跟踪进入到genContainers()
文件位于:osquery/tables/applications/posix/docker.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
QueryData genContainers(QueryContext& context) {
QueryData results;
std::set<std::string> ids;
pt::ptree containers;
auto s = getContainers(context, ids, containers);
if (!s.ok()) {
return results;
}

for (const auto& entry : containers) {
const pt::ptree& container = entry.second;
Row r;
r["id"] = getValue(container, ids, "Id");
if (container.count("Names") > 0) {
for (const auto& name : container.get_child("Names")) {
r["name"] = name.second.data();
break;
}
}
.......
}
}

for循环其实只是用来从entry解析image_id,image.command等等字段的,真正用于解析数据的是getContainers(context, ids, containers);

getContainers

跟踪getContainers()

1
2
3
4
5
6
7
8
9
10
11
12
13
Status getContainers(QueryContext& context,
std::set<std::string>& ids,
pt::ptree& containers) {
std::string query;
getQuery(context, "id", query, ids, true);

Status s = dockerApi("/containers/json" + query, containers);
if (!s.ok()) {
VLOG(1) << "Error getting docker containers: " << s.what();
return s;
}
return Status(0);
}

代码getQuery(context, “id”, query, ids, true);用于生成查询语句.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
void getQuery(QueryContext& context,
const std::string& key,
std::string& query,
std::set<std::string>& set,
bool add_all) {
if (!context.constraints[key].exists(EQUALS)) {
return;
}

std::string key_str;
for (const auto& item : context.constraints[key].getAll(EQUALS)) {
if (!checkConstraintValue(item)) {
continue;
}
if (!key_str.empty()) {
key_str.append("%2C"); // comma
}
key_str.append("%22").append(item).append("%22%3Atrue"); // "item":true
set.insert(item);
}

query.append("?");
if (add_all) {
query.append("all=1&");
}
// filters={"key": {"item1":true, "item2":true, ...}}
query.append("filters=%7B%22")
.append(key)
.append("%22%3A%7B")
.append(key_str)
.append("%7D%7D");
}

如果没有任何的过滤条件.直接进入到 if (!context.constraints[key].exists(EQUALS)) { return; } 否则就会进行过滤组合最终得到类似与 filters={"key": {"item1":true, "item2":true, ...}} 这样的结果

调用getQuery()最终得到查询语句query,最后调用 dockerApi(“/containers/json” + query, containers) dockerApi()的结果会保存在containers中

dockerApi

跟踪进入到dockerApi()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
FLAG(string,
docker_socket,
"/var/run/docker.sock",
"Docker UNIX domain socket path");

Status dockerApi(const std::string& uri, pt::ptree& tree) {
static const std::regex httpOkRegex("HTTP/1\\.(0|1) 200 OK\\\r");

try {
// docker的unix domain socket值是/var/run/docker.sock
local::stream_protocol::endpoint ep(FLAGS_docker_socket);
// 封装成为iostream流
local::stream_protocol::iostream stream(ep);
if (!stream) {
return Status(
1, "Error connecting to docker sock: " + stream.error().message());
}

// Since keep-alive connections are not used, use HTTP/1.0
// 发送http请求
stream << "GET " << uri
<< " HTTP/1.0\r\nAccept: */*\r\nConnection: close\r\n\r\n"
<< std::flush;
if (stream.eof()) {
stream.close();
return Status(1, "Empty docker API response for: " + uri);
}

// All status responses are expected to be 200
// 得到返回结果保存在str中
std::string str;
getline(stream, str);

std::smatch match;
if (!std::regex_match(str, match, httpOkRegex)) {
stream.close();
return Status(1, "Invalid docker API response for " + uri + ": " + str);
}

// Skip empty line between header and body
while (!stream.eof() && str != "\r") {
getline(stream, str);
}

try {
// 将数据解析为json返回
pt::read_json(stream, tree);
} catch (const pt::ptree_error& e) {
stream.close();
return Status(
1, "Error reading docker API response for " + uri + ": " + e.what());
}

stream.close();
} catch (const std::exception& e) {
return Status(1, std::string("Error calling docker API: ") + e.what());
}

return Status(0);
}

通过分析dockerApi()可以发现其是这个函数的功能很简单,就是利用/var/run/docker.sock来获取docker的数据中的结果,只不过请求方式进行了封装,不是直接通过unix domain socket的方式来获取数据.其实dockerApi()本质上是利用了 docker remote api.通过/containers/json获取所有的镜像的信息,包括镜像名字,容器名称,容器id等等信息.

container_details

通过/containers/json获取的是当前主机上面所有的镜像的信息,但是我们还需要获取具体的某个容器的信息.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
pt::ptree container_details;
s = dockerApi("/containers/" + r["id"] + "/json?stream=false",container_details);
if (s.ok()) {
r["pid"] =BIGINT(container_details.get_child("State").get<pid_t>("Pid", -1));
.......
}
......
#ifdef __linux__
if (r["pid"] != "-1") {
ProcessNamespaceList namespace_list;
s = procGetProcessNamespaces(r["pid"], namespace_list);
if (s.ok()) {
for (const auto& pair : namespace_list) {
r[pair.first + "_namespace"] = std::to_string(pair.second);
}
} else {
VLOG(1) << "Failed to retrieve the namespace list for container "
<< r["id"];
}
}

通过/containers/container_id/json?stream=false 得到对应的container_id的进程id,然后调用procGetProcessNamespaces()得到对应的namespace,procGetProcessNamespaces()函数已经在前面分析了,这里不赘述了.

这样通过dockerApi()我们就可以得到如下的关联:container_id====>pid=====>namespace.

那么综上,我们利用processes,process_namespaces,docker_containers 三张表就可以将pid和进程进行关联.示例的SQL语句如下:

1
Select * from processes as p left join process_namespaces as pn on p.pid=pn.pid left join docker_containers as dc on pn.pid_namespace=dc.pid_namespace;

总结

纵观其实osquery监控docker的方式也并没有什么黑科技,仅仅只是利用了docker remote api的方式实现的,配置遍历pid的方式,得到pid所对应的容器的信息.虽然不是什么特别高明的方法,但是也为我们监控docker提供了一个切实可行的方法,可供我们参考