GDB 崩溃问题

本图片来自于 The Art of Debugging with GDB, DDD, and Eclipse 1st Edition, Kindle Edition 的封面

今天在调试程序的时候遇到了 GDB 导致的 segfault,感觉还挺有意思的,因此在这里做一个简要记录。

环境

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic

$ gdb --version
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".

重现

场景是这样的,我在一个终端中进入了一个临时的工作目录;随后,我在另一个终端中将这个临时目录删除了;之后,我再回到第一个终端,并启动 GDB 进行调试,结果就遇到了 crash。为了方便演示,我们这里使用如下的测试程序:

1
2
3
4
5
6
7
8
#include <stdio.h>

int
main(void)
{
printf("GDB segfault\n");
return 0;
}

使用如下命令进行编译:

1
$ gcc -g -O0 -o segfault segfault.c

为了后续方便测试,我们将 segfault 这个二进制文件所在路径加入到环境变量中。

1
$ export PATH=$PWD:$PATH

随后,我们在新建一个目录 t,并进入该目录。

1
$ mkdir t && cd t

然后,在另一个终端中删除目录 t。最后我们在第一个终端中执行下面的调试命令即可触发。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
$ gdb segfault
gdb: warning: error finding working directory: No such file or directory
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Segmentation fault

分析

通过生成 coredump 文件,我们可以看到如下的信息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
$ sudo gdb gdb /tmp/core.0.10585.1653385146
GNU gdb (Ubuntu 8.1.1-0ubuntu1) 8.1.1
Copyright (C) 2018 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from gdb...Reading symbols from /usr/lib/debug/.build-id/aa/405dd866e17b9353dd22ef4350c9f765bed9aa.debug...done.
done.
[New LWP 10585]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
Core was generated by `gdb -p 28121'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:65
65 ../sysdeps/x86_64/multiarch/strlen-avx2.S: No such file or directory.
(gdb) bt
#0 __strlen_avx2 () at ../sysdeps/x86_64/multiarch/strlen-avx2.S:65
#1 0x0000560b9bc52469 in gdb_abspath (path=path@entry=0x560b9dd723c0 "system-supplied DSO at 0x7ffc9cd80000") at ./gdb/common/pathstuff.c:143
#2 0x0000560b9bd97d03 in objfile::objfile (this=0x560b9d077640, abfd=0x560b9ddc52b0, name=0x560b9dd723c0 "system-supplied DSO at 0x7ffc9cd80000", flags_=...)
at ./gdb/objfiles.c:400
#3 0x0000560b9bdff2f9 in symbol_file_add_with_addrs (abfd=abfd@entry=0x560b9ddc52b0, name=0x560b9dd723c0 "system-supplied DSO at 0x7ffc9cd80000", add_flags=...,
addrs=addrs@entry=0x560b9cdee4d0, flags=..., parent=parent@entry=0x0) at ./gdb/symfile.c:1153
#4 0x0000560b9bdff9e5 in symbol_file_add_from_bfd (abfd=abfd@entry=0x560b9ddc52b0, name=<optimized out>, add_flags=..., addrs=addrs@entry=0x560b9cdee4d0, flags=...,
flags@entry=..., parent=parent@entry=0x0) at ./gdb/symfile.c:1263
#5 0x0000560b9bbb08f2 in symbol_file_add_from_memory (templ=templ@entry=0x560b9cebeb50, addr=<optimized out>, size=<optimized out>,
name=name@entry=0x560b9dd723c0 "system-supplied DSO at 0x7ffc9cd80000", from_tty=from_tty@entry=0) at ./gdb/symfile-mem.c:133
#6 0x0000560b9bbb0a3f in add_vsyscall_page (target=<optimized out>, from_tty=<optimized out>) at ./gdb/symfile-mem.c:205
#7 0x0000560b9bd9a41d in generic_observer_notify (args=0x7ffd04f1f450, subject=<optimized out>) at ./gdb/observer.c:167
#8 observer_notify_inferior_created (objfile=<optimized out>, from_tty=from_tty@entry=1) at ./observer.inc:426
#9 0x0000560b9bd58861 in post_create_inferior (target=<optimized out>, from_tty=from_tty@entry=1) at ./gdb/infcmd.c:501
#10 0x0000560b9bd5ac3c in setup_inferior (from_tty=from_tty@entry=1) at ./gdb/infcmd.c:2660
#11 0x0000560b9bd5ad3c in attach_post_wait (from_tty=1, mode=ATTACH_POST_WAIT_STOP, args=<optimized out>) at ./gdb/infcmd.c:2689
#12 0x0000560b9bcbe00d in do_my_continuations_1 (err=0, pmy_chain=<synthetic pointer>) at ./gdb/continuations.c:59
#13 do_my_continuations (err=0, list=<optimized out>) at ./gdb/continuations.c:83
#14 do_all_inferior_continuations (err=err@entry=0) at ./gdb/continuations.c:125
#15 0x0000560b9bd54c97 in inferior_event_handler (event_type=<optimized out>, client_data=<optimized out>) at ./gdb/inf-loop.c:59
#16 0x0000560b9bd6bd5c in fetch_inferior_event (client_data=<optimized out>) at ./gdb/infrun.c:3973
#17 0x0000560b9bd274fd in gdb_wait_for_event (block=block@entry=0) at ./gdb/event-loop.c:859
#18 0x0000560b9bd276ef in gdb_do_one_event () at ./gdb/event-loop.c:322
#19 0x0000560b9bd277a6 in gdb_do_one_event () at ./gdb/event-loop.c:353
#20 0x0000560b9be34d3c in wait_sync_command_done () at ./gdb/top.c:503
#21 0x0000560b9be34d7a in maybe_wait_sync_command_done (was_sync=<optimized out>, was_sync@entry=0) at ./gdb/top.c:520
#22 0x0000560b9bd8290f in catch_command_errors (command=0x560b9bd5ae80 <attach_command(char const*, int)>, arg=arg@entry=0x7ffd04f2089f "28121", from_tty=1) at ./gdb/main.c:381
#23 0x0000560b9bd83cbc in captured_main_1 (context=<optimized out>) at ./gdb/main.c:1061
#24 captured_main (data=<optimized out>) at ./gdb/main.c:1147
#25 gdb_main (args=<optimized out>) at ./gdb/main.c:1173
#26 0x0000560b9bb8decb in main (argc=<optimized out>, argv=<optimized out>) at ./gdb/gdb.c:32

可以看到是在 gdb_abspath() 函数中调用 strlen() 触发的,大概率这里的值是空值。gdb_abspath() 函数的源码如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
gdb::unique_xmalloc_ptr<char>
gdb_abspath (const char *path)
{
gdb_assert (path != NULL && path[0] != '\0');

if (path[0] == '~')
return gdb_tilde_expand_up (path);

if (IS_ABSOLUTE_PATH (path))
return gdb::unique_xmalloc_ptr<char> (xstrdup (path));

/* Beware the // my son, the Emacs barfs, the botch that catch... */
return gdb::unique_xmalloc_ptr<char>
(concat (current_directory,
IS_DIR_SEPARATOR (current_directory[strlen (current_directory) - 1])
? "" : SLASH_STRING,
path, (char *) NULL));
}

这里的 current_directory 是一个全局变量,由于它为空导致了崩溃。我在代码中查找了关于该变量的所有引用,如下所示:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
$ grep 'current_directory' -rn . --include "*.c"
./gdb/cli/cli-cmds.c:364: if (strcmp (cwd.get (), current_directory) != 0)
./gdb/cli/cli-cmds.c:366: current_directory, cwd.get ());
./gdb/cli/cli-cmds.c:368: printf_unfiltered (_("Working directory %s.\n"), current_directory);
./gdb/cli/cli-cmds.c:414: xfree (current_directory);
./gdb/cli/cli-cmds.c:415: current_directory = dir_holder.release ();
./gdb/cli/cli-cmds.c:419: if (IS_DIR_SEPARATOR (current_directory[strlen (current_directory) - 1]))
./gdb/cli/cli-cmds.c:420: current_directory = concat (current_directory, dir_holder.get (),
./gdb/cli/cli-cmds.c:423: current_directory = concat (current_directory, SLASH_STRING,
./gdb/cli/cli-cmds.c:430: for (p = current_directory; *p;)
./gdb/cli/cli-cmds.c:444: while (q != current_directory && !IS_DIR_SEPARATOR (q[-1]))
./gdb/cli/cli-cmds.c:447: if (q == current_directory)
./gdb/cli/cli-cmds.c:448: /* current_directory is
./gdb/cli/cli-cmds.c:722: chdir (current_directory);
./gdb/source.c:510: name = current_directory;
./gdb/source.c:540: name = concat (current_directory, SLASH_STRING, name, (char *)NULL);
./gdb/source.c:799: len = strlen (current_directory);
./gdb/source.c:806: strcpy (filename, current_directory);
./gdb/tracefile-tfile.c:441: filename.reset (concat (current_directory, "/", filename.get (),
./gdb/top.c:133:char *current_directory;
./gdb/top.c:1875: history_filename = concat (current_directory, "/_gdb_history",
./gdb/top.c:1878: history_filename = concat (current_directory, "/.gdb_history",
./gdb/top.c:1958: history_filename = reconcat (history_filename, current_directory, "/",
./gdb/top.c:2123: make_final_cleanup (do_chdir_cleanup, xstrdup (current_directory));
./gdb/bsd-kvm.c:84: temp = concat (current_directory, "/", filename, (char *)NULL);
./gdb/go32-nat.c:464: chdir (current_directory);
./gdb/common/pathstuff.c:142: (concat (current_directory,
./gdb/common/pathstuff.c:143: IS_DIR_SEPARATOR (current_directory[strlen (current_directory) - 1])
./gdb/corelow.c:290: filename.reset (concat (current_directory, "/",
./gdb/gdbserver/server.c:63:char *current_directory;
./gdb/gdbserver/server.c:3598: current_directory = getcwd (NULL, 0);
./gdb/gdbserver/server.c:3599: if (current_directory == NULL)
./gdb/main.c:538: current_directory = getcwd (NULL, 0);
./gdb/main.c:539: if (current_directory == NULL)

通过上面的查找可以看到仅在 server.cmain.c 中对其进行了初始化(调用 getcwd() 函数),事实上,在上述的测试情况下 getcwd() 函数将返回 NULL,然而其只是给出一个警告便继续执行了,因此在后续使用 current_directory 时是对空指针进行操作,从而导致崩溃。

1
2
3
current_directory = getcwd (NULL, 0);
if (current_directory == NULL)
perror_warning_with_name (_("error finding working directory"));

这个问题在之后的版本被修复了[1]。但不幸的是在 GDB 8.1.1 版本没有机会没有合并(与 GDB 的版本维护策略相关),因此建议升级一下 GDB,目前的 GDB 12 版本是已经确定修复了。感兴趣如何修复的朋友可以在这里看到 patch 的完整内容。

参考

[1] https://sourceware.org/bugzilla/show_bug.cgi?id=23613

笑林广记 - 讲解

有姓李者暴富而骄,或嘲之云:“一童读百家姓首句,求师解释,
师曰:‘赵是精赵的赵字(吴俗谓人呆为赵),钱是有铜钱的钱字,孙是小猢狲的孙字,李是姓张姓李的李字。’
童又问:‘倒转亦可讲得否?’
师曰:‘也讲得。’
童曰:‘如何讲?’
师曰:‘不过姓李的小猢狲,有了几个臭铜钱,一时就精赵起来。’”