首页 > *nix技术, 应用程序, 跟踪调试 > 64位Linux下的系统调用

64位Linux下的系统调用

2013年2月4日 发表评论 阅读评论 11,040 次浏览

AMD64,由AMD公司提出来的64位技术,由于它能很好的向下兼容32位,因此在与Intel公司的纯64技术IA64(即无法向下兼容)进行竞争的过程中占据着绝对的市场优势,当然,Intel公司不会自甘落后,因此也相继推出了IA32E(后被正式命名为EM64T,Extended Memory 64 Technology),IA32E与AMD64完全兼容,所以它们也被统称为AMD64技术。相比之前的x86-32架构,AMD64被称x86-64架构,在一般情况下,单独的x86指代的x86-32,而x86-64以x64为简称。

相比Intel支持的快速系统调用指令sysenter/sysexit,AMD对应的是syscall/sysret,不过现在,Intel也兼容这两条指令,毕竟在x64上,是AMD占了先机,Intel不能还在兼容性上吃亏。看实例:

[root@localhost getuid]# uname -a
Linux localhost.localdomain 3.0.3 #1 SMP Mon Dec 17 12:07:26 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost getuid]# cat /proc/cpuinfo | grep vendor_id
vendor_id	: GenuineIntel
[root@localhost getuid]# cat getuid_glibc.c 
/**
 * filename: getuid_glibc.c
 */
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>

int main(int argc, char *argv[])
{
    printf("uid:%d\n", getuid());
    return 0;
}
[root@localhost getuid]# gcc getuid_glibc.c -o getuid_glibc -static
/usr/bin/ld: cannot find -lc
collect2: ld returned 1 exit status

显然,这是一台Intel CPU机器,x64的系统环境,gcc静态连接编译实例程序出错,原因很简单,我这个系统没有装静态glibc库,既然如此,那就装一个。版本要装对,所以先看看系统里已安装的对应glibc动态库:

[root@localhost getuid]# cat /etc/issue
CentOS Linux release 6.0 (Final)
Kernel \r on an \m

[root@localhost getuid]# rpm -q glibc
glibc-2.12-1.7.el6.x86_64
glibc-2.12-1.7.el6.i686

根据上面信息,在http://rpm.pbone.net/搜索glibc-static-2.12-1.7.el6.x86_64,OK,下载到后安装并再次编译实例程序:

[root@localhost getuid]# ls -l
total 1388
-rw-r--r--. 1 root root     196 Jan 22  2013 getuid_glibc.c
-rw-r--r--. 1 root root 1415340 Jan 22  2013 glibc-static-2.12-1.7.el6.x86_64.rpm
[root@localhost getuid]# rpm -i glibc-static-2.12-1.7.el6.x86_64.rpm 
warning: glibc-static-2.12-1.7.el6.x86_64.rpm: Header V3 DSA/SHA1 Signature, key ID 1d1e034b: NOKEY
[root@localhost getuid]# ls /usr/lib64/libc.a 
/usr/lib64/libc.a
[root@localhost getuid]# gcc getuid_glibc.c -o getuid_glibc -static

在gdb里反编译试试:

[root@localhost getuid]# gdb ./getuid_glibc -q
Reading symbols from /home/gqk/work/getuid/getuid_glibc...(no debugging symbols found)...done.
(gdb) tb __getuid
Temporary breakpoint 1 at 0x40c560
(gdb) r
Starting program: /home/gqk/work/getuid/getuid_glibc 

Temporary breakpoint 1, 0x000000000040c560 in getuid ()
(gdb) disass
Dump of assembler code for function getuid:
=> 0x000000000040c560 <+0>:	mov    $0x66,%eax
   0x000000000040c565 <+5>:	syscall 
   0x000000000040c567 <+7>:	retq   
End of assembler dump.
(gdb) 

值0x66是__NR_getuid在x64上的值,即把__NR_getuid放到eax寄存器后,不再是执行指令int 0x80,而是执行指令syscall,看另外一个文件:

[root@localhost getuid]# cat getuid_syscall.c 
/**
 * filename: getuid_syscall.c
 */
#include <stdio.h>
#define _GNU_SOURCE
#include <unistd.h>
#include <sys/syscall.h>

int main(int argc, char *argv[])
{
    printf("uid:%d\n", syscall(__NR_getuid));
    return 0;
}
[root@localhost getuid]# gcc getuid_syscall.c -o getuid_syscall -static
[root@localhost getuid]# gdb ./getuid_syscall -q
Reading symbols from /home/gqk/work/getuid/getuid_syscall...(no debugging symbols found)...done.
(gdb) b main
Breakpoint 1 at 0x4003f8
(gdb) r
Starting program: /home/gqk/work/getuid/getuid_syscall 

Breakpoint 1, 0x00000000004003f8 in main ()
(gdb) disass
Dump of assembler code for function main:
...
   0x0000000000400403 <+15>:	mov    $0x66,%edi
   0x0000000000400408 <+20>:	mov    $0x0,%eax
   0x000000000040040d <+25>:	callq  0x40d730 <syscall>
...
End of assembler dump.
(gdb) disass 0x40d730
Dump of assembler code for function syscall:
   0x000000000040d730 <+0>:	mov    %rdi,%rax
   0x000000000040d733 <+3>:	mov    %rsi,%rdi
   0x000000000040d736 <+6>:	mov    %rdx,%rsi
   0x000000000040d739 <+9>:	mov    %rcx,%rdx
   0x000000000040d73c <+12>:	mov    %r8,%r10
   0x000000000040d73f <+15>:	mov    %r9,%r8
   0x000000000040d742 <+18>:	mov    0x8(%rsp),%r9
   0x000000000040d747 <+23>:	syscall 
...
End of assembler dump.
(gdb) 

上面代码无需多说,再看一下AMD机器的情况:

[root@www getuid]# uname -a
Linux www.t1.com 2.6.38.8 #2 SMP Wed Nov 2 07:52:53 CST 2011 x86_64 x86_64 x86_64 GNU/Linux
[root@www getuid]# cat /proc/cpuinfo | grep vendor_id
vendor_id	: AuthenticAMD
[root@www getuid]# cat getuid_glibc.c
/**
 * filename: getuid_glibc.c
 */
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>

int main(int argc, char *argv[])
{
    printf("uid:%d\n", getuid());
    return 0;
}
[root@www getuid]# gcc getuid_glibc.c -o getuid_glibc -static
[root@www getuid]# gdb ./getuid_glibc -q
Reading symbols from /home/work/getuid/getuid_glibc...(no debugging symbols found)...done.
(gdb) tb __getuid
Temporary breakpoint 1 at 0x40c560
(gdb) r
Starting program: /home/work/getuid/getuid_glibc 

Temporary breakpoint 1, 0x000000000040c560 in getuid ()
(gdb) disass
Dump of assembler code for function getuid:
=> 0x000000000040c560 <+0>:	mov    $0x66,%eax
   0x000000000040c565 <+5>:	syscall 
   0x000000000040c567 <+7>:	retq   
End of assembler dump.
(gdb) 

验证了syscall指令问题,再来看另一个话题,即vdso。x64机器上的应用程序,在执行过程中还是会看到有vdso的映射,如下:

[root@localhost getuid]# uname -a
Linux localhost.localdomain 3.0.3 #1 SMP Mon Dec 17 12:07:26 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost getuid]# cat /proc/self/maps | grep vdso
7fff643ff000-7fff64400000 r-xp 00000000 00:00 0                          [vdso]
[root@localhost getuid]# cat /proc/self/maps | grep vdso
7fffcbf64000-7fffcbf65000 r-xp 00000000 00:00 0                          [vdso]
[root@localhost getuid]# cat /proc/self/maps | grep vdso
7fffa5bff000-7fffa5c00000 r-xp 00000000 00:00 0                          [vdso]
[root@localhost getuid]# echo 1 > /proc/sys/kernel/randomize_va_space
[root@localhost getuid]# cat /proc/self/maps | grep vdso
7fffda1ff000-7fffda200000 r-xp 00000000 00:00 0                          [vdso]
[root@localhost getuid]# cat /proc/self/maps | grep vdso
7fff8c1ff000-7fff8c200000 r-xp 00000000 00:00 0                          [vdso]
[root@localhost getuid]# cat /proc/self/maps | grep vdso
7fff16fff000-7fff17000000 r-xp 00000000 00:00 0                          [vdso]

那这个vdso的作用是否仍如x86上一样呢?看看:

[root@localhost getuid]# gcc getuid_glibc.c -o getuid_glibc
[root@localhost getuid]# gdb ./getuid_glibc -q
Reading symbols from /home/gqk/work/getuid/getuid_glibc...(no debugging symbols found)...done.
(gdb) b __getuid
Function "__getuid" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Breakpoint 1 (__getuid) pending.
(gdb) r
Starting program: /home/gqk/work/getuid/getuid_glibc 

Breakpoint 1, 0x0000003eba0a7960 in getuid () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.7.el6.x86_64
(gdb) disassemble 
Dump of assembler code for function getuid:
=> 0x0000003eba0a7960 <+0>:	mov    $0x66,%eax
   0x0000003eba0a7965 <+5>:	syscall 
   0x0000003eba0a7967 <+7>:	retq   
End of assembler dump.
(gdb) 

很明显,在x64上不再需要__kernel_vsyscall,因为它只有一种调用方式,即syscall指令(注意:x64上的32位程序不在此列,本文最后会看到)。
那看看x64上vdso的内容是什么:

[root@localhost getuid]# gdb -q /bin/ls
Reading symbols from /bin/ls...(no debugging symbols found)...done.
Missing separate debuginfos, use: debuginfo-install coreutils-8.4-9.el6.x86_64
(gdb) tb __open
Function "__open" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y

Temporary breakpoint 1 (__open) pending.
(gdb) r
Starting program: /bin/ls 

Temporary breakpoint 1, 0x0000003eb9816e20 in open64 ()
   from /lib64/ld-linux-x86-64.so.2
(gdb) info program 
	Using the running image of child process 19988.
Program stopped at 0x3eb9816e20.
It stopped at a breakpoint that has since been deleted.
(gdb) shell cat /proc/19988/maps | grep vdso
7ffff7ffe000-7ffff7fff000 r-xp 00000000 00:00 0                          [vdso]
(gdb) dump memory /tmp/linux-vdso.so.1 0x7ffff7ffe000 0x7ffff7fff000
(gdb) q
A debugging session is active.

	Inferior 1 [process 19988] will be killed.

Quit anyway? (y or n) y
[root@localhost getuid]# readelf -h /tmp/linux-vdso.so.1 
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00 
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0xffffffffff700700
  Start of program headers:          64 (bytes into file)
  Start of section headers:          2856 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         4
  Size of section headers:           64 (bytes)
  Number of section headers:         14
  Section header string table index: 13
[root@localhost getuid]# objdump -d /tmp/linux-vdso.so.1 | grep -A5 \>\:
ffffffffff700700 <__vdso_time>:
ffffffffff700700:	55                   	push   %rbp
ffffffffff700701:	8b 04 25 14 0d 60 ff 	mov    0xffffffffff600d14,%eax
ffffffffff700708:	48 89 e5             	mov    %rsp,%rbp
ffffffffff70070b:	85 c0                	test   %eax,%eax
ffffffffff70070d:	74 12                	je     ffffffffff700721 <__vdso_time+0x21>
--
ffffffffff700880 <__vdso_gettimeofday>:
ffffffffff700880:	55                   	push   %rbp
ffffffffff700881:	48 89 e5             	mov    %rsp,%rbp
ffffffffff700884:	41 54                	push   %r12
ffffffffff700886:	49 89 f4             	mov    %rsi,%r12
ffffffffff700889:	53                   	push   %rbx
--
ffffffffff7009c0 <__vdso_clock_gettime>:
ffffffffff7009c0:	55                   	push   %rbp
ffffffffff7009c1:	8b 0c 25 14 0d 60 ff 	mov    0xffffffffff600d14,%ecx
ffffffffff7009c8:	48 89 e5             	mov    %rsp,%rbp
ffffffffff7009cb:	85 c9                	test   %ecx,%ecx
ffffffffff7009cd:	74 13                	je     ffffffffff7009e2 <__vdso_clock_gettime+0x22>
--
ffffffffff700a40 <__vdso_getcpu>:
ffffffffff700a40:	55                   	push   %rbp
ffffffffff700a41:	83 3c 25 88 0c 60 ff 	cmpl   $0x1,0xffffffffff600c88
ffffffffff700a48:	01 
ffffffffff700a49:	48 89 e5             	mov    %rsp,%rbp
ffffffffff700a4c:	74 2a                	je     ffffffffff700a78 <__vdso_getcpu+0x38>
[root@localhost getuid]# 

x64的vdso里果然不再包含有__kernel_vsyscall,而是另外四个函数__vdso_time、__vdso_gettimeofday、__vdso_clock_gettime和__vdso_getcpu

好,下面开始进入到本文的重点部分,我们知道:

1,对于一般的系统调用,不管做怎样的优化,性能消耗总是有的,比如从int 0x80中断指令到sysenter快速指令的改变,也不过是减少一些权限检测,少转存一些寄存器数据,但对用户态到内核态这条鸿沟的来回跨越却是无法避免的。
2,在某些特定的应用场合,用户进程会非常频繁的请求某些系统调用,比如nginx通过系统调用gettimeofday()获取当前时间。
3,在执行像gettimeofday()这样的系统调用时,并不会向内核提交参数,而仅仅只是从内核里请求读取某个数据,这就为内核对gettimeofday()做特别的优化设定了很好的基础,比如,内核可以把系统当前时间写在一个固定的位置,而应用程序直接从该位置简单读取即可,无需发起系统调用。
4,不发起系统调用就可以获取当前系统时间,这样做带来的性能提升是显著的,至少在理论上是如此。当然,这还有两个问题需要解决:第一,写在固定位置的时间如何更新?答案是由内核在每个时间中断里去完成这个更新动作。第二,内核和所有应用程序如何共享这一块固定位置?简单,有mmap映射。

Linux内核就是这么做的,这整个一套机制被称之为virtual system call,即vsyscall。Linux内核代码的更新是频繁的,最初的vsyscall机制从内核2.5.x开始就存在(参考2),但因为vsyscall机制的一些缺点(主要是认为其固定映射存在严重安全攻击威胁),所以有大牛又提供了vdso的改进补丁,vdso的随机映射在一定程度上缓解了安全威胁。虽然有了vdso,但从历史兼容性上来讲,vsyscall不能就此完全抛弃,否则将导致一些陈旧的(特别是静态连接的)应用程序无法执行,因此现在在2.6.x内核上,将同时看到vdso和vsyscal。
关于这份“杂乱的发展历史”,我虽然查了很多资料,但仍然没有完全理清,就差去逐个对比Linux各个版本的内核代码了,嘛,算了,这不是太重要,我给出几点整理结论(虽然它们不一定正确):

1,在较新的2.6.x及后续版本的原生Linux内核,x86-32上没有vsyscall,而vdso也不提供gettimeofday()类函数的vsyscall功能。
这是我自己验证的结果:

[root@lenky gettimeofday]# uname -a
Linux lenky 2.6.30 #2 SMP Tue Sep 21 17:19:57 CST 2010 i686 i686 i386 GNU/Linux
[root@lenky gettimeofday]# cat gettimeofday_glibc.c 
/**
 * filename: gettimeofday_glibc.c
 */
#include <stdio.h>
#include <sys/time.h>

int main(int argc, char *argv[])
{
    struct timeval tv;
    if (gettimeofday(&tv, NULL) == -1) {
        perror("gettimeofday error");
        return -1;
    }
    printf("gettimeofday() : %.2f s\n", 
        (double) tv.tv_sec + tv.tv_usec / 1000000.0);
    return 0;
}
[root@lenky gettimeofday]# gcc gettimeofday_glibc.c 
[root@lenky gettimeofday]# gdb ./a.out -q
(no debugging symbols found)
(gdb) b main
Breakpoint 1 at 0x80483f2
(gdb) r
Starting program: /home/work/gettimeofday/a.out 
(no debugging symbols found)
(no debugging symbols found)
(no debugging symbols found)

Breakpoint 1, 0x080483f2 in main ()
(gdb) disassemble 
Dump of assembler code for function main:
0x080483e4 <main+0>:	lea    0x4(%esp),%ecx
0x080483e8 <main+4>:	and    $0xfffffff0,%esp
0x080483eb <main+7>:	pushl  -0x4(%ecx)
0x080483ee <main+10>:	push   %ebp
0x080483ef <main+11>:	mov    %esp,%ebp
0x080483f1 <main+13>:	push   %ecx
0x080483f2 <main+14>:	sub    $0x24,%esp
0x080483f5 <main+17>:	movl   $0x0,0x4(%esp)
0x080483fd <main+25>:	lea    -0xc(%ebp),%eax
0x08048400 <main+28>:	mov    %eax,(%esp)
0x08048403 <main+31>:	call   0x80482f0 <gettimeofday@plt>
0x08048408 <main+36>:	cmp    $0xffffffff,%eax
0x0804840b <main+39>:	jne    0x8048422 <main+62>
0x0804840d <main+41>:	movl   $0x8048548,(%esp)
0x08048414 <main+48>:	call   0x80482e0 <perror@plt>
0x08048419 <main+53>:	movl   $0xffffffff,-0x18(%ebp)
0x08048420 <main+60>:	jmp    0x8048459 <main+117>
0x08048422 <main+62>:	mov    -0xc(%ebp),%eax
0x08048425 <main+65>:	push   %eax
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) b __gettimeofday
Breakpoint 2 at 0x1eefc0
(gdb) c
Continuing.

Breakpoint 2, 0x001eefc0 in gettimeofday () from /lib/libc.so.6
(gdb) disassemble 
Dump of assembler code for function gettimeofday:
0x001eefc0 <gettimeofday+0>:	mov    %ebx,%edx
0x001eefc2 <gettimeofday+2>:	mov    0x8(%esp),%ecx
0x001eefc6 <gettimeofday+6>:	mov    0x4(%esp),%ebx
0x001eefca <gettimeofday+10>:	mov    $0x4e,%eax
0x001eefcf <gettimeofday+15>:	call   *%gs:0x10
0x001eefd6 <gettimeofday+22>:	mov    %edx,%ebx
0x001eefd8 <gettimeofday+24>:	cmp    $0xfffff001,%eax
0x001eefdd <gettimeofday+29>:	jae    0x1eefe0 <gettimeofday+32>
0x001eefdf <gettimeofday+31>:	ret    
0x001eefe0 <gettimeofday+32>:	call   0x2772e8 <__i686.get_pc_thunk.cx>
0x001eefe5 <gettimeofday+37>:	add    $0xbe00f,%ecx
0x001eefeb <gettimeofday+43>:	mov    -0x20(%ecx),%ecx
0x001eeff1 <gettimeofday+49>:	xor    %edx,%edx
0x001eeff3 <gettimeofday+51>:	sub    %eax,%edx
0x001eeff5 <gettimeofday+53>:	mov    %edx,%gs:(%ecx)
0x001eeff8 <gettimeofday+56>:	or     $0xffffffff,%eax
0x001eeffb <gettimeofday+59>:	jmp    0x1eefdf <gettimeofday+31>
End of assembler dump.
(gdb) b __kernel_vsyscall
Breakpoint 3 at 0xb7fff414
(gdb) c
Continuing.

Breakpoint 3, 0xb7fff414 in __kernel_vsyscall ()
(gdb) disassemble 
Dump of assembler code for function __kernel_vsyscall:
0xb7fff414 <__kernel_vsyscall+0>:	push   %ecx
0xb7fff415 <__kernel_vsyscall+1>:	push   %edx
0xb7fff416 <__kernel_vsyscall+2>:	push   %ebp
0xb7fff417 <__kernel_vsyscall+3>:	mov    %esp,%ebp
0xb7fff419 <__kernel_vsyscall+5>:	sysenter 
0xb7fff41b <__kernel_vsyscall+7>:	nop    
0xb7fff41c <__kernel_vsyscall+8>:	nop    
0xb7fff41d <__kernel_vsyscall+9>:	nop    
0xb7fff41e <__kernel_vsyscall+10>:	nop    
0xb7fff41f <__kernel_vsyscall+11>:	nop    
0xb7fff420 <__kernel_vsyscall+12>:	nop    
0xb7fff421 <__kernel_vsyscall+13>:	nop    
0xb7fff422 <__kernel_vsyscall+14>:	jmp    0xb7fff417 <__kernel_vsyscall+3>
0xb7fff424 <__kernel_vsyscall+16>:	pop    %ebp
0xb7fff425 <__kernel_vsyscall+17>:	pop    %edx
0xb7fff426 <__kernel_vsyscall+18>:	pop    %ecx
0xb7fff427 <__kernel_vsyscall+19>:	ret    
End of assembler dump.
(gdb) q
The program is running.  Exit anyway? (y or n) y
[root@lenky gettimeofday]# 
[root@lenky gettimeofday]# rpm -q glibc
glibc-2.5-42
[root@lenky gettimeofday]# cat /proc/self/maps 
00149000-00163000 r-xp 00000000 fd:00 1967404    /lib/ld-2.5.so
00163000-00164000 r--p 00019000 fd:00 1967404    /lib/ld-2.5.so
00164000-00165000 rw-p 0001a000 fd:00 1967404    /lib/ld-2.5.so
0016c000-002ab000 r-xp 00000000 fd:00 1967405    /lib/libc-2.5.so
002ab000-002ad000 r--p 0013f000 fd:00 1967405    /lib/libc-2.5.so
002ad000-002ae000 rw-p 00141000 fd:00 1967405    /lib/libc-2.5.so
002ae000-002b1000 rw-p 00000000 00:00 0 
08048000-0804d000 r-xp 00000000 fd:00 12353611   /bin/cat
0804d000-0804e000 rw-p 00004000 fd:00 12353611   /bin/cat
0804e000-0806f000 rw-p 00000000 00:00 0          [heap]
b7df3000-b7ff3000 r--p 00000000 fd:00 8852004    /usr/lib/locale/locale-archive
b7ff3000-b7ff5000 rw-p 00000000 00:00 0 
b7fff000-b8000000 r-xp 00000000 00:00 0          [vdso]
bffeb000-c0000000 rw-p 00000000 00:00 0          [stack]

可以看到gettimeofday()最后也调入到__kernel_vsyscall(),然后到sysenter指令。进程映射里也没看到[vsyscall]。
有网友也提了类似问题:
http://lists.kernelnewbies.org/pipermail/kernelnewbies/2011-December/004130.html

Hi,

Why there is no vsyscall support on i386? I see it only on x86_64. In
x86_64 it supports gettimeofday, etc.
In i386, it only uses to support the sysenter optimization.
Why there is no support for gettimeofday in i386 as a vsyscall?

-Fredrick

另外,我之所以说是“原生Linux内核”,是因为貌似有类似的支持补丁:
http://sr71.net/~jstultz/tod/archive/linux-2.6.18-rc3_timekeeping_C4/broken-out/linux-2.6.18-rc3_timeofday-i386-vsyscall_C4.patch
因此,具体有木有支持,还是得以在自己平台上的测试和验证结果为准。

2,在较新的2.6.x及后续版本的原生Linux内核,x86-64同时支持vsyscall和vdso,即有两套实现同时存在,但貌似在Linux 3.1版本内,vsyscall被移除,不管怎样,我们应优先使用vdso,而不是陈旧的vsyscall。
验证:

[root@localhost gettimeofday]# uname -a
Linux localhost.localdomain 3.0.3 #1 SMP Mon Dec 17 12:07:26 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost gettimeofday]# cat gettimeofday_glibc.c 
/**
 * filename: gettimeofday_glibc.c
 */
#include <stdio.h>
#include <sys/time.h>

int main(int argc, char *argv[])
{
    struct timeval tv;
    if (gettimeofday(&tv, NULL) == -1) {
        perror("gettimeofday error");
        return -1;
    }
    printf("gettimeofday() : %.2f s\n", 
        (double) tv.tv_sec + tv.tv_usec / 1000000.0);
    return 0;
}
[root@localhost gettimeofday]# gcc gettimeofday_glibc.c -o gettimeofday_glibc
[root@localhost gettimeofday]# ldd ./gettimeofday_glibc
	linux-vdso.so.1 =>  (0x00007fffb8359000)
	libc.so.6 => /lib64/libc.so.6 (0x0000003eba000000)
	/lib64/ld-linux-x86-64.so.2 (0x0000003eb9800000)
[root@localhost gettimeofday]# cat /proc/self/maps 
00400000-0040b000 r-xp 00000000 fd:00 1966117                            /bin/cat
0060b000-0060c000 rw-p 0000b000 fd:00 1966117                            /bin/cat
0060c000-0062e000 rw-p 00000000 00:00 0                                  [heap]
3eb9800000-3eb981e000 r-xp 00000000 fd:00 655799                         /lib64/ld-2.12.so
3eb9a1e000-3eb9a1f000 r--p 0001e000 fd:00 655799                         /lib64/ld-2.12.so
3eb9a1f000-3eb9a20000 rw-p 0001f000 fd:00 655799                         /lib64/ld-2.12.so
3eb9a20000-3eb9a21000 rw-p 00000000 00:00 0 
3eba000000-3eba175000 r-xp 00000000 fd:00 655800                         /lib64/libc-2.12.so
3eba175000-3eba375000 ---p 00175000 fd:00 655800                         /lib64/libc-2.12.so
3eba375000-3eba379000 r--p 00175000 fd:00 655800                         /lib64/libc-2.12.so
3eba379000-3eba37a000 rw-p 00179000 fd:00 655800                         /lib64/libc-2.12.so
3eba37a000-3eba37f000 rw-p 00000000 00:00 0 
7f386c2ff000-7f3872190000 r--p 00000000 fd:00 1443193                    /usr/lib/locale/locale-archive
7f3872190000-7f3872193000 rw-p 00000000 00:00 0 
7f38721b6000-7f38721b7000 rw-p 00000000 00:00 0 
7fff29c9d000-7fff29cbe000 rw-p 00000000 00:00 0                          [stack]
7fff29d38000-7fff29d39000 r-xp 00000000 00:00 0                          [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

[root@localhost gettimeofday]# gdb ./gettimeofday_glibc -q
Reading symbols from /home/gqk/work/gettimeofday/gettimeofday_glibc...(no debugging symbols found)...done.
(gdb) b _dl_vdso_vsym
Function "_dl_vdso_vsym" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (_dl_vdso_vsym) pending.
(gdb) b __gettimeofday
Function "__gettimeofday" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 2 (__gettimeofday) pending.
(gdb) r
Starting program: /home/gqk/work/gettimeofday/gettimeofday_glibc 

Breakpoint 1, 0x0000003eba11f8f0 in _dl_vdso_vsym () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.7.el6.x86_64
(gdb) disass
Dump of assembler code for function _dl_vdso_vsym:
=> 0x0000003eba11f8f0 <+0>:	sub    $0x38,%rsp
   0x0000003eba11f8f4 <+4>:	mov    0x25958d(%rip),%rax        # 0x3eba378e88
   0x0000003eba11f8fb <+11>:	mov    0xb0(%rax),%r10
   0x0000003eba11f902 <+18>:	test   %r10,%r10
   0x0000003eba11f905 <+21>:	jne    0x3eba11f910 <_dl_vdso_vsym+32>
...
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 1, 0x0000003eba11f8f0 in _dl_vdso_vsym () from /lib64/libc.so.6
(gdb) disass
Dump of assembler code for function _dl_vdso_vsym:
=> 0x0000003eba11f8f0 <+0>:	sub    $0x38,%rsp
   0x0000003eba11f8f4 <+4>:	mov    0x25958d(%rip),%rax        # 0x3eba378e88
   0x0000003eba11f8fb <+11>:	mov    0xb0(%rax),%r10
   0x0000003eba11f902 <+18>:	test   %r10,%r10
   0x0000003eba11f905 <+21>:	jne    0x3eba11f910 <_dl_vdso_vsym+32>
...
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) c
Continuing.

Breakpoint 2, 0x0000003eba096840 in gettimeofday () from /lib64/libc.so.6
(gdb) disass
Dump of assembler code for function gettimeofday:
=> 0x0000003eba096840 <+0>:	sub    $0x8,%rsp
   0x0000003eba096844 <+4>:	mov    0x2e79b5(%rip),%rax        # 0x3eba37e200 <__vdso_gettimeofday>
   0x0000003eba09684b <+11>:	ror    $0x11,%rax
   0x0000003eba09684f <+15>:	xor    %fs:0x30,%rax
   0x0000003eba096858 <+24>:	callq  *%rax
   0x0000003eba09685a <+26>:	cmp    $0xfffff001,%eax
   0x0000003eba09685f <+31>:	jae    0x3eba096866 <gettimeofday+38>
   0x0000003eba096861 <+33>:	add    $0x8,%rsp
   0x0000003eba096865 <+37>:	retq   
   0x0000003eba096866 <+38>:	mov    0x2e2733(%rip),%rcx        # 0x3eba378fa0
   0x0000003eba09686d <+45>:	xor    %edx,%edx
   0x0000003eba09686f <+47>:	sub    %rax,%rdx
   0x0000003eba096872 <+50>:	mov    %edx,%fs:(%rcx)
   0x0000003eba096875 <+53>:	or     $0xffffffffffffffff,%rax
   0x0000003eba096879 <+57>:	jmp    0x3eba096861 <gettimeofday+33>
End of assembler dump.
(gdb) b __vdso_gettimeofday
Breakpoint 3 at 0x7ffff7ffe884
(gdb) c
Continuing.

Breakpoint 3, 0x00007ffff7ffe884 in gettimeofday ()
(gdb) disass
Dump of assembler code for function gettimeofday:
   0x00007ffff7ffe880 <+0>:	push   %rbp
   0x00007ffff7ffe881 <+1>:	mov    %rsp,%rbp
=> 0x00007ffff7ffe884 <+4>:	push   %r12
   0x00007ffff7ffe886 <+6>:	mov    %rsi,%r12
   0x00007ffff7ffe889 <+9>:	push   %rbx
   0x00007ffff7ffe88a <+10>:	mov    0xffffffffff600d14,%edx
   0x00007ffff7ffe891 <+17>:	mov    %rdi,%rbx
   0x00007ffff7ffe894 <+20>:	test   %edx,%edx
   0x00007ffff7ffe896 <+22>:	je     0x7ffff7ffe8f5 <gettimeofday+117>
   0x00007ffff7ffe898 <+24>:	cmpq   $0x0,0xffffffffff600d20
   0x00007ffff7ffe8a1 <+33>:	je     0x7ffff7ffe8f5 <gettimeofday+117>
   0x00007ffff7ffe8a3 <+35>:	test   %rdi,%rdi
   0x00007ffff7ffe8a6 <+38>:	je     0x7ffff7ffe8d0 <gettimeofday+80>
   0x00007ffff7ffe8a8 <+40>:	callq  0x7ffff7ffe7d0
   0x00007ffff7ffe8ad <+45>:	mov    0x8(%rbx),%rcx
   0x00007ffff7ffe8b1 <+49>:	movabs $0x20c49ba5e353f7cf,%rdx
   0x00007ffff7ffe8bb <+59>:	mov    %rcx,%rax
   0x00007ffff7ffe8be <+62>:	sar    $0x3f,%rcx
   0x00007ffff7ffe8c2 <+66>:	imul   %rdx
   0x00007ffff7ffe8c5 <+69>:	sar    $0x7,%rdx
   0x00007ffff7ffe8c9 <+73>:	sub    %rcx,%rdx
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) 

调入到_dl_vdso_vsym()函数,对应的是代码

#ifdef SHARED
void *gettimeofday_ifunc (void) __asm__ ("__gettimeofday");

void *
gettimeofday_ifunc (void)
{
  PREPARE_VERSION (linux26, "LINUX_2.6", 61765110);

  /* If the vDSO is not available we fall back on the old vsyscall.  */
  return (_dl_vdso_vsym ("gettimeofday", &linux26)
	  ?: (void *) VSYSCALL_ADDR_vgettimeofday);
}
__asm (".type __gettimeofday, %gnu_indirect_function");
#else

可以看到使用了vdso,试试静态连接:

[root@localhost gettimeofday]# gcc gettimeofday_glibc.c -o gettimeofday_glibc_static -static
[root@localhost gettimeofday]# gdb ./gettimeofday_glibc_static -q
Reading symbols from /home/gqk/work/gettimeofday/gettimeofday_glibc_static...(no debugging symbols found)...done.
(gdb) b __gettimeofday
Breakpoint 1 at 0x414590
(gdb) r
Starting program: /home/gqk/work/gettimeofday/gettimeofday_glibc_static 

Breakpoint 1, 0x0000000000414590 in gettimeofday ()
(gdb) disassemble 
Dump of assembler code for function gettimeofday:
=> 0x0000000000414590 <+0>:	sub    $0x8,%rsp
   0x0000000000414594 <+4>:	mov    $0xffffffffff600000,%rax
   0x000000000041459b <+11>:	callq  *%rax
   0x000000000041459d <+13>:	cmp    $0xfffff001,%eax
   0x00000000004145a2 <+18>:	jae    0x417410 <__syscall_error>
   0x00000000004145a8 <+24>:	add    $0x8,%rsp
   0x00000000004145ac <+28>:	retq   
End of assembler dump.
(gdb) 

地址0xffffffffff600000是什么?还记得这一行么?

ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]

因此,静态连接时使用的是vsyscall。
不管是vdso还是vsyscall,利用proc接口可以进行关闭:
echo 0 > /proc/sys/kernel/vsyscall64
测试:

[root@localhost gettimeofday]# gcc gettimeofday_glibc.c -o gettimeofday_glibc
[root@localhost gettimeofday]# gcc gettimeofday_glibc.c -o gettimeofday_glibc_static -static
[root@localhost gettimeofday]# echo 0 > /proc/sys/kernel/vsyscall64
[root@localhost gettimeofday]# strace -e trace=gettimeofday ./gettimeofday_glibc 
gettimeofday({1357510633, 399266}, NULL) = 0
gettimeofday() : 1357510633.40 s
[root@localhost gettimeofday]# strace -e trace=gettimeofday ./gettimeofday_glibc_static 
gettimeofday({1357510638, 444941}, NULL) = 0
gettimeofday() : 1357510638.44 s
[root@localhost gettimeofday]# echo 1 > /proc/sys/kernel/vsyscall64
[root@localhost gettimeofday]# strace -e trace=gettimeofday ./gettimeofday_glibc 
gettimeofday() : 1357510645.58 s
[root@localhost gettimeofday]# strace -e trace=gettimeofday ./gettimeofday_glibc_static 
gettimeofday() : 1357510648.17 s
[root@localhost gettimeofday]#

可以看到,在关闭的情况下,strace能跟踪到gettimeofday()系统调用,而在打开的情况下无法跟踪到,因为此时根本就没有进行传统意义上的系统调用。

最后,我们把vsyscall页面dump出来看看:

[root@localhost gettimeofday]# uname -a
Linux localhost.localdomain 3.0.3 #1 SMP Mon Dec 17 12:07:26 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost gettimeofday]# gdb -q /bin/ls
Reading symbols from /bin/ls...(no debugging symbols found)...done.
Missing separate debuginfos, use: debuginfo-install coreutils-8.4-9.el6.x86_64
(gdb) tb __open
Function "__open" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Temporary breakpoint 1 (__open) pending.
(gdb) r
Starting program: /bin/ls 

Temporary breakpoint 1, 0x0000003eb9816e20 in open64 () from /lib64/ld-linux-x86-64.so.2
(gdb) info program
	Using the running image of child process 2567.
Program stopped at 0x3eb9816e20.
It stopped at a breakpoint that has since been deleted.
(gdb) shell cat /proc/2567/maps | grep vsyscall
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0                  [vsyscall]
(gdb) dump memory /tmp/linux-vsyscall.so 0xffffffffff600000 0xffffffffff601000
(gdb) q
A debugging session is active.

	Inferior 1 [process 2567] will be killed.

Quit anyway? (y or n) y
[root@localhost gettimeofday]# readelf -s /tmp/linux-vsyscall.so 
readelf: Error: Not an ELF file - it has the wrong magic bytes at the start
readelf: Error: /tmp/linux-vsyscall.so: Failed to read file header
[root@localhost gettimeofday]# file /tmp/linux-vsyscall.so 
/tmp/linux-vsyscall.so: data

额,linux-vsyscall.so并不是一个so库格式的elf文件,file命令显示就是data数据文件,当然,我们知道它里面存储的是code数据,用objdump解析一下:

[root@localhost gettimeofday]# objdump -D -b binary -mi386:x86-64 /tmp/linux-vsyscall.so 

/tmp/linux-vsyscall.so:     file format binary


Disassembly of section .data:

0000000000000000 <.data>:
       0:	55                   	push   %rbp
       1:	48 89 e5             	mov    %rsp,%rbp
       4:	41 55                	push   %r13
       6:	41 54                	push   %r12
       8:	49 89 fc             	mov    %rdi,%r12
       b:	53                   	push   %rbx
       c:	48 89 f3             	mov    %rsi,%rbx
       f:	48 83 ec 08          	sub    $0x8,%rsp
      13:	48 85 ff             	test   %rdi,%rdi
      16:	0f 84 c9 00 00 00    	je     0xe5
      1c:	44 8b 2c 25 00 0d 60 	mov    0xffffffffff600d00,%r13d
      23:	ff 
      24:	41 f6 c5 01          	test   $0x1,%r13b
      28:	0f 85 d4 00 00 00    	jne    0x102
      2e:	48 8b 04 25 20 0d 60 	mov    0xffffffffff600d20,%rax
      35:	ff 
      36:	8b 14 25 14 0d 60 ff 	mov    0xffffffffff600d14,%edx
      3d:	48 85 c0             	test   %rax,%rax
      40:	0f 84 c3 00 00 00    	je     0x109
      46:	85 d2                	test   %edx,%edx
      48:	0f 84 bb 00 00 00    	je     0x109
      4e:	ff d0                	callq  *%rax
      50:	48 8b 14 25 08 0d 60 	mov    0xffffffffff600d08,%rdx
      57:	ff 
      58:	4c 8b 04 25 28 0d 60 	mov    0xffffffffff600d28,%r8
      5f:	ff 
      60:	48 8b 3c 25 30 0d 60 	mov    0xffffffffff600d30,%rdi
      67:	ff 
      68:	8b 34 25 38 0d 60 ff 	mov    0xffffffffff600d38,%esi
      6f:	8b 0c 25 3c 0d 60 ff 	mov    0xffffffffff600d3c,%ecx
      76:	49 89 14 24          	mov    %rdx,(%r12)
      7a:	8b 14 25 10 0d 60 ff 	mov    0xffffffffff600d10,%edx
      81:	44 3b 2c 25 00 0d 60 	cmp    0xffffffffff600d00,%r13d
      88:	ff 
      89:	75 91                	jne    0x1c
      8b:	4c 29 c0             	sub    %r8,%rax
      8e:	48 21 f8             	and    %rdi,%rax
      91:	48 0f af c6          	imul   %rsi,%rax
      95:	48 d3 e8             	shr    %cl,%rax
      98:	48 8d 14 10          	lea    (%rax,%rdx,1),%rdx
      9c:	48 81 fa ff c9 9a 3b 	cmp    $0x3b9ac9ff,%rdx
      a3:	76 23                	jbe    0xc8
      a5:	49 8b 04 24          	mov    (%r12),%rax
      a9:	0f 1f 80 00 00 00 00 	nopl   0x0(%rax)
      b0:	48 81 ea 00 ca 9a 3b 	sub    $0x3b9aca00,%rdx
      b7:	48 83 c0 01          	add    $0x1,%rax
      bb:	48 81 fa ff c9 9a 3b 	cmp    $0x3b9ac9ff,%rdx
      c2:	77 ec                	ja     0xb0
      c4:	49 89 04 24          	mov    %rax,(%r12)
      c8:	48 c1 ea 03          	shr    $0x3,%rdx
      cc:	48 b9 cf f7 53 e3 a5 	mov    $0x20c49ba5e353f7cf,%rcx
      d3:	9b c4 20 
      d6:	48 89 d0             	mov    %rdx,%rax
      d9:	48 f7 e1             	mul    %rcx
      dc:	48 c1 ea 04          	shr    $0x4,%rdx
      e0:	49 89 54 24 08       	mov    %rdx,0x8(%r12)
      e5:	48 85 db             	test   %rbx,%rbx
      e8:	74 0b                	je     0xf5
      ea:	48 8b 04 25 18 0d 60 	mov    0xffffffffff600d18,%rax
      f1:	ff 
      f2:	48 89 03             	mov    %rax,(%rbx)
      f5:	48 83 c4 08          	add    $0x8,%rsp
      f9:	31 c0                	xor    %eax,%eax
      fb:	5b                   	pop    %rbx
      fc:	41 5c                	pop    %r12
      fe:	41 5d                	pop    %r13
     100:	c9                   	leaveq 
     101:	c3                   	retq   
     102:	f3 90                	pause  
     104:	e9 13 ff ff ff       	jmpq   0x1c
     109:	b8 60 00 00 00       	mov    $0x60,%eax
     10e:	31 f6                	xor    %esi,%esi
     110:	4c 89 e7             	mov    %r12,%rdi
     113:	0f 05                	syscall 
     115:	eb ce                	jmp    0xe5
	...
     13f:	00 55 48             	add    %dl,0x48(%rbp)
     142:	89 e5                	mov    %esp,%ebp
     144:	66 66 90             	xchg   %ax,%ax
     147:	0f ae e8             	lfence 
     14a:	0f 31                	rdtsc  
     14c:	89 c1                	mov    %eax,%ecx
     14e:	48 89 d0             	mov    %rdx,%rax
     151:	48 8b 14 25 28 0d 60 	mov    0xffffffffff600d28,%rdx
     158:	ff 
     159:	48 c1 e0 20          	shl    $0x20,%rax
     15d:	89 c9                	mov    %ecx,%ecx
     15f:	48 09 c8             	or     %rcx,%rax
     162:	48 39 c2             	cmp    %rax,%rdx
     165:	77 02                	ja     0x169
     167:	c9                   	leaveq 
     168:	c3                   	retq   
     169:	48 89 d0             	mov    %rdx,%rax
     16c:	c9                   	leaveq 
     16d:	c3                   	retq   
     16e:	00 00                	add    %al,(%rax)
     170:	55                   	push   %rbp
     171:	48 89 e5             	mov    %rsp,%rbp
     174:	8b 04 25 f0 f0 5f ff 	mov    0xffffffffff5ff0f0,%eax
     17b:	89 c0                	mov    %eax,%eax
     17d:	c9                   	leaveq 
     17e:	c3                   	retq   
	...
     3ff:	00 55 8b             	add    %dl,-0x75(%rbp)
     402:	0c 25                	or     $0x25,%al
     404:	14 0d                	adc    $0xd,%al
     406:	60                   	(bad)  
     407:	ff 48 89             	decl   -0x77(%rax)
     40a:	e5 85                	in     $0x85,%eax
     40c:	c9                   	leaveq 
     40d:	74 27                	je     0x436
     40f:	8b 14 25 00 0d 60 ff 	mov    0xffffffffff600d00,%edx
     416:	f6 c2 01             	test   $0x1,%dl
     419:	75 24                	jne    0x43f
     41b:	48 8b 04 25 08 0d 60 	mov    0xffffffffff600d08,%rax
     422:	ff 
     423:	3b 14 25 00 0d 60 ff 	cmp    0xffffffffff600d00,%edx
     42a:	75 e3                	jne    0x40f
     42c:	48 85 ff             	test   %rdi,%rdi
     42f:	74 03                	je     0x434
     431:	48 89 07             	mov    %rax,(%rdi)
     434:	c9                   	leaveq 
     435:	c3                   	retq   
     436:	b8 c9 00 00 00       	mov    $0xc9,%eax
     43b:	0f 05                	syscall 
     43d:	c9                   	leaveq 
     43e:	c3                   	retq   
     43f:	f3 90                	pause  
     441:	eb cc                	jmp    0x40f
	...
     7ff:	00 55 48             	add    %dl,0x48(%rbp)
     802:	85 d2                	test   %edx,%edx
     804:	49 89 d0             	mov    %rdx,%r8
     807:	48 89 e5             	mov    %rsp,%rbp
     80a:	74 64                	je     0x870
     80c:	48 8b 02             	mov    (%rdx),%rax
     80f:	4c 8b 0c 25 80 0c 60 	mov    0xffffffffff600c80,%r9
     816:	ff 
     817:	4c 39 c8             	cmp    %r9,%rax
     81a:	74 44                	je     0x860
     81c:	83 3c 25 88 0c 60 ff 	cmpl   $0x1,0xffffffffff600c88
     823:	01 
     824:	74 42                	je     0x868
     826:	b9 7b 00 00 00       	mov    $0x7b,%ecx
     82b:	0f 03 c9             	lsl    %cx,%ecx
     82e:	4d 85 c0             	test   %r8,%r8
     831:	74 0c                	je     0x83f
     833:	4c 89 c8             	mov    %r9,%rax
     836:	49 89 00             	mov    %rax,(%r8)
     839:	89 c8                	mov    %ecx,%eax
     83b:	49 89 40 08          	mov    %rax,0x8(%r8)
     83f:	48 85 ff             	test   %rdi,%rdi
     842:	74 09                	je     0x84d
     844:	89 c8                	mov    %ecx,%eax
     846:	25 ff 0f 00 00       	and    $0xfff,%eax
     84b:	89 07                	mov    %eax,(%rdi)
     84d:	48 85 f6             	test   %rsi,%rsi
     850:	74 05                	je     0x857
     852:	c1 e9 0c             	shr    $0xc,%ecx
     855:	89 0e                	mov    %ecx,(%rsi)
     857:	31 c0                	xor    %eax,%eax
     859:	c9                   	leaveq 
     85a:	c3                   	retq   
     85b:	0f 1f 44 00 00       	nopl   0x0(%rax,%rax,1)
     860:	8b 4a 08             	mov    0x8(%rdx),%ecx
     863:	eb d1                	jmp    0x836
     865:	0f 1f 00             	nopl   (%rax)
     868:	0f 01 f9             	rdtscp 
     86b:	eb c1                	jmp    0x82e
     86d:	0f 1f 00             	nopl   (%rax)
     870:	45 31 c9             	xor    %r9d,%r9d
     873:	eb a7                	jmp    0x81c
	...
     c7d:	00 00                	add    %al,(%rax)
     c7f:	00 ef                	add    %ch,%bh
     c81:	7f 29                	jg     0xcac
     c83:	3e 01 00             	add    %eax,%ds:(%rax)
     c86:	00 00                	add    %al,(%rax)
     c88:	02 00                	add    (%rax),%al
	...
     cfe:	00 00                	add    %al,(%rax)
     d00:	de 1e                	ficomp (%rsi)
     d02:	f0 02 6f 0f          	lock add 0xf(%rdi),%ch
     d06:	6f                   	outsl  %ds:(%rsi),(%dx)
     d07:	0f 34                	sysenter 
     d09:	f9                   	stc    
     d0a:	e9 50 00 00 00       	jmpq   0xd5f
     d0f:	00 7b db             	add    %bh,-0x25(%rbx)
     d12:	9f                   	lahf   
     d13:	0c 01                	or     $0x1,%al
     d15:	00 00                	add    %al,(%rax)
     d17:	00 20                	add    %ah,(%rax)
     d19:	fe                   	(bad)  
     d1a:	ff                   	(bad)  
     d1b:	ff 00                	incl   (%rax)
     d1d:	00 00                	add    %al,(%rax)
     d1f:	00 40 01             	add    %al,0x1(%rax)
     d22:	60                   	(bad)  
     d23:	ff                   	(bad)  
     d24:	ff                   	(bad)  
     d25:	ff                   	(bad)  
     d26:	ff                   	(bad)  
     d27:	ff 98 37 79 e0 47    	lcallq *0x47e07937(%rax)
     d2d:	40 09 00             	rex or     %eax,(%rax)
     d30:	ff                   	(bad)  
     d31:	ff                   	(bad)  
     d32:	ff                   	(bad)  
     d33:	ff                   	(bad)  
     d34:	ff                   	(bad)  
     d35:	ff                   	(bad)  
     d36:	ff                   	(bad)  
     d37:	ff f5                	push   %rbp
     d39:	a3 66 00 18 00 00 00 	mov    %eax,0xf1d2000000180066
     d40:	d2 f1 
     d42:	25 af ff ff ff       	and    $0xffffffaf,%eax
     d47:	ff ca                	dec    %edx
     d49:	76 0d                	jbe    0xd58
     d4b:	36 00 00             	add    %al,%ss:(%rax)
     d4e:	00 00                	add    %al,(%rax)
     d50:	34 f9                	xor    $0xf9,%al
     d52:	e9 50 00 00 00       	jmpq   0xda7
     d57:	00 7b db             	add    %bh,-0x25(%rbx)
     d5a:	9f                   	lahf   
     d5b:	0c 00                	or     $0x0,%al
	...
[root@localhost gettimeofday]# 

其中的“callq *%rax”指令调用的是地址0xffffffffff600d20,在内核符号里查一下:

[root@localhost gettimeofday]# cat /proc/kallsyms | grep ffffffffff600d
ffffffffff600d00 D __vvar_vsyscall_gtod_data
[root@localhost gettimeofday]# cat /proc/kallsyms | grep ffffffffff600
ffffffffff600000 T vgettimeofday
ffffffffff600140 T vread_tsc
ffffffffff600170 t vread_hpet
ffffffffff600400 T vtime
ffffffffff600800 T vgetcpu
ffffffffff600c80 D __vvar_jiffies
ffffffffff600c90 D __vvar_vgetcpu_mode
ffffffffff600d00 D __vvar_vsyscall_gtod_data

根据这些符号的名称,不难看出一些端倪,而具体细节,本文暂且不说,但我们已经知道我们的理解是正确的。

前后结合起来看,应用程序会调入到vdso或vsyscall,而在vdso或vsyscall做进一步判断,如果当前支持并且启用了vgettimeofday(),那就使用它,否则就需要进行传统意义上的系统调用。另外,系统会优先使用vdso,比如在前面看到的代码很好的说明了这一点:

  /* If the vDSO is not available we fall back on the old vsyscall.  */
  return (_dl_vdso_vsym ("gettimeofday", &linux26)
	  ?: (void *) VSYSCALL_ADDR_vgettimeofday);

_dl_vdso_vsym失败,则到VSYSCALL_ADDR_vgettimeofday,根据VSYSCALL_ADDR_vgettimeofday的汇编代码,可以看到里面还有判断,如果判断失败则到syscall。

在x64上试试32位程序:

[root@localhost gettimeofday]# gcc gettimeofday_glibc.c -o gettimeofday_glibc_32 -m32
In file included from /usr/include/features.h:385,
                 from /usr/include/stdio.h:28,
                 from gettimeofday_glibc.c:4:
/usr/include/gnu/stubs.h:7:27: error: gnu/stubs-32.h: No such file or directory

提示差32位的开发glibc库,查看系统当前的glibc版本:

[root@localhost gettimeofday]# rpm -q glibc
glibc-2.12-1.7.el6.x86_64
glibc-2.12-1.7.el6.i686

http://rpm.pbone.net/搜索glibc-devel-2.12-1.7.el6.i686,下载到对应的文件glibc-devel-2.12-1.7.el6.i686.rpm,安装后再编译:

[root@localhost gettimeofday]# ls -l
total 964
-rw-r--r--. 1 root root    374 Jan  6 12:35 gettimeofday_glibc.c
-rw-r--r--. 1 root root 982360 Jan 23  2013 glibc-devel-2.12-1.7.el6.i686.rpm
[root@localhost gettimeofday]# rpm -i glibc-devel-2.12-1.7.el6.i686.rpm 
warning: glibc-devel-2.12-1.7.el6.i686.rpm: Header V3 DSA/SHA1 Signature, key ID 1d1e034b: NOKEY
[root@localhost gettimeofday]# !gcc
gcc gettimeofday_glibc.c -o gettimeofday_glibc_32 -m32

下面测试显示,32位的应用程序没有使用vsyscall,而是使用的传统系统调用:

[root@localhost gettimeofday]# gcc gettimeofday_glibc.c -o gettimeofday_glibc
[root@localhost gettimeofday]# echo 1 > /proc/sys/kernel/vsyscall64
[root@localhost gettimeofday]# strace -e trace=gettimeofday ./gettimeofday_glibc
gettimeofday() : 1357515041.51 s
[root@localhost gettimeofday]# strace -e trace=gettimeofday ./gettimeofday_glibc_32 
[ Process PID=3589 runs in 32 bit mode. ]
gettimeofday({1357515045, 230917}, NULL) = 0
gettimeofday() : 1357515045.23 s

看看具体:

[root@localhost gettimeofday]# uname -a
Linux localhost.localdomain 3.0.3 #1 SMP Mon Dec 17 12:07:26 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost gettimeofday]# file ./gettimeofday_glibc_32
./gettimeofday_glibc_32: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), dynamically linked (uses shared libs), for GNU/Linux 2.6.18, not stripped
[root@localhost gettimeofday]# echo 1 > /proc/sys/kernel/vsyscall64
[root@localhost gettimeofday]# gdb ./gettimeofday_glibc_32 -q
Reading symbols from /home/gqk/work/gettimeofday/gettimeofday_glibc_32...(no debugging symbols found)...done.
(gdb) b __kernel_vsyscall
Function "__kernel_vsyscall" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (__kernel_vsyscall) pending.
(gdb) r
Starting program: /home/gqk/work/gettimeofday/gettimeofday_glibc_32 

Breakpoint 1, 0xf7ffd420 in __kernel_vsyscall ()
Missing separate debuginfos, use: debuginfo-install glibc-2.12-1.7.el6.i686
(gdb) disass
Dump of assembler code for function __kernel_vsyscall:
=> 0xf7ffd420 <+0>:	push   %ecx
   0xf7ffd421 <+1>:	push   %edx
   0xf7ffd422 <+2>:	push   %ebp
   0xf7ffd423 <+3>:	mov    %esp,%ebp
   0xf7ffd425 <+5>:	sysenter 
   0xf7ffd427 <+7>:	nop
   0xf7ffd428 <+8>:	nop
   0xf7ffd429 <+9>:	nop
   0xf7ffd42a <+10>:	nop
   0xf7ffd42b <+11>:	nop
   0xf7ffd42c <+12>:	nop
   0xf7ffd42d <+13>:	nop
   0xf7ffd42e <+14>:	jmp    0xf7ffd423 <__kernel_vsyscall+3>
   0xf7ffd430 <+16>:	pop    %ebp
   0xf7ffd431 <+17>:	pop    %edx
   0xf7ffd432 <+18>:	pop    %ecx
   0xf7ffd433 <+19>:	ret    
End of assembler dump.
(gdb) 

静态编译32位程序:

[root@localhost gettimeofday]# cat /etc/issue
CentOS Linux release 6.0 (Final)
Kernel \r on an \m

[root@localhost gettimeofday]# uname -a
Linux localhost.localdomain 3.0.3 #1 SMP Mon Dec 17 12:07:26 CST 2012 x86_64 x86_64 x86_64 GNU/Linux
[root@localhost gettimeofday]# gcc gettimeofday_glibc.c -o gettimeofday_glibc_32_static -m32 -static
/usr/bin/ld: skipping incompatible /usr/lib64/libc.a when searching for -lc
/usr/bin/ld: cannot find -lc
collect2: ld returned 1 exit status

差静态库,老办法,在http://rpm.pbone.net/下载对应的glibc-static-2.12-1.7.el6.i686.rpm文件后安装,再试:

[root@localhost gettimeofday]# rz
rz waiting to receive.
[root@localhost gettimeofday]# rpm -i glibc-static-2.12-1.7.el6.i686.rpm 
warning: glibc-static-2.12-1.7.el6.i686.rpm: Header V3 DSA/SHA1 Signature, key ID 1d1e034b: NOKEY
[root@localhost gettimeofday]# !gcc
gcc gettimeofday_glibc.c -o gettimeofday_glibc_32_static -m32 -static
[root@localhost gettimeofday]# strace -e trace=gettimeofday ./gettimeofday_glibc_32_static
[ Process PID=1885 runs in 32 bit mode. ]
gettimeofday({1357748316, 537693}, NULL) = 0
gettimeofday() : 1357748316.54 s

可以看到使用了传统的系统调用,具体情况:

[root@localhost gettimeofday]# echo 1 > /proc/sys/kernel/vsyscall64
[root@localhost gettimeofday]# gdb ./gettimeofday_glibc_32_static -q
Reading symbols from /home/gqk/work/gettimeofday/gettimeofday_glibc_32_static...(no debugging symbols found)...done.
(gdb) b __kernel_vsyscall
Function "__kernel_vsyscall" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (__kernel_vsyscall) pending.
(gdb) r
Starting program: /home/gqk/work/gettimeofday/gettimeofday_glibc_32_static 

Breakpoint 1, 0xf7ffd420 in __kernel_vsyscall ()
(gdb) disass
Dump of assembler code for function __kernel_vsyscall:
=> 0xf7ffd420 <+0>:	push   %ecx
   0xf7ffd421 <+1>:	push   %edx
   0xf7ffd422 <+2>:	push   %ebp
   0xf7ffd423 <+3>:	mov    %esp,%ebp
   0xf7ffd425 <+5>:	sysenter 
   0xf7ffd427 <+7>:	nop
   0xf7ffd428 <+8>:	nop
   0xf7ffd429 <+9>:	nop
   0xf7ffd42a <+10>:	nop
   0xf7ffd42b <+11>:	nop
   0xf7ffd42c <+12>:	nop
   0xf7ffd42d <+13>:	nop
   0xf7ffd42e <+14>:	jmp    0xf7ffd423 <__kernel_vsyscall+3>
   0xf7ffd430 <+16>:	pop    %ebp
   0xf7ffd431 <+17>:	pop    %edx
   0xf7ffd432 <+18>:	pop    %ecx
   0xf7ffd433 <+19>:	ret    
End of assembler dump.
(gdb) 

[root@localhost gettimeofday]# gdb ./gettimeofday_glibc_32_static -q
Reading symbols from /home/gqk/work/gettimeofday/gettimeofday_glibc_32_static...(no debugging symbols found)...done.
(gdb) b __gettimeofday
Breakpoint 1 at 0x805cbf0
(gdb) r
Starting program: /home/gqk/work/gettimeofday/gettimeofday_glibc_32_static 

Breakpoint 1, 0x0805cbf0 in gettimeofday ()
(gdb) disassemble 
Dump of assembler code for function gettimeofday:
=> 0x0805cbf0 <+0>:	mov    %ebx,%edx
   0x0805cbf2 <+2>:	mov    0x8(%esp),%ecx
   0x0805cbf6 <+6>:	mov    0x4(%esp),%ebx
   0x0805cbfa <+10>:	mov    $0x4e,%eax
   0x0805cbff <+15>:	call   *0x80d9a60
   0x0805cc05 <+21>:	mov    %edx,%ebx
   0x0805cc07 <+23>:	cmp    $0xfffff001,%eax
   0x0805cc0c <+28>:	jae    0x805f660 <__syscall_error>
   0x0805cc12 <+34>:	ret    
End of assembler dump.
(gdb) x/x 0x80d9a60
0x80d9a60 <_dl_sysinfo>:	0xf7ffd420
(gdb) info auxv
32   AT_SYSINFO           Special system info/entry points 0xf7ffd420
33   AT_SYSINFO_EHDR      System-supplied DSO's ELF header 0xf7ffd000
16   AT_HWCAP             Machine-dependent CPU capability hints 0xfebfbff
6    AT_PAGESZ            System page size               4096
17   AT_CLKTCK            Frequency of times()           100
3    AT_PHDR              Program headers for program    0x8048034
4    AT_PHENT             Size of program header entry   32
5    AT_PHNUM             Number of program headers      5
7    AT_BASE              Base address of interpreter    0x0
8    AT_FLAGS             Flags                          0x0
9    AT_ENTRY             Entry point of program         0x80481c0
11   AT_UID               Real user ID                   0
12   AT_EUID              Effective user ID              0
13   AT_GID               Real group ID                  0
14   AT_EGID              Effective group ID             0
23   AT_SECURE            Boolean, was exec setuid-like? 0
25   AT_RANDOM            Address of 16 random bytes     0xffffd76b
31   AT_EXECFN            File name of executable        0xffffdfbf "/home/gqk/work/gettimeofday/gettimeofday_glibc_32_static"
15   AT_PLATFORM          String identifying platform    0xffffd77b "i686"
0    AT_NULL              End of vector                  0x0
(gdb) b __kernel_vsyscall
Breakpoint 2 at 0xf7ffd420
(gdb) 

参考:
1,http://www.mouseos.com/arch/syscall_sysret.html

2,The Linux kernel: System Calls

3,On vsyscalls and the vDSO

4,http://www.acsu.buffalo.edu/~charngda/x86assembly.html

5,http://lkml.indiana.edu/hypermail/linux/kernel/0709.2/0201.html

转载请保留地址:http://www.lenky.info/archives/2013/02/2199http://lenky.info/?p=2199


备注:如无特殊说明,文章内容均出自Lenky个人的真实理解而并非存心妄自揣测来故意愚人耳目。由于个人水平有限,虽力求内容正确无误,但仍然难免出错,请勿见怪,如果可以则请留言告之,并欢迎来讨论。另外值得说明的是,Lenky的部分文章以及部分内容参考借鉴了网络上各位网友的热心分享,特别是一些带有完全参考的文章,其后附带的链接内容也许更直接、更丰富,而我只是做了一下归纳&转述,在此也一并表示感谢。关于本站的所有技术文章,欢迎转载,但请遵从CC创作共享协议,而一些私人性质较强的心情随笔,建议不要转载。

法律:根据最新颁布的《信息网络传播权保护条例》,如果您认为本文章的任何内容侵犯了您的权利,请以Email或书面等方式告知,本站将及时删除相关内容或链接。

  1. 本文目前尚无任何评论.
  1. 本文目前尚无任何 trackbacks 和 pingbacks.