TOC
0. No SIGBUS on x86?!
1. Why?
2. How to tell x86 to warn me an unaligned memory acess?
3. So, what? - A real world application
4. Possible worry
5. When programming for Intel's CPUs, no need to care about alignment?
0. No SIGBUS on x86?!
According to wikipedia, there are two cases where a processor generates bus error:
1. non-existent address
2. unaligned memory access.
Strangely, you may have never seen such an error on x86 processors.
Compile following code and run it on a x86 machine:
You will find no problem with your program on x86 machines:
shawn.r2:~/work/aligntest$ uname -a
Linux r2 2.6.18-194.el5xen #1 SMP Tue Mar 16 22:01:26 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
shawn.r2:~/work/aligntest$ gcc a.c
shawn.r2:~/work/aligntest$ ./a.out
0
shawn.r2:~/work/aligntest$
On the contrary, if you try it on SPARC or IA64 machines, You will definitely end up with bus error:
shawn.sx1000:~/work/align$ uname -a
HP-UX sx1000 B.11.31 U ia64 1177235479 unlimited-user license
shawn.sx1000:~/work/align$ cc a.c
shawn.sx1000:~/work/align$ ./a.out
Bus error (core dumped)
shawn.sx1000:~/work/align$
shawn.v880:~/work/align$ cc a.c
shawn.v880:~/work/align$ uname -a
SunOS v880 5.10 Generic_142900-01 sun4u sparc SUNW,Sun-Fire-880
shawn.v880:~/work/align$ ./a.out
Bus Error (core dumped)
shawn.v880:~/work/align$
1. Why?
Well, my speculation is that it is because x86 processors do something to take care of this kind of unaligned memory access on microinstruction level.
Intel provides a way to switch off this feature. According to Intel's ia32 system programming guide, the EFLAGS register has a flag called AC (Alignment Check) flag. It is bit 18 in the EFLAGS register. After turning the AC flag on, you will be able to encounter with bus errors.
CPL is abbr. of Current Privilege Level. Intel's processors (x86 family) have 4 CPLS: 0, 1, 2, 3. Usually CPL 0 is used by the kernel (privileged mode), and CPL 3 is for user level processes.
2. How to tell x86 to warn me an unaligned memory acess?
Now, how to turn it on? First, you need to push the content of the EFLAGS register on the stack with PUSHF assembly instruction. And raise the bit 18 of the value on top of the stack, which is current value of EFLAGS. And then pop it back into EFLAGS register (RFLAGS for x86_64) with POPF assembly instruction. Following is the assembly code (AT&T convention):
Note that if you do this on a 32bit x86, you need to use ESP register instead, as noted in the comment.
The difference between ESP and RSP is that ESP is 32-bit, and RSP is 64-bit. If your processor is x86_64, it uses RFLAGS instead of EFLAGS. You need to use your stack pointer correspondingly.
Now, insert this assembly code into the original source code :
Behold, now you have SIGBUS on an x86 processor:
shawn.r2:~/work/aligntest$ gcc a.c
shawn.r2:~/work/aligntest$ ./a.out
Bus error (core dumped)
shawn.r2:~/work/aligntest$ uname -a
Linux r2 2.6.18-194.el5xen #1 SMP Tue Mar 16 22:01:26 EDT 2010 x86_64 x86_64 x86_64 GNU/Linux
3. So, what? - A real world application
We use a bunch of workstations at my workplace. People prefer x86 Linux systems because they offer more convenience to developers with great GNU tools. And they are faster than Solaris or HP-UX or AIX systems because systems other than x86 were bought at least 5 years ago.
But, the fact that people prefer x86 Linux systems over SUN's or IBM's systems brings a problem because x86 processors do not detect unaligned memory access. For that reason, I have constantly urged my crew to test their programs on Solaris (of course not to test them on Solaris x86). But people are not pleased to use slow and crowded Solaris machines when they have fast, new x86 Linux machines.
It would be very good if I applied this one line of assembly code to our product because it will enable programmers to detect their unaligned access to the memory even on x86 machines. It would be horribly embarrassing if our product crashed, especially because of a bus error! It implies that we do not thoroughly test our product and the company's credentials would be undermined. Happy to prevent it more easily beforehand.
4. Possible worry
Some people might be worried that turning AC flag on would have side effects on another processes running on the system. But, it is nothing to worry about, because flag register(s) is(are) in the process's context. It only affects the process that turned it on. When a context switching is to take place, the operating system pushes EFLAGS register and bunch of registers on the stack of the process that's going background, and then load the context of the process selected for the next active process. The context of the process includes EFLAGS register.
This is my speculation and I have not yet tested: if you want to enable alignment checking systemwide, you are going to want to set AC flag in the CR0 control register instead of the one in the EFLAGS register. I am not sure if it is true or not for the time, but I am going to test it tomorrow. If it were true, the system I will be testing it on might go down though :-)
----
Unfortunately, you must be in the ring 0, or CPL 0 to access CR0. That is, you can set AC flag in the CR0 only if you were in kernel module :-p. And, my speculation turned out to be true lol
----
Added 12 March, 2011:
5. When programming for Intel's CPUs, no need to care about alignment?
Intel CPUs allow unaligned access of words, double words, quad words. They don't generate GP(General Protection) exception that causes a SIGBUS signal even if an access to a memory is unaligned.
Then doesn't it make sense that programmers who work on Intel's CPU do not need to care about address alignment, right? Yes. It makes sense.
But, programmers SHOULD know of the fact that an unaligned memory access requires additional memory bus cycle even on Intel's CPUs. Intel's CPU manual says:
A word or doubleword operand that crosses a 4-byte boundary or a quadword operand that crosses an 8-byte boundary is considered unaligned and requires two separate memory bus cycles for access.
- Intel 64 and IA-32 Architectures Software Developer's Manual Volume 1, Section 4.1.1
So, it is always good practice for programmers to make every effort to use aligned memory access even on Intel's machines.
----
포스팅 날짜 변경. 원 날짜 : 2010/10/27 12:12