- 注册时间
- 2010-10-22
- 最后登录
- 1970-1-1
- 威望
- 星
- 金币
- 枚
- 贡献
- 分
- 经验
- 点
- 鲜花
- 朵
- 魅力
- 点
- 上传
- 次
- 下载
- 次
- 积分
- 2292
- 在线时间
- 小时
|
发表于 2013-6-8 23:13:08
|
显示全部楼层
理论上,CPU存在多个执行单元,调整指令顺序,减少指令依赖可以提升速度。然而我之前的种种实验表明,这种调整指令顺序,提高并行度的做法并无多大效果,不知何故。
另一种行之有效的优化循环的办法是减少控制指 ...
liangbch 发表于 2008-4-11 10:41
1. "All modern x86 processors can execute instructions out of order."意味着调整指令顺序基本没有太大的效果。- ; Out-of-order execution
- mov eax, [mem1]
- imul eax, 6
- mov [mem2], eax
- mov ebx, [mem3]
- add ebx, 2
- mov [mem4], ebx
复制代码 乱序执行,如果[mem1]不在cache中,则imul无法继续,但是第4条mov ebx,[mem3]不依赖于前面的指令,所以也已经开始执行了。
2. 寄存器重命名技术register renaming- mov eax, [mem1]
- imul eax, 6
- mov [mem2], eax
- mov eax, [mem3]
- add eax, 2
- mov [mem4], eax
复制代码 "the CPU is able to use different physical registers for the same logical register eax." 所以我们看到的只是逻辑上的eax,在微内核会被重命名,所以也不用担心在乱序执行的时候会干扰。
This means that the above code is changed inside the CPU to a code that uses four different physical registers for eax. The first register is used for the value loaded from
[mem1]. The second register is used for the output of the imul instruction. The third register is used for the value loaded from [mem3]. And the fourth register is used for the output of the add instruction.
The use of different physical registers for the same logical register enables the CPU to make the last three instructions in example independent of the first three instructions.
The CPU must have a lot of physical registers for this mechanism to work efficiently. The number of physical registers is different for different microprocessors, but
you can generally assume that the number is sufficient for quite a lot of instruction reordering. |
|