The team will post updates and news about our project here |
|
---|
| | Gunnar von Boehn (Apollo Team Member) Posts 6258 22 Jan 2024 07:41
| We have some very good news: 1) I have changed the implementation of the FINT instruction and made it much faster now. FINT is very fast now, with needing only 1 cycle. 2) I have reviewed the source code of many PC games. And I found that mane PC games use an FPU operation for which the 68K FPU family has no Hardware instruction. Many PC games cast "FLOAT to Unsigned Integer", or cast "DOUBLE to Unsigned Integer". While the 68K Family has an instruction for casting to SIGNED int, there is no instruction for a cast to unsigned - in the 68K architecture. As workaround compiler put it an instruction sequence to emulate this. Here is the sequence that is typically used. _ftoUint: fsmove.s (4,sp),fp0 fcmp.s #0x4f000000,fp0 fjge .L2 fintrz.x fp0,fp0 fmove.l fp0,d0 rts .L2: fssub.s #0x4f000000,fp0 fintrz.x fp0,fp0 fmove.l fp0,d0 add.l #-2147483648,d0 rts
The above could be replaced with the following new instruction fmoveU.l fp0,d0 We decided to improve this and to enrich the 68k FPU instruction set with two new instructions. To support casting casting from floats to unsigned and back. This will highly speed this operation up.
| |
| | John William
Posts 578 22 Jan 2024 20:41
| Gunnar von Boehn wrote:
| We have some very good news: 1) I have changed the implementation of the FINT instruction and made it much faster now. FINT is very fast now, with needing only 1 cycle. 2) I have reviewed the source code of many PC games. And I found that mane PC games use an FPU operation for which the 68K FPU family has no Hardware instruction. Many PC games cast "FLOAT to Unsigned Integer", or cast "DOUBLE to Unsigned Integer". While the 68K Family has an instruction for casting to SIGNED int, there is no instruction for a cast to unsigned - in the 68K architecture. As workaround compiler put it an instruction sequence to emulate this. Here is the sequence that is typically used. _ftoUint: fsmove.s (4,sp),fp0 fcmp.s #0x4f000000,fp0 fjge .L2 fintrz.x fp0,fp0 fmove.l fp0,d0 rts .L2: fssub.s #0x4f000000,fp0 fintrz.x fp0,fp0 fmove.l fp0,d0 add.l #-2147483648,d0 rts
The above could be replaced with the following new instruction fmoveU.l fp0,d0 We decided to improve this and to enrich the 68k FPU instruction set with two new instructions. To support casting casting from floats to unsigned and back. This will highly speed this operation up.
|
But if this line of code works: _ftoUint: fsmove.s (4,sp),fp0 fcmp.s #0x4f000000,fp0 fjge .L2 fintrz.x fp0,fp0 fmove.l fp0,d0 rts .L2: fssub.s #0x4f000000,fp0 fintrz.x fp0,fp0 fmove.l fp0,d0 add.l #-2147483648,d0 rts Why replace it?
| |
| | Gunnar von Boehn (Apollo Team Member) Posts 6258 23 Jan 2024 07:29
| John William wrote:
| But if this line of code works: _ftoUint: fsmove.s (4,sp),fp0 fcmp.s #0x4f000000,fp0 fjge .L2 fintrz.x fp0,fp0 fmove.l fp0,d0 rts .L2: fssub.s #0x4f000000,fp0 fintrz.x fp0,fp0 fmove.l fp0,d0 add.l #-2147483648,d0 rts Why replace it? |
Yes the code works fully correct. Of course this function takes some extra time. Even if you inline the function. fcmp.s #0x4f000000,fp0 fjge .overmax fintrz.x fp0,fp0 fmove.l fp0,d0 bra .next .overmax: fssub.s #0x4f000000,fp0 fintrz.x fp0,fp0 fmove.l fp0,d0 add.l #-2147483648,d0 .next
Then you still have 9 instructions And these need a ~ 23 cycles. Replacement 9 instruction with onlya single instruction which needs 1 cycle - this is always good and gives a speedup. How important is this speedup? Converting float to int is an operation which is not uncommon. Some programs might even do this very often. If a program does this only rarely... then of course this speedup will not matter much. But for programs that use this operation very often... maybe once per row, or even once per pixel - for them this tuning can make a huge difference.
| |
| | Kamelito Loveless
Posts 261 23 Jan 2024 16:39
| This is great. Any idea about the gain for Robin Hood? Next step is to improve gcc so those 2 instructions could benefit all programs.
| |
| | Gunnar von Boehn (Apollo Team Member) Posts 6258 24 Jan 2024 11:08
| Kamelito Loveless wrote:
| This is great. Any idea about the gain for Robin Hood?
|
I think all program using the FPU are doing float to int conversion. For conversion to signed integre they need 2 instructions ... for unsigned int 9 - We can improves this for 1 instruction each. Yes also Robin code does float to int conversion about 700 hundred times.
Kamelito Loveless wrote:
| Next step is to improve gcc so those 2 instructions could benefit all programs.
|
Yes this is a good idea.
| |
| | Rollef 2000
Posts 29 25 Jan 2024 05:57
| Moin, this development is the meaning of CISC, isn't it? Very good!
| |
| | Carles Bernat Martorell
Posts 22 25 Jan 2024 23:42
| Thank you for your commitment and hard work!
| |
| | Gunnar von Boehn (Apollo Team Member) Posts 6258 29 Jan 2024 06:01
| We have added FMOVERZ and FMOVEURZ to the Core. You can find their documentation now in the 68080 instruction list.FMOVERZ converts a float to signed integer, with rounding down to zero. FMOVEURZ converts a float to unsigned integer, with rounding down to zero. As the C-Language standard requires rounding down to zero, these two instruction help C compilers to make this common conversion most efficient.
| |
|
|
|