Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
Performance and Benchmark Results!

Talk About Sysinfopage  1 2 3 

Gunnar von Boehn
(Apollo Team Member)
Posts 6207
10 Jun 2019 14:09


Some result of Sysinfo


 
This is how Sysinfo results could look like....
 
 
 
 


Roger Andre Lassen

Posts 150
10 Jun 2019 14:10


I´ll take it :-)


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
10 Jun 2019 14:14


Sysinfo uses a Benchmark code to measure "Dhrystone"
But Sysinfo does not the official "Dhrystone" routine for this.
 
Sysinfo uses a "selfmade" routine.
This selfmade code has a "flaw" which prevents it from reaching official=correct values for 68060 CPU.
 
This flaw can easily be fixed - without changing any values for 68000/68020/68030/68040 ...
 
The above Sysinfo version uses such "fix".


Andy Hearn

Posts 374
10 Jun 2019 16:40


so, basicly, my understanding of the original issue "back in the day", is that Sysinfo code couldn't understand, or didn't execute code that could be "branched" to effectively use the dual execution units in the 060, so you only got the results from one integer ALU.

so what we're seeing here are the numbers from branching code?
loving the 100+MFLOPs btw. :D


Niclas A
(Apollo Team Member)
Posts 219
10 Jun 2019 16:48


Nice :)
Can someone with a 060 also run this version so we can see how it behaves for that too?


Hugo Pereira

Posts 72
10 Jun 2019 16:52


Gunnar von Boehn wrote:

  Sysinfo uses a Benchmark code to measure "Dhrystone"
  But Sysinfo does not the official "Dhrystone" routine for this.
   
  Sysinfo uses a "selfmade" routine.
  This selfmade code has a "flaw" which prevents it from reaching official=correct values for 68060 CPU.
   
  This flaw can easily be fixed - without changing any values for 68000/68020/68030/68040 ...
   
  The above Sysinfo version uses such "fix".
 

 
  This SysInfo benchmark was obtained with core 68080 (x16) - 113 MHz?


Andy Hearn

Posts 374
10 Jun 2019 17:09


Niclas A wrote:

Nice :)
  Can someone with a 060 also run this version so we can see how it behaves for that too?

that's not a bad idea. i'll see if i can dust off the A3k this evening. question, is this version of sysinfo in a coffin release?


Przemyslaw Tkaczyk

Posts 155
10 Jun 2019 17:35


Hugo Pereira wrote:

This SysInfo benchmark was obtained with core 68080 (x16) - 113 MHz?

This is x15 (106MHz).


Hugo Pereira

Posts 72
10 Jun 2019 18:42


Przemyslaw Tkaczyk wrote:

Hugo Pereira wrote:

  This SysInfo benchmark was obtained with core 68080 (x16) - 113 MHz?
 

 
  This is x15 (106MHz).

Seems to me to be a terrific result with x15 (106 MHz).


Daniel Sevo

Posts 299
10 Jun 2019 20:01


In "Old" sysinfo, a 060 @50MHz would get about 37,000 drystones,@~65MHz about 50,000 drystones..


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
10 Jun 2019 20:17


Our Sysinfo for you to test
CLICK HERE 
Please share your score for other 68K chips


Andy Hearn

Posts 374
10 Jun 2019 21:37


Cool, Thanks for that.
Ok here we go, lowest to highest
(test system: amiga 3000, 2Mb chip, 16Mb zip ram, 256meg zorram, CV64/3D, x-surf100+rapidroad, scsi2idebridge+CFadapter+32gig sandisk)


68030@25Mhz + 68882@25Mhz (onboard CPU/FPU)
Mips:4.34    MFlops:0.66    Dhrystones:4159
 
68040@25Mhz + onboard FPU (C= A3640)
Mips:19.37  MFlops:4.74    Dhrystones:18564
 
68060@50Mhz + onboard FPU (Cyberstorm Mk3 with 128Mb)
Mips:39.29  MFlops:28.02  Dhrystones:37648

so yeah, interesting result, but I was thinking that the 060 results would be about double that :) ;)


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
10 Jun 2019 21:57


Andy Hearn wrote:

so yeah, interesting result,


cool results!
Thanks for this!


Andy Hearn

Posts 374
10 Jun 2019 22:03


no worries. I've got other machines I can pull together if needed, but I think that's a good spread. It's the only machine I can currently kit out with an '060, which was kinda the point of the original idea I feel :)


Adam A

Posts 130
11 Jun 2019 11:30


EXTERNAL LINK       
        I did a test earlier today 060 at stock clocks using sysinfo v4.0
 

 


Peter Slegg

Posts 22
11 Jun 2019 13:59


So the benchmark is roughly 5x the 68060. Not sure what clock speed of the '060 example is but it is an impressive result however you look at it.




Andy Hearn

Posts 374
11 Jun 2019 15:30


Adam's clock as per sysinfo's reporting, is the same as mine (4mhz :D ), and given his results plus his "stock clocks", i'd think it's pretty safe to say he's actually running at 50Mhz too.

bear in mind that the 060 results here are still indicative of running on one execution unit, so has the potential to double up on the reported numbers if it was more "060-aware". But other side of the coin, i guess all non-060 aware software is going to behave like this, so to my mind it's a fairly honest representation.

Adam, just for my curiousity, what 060 board is that?


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
11 Jun 2019 15:52


Andy Hearn wrote:

  bear in mind that the 060 results here are still indicative of running on one execution unit, so has the potential to double up on the reported numbers if it was more "060-aware".
 

 
Normally software will automatically use both pipes.
If SETPATCH is run both pipes on 68060 should be active.
 
The 68060 is a very good CPU and can execute in theory up to 2 instruction per cycle.
But the 68060 has also some first overlooked weaknesses.
One limit is that the 68060 can only load 4 per cycle from Icache.
4 Byte is less than the 68040 could do and unfortunately limits the real power that the 68060 could in theory reach.

 
These 2 instruction the 68060 can execute in 1 cycle


ADDQ.L  #1,D1  -- 2 Byte instruction
ADDQ.L  #2,D2  -- 2 Byte instruction

 
 
Here an example of 2 instruction which the 68060 can unfortunately NOT can execute in 1 cycle

  ADD.L  #$100,D1  -- 6 Byte instruction
  ADD.L  #$200,D2  -- 6 Byte instruction
 

The problem is here, that the instructions are to long.
The 68060 will need 3 Cycle to fetch the 2 instructions from iCACHE.

Motorola did quickly identify this ICache limitation as bottleneck of the 68060 and wanted to fix it in the planned 68060-B chip.
Motorolas plan was to increase this to 8 byte per cycle.
Unfortunately the 68060-B never came to market.
 
 
 
The APOLLO 68080 also addresses this limit.
The 68080 can read 16 Byte per cycle from Icache this helps a lot to improve performance.
 
 

The SYSINFO benchmark code
is a mix of instructions, many of them are 6 byte each.
This might be the reason the 68060 scores relative low result.

Another explanation could be that the Sysinfo code is a main loop calling several subroutine.
While 68060 has good BRANCH prediction,
it has no real HW- acceleration for SUBROUTINE calls.
APOLLO 68080 is the first 68K family CPU which adds a modern Subroutine HW acceleration. This feature also helps a little in SYSINFO and in real live.




Andy Hearn

Posts 374
11 Jun 2019 16:15


Aha! good info - i did not know that! explains a lot.
On the 060, need to use 2byte instructions to keep both the pipes full. otherwise you end up with a pipe bubble as big as however many clocks it takes to get those instructions loaded, and then the execution units sitting "idle" while this happens.

really shows the need for bigger L1 caches.
even in your example, the 080 looks like it could come close to needing another clock cycle.
i guess you've not seen enough waiting, to need to have the iCache more than 16bytes - but 16 certainly sounds like a big improvement over 4!

i wonder if there are any PGA 060B's made it outside of Moto's test labs :D


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
11 Jun 2019 16:23


Andy Hearn wrote:

  really shows the need for bigger L1 caches.
 

 
 
The Icache of the 68060 is 8192 Byte size.
These 8KB size is pretty good and has very good hitrate with most programs.
 
The only problem is here, that the Icache can only "copy" 4 byte of its content to the CPU per cycle.
Depending on the Program pointer address sometime even only 2 Byte per cycle.  (after a Jump)
 

Apollo for comparison has 16 KB (16384 Byte) Icache.
Apollo can "copy" 16 byte pf these per cycle to the CPU.
Apollo Icache is special designed to ALWAYS provide 16 byte - even after a Jump.

As most 68K instruction are 2-8 Byte long the 16 Byte per cycle does allows to feed to CPU very good.
 
The Dcache of APOLLO is even bigger its 32 KBYTE size on the V2
And has 64 KB size on the V1200 and V4.
 

As mentioned the SYSINFO test does a lot Subroutine calls.
These jumps will on the 68060 also trigger cases were it only get 2 Byte for 1 cycle. Thinking about it this explains the 68060 scores.

 
 

posts 49page  1 2 3