APOLLO CPU Knowledge Forum

Overview

Features

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.

All Topics

News

Performance

Games

Demos

Apollo

Vampire

AROS

Workbench

ATARI

Releases

Performance and Benchmark Results!

Can You Beat This Score?	page 1 2 3

René W. Olsen

Posts 8
10 Nov 2016 22:18

My Evil Sam (PPC440EP 666Mhz) gives me this

-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
1 K Element : 911.87 MB/sec
2 K Element : 937.52 MB/sec
4 K Element : 925.51 MB/sec
8 K Element : 902.44 MB/sec
16 K Element : 323.83 MB/sec
32 K Element : 250.11 MB/sec

Wawa T

Posts 695
11 Nov 2016 00:04

Markus (mfro) wrote:

We still have gcc 2.59 on the Atari platform as well.

It's main advantage is it's size and memory requirements that are moderate enough to run on the limited, old machines themselves.

Other than that, it's hopelessly aged.

when i mentioned gcc > 3.x.x i actually meant cross compilers. it doesnt make much sense to compile huge projects natively on amiga. what concerns syntax analyzer as well as more usable compiler feedback i agree, 6.x.x has improved much in this respect, according to my rather limited experience with coding.

Gunnar von Boehn
(Apollo Team Member)
Posts 6254
11 Nov 2016 01:06

Latest Core Result

32K score = 207 MB/sec

Gregthe Canuck

Posts 274
11 Nov 2016 02:31

Very impressive - roughly 10% speed increase overall.

How do you explain the small difference between the 1K and 32K speeds? Is it simply the fact that the memory controller and look-ahead logic is so smart that exhausting the caches doesn't give a big performance hit? For example the PPC440 tumbles from 911 to 250.

The other consideration I presume is the low clock speed of the core compared to the memory speed.

Looking forward to the Gold 2 core. It's looking pretty nice!


Mr-Z EdgeOfPanic Posts 189 11 Nov 2016 04:47	Nice result again, Gold 2 is getting better and better. Keep up the good work!

Markus (mfro)

Posts 99
11 Nov 2016 06:34

gregthe canuck wrote:

Very impressive - roughly 10% speed increase overall.

My first question would be whether this is a different core or a different benchmark ;). The output looks different from the first attempt.

gregthe canuck wrote:

How do you explain the small difference between the 1K and 32K speeds? Is it simply the fact that the memory controller and look-ahead logic is so smart that exhausting the caches doesn't give a big performance hit? For example the PPC440 tumbles from 911 to 250.

The other consideration I presume is the low clock speed of the core compared to the memory speed.

If you don't see performance degradation on larger sizes, its either caches are still much larger than size or cache speed isn't huge compared to memory speed. I'd assume the latter.

Actually, from the benchmark results in the other thread, one could even assume there are no "real" caches at all but the benefits of movem copies would mainly be caused by memory bursts width (which doesn't need to be a disadvantage, if you can make your memory controller deliver at CPU speed you don't need caches at all).

Thierry Atheist

Posts 644
11 Nov 2016 07:09

Can you imagine an AmigaOne Sam 440EP 666MHz, side by side with an Amiga with a Vampire 2 in it, performing the 32K test...

Where;
the AmigaOne does the test in 10 minutes
and the Apollo Core'd Amiga, does the same amount of calculations in

... 12 minutes and 3 seconds!!!!!

L.O.L.!!!!

(This comment is NOT to disparage the work of Hyperion Entertainment, AOS4.x, A-Eon or ACube Systems. This is a slap in the face to the ABJECT FAILURE of motorola/freescale to produce quality CPUs at REASONABLE PRICES.)

Gunnar von Boehn
(Apollo Team Member)
Posts 6254
12 Nov 2016 11:13

gregthe canuck wrote:

Very impressive - roughly 10% speed increase overall.

Thanks.
We continuously try to improve the core.

Running benchmarks is good for this as sometimes analyzing the behavior of some code gives us new ideas how the core can be improved. E.g ideas on what could be improved on branch prediction...

gregthe canuck wrote:

How do you explain the small difference between the 1K and 32K speeds? Is it simply the fact that the memory controller and look-ahead logic is so smart that exhausting the caches doesn't give a big performance hit?

Yes exactly.
Our memory is fast and the CPU as automatic stream detection including data prefetching.

gregthe canuck wrote:

For example the PPC440 tumbles from 911 to 250.

The other consideration I presume is the low clock speed of the core compared to the memory speed.

The PowerPC can not stream detect and can not automatically prefetch.
The PowerPC is higher clocked but clock by clock Apollo is a lot stronger than the PowerPC.

Gunnar von Boehn
(Apollo Team Member)
Posts 6254
13 Nov 2016 10:56

We analyzed some bottlenecks and improved our branch prediction.

And here is the result!


Mr-Z EdgeOfPanic Posts 189 13 Nov 2016 11:44	That's an impressive improvement again!


Mo Retro Posts 241 13 Nov 2016 13:40	Wow this is a tremendous improvement for the 1K element. :-) Keep up the good work. Love your Workbench flavour. What is it exactly?

B. van Der Meer

Posts 1
13 Nov 2016 14:02

My Evil AmigaONE (PowerPC 460EX 1155MHz) gives me this

-------------------------------------------------------------
SORTBENCH 1.1 (Gunnar von Boehn)
Its a CPU benchmark that stresses CPU, DCache and branch prediction.
-------------------------------------------------------------
1 K Element : 1662.39 MB/sec
2 K Element : 1567.20 MB/sec
4 K Element : 1665.97 MB/sec
8 K Element : 1660.69 MB/sec
16 K Element : 505.42 MB/sec
32 K Element : 414.53 MB/sec

Mr-Z EdgeOfPanic

Posts 189
13 Nov 2016 14:23

Mo Retro wrote:

Wow this is a tremendous improvement for the 1K element. :-)

Keep up the good work.

Love your Workbench flavour. What is it exactly?

Seems like Gunnar is running Magic workbench for the icons and visual prefs for the title bar and windows customization.

Gunnar von Boehn
(Apollo Team Member)
Posts 6254
13 Nov 2016 15:10

Comparing AMIGA 4000 with AMIGA 600+Vampire

 
   
  A4000 - 68040 @ 25MHz
  
   1 K Element :   20.20 MB/sec
   2 K Element :   15.31 MB/sec
   4 K Element :   13.04 MB/sec
   8 K Element :   12.04 MB/sec
  16 K Element :   11.59 MB/sec
  32 K Element :   11.37 MB/sec
  
  A600 - V600 V2 Core 3569 x13
                                (Versus 040@25)
   1 K Element :  340.52 MB/sec (x16.85)
   2 K Element :  343.22 MB/sec (x22.41)
   4 K Element :  344.33 MB/sec (x26.44)
   8 K Element :  344.83 MB/sec (x28.64)
  16 K Element :  278.01 MB/sec (x23.98)
  32 K Element :  242.99 MB/sec (x21.37)

Chris H

Posts 65
13 Nov 2016 18:35

I get this result compiled with gcc 2.95 on gold1 and -O2 parameter.
Is this an expected result for gold1?

For all Elements around 58 MB/sec.

EXTERNAL LINK

Mo Retro

Posts 241
13 Nov 2016 19:45

Mr-Z EdgeOfPanic wrote:

Mo Retro wrote:

Wow this is a tremendous improvement for the 1K element. :-)

Keep up the good work.

Love your Workbench flavour. What is it exactly?

Seems like Gunnar is running Magic workbench for the icons and visual prefs for the title bar and windows customization.

Looks very neat.
Thanks Mr-Z EdgeOfPanic, I use ClassicWB, but i will give Magic WB a try as soon as i have new CF cards :)


Gregthe Canuck Posts 274 14 Nov 2016 04:15	Another big jump in performance - wow! Nice work to all involved.


Gunnar von Boehn (Apollo Team Member) Posts 6254 14 Nov 2016 14:24	We did some minor improvement on memory prefetching

Thierry Atheist

Posts 644
14 Nov 2016 15:02

Numbers from Gunnar's 2 screen grabs above here on this page (not the one right above this post):


   K's                          improvement
Element  : MB/sec  : MB/sec  :   + x%
    1      230.21    340.52      47.9
    2      231.67    343.22      48.2
    4      232.22    344.33      48.3
    8      232.46    344.83      48.3
   16      217.45    278.01      27.9
   32      207.53    242.99      17.1

W O W !

Now, compared to B. van Der Meer's numbers (in a few posts above here):
AmigaONE (PowerPC 460EX 1155 MHz)


                                                               Vampire 2
                                  PPC 460EX     Vampire 2        is x%
   K's    Vampire 2    460EX      x% faster   is like x MHz   as fast as
Element *   MB/sec   * MB/sec  *   than V2  *   PPC 460EX   *    460EX
    1       340.52     1662.39      388.2         236.6           20.5
    2 . . . 343.22 . . 1567.20 . .  356.6 . . .   252.9  .  .  .  21.9
    4       344.33     1665.97      383.8         238.7           20.7 
    8 . . . 344.83 . . 1660.69 . .  381.6 . . .   239.8  .  .  .  20.8
   16       278.01      505.42       81.8         635.3           55.0
   32 . . . 242.99 . .  414.53 . . . 70.6 . . .   677.0  .  .  .  58.6

P.S. Soooooo, ever adjust your blanket on one side of the bed, only to see the other come undone? Does all of this improvement cause other types of actions the CPU does to slow down to some degree?

Markus (mfro)

Posts 99
15 Nov 2016 09:47

Thierry Atheist wrote:

... Soooooo, ever adjust your blanket on one side of the bed, only to see the other come undone? Does all of this improvement cause other types of actions the CPU does to slow down to some degree?

Generally, the more complex a core becomes regarding logic, the more difficult it gets to achieve high clock rates. As I didn't read about a reduced clock, I'd assume the Apollo team managed to handle that?

posts 48	page 1 2 3