Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
Information about the Apollo CPU and FPU.

68080 "Transistor Count"page  1 2 

John Heritage

Posts 111
05 Feb 2017 20:45


Hey Just curious - how many transistors would the 68080 (CPU core only + appropriate caches + FPU) have roughly if fabricated as an ASIC?
 
  Motorola
  68000 = 68,000 (16-bit bus/32-bit internal)
  68010 = ??,??? (+5-10% per clock perf, Virtual Machine support)
  68020 = 190,000 ("full 32-bit")
  68030 = 273,000 (cache improvements+MMU+better bus)
  68040 = 1,200,000 (fully pipelined+FPU, bigger caches)
  68060 = 2,500,000 (superscalar integer, +clock speed)
 
  Apollo 68080 =


M Rickan

Posts 177
05 Feb 2017 22:30


That's a really interesting question.

Do we also know the ballpark number of transistors in the original custom chips?


Nixus Minimax

Posts 416
06 Feb 2017 09:02


I'd estimate it's at least 10 times that of an 060, probably more. The pipelines support all instructions (except multicycle), there are more instructions supported, lots of special tricks like bonding and fusing, the FPU is fully pipelined, most instructions execute in a single cycle, caches are much bigger (and store predecode info), there are two memory buses, hazard detection, more registers and the SIMD additions.

As for the custom chips, they probably had a low transistor count as they were fabricated in an ancient micrometer-range NMOS technology with huge feature sizes. Since it was NMOS, they also needed lots of long-channel transistors for the load transistors of the gates which made transistor density even worse.




Markus (mfro)

Posts 99
06 Feb 2017 16:52


Nixus Minimax wrote:

  I'd estimate it's at least 10 times that of an 060, probably more.

 
  Would doubt that.
 
  I did not find any numbers for the Cyclone III, but with the largest of their Stratix IV (last year's) flagship devices, Altera claims to be able to squeeze the equivalent of up to 15 million ASIC logic gates into it. Let's make that 10 million (because of marketing overdo, because they surely counted the DSP elements on the chip that might not be of much use for m68k purposes and last but not least, because it makes the math much easier ;) ).
 
  These Stratix IV devices contain 820000 LEs.
  The Cyclone III used in the Vampire contains about 40000 (probably less efficient) LEs.
 
  According to the rule of propotion, this equals to about 500000 logic gates if the Cyclone would be 100 % full.
 
  That estimation is probably highly inaccurate, but hey, it's at least an estimate.
 
 


Nixus Minimax

Posts 416
06 Feb 2017 17:15


Well, according to your estimate the C3 couldn't even hold an 060 since a basic CMOS logic gate has four transistors and thus the 40kLEs we have would only be worth 2M transistors.
 
  My estimation relies on technical arguments, yours on marketing numbers. Which one seems more reliable? ;)
 
 


Markus (mfro)

Posts 99
06 Feb 2017 17:31


Nixus Minimax wrote:

    My estimation relies on technical arguments, yours on marketing numbers. Which one seems more reliable? ;)
 

 
  I understand you'd presume Motorola marketing to understate their numbers?
 


John Heritage

Posts 111
07 Feb 2017 01:14


m rickan wrote:

That's a really interesting question.
 
  Do we also know the ballpark number of transistors in the original custom chips?

OCS/ECS Agnus+Paula+Denise = ~ 60,000
AGA Agnus+Paula+Lisa = ~ 120,000

This wouldn't include *all* of the support chips of course.  OCS was originally fabbed on a 5.0 micron process; not sure about ECS.  AGA Lisa was fabbed on a 1.5 micron process, and had 80,000 transistors for that chip alone, according to what I could find.

For historical perspectives, by 1982 Intel was already at the 1.5 micron process..  ~10 years ahead of AGA Lisa..

Bonus:  AAA was estimated at 750,000 in 4 chips (32-bit) or ~ 1M in 6 chips (64-bit) configs. 


M Rickan

Posts 177
07 Feb 2017 03:18


John Heritage wrote:

  OCS/ECS Agnus+Paula+Denise = ~ 60,000
  AGA Agnus+Paula+Lisa = ~ 120,000
 
  This wouldn't include *all* of the support chips of course...

It's just mind boggling to imagine how these chips were developed and tested in that era.


John Heritage

Posts 111
16 Feb 2017 15:19


Markus (mfro) wrote:

    These Stratix IV devices contain 820000 LEs.
    The Cyclone III used in the Vampire contains about 40000 (probably less efficient) LEs.
   
    According to the rule of propotion, this equals to about 500000 logic gates if the Cyclone would be 100 % full.
   
    That estimation is probably highly inaccurate, but hey, it's at least an estimate.
   
   

Does 1 logic gate = 1 transistor?


Markus (mfro)

Posts 99
16 Feb 2017 16:18


John Heritage wrote:
 
  Does 1 logic gate = 1 transistor?

In the simplest case, yes.

This: EXTERNAL LINK is also an interesting read regarding the topic.



John Heritage

Posts 111
16 Feb 2017 20:26


Markus (mfro) wrote:

 
John Heritage wrote:
 
    Does 1 logic gate = 1 transistor?
 

  In the simplest case, yes.
 
  This: EXTERNAL LINK is also an interesting read regarding the topic.
 
 

 
  That is interesting and appreciated
 
  So I suppose a good chunk of the 68040 and 68060 would be the FPUs, and probably caches that the Apollo core may not have..  That means:
 
  68020 = 190,000
  68030 = 273,000
  Apollo 68080 integer core = ~ 300,000-400,000
  68040 = 1,200,000
  68060 = 2,500,000
 
  Yet the 68080 unquestionably as higher IPC than 68040 and => IPC as 68060 (from what i've seen on benchmarks). 
 
  68030 added a MMU and small data caches (meaning the MMU is probably worth ~ 50,000 transistors). 
  68040 has 8kb of caches, a MMU, and a FPU, and is fully pipelined
  68060 has 16kb of caches, a MMU, and a more powerful FPU, is superscalar 2x.
 
  I'm really curious now what kind of FPU the Apollo team will fit on the Cyclone III and will it outperform the 040 and 060 FPUs?  If even equal performance, doing all that with ~ 500,000 transistors seems pretty amazing! 
 
 
 


M Rickan

Posts 177
17 Feb 2017 05:46


This is pretty fascinating stuff.

Ok, so a very basic question from a hardware newb:

When (hypothetically) moving from FPGA to an ASIC, there are a number of services that advertise "automated FPGA-to-ASIC conversion with Zero NRE."

Even in best case scenarios, that sounds a little too good to be true.

For something like the 68080 would transitioning require much heavy lifting?


Nixus Minimax

Posts 416
17 Feb 2017 11:56


John Heritage wrote:
So I suppose a good chunk of the 68040 and 68060 would be the FPUs, and probably caches that the Apollo core may not have..

 
  The 080 has 16K+32K cache and the 060 8K+8K.
 
 
 
  68020 = 190,000
    68030 = 273,000
    Apollo 68080 integer core = ~ 300,000-400,000
    68040 = 1,200,000
    68060 = 2,500,000

 
  Do you count AMMX as a part of the integer core? Remember there are far more Registers and execution units (independent pipeline stages for address and data calculation), the instruction cache caches predecoded instructions, the second pipeline supports almost all instructions, there is a link stack, more complex branch prediction, two memory Controllers, hazard detection...
 
   
 
I'm really curious now what kind of FPU the Apollo team will fit on the Cyclone III and will it outperform the 040 and 060 FPUs?  If even equal performance, doing all that with ~ 500,000 transistors seems pretty amazing!

 
  The 080 FPU is a three stage pipeline design which can schedule one FPU operation per clock cycle. This means 26 MFLOPS peak for a standard x11 release core and is a lot faster than the 060 FPU.
 
 


John Heritage

Posts 111
17 Feb 2017 15:23


m rickan wrote:

This is pretty fascinating stuff.
 
  Ok, so a very basic question from a hardware newb:
 
  When (hypothetically) moving from FPGA to an ASIC, there are a number of services that advertise "automated FPGA-to-ASIC conversion with Zero NRE."
 
  Even in best case scenarios, that sounds a little too good to be true.
 
  For something like the 68080 would transitioning require much heavy lifting?

I'm not an expert, but --

Keep in mind the conversion of FPGA to ASIC is a lot more than just the layout and following transistor rules for a given manufacturing process.  You need to do a lot of steps to produce your first mask.  On an old-but modern for 68K process like 130nm, the mask cost alone is $150K USD.  On top of that you need to then pay for/authorize an initial small production run so you can check out defects, and possibly create a second mask and one more batch run before you produce your first viable chip ("re-spin"). 

OTOH the chip would probably be pretty small on one of these mature processes so yields would be very high if it ever went that way. 

EXTERNAL LINK  has an article on a crowd funded effort that raised $360K and produced a chip on 110nm with TSMC...  so there may be ways to do it for ~ $150-300K USD.. 


John Heritage

Posts 111
17 Feb 2017 15:31


Nixus Minimax wrote:

   
    The 080 has 16K+32K cache and the 060 8K+8K.
   
    Do you count AMMX as a part of the integer core?
 

 
  Good info - I wasn't sure whether the cache was needed due to clock speeds on the FPGA (I'm naive here), but good info.  Yes, I was including AMMX in that estimate of estimates, but that's a good point that there is a lot of additional functionality in the INT core vs. even the 060...  which makes the 300-400k transistor count for the performance still seem really low (and/or very impressive)..  I'm wondering if the Integer pipeline of 68080 is a lot shorter than 060 to explain the transistor count difference?
   
   
Nixus Minimax wrote:

    The 080 FPU is a three stage pipeline design which can schedule one FPU operation per clock cycle. This means 26 MFLOPS peak for a standard x11 release core and is a lot faster than the 060 FPU.
 

 
  OK, that's great info - if I'm reading my diagram correctly, the FPU pipeline of the 060 is also very short at only 2-4 stages.  (Fetch, Execute, "data Available", "write back"). 
 
  One other item consuming transistors on the 060 is the power gating/management..  According to the manual, it powers down functional blocks that are not needed on a clock by clock basis, and there's also some way for the OS to interface with the chip for this too..    I'm guessing the initial Apollo 68080 does not have functionality like this?


Nixus Minimax

Posts 416
17 Feb 2017 17:51


Nixus Minimax wrote:

  The 080 FPU is a three stage pipeline design which can schedule one FPU operation per clock cycle. This means 26 MFLOPS peak for a standard x11 release core and is a lot faster than the 060 FPU.

Guess I shouldn't write forum posts while chatting with a colleague. Since the FPU is pipelined, it of course does 78 MFLOPS peak, not only 26.



Gunnar von Boehn
(Apollo Team Member)
Posts 6197
17 Feb 2017 18:12


John Heritage wrote:

    but that's a good point that there is a lot of additional functionality in the INT core vs. even the 060... 
   

  Yes, the 68080 is much more advanced than the 68060.
   
   
John Heritage wrote:

    which makes the 300-400k transistor count for the performance still seem really low (and/or very impressive)..
   

 
  Translating LE to Transsistors can not easily be done.
  But I can tell you that the L1 caches alone are about 4.000 K transistors.
   


John Heritage

Posts 111
17 Feb 2017 21:51


Gunnar von Boehn wrote:
 
  Translating LE to Transsistors can not easily be done.
  But I can tell you that the L1 caches alone are about 400 K transistors.

Understood -- Are using DRAM of some kind as a cache then?    SRAM is typically 4 transistors per bit.  I should have realized 400,000 transistors total was very silly given the cache alone.

(I'm not an engineer but I do love understanding micro architectures as much as I can :).)

Do you have any rough estimate for how many transistors a 68080 with FPU might occupy in an ASIC?  Just curious..


M Rickan

Posts 177
18 Feb 2017 02:47


John Heritage wrote:

  Do you have any rough estimate for how many transistors a 68080 with FPU might occupy in an ASIC?  Just curious..

I'd be interested in getting a sense of the difference in performance as well.

Again, purely for curiosity.


Heyden M

Posts 7
11 Mar 2017 05:42


I would take a guess and say 3.0-4.0 million transistors with FPU. This would put it in the same area as the original Pentium/Pentium Pro for transistor count.

It seems the 68080 performance is better per clock and would scale similar to a PIII. Though, the PIII had a transistor count of 10-45 million. Something interesting about the PIII was that regardless of the increase in transistor count from revision to revision actual performance was about the same clock for clock. And... it's been a long time... but I don't remember anything substantially changing about the PIII with each revision (and addition of transistors) so I can't speak to that.

So, if we hypothetically compare a PIII 1Ghz and Gold Core 1Ghz performance would be similar (but the Gold core would have about a 1/4-1/10 of the transistors). This is amazing.

Also, something else to point out is that performance hasn't really gone anywhere in the x86 world since the PIII clock for clock... thread for thread.

Also, this is all my opinion I'm not trying to upset anyone.

posts 29page  1 2