Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
Information about the Apollo CPU and FPU.

Polygon Pushing Performance of the 080page  1 2 3 4 5 6 7 8 9 

Dusko Kovacic

Posts 1
20 Oct 2018 17:13


"""why do you have such a hard on for V4 everytime somebody mentions anything you bang on about the v4... the v2 is here now cant you stop for just one second and let us all enjoy V2 - go put your dick in a V4 when they are actually on sale... I wont stop you lol"""

Go to aw.net forum and surf through forum history and read his posts.
Nickname, vox.

7032 days of restriction
14 abuse report(s)

that says a lot

;)

The guy was like popup commercials. Most of posts, most of  topics
( no matter what discussion was about ) somehow ended on his anti
CUSA animosity.

Now he is here, in most of topics. Most of time.  I feel the light version of Déjà vu only

here V4 is currently his merit expertise fetish.

The beautiful diversity of life and all of its forms... ;)

( please admins remove this post freely but i could not resist to
point onto same pattern).

:)




Steve Ferrell

Posts 424
20 Oct 2018 17:37


Gunnar von Boehn wrote:

Steve Ferrell wrote:

  so my question to Gunnar and others reading this is as follows.
 

 
  I agree that "3D geometry transformation" is an important part of 3D game calculation.
  Also very important part is the "line rasterization".
 
  For the 3D geometry using Floating point has several advantages.
 
  The FPU is designed for floating point calculation.
  APOLLO 68080 has the most powerful FPU of all 68K CPUs.
  So we have a working solution here.
 
  For "line rasterization" we are working on a New-Subunit of SAGA which does this accelerated. You can imagine this as an build-in VOODOO.

Thank you Gunnar.  Hinting at VooDoo-like capabilities opens up some new possibilities in regard to 3D API's.  Depending on how close your VooDoo implementation is to the original 3dfx, Glide comes to mind as does an OpenGL MCD (Mini Client Driver) or later possibly even an ICD driver.  The future looks bright!


Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
20 Oct 2018 18:00


Steve Ferrell wrote:

  Thank you Gunnar.  Hinting at VooDoo-like capabilities opens up some new possibilities in regard to 3D API's.  Depending on how close your VooDoo implementation is to the original 3dfx, Glide comes to mind as does an OpenGL MCD (Mini Client Driver) or later possibly even an ICD driver.  The future looks bright!
 

 
  Surely, looking forward to seeing this and Pamela (16-bit audio) as biggest advancements in Amiga chipset so far.
 
 
Dusko Kovacic wrote:

  Go to aw.net forum and surf through forum history and read his posts.
  Nickname, vox.
  7032 days of restriction
  14 abuse report(s)

Yes, that is me.
Says a lot of AW.net moderation.
Not even 7000 days and falling, its frozen timer.
Also bronse supporter, from 2004. or so.

Same happened to A.org after AEON takeover. Clamsing of non OS4 fanboys,allowing promo of such companies as C-USA.
 
And yes, CUSA deserved it all, there is a blog on it outliving them.
  EXTERNAL LINK


Andrew Copland

Posts 113
20 Oct 2018 18:23


Actually there's a good piece of info that I'd forgotten about in that link:
EXTERNAL LINK 
It won't get close to a hardware rasteriser but you can do SIMD rasterisation of triangles, meaning 2 or possibly 4.

Some other pipelining might give a decent speedup if you're using a MxN block based rasteriser design similar to what we looked at for Natami.


Louis Dias
(Needs Verification)
Posts 55/ 1
21 Oct 2018 23:16


Steve Ferrell wrote:

  @Lou Dias
 
 

    I suggested AKIKO since it already included a function to assist with Amiga graphics...so why not add more.
   
    Since a lot of these hardware 3D functions are simply software converted to hardware - having the team see a fully implemented API is the only way *the team* will be able to then convert those low level software functions into faster hardware functions.  Again - you had a reading comprehension issue.
   
    You are correct...there is a trend here...
 

 
  I read, comprehend and remember quite well.  Now you're being dishonest.  Even as far back as July you were saying this to Matt Hey over at AmigaWorld.net
 
  EXTERNAL LINK   
  "@matthey
 
  AMMX is great but it still needs to be in a highly parallelized custom chip with access to chip ram that can process 256 triangles at once to compete with N64/PS1/Saturn/Jaguar era machines...
 
  Kind of a SuperAkiko that can do C2P on 1024 bytes at once along with AMMXx256... Apparently, I'm an A-hole for bringing this up..."
 
  Again, the AKIKO has nothing to do with 3D geometry or pushing polygons. Nor does AMMX need to placed into a custom chip.  C2P has nothing to do with 3D transforms.  And yes, you described yourself quite well in your comment to Matt.
 

 
  You are truly a child.
  Well, now you are being dishonest by taking things out of context.
  To create an enhanced *Amiga chipset* solution for 3D, there is no other option than enhancing the original [AGA] chipset.  People called the C2P AKIKO useless on anything other than an 020, but if you created a highly parallelized version of it that could C2P 1024 bytes at a time instead of 4 bytes.  I am correct in this regard.  Because you can add shading functions to the AKIKO and anything else you want if you creates a "super" version of it.  Copper is just a texture mapping unit and somewhat of a simple shader [HAM].  So proper shader [fixed pipeline] functions could be adding to an existing chip (ie Copper or Gary/AKIKO, et al, it doesn't matter and that's why you clearly aren't too bright).  So I was thinking a new Amiga/Vampire could have a true SuperAGA chipset rather than a RTG one built in.  I was thinking to do it the Amiga-way by continuously enhancing the Amiga chipset.
  I realize the team won't do this for compatibility reasons now.  A shame really.  Hence SAGA uses an RTG driver.
 
  Re:AMMX in the cpu for 3D gaming
  You'll never push more than 10,000/s and have a playable game.  The cpu will be flooded with polygon code that there won't be much more time for other code to run.  This is why a separate dedicated chip solution is the only way.
 
  You can ignore the evolution of my posts but it just continues to look like your comprehension extends to 1 post.  You don't comprehend the evolution of the posts from one to the next.
 
  Feel free to take another one out of context and continue to look foolish.  Feel free to reply and continue to express your immaturity...

...and newflash, duncecap, they are adding transformation functions to SAGA, ie dedicated hardware.



Steve Ferrell

Posts 424
22 Oct 2018 00:25


Louis Dias wrote:

     
Steve Ferrell wrote:

        @Lou Dias
       
       

          I suggested AKIKO since it already included a function to assist with Amiga graphics...so why not add more.
         
          Since a lot of these hardware 3D functions are simply software converted to hardware - having the team see a fully implemented API is the only way *the team* will be able to then convert those low level software functions into faster hardware functions.  Again - you had a reading comprehension issue.
         
          You are correct...there is a trend here...
       

       
        I read, comprehend and remember quite well.  Now you're being dishonest.  Even as far back as July you were saying this to Matt Hey over at AmigaWorld.net
       
        EXTERNAL LINK         
        "@matthey
       
        AMMX is great but it still needs to be in a highly parallelized custom chip with access to chip ram that can process 256 triangles at once to compete with N64/PS1/Saturn/Jaguar era machines...
       
        Kind of a SuperAkiko that can do C2P on 1024 bytes at once along with AMMXx256... Apparently, I'm an A-hole for bringing this up..."
       
        Again, the AKIKO has nothing to do with 3D geometry or pushing polygons. Nor does AMMX need to placed into a custom chip.  C2P has nothing to do with 3D transforms.  And yes, you described yourself quite well in your comment to Matt.
       

       
        You are truly a child.
        Well, now you are being dishonest by taking things out of context.
        To create an enhanced *Amiga chipset* solution for 3D, there is no other option than enhancing the original [AGA] chipset.  People called the C2P AKIKO useless on anything other than an 020, but if you created a highly parallelized version of it that could C2P 1024 bytes at a time instead of 4 bytes.  I am correct in this regard.  Because you can add shading functions to the AKIKO and anything else you want if you creates a "super" version of it.  Copper is just a texture mapping unit and somewhat of a simple shader [HAM].  So proper shader [fixed pipeline] functions could be adding to an existing chip (ie Copper or Gary/AKIKO, et al, it doesn't matter and that's why you clearly aren't too bright).  So I was thinking a new Amiga/Vampire could have a true SuperAGA chipset rather than a RTG one built in.  I was thinking to do it the Amiga-way by continuously enhancing the Amiga chipset.
        I realize the team won't do this for compatibility reasons now.  A shame really.  Hence SAGA uses an RTG driver.
       
        Re:AMMX in the cpu for 3D gaming
        You'll never push more than 10,000/s and have a playable game.  The cpu will be flooded with polygon code that there won't be much more time for other code to run.  This is why a separate dedicated chip solution is the only way.
       
        You can ignore the evolution of my posts but it just continues to look like your comprehension extends to 1 post.  You don't comprehend the evolution of the posts from one to the next.
       
        Feel free to take another one out of context and continue to look foolish.  Feel free to reply and continue to express your immaturity...
     
      ...and newflash, duncecap, they are adding transformation functions to SAGA, ie dedicated hardware.
     
     

   
   
Evolution?  You've got to be joking as your comments haven't evolved at all here or at AmigaWorld.  Your comment to Matt Hey back in July at AmigaWorld was so nonsensical that he didn't even bother responding.  In a later exchange with him in late Sept. he was nice enough to take the time to explain the following to you, and I quote, "The Apollo core is restricted by the small resources of affordable FPGAs. It's not difficult to add multiple cores or a wider SIMD unit (speaking of hardware limitations not Gunnar's ISA limitations). It makes little sense to move dedicated functionality to other chips. As I've said before, it is better to move everything closer into one SoC, perhaps with a separate FPGA for versatility." Unquote.
     
EXTERNAL LINK 
And here's what you said on Sept. 27th, "my whole point is that for a *future* V5 Gunnar *should* create/add 256x [aka multiply by 256 but run in parallel] AMMX to a custom chip. I suggested a SuperAkiko chip since the current one is useless in it's current form... It would be wholly developed in-house so it would not require any 3rd party hardware."
     
And in spite of you making this offensive comment about Gunnar, "I repeat - his hubris is thinking the cpu can do it all.", Gunnar has been polite and patient enough to explain the same to you, yet you persist in regurgitating the same response of "co-processor" and "move AMMX to a separate chip" when the question arises as to how we can speed up 3D processing on the Vampire.
     
Have you ever wondered why no one has asked Intel to move their MMX/SSE/SSE2/SSE3/SSE4 instruction set onto a separate chip?  Or why no one talks of moving the PPC AltiVec instruction set onto a separate chip?
     
Answer:  Because it makes absolutely no sense.  This applies to the 080 as well.
     
So unless you have some insight into how the current Apollo-core/080 can be best utilized to push polygons, I suggest you move on, as it's very tiring entertaining your hardware engineering fantasies that are grounded in 1985.
     
The best way to accomplish 3D transforms on the 080 is via the AMMX/MMX PMADDWD instruction.
     
Here are a couple links that explain the implementation in a fairly straightforward manner: 
  EXTERNAL LINK      https://www.csie.ntu.edu.tw/~b6506050/Introduction/Mapping_to_MMX_Instruction_Set/mapping_to_mmx_instruction_set.html
     
I'm sure there are better examples out there if Googled.  And here's a newflash for you.  Incorporating VooDoo-like capabilities into an existing FPGA as a sub-unit of SAGA is NOT adding hardware and Gunnar isn't moving AMMX to an external chip.
 
And all this without me once calling you a name.
     


Stephan Hamers

Posts 22
22 Oct 2018 01:11


AMEN


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
23 Oct 2018 06:51


Gentlemen, this discussion went to much off-topic again.

For naming calling we had now to banned one person from the forum.


Kef Emzy

Posts 50
23 Oct 2018 08:28


Gunnar von Boehn wrote:

  For "line rasterization" we are working on a New-Subunit of SAGA which does this accelerated. You can imagine this as an build-in VOODOO.

Does that imply basic 3d acceleration and texture mapping? :c) Using the fpu for 3D geometry transformation and the "VOODOOO" subunit to handle texture mapping etc?



Matthew Burroughs

Posts 59
23 Oct 2018 19:27


Gunnar von Boehn wrote:

Gentlemen, this discussion went to much off-topic again.
 
  For naming calling we had now to banned one person from the forum.

This thread is almost getting as bad as Amigaworld.net

Some concentrated bile that i hoped would never infest this Forum.


Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
23 Oct 2018 19:35


Matthew Burroughs wrote:

  This thread is almost getting as bad as Amigaworld.net
 
  Some concentrated bile that i hoped would never infest this Forum.

Dont worry, one bad apple is out, and I will not mess with what I dont know enough.

Coders, proceed with ASM magic and 3D.



Steve Ferrell

Posts 424
24 Oct 2018 17:20


Kef Emzy wrote:

 
Gunnar von Boehn wrote:

    For "line rasterization" we are working on a New-Subunit of SAGA which does this accelerated. You can imagine this as an build-in VOODOO.
   

   
    Does that imply basic 3d acceleration and texture mapping? :c) Using the fpu for 3D geometry transformation and the "VOODOOO" subunit to handle texture mapping etc?
   
 

 
I think you're on the right track Kef.  But AMMX can still handle most of the 3D operations and even much of the rasterization depending on how comprehensive Gunnar's implementation is.  The PMULHUW (Packed Multiply High Unsigned Word) instruction is useful in 3D rasterization because it operates on unsigned pixel values.  The MINx and MAXx instructions are useful for clamping (saturating) color values in 3D geometry and rasterization, as are PFMAX (Packed Floating-Point Maximum) and PFMIN (Packed Floating-Point Minimum). They can also be used to avoid branching.


Daniel Sevo

Posts 299
26 Oct 2018 22:45


@Steve Ferrell If we look at the software side of things and find an implementation that makes the most sense from a practical point of view... meaning so that Wazp3d could be properly optimized.. (also, we know we wanna run Q2 with mipmapping on this thing :-)

How would you break down a basic rendering pipeline so that it makes the best use of what we have in the 080:
1. CPU (running code)
2. FPU (geometry calc, transformations..)
3. AMMX (or do you do scale, translation, clipping, camera transform, projection etc here?)
4. "Super3dBlitter"  (Sort of old Natami specs) now seemingly containing "Voodoo" heretige so.. (Rasterization... triangle raster, depthsorting, texture filtering...)



Matthew Burroughs

Posts 59
26 Oct 2018 23:36


shame someone can't make a short demo showing the Vampire processing power.




Steve Ferrell

Posts 424
27 Oct 2018 00:22


@Daniel Sevo

I think Gunnar gave us a strong hint when he said "something like a built-in VooDoo".  The original generation 1 VooDoo card didn't do vertex transforms and left that to the CPU but it did do z-buffering and texture mapping.
     
Its pipeline looked like this:
     
-----------------------CPU----------------------------------------------------------|GPU---
                                                     
Vetrex Transforms->Primitive Assembly->Rasterization & Interp->|-Raster Ops->Frame Buffer
                               
     
I suspect that Gunnar's pipeline will look very similar.


Daniel Sevo

Posts 299
27 Oct 2018 01:15


Steve Ferrell wrote:

@Daniel Sevo
 
  I think Gunnar gave us a strong hint when he said "something like a built-in VooDoo".  The original generation 1 VooDoo card didn't do vertex transforms and left that to the CPU but it did do z-buffering and texture mapping.
     
  Its pipeline looked like this:
     
  -----------------------CPU----------------------------------------------------------|GPU---
                                                       
  Vetrex Transforms->Primitive Assembly->Rasterization & Interp->|-Raster Ops->Frame Buffer
                                 
     
  I suspect that Gunnar's pipeline will look very similar.

Well, yea, but the "CPU" in our case is actually CPU+FPU+AMMX right?
I dont remember the hardware requirements for running a Voodoo but it didnt require SIMD (MMX) as ppl with 486 could run it. Not sure about the FPU though.. (In reality of course most ppl with Voodoo had Pentium I or II, not 486 but technically... ;-)



Steve Ferrell

Posts 424
27 Oct 2018 01:44


Daniel Sevo wrote:

Steve Ferrell wrote:

  @Daniel Sevo
 
  I think Gunnar gave us a strong hint when he said "something like a built-in VooDoo".  The original generation 1 VooDoo card didn't do vertex transforms and left that to the CPU but it did do z-buffering and texture mapping.
       
  Its pipeline looked like this:
       
  -----------------------CPU----------------------------------------------------------|GPU---
                                                       
  Vetrex Transforms->Primitive Assembly->Rasterization & Interp->|-Raster Ops->Frame Buffer
                                 
       
  I suspect that Gunnar's pipeline will look very similar.
 

 
  Well, yea, but the "CPU" in our case is actually CPU+FPU+AMMX right?
  I dont remember the hardware requirements for running a Voodoo but it didnt require SIMD (MMX) as ppl with 486 could run it. Not sure about the FPU though.. (In reality of course most ppl with Voodoo had Pentium I or II, not 486 but technically... ;-)
 

Yes, exactly.  So an 080 with AMMX coupled with a texture mapper and z-buffer should yield some pretty decent 3D results. 


Andrew Copland

Posts 113
27 Oct 2018 13:39


Steve Ferrell wrote:

  @Daniel Sevo
   
    I think Gunnar gave us a strong hint when he said "something like a built-in VooDoo".  The original generation 1 VooDoo card didn't do vertex transforms and left that to the CPU but it did do z-buffering and texture mapping.
       
    Its pipeline looked like this:
       
    -----------------------CPU----------------------------------------------------------|GPU---
                                                         
    Vetrex Transforms->Primitive Assembly->Rasterization & Interp->|-Raster Ops->Frame Buffer
                                   
       
    I suspect that Gunnar's pipeline will look very similar.
 

 
  Your pipeline is wrong:
       
  --------------------CPU------------------|GPU--------------------------------------------------------
  Vertex Transforms->Primitive Assembly->Rasterization & Interp->|-Raster Ops->Frame Buffer
 
  It's that rasterisation & interpolation of attributes + texture operations that's expensive and is what 3DFX accelerated

EDIT: getting this shit to line up is hopeless.

3DFX pipeline responsibility sharing
CPU:
  Vertex Transforms
  Primitive Assembly
GPU:
  Rasterization & Interp
  Raster Ops
  Frame Buffer

The software using Glide on the original 3DFX Voodoo would have to transform all of the transforms, clipping etc and deciding what triangles to submit (primitive assembly) on the CPU.

The "GPU" would then rasterise those triangles individually, fetch textures and handle all mipmapping and other related operation, and blend the results of any raster ops into the onboard framebuffer.

That's not too complicated to code, but it's a LOT of operations to process which is why moving it off the main CPU, even onto a 2nd CPU is better than nothing: EXTERNAL LINK for example using the GP2X and having the 2nd ARM cpu acting as the GPU.


Steve Ferrell

Posts 424
27 Oct 2018 17:04


Andrew Copland wrote:

 
Steve Ferrell wrote:

    @Daniel Sevo
     
      I think Gunnar gave us a strong hint when he said "something like a built-in VooDoo".  The original generation 1 VooDoo card didn't do vertex transforms and left that to the CPU but it did do z-buffering and texture mapping.
         
      Its pipeline looked like this:
         
      -----------------------CPU----------------------------------------------------------|GPU---
                                                           
      Vetrex Transforms->Primitive Assembly->Rasterization & Interp->|-Raster Ops->Frame Buffer
                                     
         
      I suspect that Gunnar's pipeline will look very similar.
   

   
    Your pipeline is wrong:
         
    --------------------CPU------------------|GPU--------------------------------------------------------
    Vertex Transforms->Primitive Assembly->Rasterization & Interp->|-Raster Ops->Frame Buffer
   
    It's that rasterisation & interpolation of attributes + texture operations that's expensive and is what 3DFX accelerated
 
  EDIT: getting this shit to line up is hopeless.
 
  3DFX pipeline responsibility sharing
  CPU:
    Vertex Transforms
    Primitive Assembly
  GPU:
    Rasterization & Interp
    Raster Ops
    Frame Buffer
 
  The software using Glide on the original 3DFX Voodoo would have to transform all of the transforms, clipping etc and deciding what triangles to submit (primitive assembly) on the CPU.
 
  The "GPU" would then rasterise those triangles individually, fetch textures and handle all mipmapping and other related operation, and blend the results of any raster ops into the onboard framebuffer.
 
  That's not too complicated to code, but it's a LOT of operations to process which is why moving it off the main CPU, even onto a 2nd CPU is better than nothing: EXTERNAL LINK for example using the GP2X and having the 2nd ARM cpu acting as the GPU.
 

 
Ah, yes.  You're correct on the pipeline.  Good catch.
 
Please, let's not turn this thread into another outlandish hardware wish list for more co-processors and added CPUs.  And for goodness sake's, the Vampire already runs rings around a classic Amiga graphically and computationally and there's no reason why it can't achieve rough parity graphically with a Playstation 1.  Yes, adding a real, dedicated GPU would speed things up dramatically, no one disputes that.  But Gunnar has to work within the limitations of existing FPGAs and his current board design.  That has been explained over and over but for some reason there are those who just don't want to accept that.


Samuel Crow

Posts 424
27 Oct 2018 18:36


Steve Ferrell wrote:
Please, let's not turn this thread into another outlandish hardware wish list for more co-processors and added CPUs.  And for goodness sake's, the Vampire already runs rings around a classic Amiga graphically and computationally and there's no reason why it can't achieve rough parity graphically with a Playstation 1.  Yes, adding a real, dedicated GPU would speed things up dramatically, no one disputes that.  But Gunnar has to work within the limitations of existing FPGAs and his current board design.  That has been explained over and over but for some reason there are those who just don't want to accept that.

The 68080 already has a second thread for blitter emulation.  Adding to its duties should not hurt much.

posts 161page  1 2 3 4 5 6 7 8 9