Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
Information about the Apollo CPU and FPU.

68080 Vs Wazp3D

Daniel Sevo

Posts 299
01 Sep 2017 14:39


Since Alain's Wazp3D code is not ancient but relatively modern, what feature, present or currently missing, in 68080 would be of most benefit for an easy optimization of Wazp3d.
I know Alain has said before he has no time to put into this but if the core would have features that made some of the heavier routines easy to optimize it might attract some other talented programmer.

Wild guess on my behalf would be some more SSE-like features (or AMD 3d Now) but could it be done in any other way that the 68080 dev team would find "elegant" and doable without too much sweat.

Speaking of 3d...
From the old Natami specs:
----
    * BLITTER:
      Fully compatible with AGA-AMIGA
      Speed of SuperBlitter is many times faster than AMIGA AGA blitter
      comparisons will follow.
    * 2D Enhancements:
      In Chunky/Hicolor/ or Truecolor pixel mode the Super Blitter can "cookie cut" copy bobs.
      Like the AMIGA Sprites, a "cookie cut" copy needs no blitting mask.
      This enabled the blitter to use less memory and to work at double speed compared to planar blits.

    * 3D Accelerator:
      Texturemapping Blitter enhancement.
      Provides Mipmapping, Subpixel 4 way interpolation, Light sources, Antialiasing
---

We can see here there was a clear idea of expanding the blitter to do 3d stuff. I wonder, is that still one of the goals that survived the test of time or would it be wiser to simply go for some sort of Warp3d implementation and let the 68080 do the "heavy lifting"?


Andrew Copland

Posts 113
01 Sep 2017 14:51


There was a plan for "Tami" to do 3D, it would have been a separate chip essentially. We wrote an early "emulated" version of the design but it was never made into hardware.

I have some of the code somewhere I'm sure, including an very early version of the command processor that setup internal state.

All done in 'C' though as the others were hard at work on the N68050 which in turn became Apollo/68080 all these years later.


Vojin Vidanovic

Posts 770
01 Sep 2017 17:45


Daniel Sevo wrote:

Since Alain's Wazp3D code is not ancient but relatively modern, what feature, present or currently missing, in 68080 would be of most benefit for an easy optimization of Wazp3d.

Wazp3D for 080 and SAGA RTG optimized. You mean what is not AMMX accelerated? If something is AMMX accel. wazp3D should use it. We need it to run that little 3D Amiga68k has.



Daniel Sevo

Posts 299
01 Sep 2017 19:24


Vojin Vidanovic wrote:

Daniel Sevo wrote:

  Since Alain's Wazp3D code is not ancient but relatively modern, what feature, present or currently missing, in 68080 would be of most benefit for an easy optimization of Wazp3d.
 

 
  Wazp3D for 080 and SAGA RTG optimized. You mean what is not AMMX accelerated? If something is AMMX accel. wazp3D should use it. We need it to run that little 3D Amiga68k has.
 

Probably the other way around, Yes - Wazp3D could probably be partially accelerated by AMMX, but my question was really, what other features could the devs add to the 68080 core to really make Wazp3d fly. On WinUAE Wazp can use some hardware acceleration.



Vojin Vidanovic

Posts 770
01 Sep 2017 19:32


Daniel Sevo wrote:

    Probably the other way around, Yes - Wazp3D could probably be partially accelerated by AMMX, but my question was really, what other features could the devs add to the 68080 core to really make Wazp3d fly. On WinUAE Wazp can use some hardware acceleration.
 

 
  It still the same - optimized wa(Z)rp3D should be one of teams 2018 software goals (since no one else knows how to code for 080)
 
  Dont look just after CPU, SAGA chipset should be able to do some work too, once core 3 is out.
 
  It would be best if warp3D could be overall hardware accelerated by SAGA/080 skipping Wazp3D at all, or implementing it within ROM/as transparent layer if access to real w3d is limited due to licensing.
 
  There are not many w3d apps - but that is last part to be conquered after AGA is done. Strange to have RTG that kind of came almost last, and then AGA and w3d, but these are Vampire ways.
 
  If donations are needed, I am for it.

Note that StormMESA has an AGA driver, once core 3 goes beta,
someone might test it.

 


Thellier Alain

Posts 141
02 Sep 2017 08:41


Hello

Most has been said already in this thread
CLICK HERE  Also Norbert Kett & Andreas Timmermann have worked on MiniGL and Zbuffer

IMHO what will need to be done the fastest way is

1) Acess pixels the fastest way possible in a flexible manner so same code will work on 15/16/24/32 bits screen/textures
So something like
RGBA32a=ReadPixel(texture0,u,v);
RGBA32b=ReadPixel(texture1,u1,v1);
WritePixel(bitmap,x,y,RGBA32);
with u,v,x,y fixed size 16.16 numbers

2) mix/blend color the fastest way possible
RGBA32c=mix(RGBA32a,RGBA32b,operator);
with operators like src_alpha/one_minus_alpha and all others GL combination

3) Interpolate linearly some values the fastest way possible
Something like
LONG x,y,z,u0,v0,wO,u1,v1,w1,r,g,b,a;
LONG dx,dy,dz,dw0,du0,dv0,du1,dw1,dv1,dr,dg,db,da;

loop on pixels count
{
x=x+dx;
y=y+dy;
z=z+dz;
u0=u0+du0;
v0=v0+dv0;
w0=w0+dw0;
r=r+dr;
g=g+dg;
b=b+db;
a=a+da;
u1=u1+du1;
v1=v1+dv1;
w1=w1+dw1;
process_pixel();
}

Alain Thellier - Wazp3D




Daniel Sevo

Posts 299
03 Sep 2017 22:41


Thank you Alain,
  Yes now that you mention it I even commented on that thread myself some months ago. ;-)
 
  Reason I bring it up again is due to the recent announcement of Vampire 4 with a bigger Cyclone 5 FPGA meaning that not only will the FPU fit, there is also room for new instructions /extensions (I hear now they talk about AMMX2).. So then I thought it would be useful to know how to create the best conditions for optimizing Wazp 3d for the Apollo core.
  Since it may well turn out its easier to add new useful instructions/extentions to the CPU than to spend the time optimizing the code for the *current* feature set of the CPU.
 


Vojin Vidanovic

Posts 770
03 Sep 2017 22:57


Daniel Sevo wrote:

    Since it may well turn out its easier to add new useful instructions/extentions to the CPU than to spend the time optimizing the code for the *current* feature set of the CPU.

So could Wazp3d for SAGA be "hardcoded" in the FPGA? :-)



Thellier Alain

Posts 141
04 Sep 2017 10:17


>Wazp3d for SAGA be "hardcoded" in the FPGA? :-)
<Conditional mode. I dont have a Vampire. I am not in Apollo team. Nothing done yet>
Not really : It will still be a 680x0 program but with some new fast 68080 instructions can be much faster than a simple 680x0 Wazp3D so perhaps can allow to have some simple 3D programs on the Vampire

Alain


Vojin Vidanovic

Posts 770
04 Sep 2017 11:19


thellier alain wrote:

  <Conditional mode. I dont have a Vampire. I am not in Apollo team. Nothing done yet>
  Not really : It will still be a 680x0 program but with some new fast 68080 instructions can be much faster than a simple 680x0 Wazp3D so perhaps can allow to have some simple 3D programs on the Vampire

I know, hope v4 will change that. Anything is a start. Hope one day you can do some AMMX2 080 fork of wazp3d so we can enjoy what is left of 68k 3D and at least have some start towards OpenGL


Gunnar von Boehn
(Apollo Team Member)
Posts 6197
04 Sep 2017 13:35


Daniel Sevo wrote:

Probably the other way around, Yes - Wazp3D could probably be partially accelerated by AMMX,

Its mainly a rasterizer software.
AMMX can be greatly used for this.

AMMX has special instructions for texture / alpha blending - which accelerate this by a factor of 30-40 times faster compared to normal 68K code.




Vojin Vidanovic

Posts 770
04 Sep 2017 14:02


Gunnar von Boehn wrote:

  AMMX has special instructions for texture / alpha blending - which accelerate this by a factor of 30-40 times faster compared to normal 68K code.

And fast SoftW3D would be great. Hope you will put in to do list :-)


Gunnar von Boehn
(Apollo Team Member)
Posts 6197
04 Sep 2017 14:19


Vojin Vidanovic wrote:

And fast SoftW3D would be great. Hope you will put in to do list :-)

We are right now finishing the AMMX2 instruction documentation.

And we can provide you people with coding examples, like the Dragon Crown. To show how to code stuff which was completely impossible on AMIGA before.

But as Jay Miner did not write each AMIGA game for you in person - we will not write every piece of software for you.


Thellier Alain

Posts 141
04 Sep 2017 14:19


As mentionned in the other thread
CLICK HERE  Everything AMMX or 68080 that can make this Cow3D example program faster
EXTERNAL LINK  and especially the FillPoly() & DrawEdge() functions will be profitable for 3D support

Note that in a real 3D prog the count of fields to linearize may change depending on states (textured,gouraud,multitextured,etc...)
x y z u v w = 6 fields
x y z u0 v0 w0 u1 v1 w1 r g b a  = 13 fields
etc...

Alain Thellier - Wazp3D


Vojin Vidanovic

Posts 770
04 Sep 2017 14:33


Gunnar von Boehn wrote:
 
      But as Jay Miner did not write each AMIGA game for you in person - we will not write every piece of software for you.
   

   
    I am glad documentation will be finally out. Yes, you should focus on 080 and SAGA/Drivers pack, but need to provide both docs and some programming language of higher level that can abuse it.
   
    I do understand what you say, and if you provide Vamp to Alain, am willing to donate towards 080 optimized Warp3D. Such approach on ciritical things should be done and some bounty system setup like e.g. Power2People for AROS.
   
    The same goes for AROS 68k, Linux 68k, TOS/Mint and Mac Classic or any surviving 68k apps that could be vampire optimized (e.g. YAM, Final Writter etc.).
    These are all great 080 options that could be run natively, but withOUT above mentioned support are highly unlikely to happen. If some approach is not found, we might end like A1s - great hw on paper with few software to abuse it.
   
    Would be best if team would do few needed compatibility things, in general interest, just because its likely that will yield highest performance. But I do understand there is lot to do for v4 and core 3.

posts 15