Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
Performance and Benchmark Results!

X86 Power !page  1 2 3 4 5 6 7 8 9 10 11 

Gunnar von Boehn
(Apollo Team Member)
Posts 6207
10 Nov 2019 19:17


Tim Trepanier wrote:

My problem is that i've been waiting for years to put a Vampire with Gold 3 core in my A1200 and i'm out of patience.

I fully understand your wish.
But one thing you want has NOTHING to do with the other.

You wait for the Vampire1200.
The people working on the Apollo CPU core have ZERO to do with the production  of the Vampire1200 - these are different people.




Jim Drew
Learn who I am!
Posts 67/ 1
10 Nov 2019 22:23


Vojin Vidanovic wrote:

  Its great if Jim is 100% ASM. Teaching him 080 instructions or if he could use VASM (download EXTERNAL LINK ) he could produce improved code of existing "Fusion 3.x/PC.x" and see if any speed improvement happen.
 
  ‘-m68080’
  Generate code for the Apollo Core AC68080 FPGA CPU.

 
I already know the special instructions, extended registers, etc.  I have tried these over the last few years and they really made very little difference because the pipe-lining and cache works so well that even things like chunky to planar conversion didn't have any speed increase when using extra registers and such.
 
I only use Devpac 3.0.  The reason is that this is the only package with a debugger that I can run in and out of Supervisor mode.  The Mac runs in Supervisor mode (always).  I have code like device drivers that run in both Supervisor and User modes. I have to be able to transitions between both modes without hanging the machine.
 
 


Geoff Wells

Posts 43
10 Nov 2019 23:26


Gunnar von Boehn wrote:

Tim Trepanier wrote:

  My problem is that i've been waiting for years to put a Vampire with Gold 3 core in my A1200 and i'm out of patience.
 

 
  I fully understand your wish.
  But one thing you want has NOTHING to do with the other.
 
  You wait for the Vampire1200.
  The people working on the Apollo CPU core have ZERO to do with the production  of the Vampire1200 - these are different people.
 
 

@Tim - I understand and am sorry to hear about your declining health.  I've been following every post back before the end of the Natami forum.  While it has been a very long road I find the ongoing dedication impressive.

To Gunnar's point, I think it is easy to group all of the separate initiatives together as one since the teams are very small.  From the outside looking in, I think there are people working on the 080 core, the SAGA/Amiga chipset, the boards, cases and integrations into different platforms.  As far as I can tell no one is really looking at Atari/MAC/x86 chipsets/cores and the improvements Gunnar is talking about are just good chip design for modern CPUs which must be done to deliver a core superior to the existing Minimig/MISTer cores.

This has been a monumental undertaking for a few hobbyists which is very close to the finish line.  The stand alone with full Amiga capabilities is in final debug which enables all of the accelerator boards.  From what I can see they are focused on the Amiga side while answering some questions for other teams leveraging the great work for other projects like EmuTOS, etc.  I don't think we can begrudge this minimal time spent supporting other projects given how much has been dedicated to Amiga for minimal compensation.

All of the above are my own guesses and I apologize in advance if I got anything wrong.  Regardless of whatever the situation, thanks to the team for the continued effort.  Can't wait for my V4.


Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
11 Nov 2019 07:05


Jim Drew wrote:

    I already know the special instructions, extended registers, etc.  I have tried these over the last few years and they really made very little difference because the pipe-lining and cache works so well that even things like chunky to planar conversion didn't have any speed increase when using extra registers and such.
 

 
  Nice to hear, please release both "very little diff" versions (PCx and Fusion). Keep it on. +few percents is a plus. Add P96 PiP support as possible, and that would be enough for you to claim it is Vampire optimised.

If you could use some of "tricks" Gunnar pointed out, to "less emulate" some x86 functions, that could help too (like Fusion has no CPU emulation).
 


Jim Drew
Learn who I am!
Posts 67/ 1
11 Nov 2019 07:32


Vojin Vidanovic wrote:
   
      Nice to hear, please release both "very little diff" versions (PCx and Fusion). Keep it on. +few percents is a plus. Add P96 PiP support as possible, and that would be enough for you to claim it is Vampire optimised.

 
Well, then I am done.  :)  The differences were about 0.1% in benchmarks.  I had to double-check the executable code to make sure I was really running it and not the original version.  I expected at least a few percent in PCx, but it didn't happen.  I expect 0% with FUSION because it's a 68K program, and the video drivers already draw directly into frame buffer for 256 color mode (which is the mode that almost every game switches to when running).  If Cybergraphics was supported, or one of the high-end video cards (Picasso IV, Piccolo SD64, or Cybervision 64) was emulated by the Vampire, then those would have all of the modes drawn directly into the frame buffer.  The Mac itself is not the limiting factor, it's the video card that is.  You need a card that has 15 bit, 24 bit, and 32 bit (24bit+Alpha channel) video modes in Apple's bitmap layout format in order to draw directly into the frame buffer.  Otherwise, you have to do it with the "refresh" driver.  That driver is about 20x faster when a real MMU is available to be able to determine when data is written into the frame buffer regions (4K blocks).
 
I did look at my video driver source code and see that there is actually 1920x1080 already supported, but it requires a video card that has 8MB on it for more than 256 colors.  We had a couple of prototypes that were sent to us.  These were never released, but the code was kept in place just in case that ever happened.  I am not sure what type of support the SAGA driver has, or if it supports the private vectors that Alex and Tobias added for our Mac emulation.
 


Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
11 Nov 2019 07:35


There is a Apollo non compatibile MMU, but some kind of wrapper would be too much "investment".

Later on, one day, Gunnar might consider m68k comp. MMU add to V4.

He already said Voodo like 3D core will come later to V4, so next Vamp optimised version then.


Jim Drew
Learn who I am!
Posts 67/ 1
11 Nov 2019 07:39


I read that the MMU can only watch one memory location.  That is not going to help at all.  I need the MMU to watch 4MB of memory in 4K blocks and set a 'dirty' bit when something writes to that memory.  Once the changed block is identified, I then compare that region against the last contents and then convert the data (to whatever bitmap order the video card uses) and store it.

If the Apollo MMU could be made to at least watch a "range" of memory (like the 4MB block) by using a SOURCE/DEST register or something, then that would speed up the refresh driver a lot when there was nothing going on (which actually is a lot).  Without a way to identify when the video memory has been changed the refreshed driver has to compare the entire contents of the current screen to the last screen, at the selected refresh rate (30fps is the default).  That's a LOT of data to check, especially because it's only 15bit/24bit/32bit modes that have this requirement.  8 bit (256 color) draws directly into the frame buffer memory so no refreshing is required, and that of course is extremely fast.



Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
11 Nov 2019 07:44


OK, if you release -080 version of the updated emulators, I am in. Hope this means some bugfixing in future too (if needed) and if it goes well, some even never versions.
   
    Hope section will be reopened at forum.
   
    And that this experience and info will help you to optimise further and release other Amiga sw with Vamp versions.
 
  Seems in future kind of limits you face will remain for v2 cores, while for v4 you can negotiate what is needed in fpga to aid, and if its possible and cost/effective (in FPGA LE space also) , team might consider. Beauty of it :) Ah, if A1 series had Cyclone on board :)


Jim Drew
Learn who I am!
Posts 67/ 1
11 Nov 2019 07:53


BTW, I use the MMU (when available) with PCx as well to handle the video refresh.  It would be a lot faster if it didn't spend a huge part of the CPU time refreshing the screen constantly.

So Gunnar... look into making your MMU so it can watch a range of addresses, allowing you to set a START/END.  There would also need to be a register you can check to determine if something WROTE into that range.  Reading I don't care about, as that is handled automatically.  Having this capability would really speed up PCx and FUSION (and anything else that took advantage of this option) for video refreshes.



Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
11 Nov 2019 07:58


Jim Drew wrote:

  BTW, I use the MMU (when available) with PCx as well to handle the video refresh.  It would be a lot faster if it didn't spend a huge part of the CPU time refreshing the screen constantly.
 

 
  Use P96/RTG or directly hit the SAGA chipset (v4 only/limited test with gold3 test on v2).
 
  Any chance of SB16/GUS/AWE32 support in future versions? I like HQ sound.
 


Jim Drew
Learn who I am!
Posts 67/ 1
11 Nov 2019 08:06


I already support P96 with PCx, and it draws directly for MODEX/MODE13 when the video card supports it. Again, it's all about the video card driver for P96 or Cybergraphics.

You can't do true 16 bit DAC stereo audio with PAULA, and AHI takes so much processing time that it kills performance of the emulation.  The issue is that the PC's memory is backwards (little endian) and so all of the data for DMA'ing has to be converted for the DMA buffer to work.  If the AHI device (Toccata or whatever) could handle the data in reverse endian then that would work, but none of the devices can.  It was a stretch to make SB support at all.



Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
11 Nov 2019 08:09


See vampire saga video driver, if you can optimise to it.
     
      Same, gold3 and v4 have the 16-bit Pamela, try hitting it to emu sb or Make mhi type HW assisted ahi drv that uses Pamela and not cpu
  Gunnar said 080 has same endiasness, so kill translation in 080 version, if he acknowledges it.

Finally, launcher mode where rad: with minimum boot up files will be created, vamp restarted and emulator taking over to it's previously choosen os, would be welcome.

    Not now, in future


Jim Drew
Learn who I am!
Posts 67/ 1
11 Nov 2019 08:49


Vojin Vidanovic wrote:

See vampire saga video driver, if you can optimise to it.

The source code is available?  If so, where?  I could at least look at it to see what might be missing.




Vojin Vidanovic
(Needs Verification)
Posts 1916/ 1
11 Nov 2019 11:03


Jim Drew wrote:

    The source code is available?  If so, where?  I could at least look at it to see what might be missing.
 

 
  Get on irc with team, be a pro vamp developer.
 
  Do v2 just 080-int (or v2 FPU precision) no mmu and do your Best for v4-full fpu-pmmu versions.
 
  Good luck and fine results! whatever you Can develop, support we Are Here to buy.
 


Nixus Minimax

Posts 416
11 Nov 2019 13:19


Jim Drew wrote:

Gunnar von Boehn wrote:
Also FUSION is NOT an emulator.
 

 
  That is not true.  The *only* thing not being emulated with FUSION is the CPU core.  Everything else is emulated.  The Mac has a custom chipset, just like the Amiga, and those chips have to be emulated at a hardware level to be fully compatible.

Um, you mean that Fusion and Shapeshifter are not just patching MacOS to run on alien hardware but somehow magically intercept writes to hardware registers that aren't there? How would that even work in detail?



Samuel Devulder

Posts 248
11 Nov 2019 15:52


Just a note about what is possible with SAGA:
 
1) You can allocate somme memory and setup the SAGA registers so that the display will show exactly the piece of memory you've allocated. Using this way, there is no need to use MMU stuff and memory comparison nor copy stuff. All what is written in the chunk of memory will be visible on screen. This is very fast and very easy to program (much simpler and faster than P96/Cybergfx).
 
2) And further more, it is possible to program some other registers so that that that chunk of memory is being displayed as a rectangle onto the WB, for example in front of a simple window you created (this is called Picture in Picture or PiP). This is a very cool feature too.



Stefano Briccolani

Posts 586
11 Nov 2019 17:49


Gunnar von Boehn wrote:

  You wait for the Vampire1200.
  The people working on the Apollo CPU core have ZERO to do with the production  of the Vampire1200 - these are different people.
 

Hi Gunnar,
Many of us are waiting for the v1200 (and v4 too, of course). And this forum covers all Vampire products, as we can see from the "products" tag on top of the forum. Saiyng that Apollo team has Zero involvement with Igor's work appears a bit strange to me.A team usually works together, obstacles can be overridden by team-work.



Nixus Minimax

Posts 416
11 Nov 2019 17:58


Stefano Briccolani wrote:
Saiyng that Apollo team has Zero involvement with Igor's work appears a bit strange to me.

That's not what he said. Reread his statement.



Jim Drew
Learn who I am!
Posts 67/ 1
11 Nov 2019 19:04


Nixus Minimax wrote:
  Um, you mean that Fusion and Shapeshifter are not just patching MacOS to run on alien hardware but somehow magically intercept writes to hardware registers that aren't there? How would that even work in detail?
 

Yes, that is correct.  The dual 6522 VIAs, 85C30, 53C80, ADB, ASC, SWIM, NuBus card slots, etc. are all emulated.  You have to in order to be compatible.  Originally, I created an EMPLANT board for this.  If you use an EMPLANT board with FUSION (you can select it in the setup menu), the Mac emulation is even faster because the hardware is real at that point and does not have to be emulated.




Jim Drew
Learn who I am!
Posts 67/ 1
11 Nov 2019 19:09


Samuel Devulder wrote:

Just a note about what is possible with SAGA:
 
1) You can allocate somme memory and setup the SAGA registers so that the display will show exactly the piece of memory you've allocated. Using this way, there is no need to use MMU stuff and memory comparison nor copy stuff. All what is written in the chunk of memory will be visible on screen. This is very fast and very easy to program (much simpler and faster than P96/Cybergfx).

That would be great if the Mac's video buffer was some nice normal RGB type configuration.  It is only in 8 bit (256 color) mode.  The 15/15/24/32 bit modes all use Apple's bitmap color ordering.

Samuel Devulder wrote:
 
  2) And further more, it is possible to program some other registers so that that that chunk of memory is being displayed as a rectangle onto the WB, for example in front of a simple window you created (this is called Picture in Picture or PiP). This is a very cool feature too.

Yeah, FUSION already has a driver called P96PIPVideo that does this using the P96 system.


posts 216page  1 2 3 4 5 6 7 8 9 10 11