Overview Features Coding ApolloOS Performance Forum Downloads Products Order Contact

Welcome to the Apollo Forum

This forum is for people interested in the APOLLO CPU.
Please read the forum usage manual.
Please visit our Apollo-Discord Server for support.



All TopicsNewsPerformanceGamesDemosApolloVampireAROSWorkbenchATARIReleases
Information about the Apollo CPU and FPU.

GCC Improvement for 68080page  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 

Mr Niding

Posts 459
06 Aug 2019 11:57


Gildo Addox wrote:

@Gunnar
 
  Well then, how important is this task? What ist your opinion about how important an optimized GCC for 68k (and the Apollo) is?
 
  Is it a game changer?

Its a gamechanger from the point that there are PC developers that dont have time or the inclination to learn ASM to transfer their programs (or other program) to Apollo/Vampire.


Stefano Briccolani

Posts 586
06 Aug 2019 12:43


Having an optimized C compiler is really important for the entire amiga community. A LOT of software is written in C and can be ported with better results. Having a C compiler AMMX-complaint would boost Vampire-tuned software and put an end to the "there's no software vampire-ehnanced" discussions on forums by vampire bashers

In a perfect world Bebbo would be an Apollo team member


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
06 Aug 2019 13:00


Stefano Briccolani wrote:

Having a C compiler AMMX-complaint

I think that a compiler generating from normal code AMMX SIMD code is a dream scenerio which is not that likely to reach anytime soon.

A lot more realistic is to "generally" improve the created 68K code.
I think GCC could make code which is a little faster, a little smaller. I think we talk here about a goal of 20% speed gain.
This is a lot and will make games like Quake run better.
But its not a game changer which will make modern PC games run on AMIGA.




Gildo Addox

Posts 31
06 Aug 2019 13:46


If it's not a game changer, I doubt that someone will make the effort to build an optimized GCC - unless she/he will get paid.


Roy Gillotti

Posts 517
06 Aug 2019 15:30


Gildo Addox wrote:

    If it's not a game changer, I doubt that someone will make the effort to build an optimized GCC - unless she/he will get paid.
   

    Well Bebbo has been optimizing a 68080 version of GCC in his toolchain, It's essentially what the majority of this 17 page discussion has been about. Just not AMMX optimized.


Kamelito Loveless

Posts 260
06 Aug 2019 16:42


An intermediate solution could be highly optimized hand written 68080 libraries that could be called by any language where speed matter.


Mike Kopack

Posts 268
06 Aug 2019 17:53


Kamelito Loveless wrote:

  An intermediate solution could be highly optimized hand written 68080 libraries that could be called by any language where speed matter.
 

 
  Yeah agreed.
 
  Honestly I would have been happy with something that generates halfway decent 68080 code even if it’s not the most optimal so we would at least then have a starting point by which we could work from and enhance and evolve over time.  Yes it’s a very big undertaking for 1 person. And the community desiring it is very small. But if we consider the vampire as a new platform Amiga for the future, eventually we need to have this sort of capability. Otherwise we will never do anything more than run 040/060 code and not see any of the real architectural benefits that the 080 provides to really stretch its legs.
 
  It’s like running 8086 code on a 486 - it works and yes it’s faster but not as fast as it could be.  When we talk about wanting more modern stuff like web browsers and more advanced games to be ported, we need all the cycles we can get to bring us as close to modern pc performance as we can get, even if it’s still way behind a modern Corei9...


Stefano Briccolani

Posts 586
06 Aug 2019 17:58


I agree with you 100% Mike


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
06 Aug 2019 18:16


Let me clarify something about AMMX.
Today all CPUs have SIMD instruction sets.
For example Motorola ALTIVEC, INTEL SSE, ARM NEON, now also Amiga with AMMX
 
A CPU is designed to do many different types of jobs.
Some jobs are a lot more "work" for a CPU.
The idea of these SIMD instructions is accelerating of some of these heavy duty operations.
 
Typical examples of "work-jobs" are image processing, Photoshop filter, JPEG decoding, or Video decoding.
AMMX is designed to speed these jobs up and also has some acceleration instructions to speed up game rendering.
 
In general all SIMD instruction sets are designed to speed up those "heavy-jobs".
For other-jobs the normal instructions will be used.
In our case the normal 68K instructions will be used.
The normal 68K instruction set is good.
The job of a C-Compiler is to make the best use of these instructions.

Fact is, that the 68K code that GCC generates has room for improvement.
This means is makes sense to tune this, and tuning the normal GCC code generation will benefit all programs and all 68K CPUs.
 
The work Bebbo did the last weeks already showed a nice improvement on the 68k code generation.


Gildo Addox

Posts 31
07 Aug 2019 09:05


@Gunnar

Thanks for clarification- To me, this things are cristal clear.

But my questions have other reasons. I would like to know, if action is needed to support a person or a team to work on GCC for 68k.

I still don't know, if I should start to do anything, to make it possible that work on GCC will be done, or if that's not necessary.

What is your POV?


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
07 Aug 2019 09:47


Gildo Addox wrote:

  @Gunnar
 
  I would like to know, if action is needed to support a person or a team to work on GCC for 68k.
 

 
GCC for AMIGA does exist since decades.
AWEB, NETSURF, QUAKE, DIABLO and many other projects use GCC for AMIGA.
Also on ATARI the 68K GCC is used a lot.

 
The current status is that GCC works, and it can compile programs.
The generated 68K code runs but could be more optimized.
 
68K has a very rich instruction set, with instruction tuned for certain ranges. On 68k you can often reach the same goal and have the option to choose the different instructions for this.
This allows an experienced programmer or good compiler to create smaller and faster code.
 
Lets say D0=8, and A5 point to memory of value=8
One example:
 


  addq.w #8,A0
  addq.l #8,A0
  adda.w #8,A0
  adda.l #8,A0
  adda.l (A5),A0
  adda.l D0,A0
  lea    8(A0),A0
  lea    (A0,D0),A0
 

 
All these instructions will result in the same result in A0.
A programmer and compiler needs to make a good decision based on ther cost, and length to chose the best possible option.
 
 
To improve GCC to that its able to create the best possible code,
you need someone which is
  - a great C coder,
  - understands the GCC internals,
  - is an 68k ASM expert,
  - and is an expert in the CPU architecture and internal design of all 68K models.
  - and has the will to really make GCC better
 
I think this is a task for a small team.
In a team you can brainstorm and discuss different options.
A team can also easier cover all the skills needed.
 
I'm very happy to help with 68k ASM and 68K architecture know-how but I do not know the GCC internals well.


Gildo Addox

Posts 31
07 Aug 2019 11:37


@all

Should we set up a team, as Gunnar proposed - or is there already a team working on GCC?


Grom 68k

Posts 61
07 Aug 2019 23:03


Hi,
 
  I would like to know if the 2 following instructions are similar ?
   
  fmove.d #0x401c000000000000,fp0  EXTERNAL LINK 
  fmove.b #7,fp0
 
  Is the second one smaller (7 vs 12) ?
 
 
  I think I found the function to modify.
 
 

  const char *
  output_move_const_double (rtx *operands)
  {
    int code = standard_68881_constant_p (operands[1]);
 
    if (code != 0)
      {
        static char buf[40];
 
        sprintf (buf, "fmovecr #0x%x,%%0", code & 0xff);
        return buf;
      }
    return "fmove%.d %1,%0";
  }
 

 
  I try to write a code to test if the value is an integer and verify the range. 
 

  double const_d = CONST_DOUBLE_REAL_VALUE(operands[1]);
  if (const_d  == (int)const_d)
  {
    int const_i = (int)const_d);
    if (const_i >= -0x80 && const_i < 0x80)
    {
        static char buf[40];
        sprintf (buf, "fmove%.b #%d,%%0", const_i & 0xff);
        return buf;
    }
    if (const_i >= -0x8000 && const_i < 0x8000)
    {
        static char buf[40];
        sprintf (buf, "fmove%.w #%d,%%0", const_i & 0xffff);
        return buf;
    }
    if (const_i >= -0x80000000 && const_i < 0x80000000)
    {
        static char buf[40];
        sprintf (buf, "fmove%.l #%d,%%0", const_i);
        return buf;
    }
  }
 

 
  Else, how I can test easily bit modifications directly from asm ?
 
 
  Regards


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
08 Aug 2019 06:16


Grom 68k wrote:

  Hi,
     
      I would like to know if the 2 following instructions are similar ?
     
      fmove.d #0x401c000000000000,fp0  EXTERNAL LINK   
      fmove.b #7,fp0
     
      Is the second one smaller (7 vs 12) ?
     
 

 
  Good morning Grom,
 
  Yes, the FPU does support as #immediate input the following types
  #Byte  -- length 2 Byte
  #Word  -- length 2 Byte
  #Long  -- length 4 Byte
  #Single -- length 4 Byte
  #Double -- length 8 Byte
  #Extended -- length 12 Byte (only on 68080! - not supported on 68060)
 
  #Single and #Double input have same speed.
  This means is makes always sense to store inputs as SINGLE if SINGLE can express the number the same way as DOUBLE.
 
  Using BYTE as input has no advantage over using WORD.
  Using WORD for Integer inputs will save even more.
  Using LONG makes sense for Integer Input which are bigger than WORD and can not be expressed by SINGLE.
 
  Integer inputs as BYTE/WORD/LONG add a cycle delay but can reach the same throughput. == 1 FLOPS per MHz
 
  68080 does also support #extended-immediate input - but PLEASE never use #extended. #Extended really is overkill (and very big 12 byte) and as Motorola 68060 did not supports #extended any more - I highly recommend to NOT use it any code.
   


Grom 68k

Posts 61
09 Aug 2019 06:32


Hi,
 
  I found the function exact_real_truncate in gcc/real.c. It will be cross-compilation safe.
 
const char *
  output_move_const_double (rtx *operands)
  {
    int code = standard_68881_constant_p (operands[1]);
 
    if (code != 0)
      {
        static char buf[40];
 
        sprintf (buf, "fmovecr #0x%x,%%0", code & 0xff);
        return buf;
      }
    REAL_VALUE_TYPE r;
    r = *CONST_DOUBLE_REAL_VALUE ((operands[1]);
    if (exact_real_truncate (SFmode, &r))
      return "fmove%.s %f1,%0";

    return "fmove%.d %1,%0";
  }

  I don't find where fmove string is used. I would like to understand why %1 is sometime replaced by %f1. I can check too if operands[1] conversion in single is implicit.
  Alike, I don't find where fadd is generated.
 
  Regards
 
  EDIT: It should be %1 in output_move_const_single instead of %f1. What's why I have a doubt.
 
  ;;- Operand classes for the register allocator:
  ;;- 'a' one of the address registers can be used.
  ;;- 'd' one of the data registers can be used.
  ;;- 'f' one of the m68881/fpu registers can be used
  ;;- 'r' either a data or an address register can be used.


Gunnar von Boehn
(Apollo Team Member)
Posts 6207
14 Aug 2019 07:48


I was looking at ATARI EMUTOS ASM code created by GCC the other day.
The code is full of interesting bits:

Look at this:
LEA (0,A3,D3.L),A3  ;1bbc8: 47f33800

or
LEA (0,A4,D3.L),A4

or

  MOVEA.W D0,A1  ;1bc32: 3240
  MOVE.L A1,D0  ;1bc34: 2009
(A1 is not used otherwise!)

and much more like this



Kamelito Loveless

Posts 260
14 Aug 2019 21:05


Maybe GCC is not the right compiler, maybe DICE is simpler and the source code is available, I also read that the Aztec  C compiler is owned by one guy, maybe we should ask him if he’s willing to sell the it and for how much...
EXTERNAL LINK 


Mike Kopack

Posts 268
14 Aug 2019 22:54


Kamelito Loveless wrote:

Maybe GCC is not the right compiler, maybe DICE is simpler and the source code is available, I also read that the Aztec  C compiler is owned by one guy, maybe we should ask him if he’s willing to sell the it and for how much...
  EXTERNAL LINK 

I think the advantage of doing it with GCC is that you can run that in a cross-compiler configuration on modern computers/OS's, and use your modern IDE/tools to build stuff quicker and then run the generated executable on the Amiga...


Steve Ferrell

Posts 424
15 Aug 2019 02:00


VBCC is probably the next best option.

EXTERNAL LINK


Samuel Crow

Posts 424
15 Aug 2019 06:22


Mike Kopack wrote:

Kamelito Loveless wrote:

  Maybe GCC is not the right compiler, maybe DICE is simpler and the source code is available, I also read that the Aztec  C compiler is owned by one guy, maybe we should ask him if he’s willing to sell the it and for how much...
  EXTERNAL LINK   
 

 
  I think the advantage of doing it with GCC is that you can run that in a cross-compiler configuration on modern computers/OS's, and use your modern IDE/tools to build stuff quicker and then run the generated executable on the Amiga...

LLVM and GCC each have an automatic vectorizers.  Once the AMMX instructions are entered into the compiler, it'll generate vector instructions instead of normal loops.

posts 367page  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19