Kore
(Kha's C++ sister project)

krafix
(Shader cross-compiler)

kraffiti
(Image compression tool)

g2/g4
(2D/3D Graphics API)

Kha

Returns

                                                and it is faster

                                                      OpenFL

                                                         Unity

                                                        Unreal

 

 

Speed

  1. Runtime Performance
    1. Rendering
    2. Load Times
    3. ...
  2. Development Workflow
    1. Debugging
    2. Compiling
    3. ...

Rendering Performance

Bunnymark

Big caveats:
Only draws the same image again and again
No browser tests yet

Why is NME/OpenFL fast?


for (i in 0...numBunnies) {
	// ...
	drawList[index] = bunny.position.x;
	drawList[index + 1] = bunny.position.y;
}
tilesheet.drawTiles(graphics, drawList, false);
					

Why is Kha fast?


for (i in 0...bunnies.length) {
	g.drawImage(bunnies[i].texture, bunnies[i].x, bunnies[i].y);
}	
					

Why is Kha fast?

I use profilers

Profilers

  • hxScout
    • Shows data on Haxe level
  • Visual Studio, Xcode, ...
    • Shows data on C++ level
  • Intel VTune, ...
    • Shows data on CPU level

Why is Kha fast?

  • Inline Constructors
  • 32 bit floats
    • kha.FastFloat
  • SIMD
    • kha.simd

SIMD

  • Single Instruction, Multiple Data
  • SSE was introduced in 1999 (MMX in 1997)
  • Support in C# added in 2014
  • Support in JavaScript in development
  • Support in C/C++ not in sight (use compiler intrinsics)

kha.simd

  • SIMD api similar to simd.js
  • Maps to SSE and Neon where available
  • Maps to scalar code everywhere else

Next steps in runtime performance?

A comprehensive preformance test suite

Loading Data

Typically the biggest data set and also the biggest optimization potential:

Images

Loading Images

Worst possible strategy for a local application:

Load PNG files

  • Has to be uncompressed before used
  • Uncompression is slower than disk read speeds

Loading Images

Better strategy for a local application:
  • Use light compression which decompressed faster than disk speeds (Snappy)
  • Use compressed texture formats when possible
  • Preprocess as much as possible

khafiles


// uses snappy for non-web targets
project.addAssets('Assets/**');
					

khafiles


// tries to use compressed textures
project.addAssets('Assets/**', { quality: 0.7 });
					

khafiles


project.addAssets('Assets/**', { quality: 0.7, size: 0.5 });
if (target == Target.Windows)
	project.addAssets('Assets/**', { name: 'hq-{name}' });
					

khafiles


project.addTarget('windows-steam', ['Windows', 'Direct3D11']);
if (target == 'windows-steam') {
	// ...
}
					

khafiles


project.addShaders('Shaders/**', {
	name: 'bla-{name}', defines: ['bla=1']
});
					

Shader News

  • Tesselation Shaders
  • Compute Shaders
  • Automatic compilation and use of shader variants
    • Specialized shaders for number of supported textures
    • Specialized shaders for instanced rendering backwards compatibility

G5

  • Vulkan/Direct3D12 style API
  • G4 on G5 (first version in Kore)
  • G5 on G4 (coming soon)
  • Lower level control, no magic performance boost compared to G4
  • Considerable, magic performance boost in G2 coming up
    • Texture atlas usage outdated when using G5

http://luboslenco.com/notes/

Audio

  • Object based surround audio
  • Dolby Atmos/DTS:X support in research
  • Surround simulation on headphones available
    (only htm5 currently, based on hrtf-panner-js)

Native Performance

It's a lie

Your Flash and OpenFL experience does not apply

Haxe/JavaScript regularly outperforms Haxe/C++

Haxe/JavaScriptHaxe/C++
Code optimization+++
Garbage Collection++-
Number types--
Introspection-+

Chrome

  • Fast code execution for Haxe thanks to V8
  • WebGL's feature set is severely outdated
  • WebGL is not as fast as it could be because of security
  • Still no compute shader support

Krom

  • Fast code execution for Haxe thanks to V8
  • No WebGL restrictions
  • Speed is priority 1
  • Compute shaders coming

Krom

  • Alternative for deployment
  • Typically faster for unoptimized code
  • Exception: iOS

Krom

  • Special development features
  • Soon to be the new default debugger in Kode Studio

Kode Studio

  • Fork of Visual Studio Code
  • Uses the vshaxe extension
    • Developed in tandem with Haxe 3.3
  • Uses JavaScript for Haxe debugging
    • Much faster compilation than C++
    • Much better feature set and runtime performance than Flash (especially with Krom)
    • Improvements in Haxe 3.3

Kode Studio

Demo