Kore
(Kha's C++ sister project)
krafix
(Shader cross-compiler)
kraffiti
(Image compression tool)
g2/g4
(2D/3D Graphics API)
Kha
Returns
and it is faster
OpenFL
Unity
Unreal
Speed
-
Runtime Performance
- Rendering
- Load Times
- ...
-
Development Workflow
- Debugging
- Compiling
- ...
Rendering Performance
Bunnymark
Big caveats:
Only draws the same image again and again
No browser tests yet
Why is NME/OpenFL fast?
for (i in 0...numBunnies) {
// ...
drawList[index] = bunny.position.x;
drawList[index + 1] = bunny.position.y;
}
tilesheet.drawTiles(graphics, drawList, false);
Why is Kha fast?
for (i in 0...bunnies.length) {
g.drawImage(bunnies[i].texture, bunnies[i].x, bunnies[i].y);
}
Why is Kha fast?
I use profilers
Profilers
-
hxScout
-
Visual Studio, Xcode, ...
-
Intel VTune, ...
Why is Kha fast?
- Inline Constructors
-
32 bit floats
-
SIMD
SIMD
- Single Instruction, Multiple Data
- SSE was introduced in 1999 (MMX in 1997)
- Support in C# added in 2014
- Support in JavaScript in development
- Support in C/C++ not in sight (use compiler intrinsics)
kha.simd
- SIMD api similar to simd.js
- Maps to SSE and Neon where available
- Maps to scalar code everywhere else
Next steps in runtime performance?
A comprehensive preformance test suite
Loading Data
Typically the biggest data set and also the biggest optimization potential:
Images
Loading Images
Worst possible strategy for a local application:
Load PNG files
- Has to be uncompressed before used
- Uncompression is slower than disk read speeds
Loading Images
Better strategy for a local application:
- Use light compression which decompressed faster than disk speeds (Snappy)
- Use compressed texture formats when possible
- Preprocess as much as possible
khafiles
// uses snappy for non-web targets
project.addAssets('Assets/**');
khafiles
// tries to use compressed textures
project.addAssets('Assets/**', { quality: 0.7 });
khafiles
project.addAssets('Assets/**', { quality: 0.7, size: 0.5 });
if (target == Target.Windows)
project.addAssets('Assets/**', { name: 'hq-{name}' });
khafiles
project.addTarget('windows-steam', ['Windows', 'Direct3D11']);
if (target == 'windows-steam') {
// ...
}
khafiles
project.addShaders('Shaders/**', {
name: 'bla-{name}', defines: ['bla=1']
});
Shader News
- Tesselation Shaders
- Compute Shaders
-
Automatic compilation and use of shader variants
- Specialized shaders for number of supported textures
- Specialized shaders for instanced rendering backwards compatibility
G5
- Vulkan/Direct3D12 style API
- G4 on G5 (first version in Kore)
- G5 on G4 (coming soon)
- Lower level control, no magic performance boost compared to G4
-
Considerable, magic performance boost in G2 coming up
-
Texture atlas usage outdated when using G5
Audio
- Object based surround audio
- Dolby Atmos/DTS:X support in research
- Surround simulation on headphones available
(only htm5 currently, based on hrtf-panner-js)
Native Performance
It's a lie
Your Flash and OpenFL experience does not apply
Haxe/JavaScript regularly outperforms Haxe/C++
| Haxe/JavaScript | Haxe/C++ |
Code optimization | + | ++ |
Garbage Collection | ++ | - |
Number types | - | - |
Introspection | - | + |
Chrome
- Fast code execution for Haxe thanks to V8
- WebGL's feature set is severely outdated
- WebGL is not as fast as it could be because of security
- Still no compute shader support
Krom
- Fast code execution for Haxe thanks to V8
- No WebGL restrictions
- Speed is priority 1
- Compute shaders coming
Krom
- Alternative for deployment
- Typically faster for unoptimized code
- Exception: iOS
Krom
- Special development features
- Soon to be the new default debugger in Kode Studio
Kode Studio
- Fork of Visual Studio Code
-
Uses the vshaxe extension
- Developed in tandem with Haxe 3.3
-
Uses JavaScript for Haxe debugging
- Much faster compilation than C++
- Much better feature set and runtime performance than Flash (especially with Krom)
- Improvements in Haxe 3.3