A fellow named Nicholas Gramlich has put some effort into creating a pretty nice 2D Android game engine called Andengine. It is similar to other libraries like Cocos2d or Rokon but from what i saw so far Nicholas is way ahead in terms of implemented features as well as overal API design. What’s lacking currently is documentation but he makes up for it in shear example count. He’s using libgdx for the box2D wrapper (which i still don’t extract from libgdx itself). The feature i’ll probably steal is the Mod player he put out as an extension. He’s also pretty active on his forums (i wonder how long he can sustain that, no summer internship? :)). I only briefly looked over his code and some things could need a bit of reworking, especially when it comes to performance. He’s done a great job so far and anyone who’s looking for an integrated solution should check out his work. Here’s a video of a livewallpaper demonstrating his particle engine:

Anyone who’s overwhelmed by the “down to the metal” approach of libgdx should seriously consider this library for his/her next game. Badlogic Games seal of approval 🙂

Maybe i can get him to refactor his thingy a little so it can run on top of libgdx. That would be awesomesauce.

Update: i tried the benchmarks included in the examples. Some of them are promising, others need some more work. All testing performed on a N1, Android 2.2. The Sprite Benchmark renders 1000 sprites at 15.9 fps. Each sprite is going through a full glPush/glXXX/glPop cycle everytime it’s rendered, where glXXX ranges from 1 to a shitload of transforms. That’s going to kill anything, batching would help a lot. I’m actually surprised that it works that well in that case, but i attribute much of it to the JIT. I need to test it on the hero but i expect around 2-3fps.

Next i tried the particle benchmark, 200 particles which scale and change color. 25fps on my N1. Same problem as before, the missing batching is killing that poor thing. And that’s on a N1, for first gen devices that’s going to be completely useless. Christoph showed me a benchmark today for his amarena2d particle system using libgdx’s SpriteBatch class emitting 1000 particles, blended and dynamic at 45fps on my N1, on a HTC Hero, Android 1.5 around 12FPS.

This is of course not a final verdict, but it seems that Andengine is mainly aimed at second gen devices. It could improve performance quiet a bit on those too but that would need a major redesign of the rendering architecture. Still worth checking out!

Android FloatMath revisited

Today i implemented a class called FastMath which offers the same static methods as the Android FloatMath class. Behind the scenes there’s a singleton instance which can be user defined that actually implements the math functions such as cosine, sine or sqrt. All the math code in libgdx now uses that static class to delegate those calculations to platform dependent optimal versions, like FloatMath on Android.

I have two implementations for the FastMath class, a standard implementatin using the Math class from the standard library as well as an Android specific implementation that uses the FloatMath class internally. When calling a method of the FastMath class it gets delegated to the singleton instance which means we have one additional call to a static method, another call to the singleton method, e.g. AndroidFastMath.cos(), and finally the call to the actual implementation, e.g. Math.cos() or FloatMath.cos(). That’s a total of 3 calls to get the cosine of an angle or a square root.

A request for enhancement over on the issue tracker requested this so we get maximum performance on all platforms. I had the suspicion that this wouldn’t work out from the beginning due to the function call overhead. The design is as slick as possible so i believe that the three method calls are the minimum.

Now here’s some hard data. I tested the methods Math.cos, Math.sqrt, FloatMath.cos, FloatMath.sqrt and FastMath.cos/FastMath.sqrt (which internally use FloatMath) on my HTC Hero (no FPU, Android 1.5) and my Nexus One (FPU, Android 2.2). Each function was executed a million times with a variable argument. Here’s the results for the Hero:

FloatMath.cos(): 10.42688 secs
FloatMath.sqrt(): 2.8764648 secs
Math.cos():9.445556641 secs
Math.sqrt(): 3.755310058 secs
FastMath.cos(): 13.469482422 secs
FastMath.sqrt(): 5.175018311 secs

Wow, now that is silly. Granted this is only a single run and FloatMath.cos() might still outperform Math.cos() when averaged over multiple runs. However, the big loser is my FastMath class which uses FloatMath internally. The function call overhead is immense so nothing is gained here really. How’s the situation on the Nexus One?

FloatMath.cos(): 0.70428467 secs
FloatMath.sqrt(): 0.23391724 secs
Math.cos(): 0.45666504 secs
Math.sqrt(): 0.133239742 secs
FastMath.cos(): 0.910583498 secs
FastMath.sqrt(): 0.48376465 secs

First we see the FPU and the JIT doing there magic. I don’t think that the JIT actually plays a big role here, it’s probably mainly the FPU which is speeding up things significantly compared to the Hero. Math beats the crap out of FloatMath here. This is similar to the old x86/87 Wisdom that using doubles actually increases performance as most FPU registers are > 32bits anyways. You can therefor eliminate a costly conversion to and from float. On the Hero that effect is non-existant as it has no FPU and everything is done in software, so double is slower than floats. FastMath still stinks. The relative timings to FloatMath are nearly the same as on the Hero which suggests that the function call overhead is still there. The JIT in Dalvik does no inlining yet so you still have to pay the price for excessive function calls.

So what does this mean? FloatMath works a tiny bit better on older devices with no FPU as expected. On never devices with FPU it’s actually slower it seems due to load conversions to and from the FPU registers for 32-bit floats. Function calls are still evil and nasty even with the JIT. Finally, i’ll revert the changes to libgdx and use the standard library Math function instead. The performance impact on old devices is neglectable.

Note: Yes, those are only micro benchmarks and there’s a lot that’s ignored. However, for my purposes they are more than precise enough.

WTF: Google you scare me. 20 minutes after i posted this i get the following. I never felt so relevant…

Ohloh – An open source directory & libgdx statistics

Today i stumbled across Ohloh, an open source project directory. I have no idea how many people actively use it, but it has some nice features that i miss on Google Code, namely project statistics. These range from simple LOC counts to more elaborate metrics like percentage of different languages used.

The most fun part of Ohloh are the widgets. There’s a couple of different widgets that allow you to display the number of users, project statistics etc. And then there is this widget.

What can i say, appearantly libgdx is worth around 800k$. Of course this is an estimate based on LOCs, average salary (55K, man i wished i’d get that at my day job :() and so on. It’s really silly. Additionally it also includes the LOCs of the 3rd party sources that are in the SVN repository. Libgdx itself now has around 40K lines of code (Java + C), so it’s actually only worth 600k. Anyone want to invest?

Here’s some more stats:

Pretty sweet. You can find libgdx’s Ohloh page here.

Btw, check out the brand new summary page over at Google Code!