Welcome to SDL !!Science!! wherein we test things we’ve covered in ‘Intro to SDL’ to make sure that the documentation matches the actual real-world performance (Spoiler: Not as often as you would hope). Trying to use the internet to figure some of this stuff out can be a little irritating as SDL is both multi-platform (meaning the answer to the question is only true in linux with the X11 driver on a full moon) and old (meaning most of the stuff is from 2003). As someone who has used SDL for years, I’m using this as both an opportunity to tighten up my toolbox and make sure I’m not assuming something stupid out of habit or ignorance, as well as providing a comprehensive internet source for this stuff that isn’t 9 years old. This is with the latest stable version of SDL (1.2.15) and may not be true for older versions.
First off I need to check that my method of displaying fps isn’t actually affecting the fps of our program, otherwise the windowed/fullscreen comparisons will be wrong.
Replacing title FPS counter: ~9500 to ~9600 fps (requisite DBZ joke here)
Text to file FPS counter: ~9500 to ~9600 fps
Obviously I’m taking a rough sample and averaging, since background processes eat very small amounts of cpu and throw the results slightly(also I swear core 4 is slacking). I’m actually surprised that changing the window title every second had no appreciable effect on fps.
Time to mess around with the various flags and arguments and see what happens.
640 x 480 x 64 window hwsurface: Crash. Not unexpected, never tried it before though.
640 x 480 x 32 window hwsurface: ~9500 to ~9600fps
640 x 480 x 16 window hwsurface: ~1500 fps
640 x 480 x 8 window hwsurface: ~5000 fps
Huh. Did not call that one. I don’t even. What? How is 16-bit that much worse?
640 x 480 x 32 window swsurface: ~9500 to ~9600 fps
640 x 480 x 16 window swsurface: ~1500 fps
640 x 480 x 8 window swsurface: ~5000 fps
Ok, the conclusion we can draw from this is that we are not receiving a hardware surface when we are windowed. This is not unexpected, but good to confirm. Now lets see what happens when we throw fullscreen into the mix. And since we’ve already proven that the fps display methods don’t put a damper on the actual fps, we can test without worry.
640 x 480 x 32 fullscreen hwsurface: ~9000 to ~9100 fps. My eyes, they bleed. Surprising, I was expecting fullscreen to give a performance boost.
640 x 480 x 32 fullscreen swsurface: ~9000 to ~9100 fps. Nope, no HWSURFACES at all. Checked the flags and everything.
SDL_ListModes() claims that everything you’d expect is HWSURFACE capable. This function is a dirty dirty liar and we aren’t going to get a HWSURFACE, ever(at least on windows). The loss of hardware surfaces and double buffering is something of a !!Science!! loss, I’m not overly sad that we can’t use them. As to why we wouldn’t want hardware surfaces, they’re (apparently, as I have no way of testing this) not good for direct pixel work, alpha blending, and a host of other things we really want to do later.
Using SDL_SetVideoMode() to change resolutions works surprisingly well. The last one declared wins, so if you call it again you get a new window resolution. I was expecting a massive memory leak for doing so, but it seems to be fine. Can also switch between windowed and fullscreen without problems. It also means you can’t use successive SDL_SetVideoMode() calls to make more than one window (though they are adding in multiple windows in SDL 2.0).
800 x 600 window: ~4700 fps
1024 x 768 window: ~2000 fps
1280 x 960 window: ~950 fps
1600 x 1200 window: ~600 fps
This is somewhat surprising, I was expecting drop-off, but not this severely. 1600 x 1200 has 4 times the pixels of 800 x 600, but loses ~7.8 times as much performance. 1280 x 960 loses a massive ~10 times performance for 4 times the pixels over 640 x 480. There is obviously some significant overhead with handling the sheer volume of pixels via the CPU. I’m running a Quad Core i5 running at 3.30 ghz and running a blank screen at a single core maxed out, whats it going to run like with things on screen on a low end PC?
Most of the other flags are fairly unremarkable. There is some promise in the SDL_ASYNCBLIT flag, but we’ll have to wait until we are actually blitting things to see if there’s performance to be gained.
Wanted to see if there was a difference between using our sdl.screen pointer or simply using SDL_GetVideoSurface() all the time.
640 x 480 window: ~9400 to ~9500 fps
800 x 600 window: ~4500 fps
Other resolutions: No discernible difference. I’m not sure if the difference is just falling in between the fps spikes or if the lower call volume reduces the negative performance enough that it isn’t noticeable, So lets take it to profiling.
sdl.screen – 1000 calls: 210 ms
SDL_GetVideoSurface – 1000 calls: 230 ms
It’s slower, and since we’d be using it any time we draw to the screen, that’s problematic. At 20 microseconds slower per call we lose 1ms every 50 calls. That’s pretty significant, since that’s well within the realm of possibility. Just out of curiosity, I wondered what the difference between having sdl.screen and just a global screen pointer was.
screen – 1000 calls: 197 ms
Remember earlier when I joked that core 4 was slacking? Now I’m not sure whats going on. The first time I ran this it came in at 197 ms, but subsequent runs have had it all over the map between 197 and 210. This of course means my data is now suspect and I’m going to have to build a profiling system that can handle multiple runs with means and medians and graphs(oh my). But, a quick google shows that you can force things to be contained on a single core. Which is a markedly simpler solution until we hit multi-threading.
SDL_INIT_EVENTTHREAD sounds like a potentially cool flag. Except that it crashes on Win32 (haven’t tried it with x64 yet). You’d think this would be mentioned on the main SDL_Init page, but it’s hidden away on some unrelated FAQ page. Same page also suggests that if you don’t use this flag under X11 you may not register events.
Thus ends our first SDL !!Science!! Can you taste the learning?