Performance Optimizations

Top  Previous  Next

Short List of Things To Do To Improve The Performance of DPSF


This is the TL;DR version of the more detailed explanations in the following section.


1. If a particle system is not needed for a while, set it's Enabled property to false.

2. Play with the UpdatesPerSecond property. You can often lower it to around 40 updates per second without any perceivable differences.

3. Disable the AutoMemoryManager if you don't need to use it.

4. Turn performance profiling off before making a release build of your app if you have enabled it.


Hint: Most of these settings can be globally configured using the static DPSFDefaultSettings class.



Longer and More Detailed Explanations of Things To Do To Improve The Performance of DPSF


Here is a list of some optimizations that may be performed to increase speed and/or reduce memory when using DPSF particle systems:


Disable particle systems when they are not in use.  Even if a particle system does not currently have any particles some processing is still performed, such as calculating if enough time has passed to perform the next update, calling the particle systems overridden BeforeUpdate() and AfterUpdate() functions, processing any ParticleSystemEvents, updating the Emitter's position and orientation, and calculating how many particles the Emitter should add to the particle system.
To disable a particle system set its Enabled property to false.  Do this instead of setting its SimulationSpeed to zero, as all particles are still updated using an ElapsedTime of zero when the SimulationSpeed is set to zero, but no processing is done at all on the particles when the Enabled property is false.


Improve speed by updating the particle systems less often.  Set the particle system's UpdatesPerSecond property to a value like 40.  The Particle System Manager class also provides a function to set this property for all particle systems in the Manager.  NOTE: If you set the UpdatesPerSecond value to be too low, the particle system animation may appear choppy since it is not being updated often enough.  Also, if collision detection is used for the particles, potential collisions may be missed if the particle system is not updated often enough.


Create your own Particle Class and Vertex Struct to save memory (the default Particle classes are bloated with many properties to offer more functionality out-of-the-box).  This may not be as big of a problem when creating PC apps since most PC's these days have at least 1GB of RAM or more.  However, if creating particle systems for the Xbox 360 or the Windows Phone which have limited memory, then you may want to create your own Particle class that contains only the properties you need your particle system to have, especially if the particle system uses thousands of particles, as all of those extra unused properties may take up a lot of memory.


If you like you may Initialize() a particle system when it is needed and Destroy() it when finished with it.  If initializing and destroying the particle systems at run-time is causing stuttering though (i.e. the frame rate drops for a moment), you may Initialize() the particle system while loading other resources (such as when loading a level) and simply set its Enabled property to false when it is not in use.  This will allocate the required space in memory for the particle system, but no processing will be done until the particle system's Enabled property is set to true.  You can then Destroy() the particle system when unloading all of the other resources (such as once the level is complete).


Try to set the NumberOfParticlesAllocatedInMemory and MaxNumberOfParticlesAllowed particle system properties to reasonable amounts (these are initially set using the parameters provided to the particle system's InitializeXParticleSystem() function (where X is the type of particle system)).  For example, do not set the NumberOfParticlesAllocatedInMemory to 10,000 when you know that particle system will never contain more than 100 particles.  By default most of the particle systems provided by DPSF for learning, as well as the templates, initially set the NumberOfParticlesAllocatedInMemory to 1,000 and the MaxNumberOfParticlesAllowed to 50,000, so be sure to adjust these values appropriately if using the particle system in your application.  The MaxNumberOfParticlesAllowed will allow the NumberOfParticlesAllocatedInMemory to grow as needed if the Auto Memory Manager is enabled to increase the number of particles allocated in memory (see next point), so you may want to lower it as well.


By default the Auto Memory Manager is enabled to increase and decrease the number of particles allocated in memory as needed.  This will automatically increase the NumberOfParticlesAllocatedInMemory at run-time when it is needed (i.e. it will increase when AddParticle() is called if the NumberOfActiveParticles is already equal to the NumberOfParticlesAllocatedInMemory, and the MaxNumberOfParticlesAllowed is greater than the NumberOfParticlesAllocatedInMemory), and decrease the NumberOfParticlesAllocatedInMemory when many of them are not being used for a certain length of time.  When the NumberOfParticlesAllocatedInMemory is adjusted, memory is re-allocated, and may cause a performance stutter while the new memory is being allocated.  To avoid this, you can disable the Auto Memory Manager by setting AutoMemoryManagerSettings.MemoryManagementMode = AutoMemoryManagerModes.Disabled.  When doing this however, be sure that the NumberOfParticlesAllocatedInMemory is high enough to support the number of particles that you want the particle system to display, but not so high that there will be a lot of wasted memory not used by the particle system.


The Particle System Manager class and DrawableGameComponents draw the particle systems in the order of to the particle systems' DrawOrder property.  If you are using multiple particle systems and you know that some of them use the same Texture image file, then you should set the particle systems that use a common Texture to have the same DrawOrder property value.  This will prevent unnecessary texture swapping.  For example if particle systems A and C use texture 1, and particle system B uses texture 2, then you would want the particle systems to be drawn in the order ACB or BAC so that the texture would only need to be swapped once.  If the particle systems were drawn in the order ABC, then the texture would need to be swapped twice.  Texture swapping at run-time is expensive and should always be avoided when possible.


Sometimes you may be able to get better speed performance by drawing fewer large particles rather than many small particles (you will always get better memory performance with this technique as fewer particles means less memory consumption).  This is not always the case however.  As mentioned above, the size of a particle drawn on the screen can have a direct impact on speed performance, so sometimes drawing many small particles is faster than drawing fewer large ones.  The size of a particle (that is, the size it appears on the screen) has a large impact on performance because the pixel shader will need to color a lot more pixels.  For example, if we have 100 particles, and they each have a size of 10 width x 10 height = 100 pixels, then the pixel shader will need to color 100 particles x 100 pixels per particle = 10,000 pixels to draw all of the particles.  Now if we double the size of the particles on the screen (either by doubling the Size property of the particle to 20, or by moving the camera closer to the particles), then each particle will have a size of 20 x 20 = 400 pixels on screen, so the pixel shader will need to color 100 particles x 400 pixels per particle = 40,000 pixels.  In this example we doubled the size of the particles (i.e. from 10 to 20), but the performance impact was multiplied by 4 (i.e. from coloring 10,000 pixels to coloring 40,000 pixels).


So this explains why drawing larger sprites affects performance, but there are a few caveats worth mentioning:


1) As an example, lets say we are creating a fire effect that draws 100 particles. Because each particle is semi-transparent (i.e. we are using alpha blending), all 100 particles will need to be drawn by the pixel shader, since particles will be slightly visible though other particles. If the particles were not transparent (i.e. were solid), then the pixel shader would only need to draw the particles closest to the camera (if depth buffer writes are enabled), since they would hide other particles behind them; so when using non-transparent particles the pixel shader might end up only drawing 10 or 20 of the 100 particles, increasing performance....however, using non-transparent particles for a fire effect would likely not look very good at all, so this is only an option for certain effects.  Depth buffer writes can be enabled by overriding the InitializeRenderProperties() function in your particle system class, like so:


               protected override void InitializeRenderProperties()



                       RenderProperties.DepthStencilState.DepthBufferWriteEnable = true; // Turn on depth (z-buffer) sorting



2) You may notice that when you turn the camera so that the large fire particles are not visible at all, your frame-rate is much higher, and it is only when you are actually looking at the fire that your frame-rate drops. This is because the pixel shader does not actually draw the particles unless they are in view.


So should I use lots of small particles, or fewer large ones? It depends:


Drawing lots of small particles

If you draw lots of small particles, there is less chance of the GPU's pixel shader becoming bottle necked.  However, while the pixel shader might have less work to do, the CPU will have more work since it will have more particles to process (i.e. updating the position, orientation, etc. of each particle every frame). This can be somewhat supplemented by specifying a lower value for the particle system's UpdatesPerSecond property, as mentioned above.


Drawing fewer large particles

As mentioned, drawing larger particles puts more stress on the pixel shader, but frees up the CPU to do other operations. One benefit of drawing fewer large particles is that there are less particles to update. So while you may have poorer performance when viewing lots of particles, you will still have good performance when not viewing the particles (or only viewing some of them), since the pixel shader will only draw particles that are actually in view. Because the operations to update all of the particles will be performed whether or not the particles are in view, this method can sometimes offer better performance than drawing many small particles, since there are not as many particles to update. So with the drawing fewer large particles method, the framerate may be poorer when viewing lots of particles, but will be better when not viewing lots (or any) particles since there are fewer particles to update.


So as for which method is better, it's tough to say. You will generally need to just try them both out and see which gives the best performance. Also, try finding a medium ground (i.e. draw a medium amount of particles of a medium size) and see what type of performance that gives you.  The speed performance hit is exponential relative to the particle size, so instead of cranking your particle size from 10 to 70, try changing it to 20 or 30 and adjusting the number of particles and see how it looks and what type of performance you get.