When it comes to high performance software, using the built in memory allocators like new and malloc can cause performance issues due to the underlying kernel calls required to fulfill the allocation requests. Games are very much high performance software as the time spent processing frames effects the overall experience that the user has. Since the desired time between frames is usually somewhere between 1/30 and 1/60 of a second, every bit of time that can be avoided during frame processing is highly desired. Since kernel based memory allocation takes so much time due to context switching, a custom memory allocator can greatly improve the performance of game that requires many allocations per frame.
One of the most simple forms of memory allocation is stack based allocation. This form of allocator initially allocates a large chunk of memory from the OS, and then services memory requests from that block. Very simply, every request for memory moves the top of the stack up by the size of the requested block. The downside of a stack is that memory must be freed in the opposite order it was allocated which can be tricky. One method to combat this issue is to allow the stack to be "marked" and then rolled back to the marker freeing up all memory used since the marker. This is the approach that I used for this allocator and it works out pretty well.
This allocator ends up working very well and with very few lines of code. The whole allocator class fits into 113 lines of code which includes white space and comments. When tested against the stock allocator new, this allocator performs about 3 times as fast when doing lots of small allocations which is an excellent improvement. In my implementation I've also added the ability to optionally zero out memory when allocating, rolling back, and wiping the stack so there is a little flexibility in how the allocator works.
The allocator class and the profiling code can be found on my github here: https://github.com/zachmakesgames/StackAllocator.