Memory management behavior is one of the first topics I wanted to understand in node. This will be part one of two articles in which I intend to explore:
- Memory management / gc options in the V8 VM that runs node.js applications.
- Debugging / memory leak analysis for running node servers.
At a high level V8 uses a generational memory model with a copy collector and incremental mark and sweep. You’ve got control over the size of three different memory spaces, new, old, and code (Although the old generation is further split into the map space (V8′s hidden class construct), the large object space, the cell space, the old data space, and the old pointer space) . The young generation is collected using a scavenge copy collector, and is cheap, taking only a couple of ms typically. The old generation is collected using an incremental mark-and-sweep with or without compaction, and is more expensive. Below I take a look at the various controls – some of these are controls are V8 options, some are Node.js options.
Configuring V8 heap sizes
Out of memory errors? Don’t solve memory leaks – drown them in sweet tasty memory nectar.
--max_new_space_size (in kBytes)
Control the size of the new generation. Every app is different, but if you have large volumes of transient objects, you’ll want to make sure that this is large enough that they aren’t being needlessly promoted to the old generation. Note that unlike the other size flags, this is in KB, so go crazy.
--max_old_space_size (in Mbytes)
Control the size of the old generation. This is the nursing home of the memory model, so size accordingly.
--max_executable_size (in Mbytes)
The code space size.
Controlling when GCs occur in V8
By default, V8 will perform garbage collections on failed allocations, after every N allocations, and on a notification of idle condition by its embedder. You can also optionally configure Node.js to allow manual invocation of the garbage collector. Here are the relevant knobs:
This controls whether V8 will do automatic garbage collection after every gc_interval allocations.
The frequency of which V8 will perform its automatic garbage collection, in number of allocations.
V8 allows the embedder (Node.js is my case) to send idle notifications, and those in turn trigger garbage collections. When these occur are dependent on Node.js and the algorithm it has in place for sending notifications. Many node users have turned this off, as the GCs can become frequent and disruptive when the heap is over 128MB. Future iterations of nodejs plan to update and improve this (https://github.com/joyent/
Tracing, and monitoring
The VM provides multiple options to allow for logging garbage collection events and statistics. The most basic of these is:
This will cause standard out messages to be written on every garbage collection event. Here is an example of the output:
1826068 ms: Mark-sweep 13.8 (52.1) -> 13.6 (52.1) MB, 24 ms [idle notification ... 1909752 ms: Scavenge 14.9 (52.1) -> 14.1 (52.1) MB, 1 ms [Runtime::PerformGC]. 1982853 ms: Scavenge 15.0 (52.1) -> 14.4 (52.1) MB, 1 ms [allocation failure].
The first field is when the GC occurred in milliseconds since the process started. The second is the type of collection – Scavenge or Mark Sweep. This is followed by the before and after memory usage status, the time the GC took to execute, and the trigger of the GC. In the example above, the Mark-sweep was triggered by an idle notification, the first scavenge by an automatic GC, and the second by an allocation failure.
This option can be combined with trace_gc, and to provide a breakdown of memory allocation by each of the defined memory spaces.
Memory allocator, used: 37732352, available: 1497382912 New space, used: 0, available: 1048576 Old pointers, used: 1546168, available: 10056, waste: 0 Old data space, used: 1228544, available: 0, waste: 0 Code space, used: 625184, available: 394720, waste: 0 Map space, used: 75712, available: 55360, waste: 0 Cell space, used: 27440, available: 70864, waste: 0 Large object space, used: 0, available: 1496317696
There are a variety of other flavors of information you can print. Use these options:
--trace_gc_nvp (print one detailed trace line in name=value forma) --print_cumulative_gc_stat (print cumulative GC statistics on exit) --trace_fragmentation (report fragmentation for old pointer / data) --trace_incremental_marking (trace progress of incremental marking) --log_gc (Log heap samples on garbage collection for the hp2ps tool)
Only If You DareTuning
V8 provides several other options, but they are for those with an advanced, intimate understanding of V8 and garbage collection. Thanks to Vyacheslav Egorov, V8 garbage collection guru, for helping to provide most of this information.
--incremental_marking (default: true)
If you’d like to turn of incremental mark and sweep in favor of a stop the world collection, you can do that here.
This was added during the development of incremental marking, and served as a tool for validating that feature of the garbage collector. It has no practical purpose for V8 users today.
--always_compact (default: false)
Always compact after an old generation collection. This might be useful in unusual cases where constant memory fragmentation is an issue.
Never perform compaction after a full GC. This is another option that exists for testing purposes only.
--compact_code_space (default: true)
Compact code space on full non-incremental collections.
--flush_code (default: true)
When this option is on, garbage collection will try to discard compiled code to reduce memory consumption. If the discarded code is needed in the future, it is recompiled. Currently, only non-optimized code is flushed during this process, but V8 may flush generated optimized code in the future as well.
--collect_maps (default: true)
Here is a quote from Vyacheslav Egorov that explains this better than I ever could:
The collect_maps option implies that maps are not collected when it is turned off; however this is not true. Maps just start to die in a different way. When collect_maps is on then edges in map transition trees are reversed during garbage collection and trees start dying from their leaves: paths that do not lead to any live objects are cleared from the tree,
but roots are retained. When collect_maps is off transition trees are just dying from their roots like a normal trees… It’s likely that less maps will die this way due to connection between initial map and the constructor, but fully dead trees will be surely reclaimed. In some cases application can exhibit pathological patterns of hidden classes construction that cause performance degradation due to frequent GCs when –collect_maps is enabled because parts in transition tree reappear again and again after being pruned as dead. But I would not recommend to touch this flag unless you deeply understand how V8′s hidden classes work.
--lazy_sweeping (default: true)
Use lazy sweeping for old pointer and data spaces.
--cleanup_code_caches_at_gc (default: true)
This option clears inline caches and other supplemental caches used by V8 to collect type feedback and adapt for the application. Flushing them reduces memory pressure (caches use small complied code stubs that can reference other objects e.g. hidden classes that are no longer ”relevant”) and ensures that feedback is not stale.