原文出处:https://auth0.com/blog/2016/01/26/four-types-of-leaks-in-your-javascript-code-and-how-to-get-rid-of-them/

这篇文章中,我们将探讨几种常见的客户端 JavaScript 内存泄漏问题,同时学习如何使用 Chrome 开发者工具来发现这些问题。开始吧!

介绍

内存泄漏问题基本上每个开发者都会遇到,即使是那些自带内存管理的编程语言,内存泄漏依然有可能发生。内存泄漏会引起很多问题,如系统运行缓慢、崩溃、高延时等。

什么是内存泄漏

本质上,内存泄漏可以简单地定义为那些无法被应用所使用但又没有被操作系统回收掉的内存。不同的语言之间,其内存管理方式各异。然而某块内存是否被使用本身就是一个不确定的问题。换句话说,只有开发人员自己才能真正地确定内存是否在使用。有些编程语言会提供自动的内存管理功能以方便开发,有些则完全需要开发人员自己指定何时内存不再被使用。关于这一点,wikipedia 上就有不少关于手工自动内存管理的好文章。

JavaScript 中的内存管理

JavaScript 是众多支持垃圾回收(GC)语言的其中一种,这类语言可以帮助开发者自动管理内存,定期地检查已分配内存是否可继续被追踪。换句话说,垃圾回收语言将内存管理问题从 哪些内存正在被使用 简化为 哪些内存可以被程序所利用,看起来似乎没啥区别,但是,只有开发者知道具体哪些内存会被使用,而是否可追踪则可以由算法来决定并标记以回收到操作系统中。

非 GC 语言通常使用其它方式来管理内存:如显式的告诉编译器内存是否不再需要,或者引用计数技术(对每一块使用的内存都会有一个对应的使用计数器,当这个计数器为 0 时就会将内存回收)。这些技术的使用均需要根据使用场景进行权衡,而且使用不当极易导致内存的泄漏。

JavaScript 中的内存泄漏

GC 语言中内存泄漏最常见的原因是无效引用。为了先了解什么是无效引用,我们需要先了解垃圾回收器是如何决定内存是否可追踪。

标记-清除算法

大多数垃圾回收器的实现都使用的是标记-清除算法,这个算法主要步骤如下:

  1. 由垃圾回收器维护一系列根节点(代码中被引用的全局变量)列表,在 JS 中,window 或 global 对象就可以看作是一个根结点。由于 window 对象是一直会存在的,所以它和其子对象都会被看作是一直可被追踪的,即这些对象对应的内存块不会被回收。
  2. 所有的根都会被检查并标记为引用状态,从根出发,递归地检查子结点。所有可被检查到的都视为非垃圾。
  3. 所有未被标记的内存块都被认为是垃圾内存,即回收器将回收这一块内存并返还给 OS 重新分配。

现代垃圾回收器在实现上会从不同角度去提升算法的效果,但本质上都是一样的,即未被标记的内存将会视为垃圾而清除回归操作系统。

无效引用指的就是对那些由于某些原因继续被标记但开发者则确定无效的内存块的引用。而在 JavaScript 上下文中,无效引用即代码中那些本应释放已经引用的内存却没有释放的那些变量,很多人认为,无效引用的产生主要是开发人员的错误导致。因此,为了解常见的 JS 内存泄漏,我们需要了解一些比较普遍的引用被忽视的方式。

三种典型的 JavaScript 内存泄漏

1: 不经意间的全局变量

JavaScript 被发明的初衷之一是要看起来像 Java 但对于初学者易上手(注:没看出来像 Java,反倒是像 C 与 Lisp 的产物),所以这门语言对于无声明的引用也是允许的。如下面不带 var 的变量,语法上并不会报错,但这样做的后果就是,这个变量可能稍不留神变成了全局变量。

1
2
3
function foo(arg) {
bar = "this is a hidden global variable";
}

等价于:

1
2
3
function foo(arg) {
window.bar = "this is an explicit global variable";
}

如果我们的初衷是想 bar 仅仅作为 foo 函数里的一个临时变量,未使用 var 就导致了一个全局变量的产生,从而这块内存就会泄漏。这个例子里,单个字符串的泄漏并不会有多大的问题,但是显然在实际情况,问题可能更糟糕。

另一个可能无意间导致全局变量的情况是使用 this:

1
2
3
4
5
6
7
function foo() {
this.variable = "potential accidental global";
}
// Foo called on its own, this points to the global object (window)
// rather than being undefined.
foo();

为避免上述情况的发生,最好在代码前面加上 'use strict'; 以严格模式运行代码,这样直接使用未声明的变量就会导致语法报错了。

关于全局变量

虽然我们刚刚提到了无意的全局变量,但这并不代表我们就不要使用,在代码中还是经常会用到显式的全局变量。这些变量根据定义,是不可回收的,除非显式地置为 null 或重新赋值。所以如果我们要使用全局变量,使用完毕后,最好重置为 null 以便其内存能够被回收。一个常见的全局变量持续内存消耗的原因是使用缓存,使用缓存过程中,需要给它设置一个上限以防止它不断消耗而不回收内存,毕竟它的内存是不会被自动回收的。

被遗忘的计时器及回调

JS 中 setInterval 的使用非常普遍,各种库中也提供了各种观察者或其它接收回调函数的配置。
The use of setInterval is quite common in JavaScript. Other libraries provide observers and other facilities that take callbacks. Most of these libraries take care of making any references to the callback unreachable after their own instances become unreachable as well. In the case of setInterval, however, code like this is quite common:

1
2
3
4
5
6
7
8
var someResource = getData();
setInterval(function() {
var node = document.getElementById('Node');
if(node) {
// Do stuff with node and someResource.
node.innerHTML = JSON.stringify(someResource));
}
}, 1000);

上例展示了使用悬空计时器可能会造成的影响:计时器将引用那些不再需要的节点或数据。即使将来我们希望这些节点或数据被回收,由于计时器仍处于可用状态,导致这块内存依然无法被回收。

过去,在某些浏览器(IE6)中,环型的引用无法导致内存的回收,从而可能引起内存的泄漏。因此对于涉及到订阅机制的代码,一旦我们需要删除订阅体,我们就需要显式地将订阅的内容解除引用。虽然现在的主流浏览器都能正确地处理订阅机制引发的这一类的问题,从代码实践上,如果不再需要订阅,我们最好还是显式地取消掉。如下例:

1
2
3
4
5
6
7
8
9
10
11
12
13
var element = document.getElementById('button');
function onClick(event) {
element.innerHtml = 'text';
}
element.addEventListener('click', onClick);
// Do stuff
element.removeEventListener('click', onClick);
element.parentNode.removeChild(element);
// Now when element goes out of scope,
// both element and onClick will be collected even in old browsers that don't
// handle cycles well.

关于对象观察及循环引用

对于 JavaScript 开发者来说,Observer 和循环引用可谓是代码内存泄漏之源。旧版 IE 无法检测出 DOM 结点与 JS 代码间的循环引用,这主要是旧版 IE 垃圾回收算法上有 BUG,尤其是当存在事件监听的时候,往往需要保存对监听者的引用。换句话说,每次在 IE DOM 结点上添加回调时,都可能导致内存的泄漏. 这也是为什么很多开发人员在回收 DOM 结点前,会显式地取消注册在 DOM 上事件的原因了。现如今,主流浏览器(包括高版本 IE 及 Edge)均使用了比较新的垃圾回收算法,它们可以检测环型引用并正确进行处理,换句话说,这种情况下,在销毁 DOM 节点前显式地调用 removeEventListener 已经意义不大了。

得益于诸如 jQuery 之类的库或框架的底层封装,即便是使用旧版 IE 浏览器,使用者在使用的时候也可以无感知上述一类的内存泄漏,因为这些框架在底层会自动帮我们检测并解除回调,以确保没有内存的泄漏。

3: DOM 以外的引用

有时候我们会在数据结构中存储 DOM 结点,这其实非常地普遍。比如,我们希望快速更新表格中多行内容,很明显我们会把对应的所有 DOM 结点存储在字典或数组中。当我们这么做以后,实际上每个 DOM 结点就会有二处地方在引用:一个是 DOM 树中,另一个则是我们的字典或数组中。如果将来需要把这些行删除,我们需要保证这所有引用的地方都变为不可追踪(unreachable)。看下面这个例子:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
var elements = {
button: document.getElementById('button'),
image: document.getElementById('image'),
text: document.getElementById('text')
};
function doStuff() {
image.src = 'http://some.url/image';
button.click();
console.log(text.innerHTML);
// Much more logic
}
function removeButton() {
// The button is a direct child of body.
document.body.removeChild(document.getElementById('button'));
// At this point, we still have a reference to #button in the global
// elements dictionary. In other words, the button element is still in
// memory and cannot be collected by the GC.
}

另外需要注意的一点是当我们引用 DOM 树中某个子结点的情况。假设我们引用了表中某一个具体的单元 DOM 结点,因为某些缘故,我们把表删除了,但指向单元 DOM 结点的引用尚未解除。这种情况下,你是不是会认为垃圾回收会把表中除该单元结点以外的内存都回收了呢?其实不然。表单元 DOM 结点实际上是会有对其父结点的引用,这样的后果就是,整个表对应的内存实际上都无法 GC。引用 DOM 元素的时候,这一点需要特别注意。

4: 闭包

JavaScript 里最重要的一个特性是闭包,闭包可以理解成一个匿名函数,从函数体内可以访问外部环境的变量。一般而言,闭包是不会引起内存泄漏的,但也有特例,可以参考 Meteor 的开发者的一篇文章

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
var theThing = null;
var replaceThing = function () {
var originalThing = theThing;
var unused = function () {
if (originalThing)
console.log("hi");
};
theThing = {
longStr: new Array(1000000).join('*'),
someMethod: function () {
console.log(someMessage);
}
};
};
setInterval(replaceThing, 1000);

上述代码片段做的事情很简单,每次当 replaceThing 被调用时,theThing 被重新赋值为一个新的对象,对象体中包含一个新分配的字符串和一个新的闭包(someMethod),与此同时,变量 unused 引用了一个新的闭包,其中引用了 originalThing 这个变量,而 originalThing 变量引用的是 theThing 前一次所指向的内存。是不是有点懵逼?我们这里的重点在于,一旦为同一父作用域下的闭包分配了作用域,这块作用域将被所有闭包共享。那么现在内存泄漏的原因就很明显了:someMethod 和 unused 都会引用同一作用域,其中会包含 originalThing。即便 unused 并没有使用,theThing 却是一个外部的全局变量,所以 someMethod 是无法被回收,导致对应的作用域也持续存在,这样就导致了上述的泄漏。

上面这种泄漏行为,可以认为是 JavaScript 引擎实现上的不完备,为了尽量避免这一类泄漏,实际上我们只需要在 replaceThing 最后将 originalThing 重置为 null 即可防止泄漏。

GC 的反直觉行为

Although Garbage Collectors are convenient they come with their own set of trade-offs. One of those trade-offs is nondeterminism. In other words, GCs are unpredictable. It is not usually possible to be certain when a collection will be performed. This means that in some cases more memory than is actually required by the program is being used. In other cases, short-pauses may be noticeable in particularly sensitive applications. Although nondeterminism means one cannot be certain when a collection will be performed, most GC implementations share the common pattern of doing collection passes during allocation. If no allocations are performed, most GCs stay at rest. Consider the following scenario:

  1. A sizable set of allocations is performed.
  2. Most of these elements (or all of them) are marked as unreachable (suppose we null a reference pointing to a cache we no longer need).
  3. No further allocations are performed.

上述场景中,大部分 GC 都无法进行进一步的回收。换句话说,即便这里存在无法追踪的引用,回收器依然不会认定这为垃圾内存。也许这不算严格的泄漏,但仍然有可能导致出乎意料的内存占用。

Google 提供了一个很好的内存分析例子,参考这里


Chrome内存 profile 工具

Chrome provides a nice set of tools to profile memory usage of JavaScript code. There two essential views related to memory: the timeline view and the profiles view.

时间轴视图

Google Dev Tools Timeline in Action The timeline view is essential in discovering unusual memory patterns in our code. In case we are looking for big leaks, periodic jumps that do not shrink as much as they grew after a collection are a red flag. In this screenshot we can see what a steady growth of leaked objects can look like. Even after the big collection at the end, the total amount of memory used is higher than at the beginning. Node counts are also higher. These are all signs of leaked DOM nodes somewhere in the code.

profile 视图

Google Dev Tools Profiles in Action This is the view you will spend most of the time looking at. The profiles view allows you to get a snapshot and compare snapshots of the memory use of your JavaScript code. It also allows you to record allocations along time. In every result view different types of lists are available, but the most relevant ones for our task are the summary list and the comparison list.

The summary view gives us an overview of the different types of objects allocated and their aggregated size: shallow size (the sum of all objects of a specific type) and retained size (the shallow size plus the size of other objects retained due to this object). It also gives us a notion of how far an object is in relation to its GC root (the distance).

The comparison list gives us the same information but allows us to compare different snapshots. This is specially useful to find leaks.

示例:使用 Chrome 查找内存泄漏

内存的泄漏可以分为两种:一种是周期性的内存持续上涨,另一种是泄漏仅仅发生一次。显然,持续的内存上涨是很容易观察到的,它也是引起应用运行迟缓和 JS 无法正常运行的原因。对于非周期性的内存泄漏,大内存的消耗还容易观察到,对于小内存的泄漏,其实很难观察。一般情况下,小内存的非持续泄漏可以看作是一个性能问题,但对于周期性的泄漏,则很明显是代码有 BUG 了。

我们使用 Chrome 文档中的例子进行内存的泄漏分析,代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
var x = [];
function createSomeNodes() {
var div,
i = 100,
frag = document.createDocumentFragment();
for (;i > 0; i--) {
div = document.createElement("div");
div.appendChild(document.createTextNode(i + " - "+ new Date().toTimeString()));
frag.appendChild(div);
}
document.getElementById("nodes").appendChild(frag);
}
function grow() {
x.push(new Array(1000000).join('x'));
createSomeNodes();
setTimeout(grow,1000);
}

触发 grow 函数后,div 结点被不断创建并添加到 DOM 中,同时分配一块数组内存并将数组推到全局变量中,通过 chrome 的 profiler 工具,我们会发现内存在持续增长。

正常情况下,支持垃圾回收的语言其内存占用曲线图通常是类似于锯齿型。这实际上是正常的,因为通常情况下,在使用过程中,GC 会间断地对内存进行遍历和回收。

找出内存是否在持续增长

The timeline view is great for this. Open the example in Chrome, open the Dev Tools, go to timeline, select memory and click the record button. Then go to the page and click The Button to start leaking memory. After a while stop the recording and take a look at the results:

Memory leaks in the timeline view

There are two big signs in this image that show we are leaking memory. The graphs for nodes (green line) and JS heap (blue line). Nodes are steadily increasing and never decrease. This is a big warning sign.

The JS heap also shows a steady increase in memory use. This is harder to see due to the effect of the garbage collector. You can see a pattern of initial memory growth, followed by a big decrease, followed by an increase and then a spike, continued by another drop in memory. The key in this case lies in the fact that after each drop in memory use, the size of the heap remains bigger than in the previous drop. In other words, although the garbage collector is succeeding in collecting a lot of memory, some of it is periodically being leaked.

We are now certain we have a leak. Let’s find it.

获取两个快照

To find a leak we will now go to the profiles section of Chrome’s Dev Tools. To keep memory use in a manageable levels, reload the page before doing this step. We will use the Take Heap Snapshot function.

Reload the page and take a heap snapshot right after it finishes loading. We will use this snapshot as our baseline. After that, hit The Button again, wait a few seconds, and take a second snapshot. After the snapshot is taken, it is advisable to set a breakpoint in the script to stop the leak from using more memory.

Heap Snapshots

There are two ways in which we can take a look at allocations between the two snapshots. Either select Summary and then to the right pick Objects allocated between Snapshot 1 and Snapshot 2, or select Comparison rather than Summary. In both cases we will see a list of objects that were allocated between the two snapshots.

In this case it is quite easy to find the leaks: they are big. Take a look at the Size Delta of the (string) constructor. 8MBs with 58 new objects. This looks suspicious: new objects are allocated but not freed and 8MBs get consumed.

If we open the list of allocations for the (string) constructor we will notice there are a few big allocations among many small ones. The big ones immediately call our attention. If we select any single one of them we get something interesting in the retainers section below.

Retainers for selected object

We see our selected allocation is part of an array. In turn, the array is referenced by variable x inside the global window object. This gives us a full path from our big object to its noncollectable root (window). We found our potential leak and where it is referenced.

So far so good. But our example was easy: big allocations such as the one in this example are not the norm. Fortunately our example is also leaking DOM nodes, which are smaller. It is easy to find these nodes using the snapshots above, but in bigger sites, things get messier. Recent versions of Chrome provide an additional tool that is best suited for our job: the Record Heap Allocations function.

记录堆分析过程以发现泄漏

Disable the breakpoint you set before, let the script continue running, and go back to the Profiles section of Chrome’s Dev Tools. Now hit Record Heap Allocations. While the tool is running you will notice blue spikes in the graph at the top. These represent allocations. Every second a big allocation is performed by our code. Let it run for a few seconds and then stop it (don’t forget to set the breakpoint again to prevent Chrome from eating more memory).

堆分配记录

In this image you can see the killer feature of this tool: selecting a piece of the timeline to see what allocations where performed during that time span. We set the selection to be as close to one of the big spikes as possible. Only three constructors are shown in the list: one of them is the one related to our big leaks ((string)), the next one is related to DOM allocations, and the last one is the Text constructor (the constructor for leaf DOM nodes containing text).

Select one of the HTMLDivElement constructors from the list and then pick Allocation stack.

Selected element in heap allocation results

BAM! We now know where that element was allocated (grow -> createSomeNodes). If we pay close attention to each spike in the graph we will notice that the HTMLDivElement constructor is being called a lot. If we go back to our snapshot comparison view we will notice that this constructor shows many allocations but no deletions. In other words, it is steadily allocating memory without allowing the GC to reclaim some of it. This has all the signs of a leak plus we know exactly where these objects are being allocated (the createSomeNodes function). Now its time to go back to the code, study it, and fix the leaks.

更多阅读

结论

对于像 JS 这样的支持垃圾回收机制的语言,内存泄漏依然可能发生,而且往往是不经意的一次泄漏,就导致整个应用出现严重的问题。因此,在开发周期中,我们应该时时地进行这一类分析,对中大型应用的开发更为如此。还犹豫什么呢?为了更好的用户体验,拿起上述工具开始进行这方面的分析吧,加油!