Solving the Node Buffer Constructor Deprecation problem

With the EOL (end of life) of Node 4.0 and the introduction of Node 10 coming in April, it’s time to look at that perennial Node problem: what to do about the Buffer constructors.

Background

The Buffer object in Node.js provided a way of working with binary streams before the TypedArray was introduced with EcmaScript 2015. A new buffer object could be created using several variations of constructor, such as the following:

var buf = new Buffer(200); // allocates 200 byte buffer

The new buffer object’s memory is made available, but doesn’t actually contain anything because, originally, the space was unallocated. This could be useful from a performance standpoint but could also have serious consequences. For instance, if I use Node 9.10 to print out the buffer just after creating the object, I get a result like the following:

<Buffer 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 00 00 00 00 00 00 00 00 00 ... >

But if I use nvm to install Node 4.0 and run the same application, I get the following:

<Buffer 05 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
 00 00 00 00 00 00 e0 9f 29 1b fc 7f 00 00 00 00 00 00 01 00
 00 00 03 00 00 00 07 00 00 00 09 00 ... >

What we’re seeing is the data that’s currently in the space. Data from Somewhere Else. This represents a serious security concern, because the data could be potentially sensitive—such as passwords, credit card numbers, and missile coordinates.

Of course, Buffer constructors can also be used safely. For instance, the following is perfectly acceptable:

var buf = new Buffer('a new string value');

The buffer object is created and instantiated to the string. Perfectly safe. As is the following:

var buf= new Buffer(200).

buf.fill(0);

Which zero-fills the entire buffer, overwriting the data that existed in the memory space.

However, Buffer constructor is particularly vulnerable when combined with ECMAScript’s type coercion. In the following very traditional and grossly simplified JavaScript snippet we can easily see where the Buffer constructor problem can sneak up on us:

function passValue(val) {
var buf = new Buffer(val);

console.log(buf);
}

passValue('this is a string'); // cool
passValue('18'); // cool
passValue(200); // wait...what?

Now, spread the same functionality over complex code contained in a large application where the value that’s passed to the function is given to the application from Somewhere Else, and you can see the problem.

This isn’t an esoteric concern, as has been documented in the past. Actually, several times in the past.

The security issues were significant enough at the time that the Node folks decided to deprecate the use of Buffer constructors in Node 6, replacing them with safer alternatives, such as Buffer.alloc() and Buffer.from(). However, since Buffer constructors were used liberally throughout untold numbers of modules by that time, Buffer constructors were deprecated in documentation, only, (soft deprecation) with the strict injunction to developers not to use them.

Still, you can’t immediately stamp out something so widely used as the Buffer constructors. The Node powers-that-be decided to make the continued use of Node constructors safer by automatically zero-filling them in Node 8.

The issue that still exists, though, is that there are Node applications still running on older versions of Node, such as Node 4 and Node 6. And there are who knows how many older but stable modules making use of the Buffer constructor.

Which leads us to this year’s discussion about what to do with Buffer constructors, as noted in Github. Originally the intent was to issue a runtime deprecation warning for the use of Buffer constructor. The problem with this is that test applications will crash and burn once testing starts with Node 10. In addition, module developers with potentially hundreds of impacted older but stable modules are daunted by the possibility of being buried under by requests for fixes certain to arrive from this move.

Ultimately, the Node TSC decided to issue a runtime deprecation warning for the use of Buffer constructors, but only in the application, itself. The use of Buffer constructors won’t generate the warning for module code (in node_modules).

Of course, this isn’t a permanent solution. This just kicks the can down the road. In six months, I can practically guarantee this issue will pop up again, when Node 10 goes to LTS.

Bluntly, there are only two real solutions to solving the Buffer constructor problem: fix it, or forget about it.

Fix It

Node achieved explosive growth about the time when the deprecated version of Buffer constructor was the only way of dealing with binary data. This means that the use of Buffer constructor is widespread, both in applications and in modules the applications are dependent on.

When Node 10 releases, running a Node application that uses the Buffer constructor will generate a runtime warning if, and only if, the use is in the application and not a module. Because of concerns, and pushback, from module developers, no warning is currently issued for the use of Buffer constructor in node_modules.

The problem with this approach is that rather than fixing the potential security risks associated with the unsafe use of Buffer constructor, the issue is just kicked down the road, only to resurface in October, when Node 10 goes to LTS. And then every six months thereafter, matching the semver-major release schedule.

So, one solution is to fix the problem by eliminating the use of Buffer constructor. We’ve had alternative methods for years now. Time to eliminate the use of Buffer constructors everywhere.

Yeah, say that to a module developer who has hundreds of modules. “Suck it up and do it, buttercup!” doesn’t play well in the open source community.

People often ask me how I maintain +600 modules (for free usually). I usually reply "because most are small and done". Node 10 deprecates an API used by 141 of them, 176 if you include tests/examples. This is *after* spending time getting a bunch other of my modules updated.

— Mathias Buus 🕳🥊 (@mafintosh) March 21, 2018

The issue associated with Buffer constructors indirectly highlights another major issue with Node: that there are a significant number of older but stable and even critical modules that are maintained by a much smaller group of developers. When faced with a pervasive change such as eliminating the Buffer constructors these same developers can be overwhelmed by the amount of work involved. And what about developers who created modules and then moved on to other tools, languages, or a mountain top in Colorado completely cut off from the Internet?

Is it a good time to remind people about left-pad?

Fix It Solution: Involving the Community

Node and the Node module management organization, NPM, have no processes in place for managing modules that have either been abandoned or are no longer being actively maintained for a variety of reasons, including the saddest.

And then there’s the case of the Node developer, such as the developer linked earlier, who is responsible for hundreds of modules.

When discussing the issue of runtime deprecation on Github, I suggested to folks that we get the community involved in fixing older modules. One possible solution is to issue a general call for help to the Node community. Node developers who might be hesitant about contributing directly to Node core, might be more comfortable in the more finite world of making specific changes to a specific number of modules.

We could match willing volunteers with module owners who have significant numbers of modules and could use the help. Either the module owner could vet the changes, directly, or even consider assigning the authority to other, more senior developers.

Not only would this permanently fix the Buffer constructor issue, it could be a good way of spreading module ownership to more people so that no one person is responsible for an overwhelming amount of code. After all, Buffer constructors aren’t the only problem we’ll be facing in the future that may require extensive and critical fixes.

Of course the issues associated with overwhelmed or missing module maintainers in Node are the same issues faced by module and or/library maintainers across the open source community. You want to ensure that actively used modules are actively maintained by groups of module owners, each of whom can be counted on to do what is necessary within a short period of time. And that when a module does reach an end of life, it is universally and gently moved aside in favor of whatever is the newer module or approach.

Reality intrudes. Herding developers === herding cats.

Barring some magical, community effort that wipes Buffer constructors from the face of the earth, we should then consider the second solution:

Bring Buffer constructors back.

UnDeprecating Buffer Constructors

We can see the problems associated with Buffer constructors. We can see the potential security vulnerabilities. But if we look at the use of Buffer constructors throughout the cosmos that is the broader Node world, I suspect we’ll find that in most modules and applications, the use of Buffer constructors is both proper and safe. As npm CEO Isaac Schlueter notes in Twitter:

I don't agree that it's as unsafe as you claim.

None of the examples of `new Buffer()` in any of my modules are problematic. I'm only bothering to update them because I know I'll get emails about it, and I'm feeling p resentful tbh.

— isaacs (@izs) March 21, 2018

Yes, Buffer constructors can be used incorrectly and cause security issues. But then so can the following:

using eval(), and yet eval() is still in ECMAScript
using form data to form a SQL query via concatenation rather than using SQL Parameters or escaping the input, yet you can still form a SQL query using concatenation and without escaping it, first.
Using PRNG from math/rand, rather than crypto/rand.
Storing passwords in plain text. Don’t laugh: I found out this week that a fairly major research site I had visited in the past was doing this. I discovered this interesting fact when they sent me my password in an email in plain text, because I notified them their site lacks a mechanism to change one’s password.
Running our web apps as root
Trusting that all modules in NPM are safe

I could go on. You could, too.

I’m not pointing fault at others because we all screw up. No one should expect us to not make mistakes. But when we do, we are expected to acknowledge the mistake, fix it, and not make the same mistake in the future. Knock on wood.

What we shouldn’t expect is that a fundamental component of the language we’re using be tossed away because we can’t be depended on to not make mistakes. Tools can’t wrap us in wool. There is no such thing as the perfect sandbox in which we can play.

Why were Buffer constructors deprecated? Because when used incorrectly, they could expose sensitive process data. But that wasn’t the fault of the Buffer constructors; this was a mistake on the part of some developers. Even now, we can still create an unintialized buffer, using Buffer.allocUnsafe(). True, you can’t use a string with the new method so it eliminates the accidental introduction of a security issue because of unforeseen type coercion, but that doesn’t mean you can’t make a mistake and still introduce an accidental vulnerability. It is (much) safer than Buffer constructors…but it isn’t perfectly safe.

Code constructs are either perfectly safe or inherently unsafe: they can’t be both. It’s up to the tool/language/environment to create that which is perfectly safe (and it’s company busting when they screw up). But it’s up to developers to carefully use that which is inherently unsafe. And that means that we’re occasionally going to bump our noses. Hard.

DOS users learned not to use format without specifying the drive just once. They never needed a second lesson. Same for Linux users and rm -rf /. There have been coding mistakes I’ve made in the distant past that I still cringe about every time I remember them. None were mistakes I ever made again.

I prefer to discover new mistakes to make.

By deprecating Buffer constructors what did we gain? The applications and modules that used Buffer incorrectly are still using the constructors incorrectly until the developers correct the code. Node has made the environment safer with zero-filling the buffers after Node 8…but it isn’t perfectly safe.

When the developers do fix their code, it could be just as simple for them to fix their mistakes by using Buffer constructors safely, rather than use the new methods. Or, if they choose, to use the new methods. It doesn’t matter, because they’re going to have to change the code, regardless.

When developers make mistakes, they have to fix them. We can’t make this essential fact of development go away by playing with deprecation behavior. So, let’s drop the deprecation and accept the consequences of fixing the code.

More importantly, this approach won’t penalize those who did use Buffer constructors properly, and safely.

And we won’t face the same question every six months: What are going to do about Buffer constructors.

Dithering is the worst possible choice to make

There are *two solutions to Buffer constructors: either we commit to runtime deprecation of their use in both node modules and applications, once and for all; or we undeprecate the constructors, albeit with all sorts of caveats attached in the documentation.

The first approach will, ultimately, create a significant amount of pain for a potentially longish period of time, as well as possibly burning out more than one module developer. We could ask for help from the community to offset some of this pain, but ultimately, we’re in for a world of hurt for a time.

The second approach will leave a potentially unsafe code construct in Node. But then, eval().

Regardless of path chosen, the Node TSC has to make a choice. Dithering is also a choice, but I guarantee, it’s the worst one.

*A third option is to keep deprecating the Buffer constructors and just ignore their use, ala Oracle and Java. However, the recent discussion on the issue demonstrates this approach isn’t palatable to the Node community.