On June 6th, something like 6 million LinkedIn passwords were leaked. At the very least they weren’t storing them in plaintext, but they also didn’t use a particularly great hashing algorithm (SHA1) and they didn’t use any salts, much less a per-user salt. Anyway, you can read more about that other places, this post is about having fun guessing some of the leaked passwords.
I used the LinkedIn compromised password checker from LastPass for this testing rather than leakedin.org because being all ajaxy and web 2.0 it was much faster to test a lot of passwords in a row.
First off, if you haven’t already you should read “A String is not an Error,” by Guillermo Rauch. The tl;dr of it is that you shouldn’t be returning or throwing Strings where errors are expected.
A few days ago I jumped into a conversation on a pull request for grunt about the best way to detect whether an object is an error or not. I suggested using node’s util.isError, which uses Object.prototype.toString.call to get the internal class of the object and ensures that it equals [object Error]. This strict check turns out to not be ideal for the use case since the code in question is expected handle custom errors. Here’s an example of how it fails:
varerr;// straight from “A String is not an Error”functionMongooseError(msg){Error.call(this);Error.captureStackTrace(this,arguments.callee);this.message=msg;this.name='MongooseError';}MongooseError.prototype.__proto__=Error.prototype;// these are good…err=newMongooseError('nope');assert.ok(errinstanceofError);assert.ok(errinstanceofMongooseError);// …but using util.isError doesn't work.util.isError(err);// === false
This seemed like a legitimate shortcoming of util.isError so @cowboyraised an issue on node, but node’s intention for util.isError is extremely accuracy so the issue was closed. While closing it @TooTallNate suggested an alternative way to subclass error such that the internal class gets set to [class Error] (which I will adapt to fit our example):
varerr;functionMongooseError(msg){varself=newError(msg);// get the stack trace for free.self.name='MongooseError';self.__proto__=MongooseError.prototype;returnself;}MongooseError.prototype.__proto__=Error.prototype;// still good…err=newMongooseError('yep');assert.ok(errinstanceofError);assert.ok(errinstanceofMongooseError);// …and this will work since it's a true [object Error]util.isError(err);// === true
But there was something else in his response that got me thinking.
Does subclassing Error really add much value?
After thinking about it I realized that I agree with @TooTallNate that it really doesn’t add much value, especially considering how unintuitive it is to get it right.
I could see some value if JavaScript had syntactic support for multiplecatch blocks:
try{module.errorProneFunction();}catch(TypeErrorerror){log('incorrect type')}catch(ParseErrorerror){log('problem parsing')}catch(RangeErrorerror){log('input out of range')}catch(module.CustomError){log('custom module error')}catch(Errorerror){log('some other type of error')}
But we don’t, and in node it’s likely that we’re getting our errors in the form of callback(error) so we’re not going to be a try block anyway. This is a more likely scenario:
module.errorProneFunction(function(error){if(util.isError(error)){if(errorinstanceofTypeError)log('incorrect type');elseif(errorinstanceofParseError)log('problem parsing');elseif(errorinstanceofRangeError)log('input out of range');elseif(errorinstanceofmodule.CustomError)log('custom module error');elsethrowerror;//can't handle it, crash.}});
instanceof depends on a reference to the error constructor, so you’re going to have to export that from your module to use this pattern.
Since all of the built-in errors set a name property your custom errors should so as well. Given that we can use a pattern which doesn’t rely on instanceof or having a reference to your custom error constructor:
module.errorProneFunction(function(error){if(util.isError(error)){if(error.name=='TypeError')log('incorrect type');elseif(error.name=='ParseError')log('problem parsing');elseif(error.name=='RangeError')log('input out of range');elseif(error.name=='ModuleCustomError')log('custom module error');elsethrowerror;}// you could also use `switch...case` if that’s your cup of jam.});
But half the reason to subclass Error is so we can compare instances and here we’re finding out that it might be simpler to not use instanceof at all.
What about more complicated errors?
The other case for subclassing Error is to make attaching a bunch of properties to the object easier. You might have something like this:
varVERSION=1.0;// ...functionAPIError(msg,route){varself=newError(msg);// get the stack trace for free.self.name='APIError';self.version=VERSION;self.route=route;self.__proto__=APIError.prototype;returnself;}APIError.prototype.__proto__=Error.prototype;// ...thrownewAPIError('forbidden or whatever','/admin/destroy/all/humans/');
In the end that’s a lot of boilerplate when the following accomplishes the same effect:
varerror=newError('forbidden or whatever');error.name='APIError';error.version=VERSION;error.route='/admin/destroy/all/humans/';throwerror;
and if you find yourself typing that all over the place, you can make a straightforward helper that doesn’t have to deal with the complexity of inheriting from Error:
functionapiError(msg,route){varerror=newError(msg);error.name='APIError';error.version=VERSION;error.route=route;returnerror;}// ...throwapiError('forbidden or whatever','/admin/destroy/all/humans/');
One downside to using a helper is that error.stack is going to include an entry for the helper. You can either ignore it, clean it up in your helper by doing some string manipulation or use something like https://github.com/flatiron/errs which abstracts all of that away for you and provides a lot of other nice stuff.
Conclusion
Subclassing Error is not intuitive and makes dealing with errors hard if done incorrectly. If you really must subclass Error, do it correctly:
The assert methods throw Error objects if the test they present is not passed. For example, assert.size looks at the response object and asserts that it’s content-length doesn’t exceed a certain size. If any of the assertions fail, I want to immediately stop executing and hit the callback with the error.
Try…catches all the way down
When testing this, I was running into a serious issue: Errors weren’t getting trapped by the try...catch block and instead were bubbling up to the very top, making the process explode. Why is that?
After some staring, thinking and experimenting I figured out the problem. Before I go into it, here’s the working code
And here’s a simplified case that demonstrates the issue:
// throw-test.jsfunctionthrower(cb){returncb()};try{thrower(function(){thrownewError("This will be captured");});}catch(e){console.log("We live for now...");}try{process.nextTick(function(){thrownewError("...but we'll never make it out alive");});}catch(e){console.log(":(");// will never reach here}
These were my results when running it:
It all comes down to scheduling
thrower and process.nextTick both take one parameter, a callback. To understand why an error thrown in the callback will bubble in the latter, it’s important to understand how process.nextTick interacts with the event loop. From the docs,
On the next loop around the event loop call this callback. This is not a simple alias to setTimeout(fn, 0), it’s much more efficient.
User-written functions do not have the ability to directly schedule things on the event loop. You could imagine the schedule looking like this immediately after each function is invoked:
A try block will only capture errors that are thrown in the current tick. thrower doesn’t reach a completed state until the callback is executed so if the callback throws an error, it can be trapped because it’s still within the original execution context.
process.nextTick reaches a completed state as soon as it schedules the function for execution. Barring syntax or type errors, it cannot fail. When the callback gets executed on the next tick it is no longer executing in the context of a try block and the error will bubble through.
Takeaways
You can’t always tell if a callback-style function is asynchronous – only those that call setTimeout or process.nextTick to defer execution are truly async.
When catching errors from asynchronous functions you have to wrap the body of the callback, not the calling function.
This is something you have to worry about if you are using synchronous functions in your callbacks. All of the synchronous methods in the standard library throw errors.
As an aside, when writing your own methods, you should not be throwing errors in callback-style functions whether they are truly async or not. By convention, the first argument to a callback is always either null or an error object. Functions which return a value directly are free to throw errors, as they do in the standard library.
I’ve been working on it easier for developers to contribute to my project at Mozilla. It should not take a significant amount of time and/or effort to stand up a project before a developer can start writing code. Even 10 minutes is too much for a person who just wants to try something out.
Nobody wants to spend two hours setting up their environment to work on your stupid project
If they didn’t think it was stupid before, they’ll definitely think so after battling a frustrating setup process.
Developing in a VM is a great practice for a number of reasons, chief among them version/dependency hell and not polluting your main OS. You can also distribute them to developers and they have a known working environment all packaged and ready to go.
Vagrant combined with Puppet is just about the greatest thing ever for VM management. You can easily spin up a new VM to test stuff out and blow it away with just a few commands.
Another immediate benefit of developing using Vagrant and Puppet is that you start thinking about deployment while you’re developing instead of as an afterthought – you could even use the same puppet manifest to provision the production server.
Baller, let’s do this
I’ve spent the past couple of days working on a branch of the project that lets a developer clone the repo, do vagrant up and immediately have a fully working environment ready to go. Everything went relatively smoothly, I built, provisioned and destroyed dozens of VMs and things seemed to be working.
Not every developer is going to have vagrant – I’m assuming most won’t – so I wanted to do a true front-to-back test, as if I just installed VirtualBox and Vagrant. So I uninstalled and then reinstalled the latest versions of both, added latest lucid32 box and tried again.
The system spun up and got provisioned like a champ but when npm
installed my app dependencies, things got fucked:
And here began six hours of debugging
The first thing I tried was changing the provisioning strategy for installing node.js and npm. Instead of building from source, I tried installing from the package manager. This did not work.
I tried an experiment: In the VM I made a fresh clone and did npm install. This worked without issue.
So maybe the VirtualBox Guest Additions for my VM were fucking everything up. That’s what handles the sharing of a folder between the host machine and the guest VM. The lucid32 box that vagrant provides has version 4.1.0 and the latest VirtualBox is 4.1.8.
By the way, it’s fucking unbelievable how hard it is to track down VBoxGuestAdditions.iso. I ended up getting a copy here by digging into the package itself.
Update:
Anyway after installing and restarting the VM, I tried to do npm install and faced the same results.
What in the great god damn fuck.
At this point I was baffled. It was working totally fine before using the latest version of VirtualBox and the new lucid32 box. It works totally fine if I do npm install in any folder other than the one shared from the host.
I’ll save you a painful recount of all the crazy shit I went through to find this out but I eventually learned that everything comes down to symlinks and shared directories. I realized that npm was failing at the points when it was trying to symlink binaries from the packages into the node_modules/bin folder.
Reading through this three year old ticket, it seems that symlinking in shared folders has been fixed and regressed a number of times. The version that I used to have apparently supported it fine – the latest version must have regressed.
All is not lost
After admitting that I was never going to be able to install the modules in app directory the fix was remarkably simple: install packages one directory up on the guest system. When fufilling a require statement node’s module loader ascends the directory tree searching for node_modulesfolders containing the requested module.
In one sense I’m glad it didn’t work right the first time. In trying to figure out what was wrong I ended up rewriting my puppet manifests a number of times, making things faster and better. I whittled down a nearly ten minute spin up to a three minute spin up.
But In the other sense I wish figuring out this was all a VirtualBox regression didn’t take all fucking day.
I can’t think of a single good reason not to get my next tattoo at this Pawn Shop/Sneaker Outlet/Cell Phone Dealer/”ewelry” & Optical Store. Maybe I’ll pawn something and get a tattoo of it in rememberance.
I’ve been learning Haskell from the fantastic online book Learn You a Haskell for Great Good! and there’s a concept in the language that I’ve just started to wrap my head around.
In Haskell, every function formally only takes one parameter. The syntactic sweetness of Haskell does its best to hide the user from this fact.
Let’s define a function:
addThreexyz=x+y+z
Better understanding through JavaScript
Here’s how I would have naïvely implemented the addThree method in JS.
varaddThree=function(x,y,z){returnx+y+z}//wrong
This doesn’t actually capture what’s going on underneath. Here’s the actual equivalent function:
To end up with an integer value, you call would call the curried version like so:
addThree(5)(8)(13)==26//true
But you could have a partial application of addThree which returns a function:
varaddSixteenTo=addThree(8)(8)//one function left in chainaddSixteenTo(112)==128//true
Bringing it back around
There is an example in the book that really drove the point home for me. The author describes another way to define the addThree function (\x -> ... is the Haskell syntax for a lambda):
addThree=\x->\y->\z->x+y+z
Also, here are two different ways to call addThree:
addThree3816((addThree3)8)16
Both of these make it clear that each parameter is really adding another single-parameter function to the chain. Haskell has so much syntactic sugar, it’s amazing. While it does occasionally make learning the underlying concepts more difficult, it’s really joy to play with this language.