JavaScript Inheritance and Method Overriding

I’m always finding myself wanting to use JavaScript like a classical OOP language with class-based inheritance, even though I know it is prototypical. Anyways, I’m jotting down how to do this and also allow for overriding (not overloading) methods. Note, that if you’re using Node.js, then you don’t need to define the inherits function, it is already available in require('util').inherits.

Check the console:


Node.js: Split/Count Number of Lines (Unicode Compatible)

The task was to read n number of a lines from a file in Node.js. The assumption is that the read buffer is UTF-8. The caveat is that there is more to just new lines in Unicode than “New Line (ASCII 10 or \n)” and “Carriage Return (ASCII 13 or \r)”.  This SO comment put me in the right direction.

Unicode Technical Standard #18 § 1.6 describes the following rule regarding line boundaries, (PS and LS are defined in Unicode 6.1 § 16.2 Layout Controls):

RL1.6 Line Boundaries
To meet this requirement, if an implementation provides for line-boundary testing, it shall recognize not only CRLF, LF, CR, but also NEL (U+0085), PS (U+2029) and LS (U+2028).

To put it into practice, the number of lines can be tallied while the ReadStream has incoming data:

The key is the regular expression, derived straight from RL1.6,  /\r\n|[\n\r\u0085\u2028\u2029]/ , which looks for any of the possible new line characters with precedence to CRLF (greedy). Don’t forget to listen for the “end” event and also remove the “data” listener on “end” or “error!”

In the context of splitting incoming data by line, simply calling  chunk.toString('utf8').split(/\r\n|[\n\r\u0085\u2028\u2029]/g) will do the trick. Note that if working with CSV files, parsing will require a bit more effort if escaped new-lines will need to be taken into consideration, and PS, LS and NEL are probably not characters that should act as line delimiters in that context anyway.