RegExp: Matching Balanced Parenthesis and Quotes (greedy, non-recursive)

Solution

I need to match all the text within balanced parens, single quotes, or double quotes, but don’t snag on the content within, i.e. be greedy! “Balanced” means there should be a corresponding single or double quote, but not mix them, or there should be a closing paren for each opening paren.

PCRE:

The match will be in capturing group #3.

In short, this regular expression uses a positive lookahead (?=...)  within the lookaround conditional  (?...)  and the backreference \2  to match the initiating character. If it is a single quote or double quote, it was contained within \2 , but if it was an open paren ( , it is not contained in a capture group and therefore it should match a close paren. In retrospect, this could be enhanced, and also needs some tweaks for JavaScript compatibility since lookaround conditionals are not supported.

Do note that this matches the beginning and ending of the string ^$ , so may require tweaking for other applications.

Test Cases

regex101 link: https://regex101.com/r/fG4rZ8/3

Matches:

These will match hello , hello world , or in some cases 'hello world  for the last test case.

Doesn’t Match:

Node.js: Break on Uncaught Exceptions

In the event the implemented error handling isn’t what it’s all cracked up to be, having the Node.js debugger break on uncaught exceptions is quite simple:

Assume the following code, where bogus  is undefined:

Line 3 will terminate the application (normally), and log an exception in stdout (or the console, a log, etc.) — even when Node.js is in debug mode.

To debug this uncaught exception JIT, the implementation may be as follows:

Run the above application (whose filename is “index.js”) as node --debug index.js  and after 2 seconds (notice the timeout) the exception will be uncaught, but the process will break on line 5 for debugging.

See this gist for an auxiliary and shortened reference.

Node.js: Dynamic Event Listener Binding and Removal

Read this SO question first: Removing event listener which was added with bind

Now that we’re on the same page. I’m going to make this short’n’sweet by providing a premise and letting the code do the talking: The example provided in the SO post was all fine and dandy, but was for the case of single-use. I needed a way to clean up event listeners on a Node.js net.Server and associated socket connections.

 

 

Basic PHP Class Serialization

Just a quick brain dump regarding serializing classes in PHP. There is nothing advanced about this, and TBH I haven’t even read up on the caveats of serializing classes in PHP – perhaps a task for another day.

Judging from the string output after serializing the class, it appears that it is simply a named object. When unserializing, the class must be defined to unserialize completely, otherwise you’re stuck with an instance of __PHP_Incomplete_class. Hint: when in the same file, class definitions work just like function definitions, i.e. you can put them anywhere, before or after target code.

Notice the Fatal Error above. This example didn’t follow my instructions of having the class “X” defined. To remedy this, simply:

And now, with both PHP snippets merged into a single file, this is what the output looks like:

 

Playing Around with PHP Namespaces, Classes, and Variable Instantiation

Just needed to do some sanity checks, figure out some syntactical sugar (i.e. tricks), and test certain scenarios. I figured I’d share my findings with the world. Explanations are very light (brain dump) and embedded in the code; output follows directly. Oh, and this is a single stand-alone file.

php://stdin

Well, this really isn’t php://stdin, but the headline looks cool.

php://stdout

Yes, this time it really is stdout:

 

Scripting Parallel Bash Commands (Jobs)

Sometimes commands take a long time to process (in my case importing huge SQL dumps into a series of sandboxed MySQL databases), in which case it may be favorable to take advantage of multiple CPUs/cores. This can be handled in shell/bash with the background control operator & in combination with wait. Got a little insight from this StackOverflow answer.

The Technique

The break-down is commented inline.

Notes

  • This does not function like a pool (task queue) where once a “thread” is freed up it will be immediately eligible for the next task, although I don’t see why something like this could be implemented with a little work.
  • wait will pause script execution for all PID parameters it is provided to complete before moving on.
  • Once the if...fi control block is entered wait will cause the for...in loop to suspend.
  • $! (or ${!}) contains the PID of the most previously executed command; make sure it comes directly after the operation used with &. Throw it into a variable (like lastPid) for future use.
  • This is not multi-threading, although this simplified concept is similar. Each command spawns a separate process.
  • read  is just creating a multiline list; in my specific case this was favorable over a bash array, but either will work.
Screenshot of htop showing parallel process tree.

JavaScript Inheritance and Method Overriding

I’m always finding myself wanting to use JavaScript like a classical OOP language with class-based inheritance, even though I know it is prototypical. Anyways, I’m jotting down how to do this and also allow for overriding (not overloading) methods. Note, that if you’re using Node.js, then you don’t need to define the inherits function, it is already available in require('util').inherits.

Check the console:

 

Node.js: Split/Count Number of Lines (Unicode Compatible)

The task was to read n number of a lines from a file in Node.js. The assumption is that the read buffer is UTF-8. The caveat is that there is more to just new lines in Unicode than “New Line (ASCII 10 or \n)” and “Carriage Return (ASCII 13 or \r)”.  This SO comment put me in the right direction.

Unicode Technical Standard #18 § 1.6 describes the following rule regarding line boundaries, (PS and LS are defined in Unicode 6.1 § 16.2 Layout Controls):

RL1.6 Line Boundaries
To meet this requirement, if an implementation provides for line-boundary testing, it shall recognize not only CRLF, LF, CR, but also NEL (U+0085), PS (U+2029) and LS (U+2028).

To put it into practice, the number of lines can be tallied while the ReadStream has incoming data:

The key is the regular expression, derived straight from RL1.6,  /\r\n|[\n\r\u0085\u2028\u2029]/ , which looks for any of the possible new line characters with precedence to CRLF (greedy). Don’t forget to listen for the “end” event and also remove the “data” listener on “end” or “error!”

In the context of splitting incoming data by line, simply calling  chunk.toString('utf8').split(/\r\n|[\n\r\u0085\u2028\u2029]/g) will do the trick. Note that if working with CSV files, parsing will require a bit more effort if escaped new-lines will need to be taken into consideration, and PS, LS and NEL are probably not characters that should act as line delimiters in that context anyway.

Node.js: Handling File Uploads without Express

I was working on a little project which does not utilize express. Part of the program needed to handle file uploads. In PHP this is easy with the $_FILES superglobal, and with express this is also just as easy with req.files. In pure node.js the developer doesn’t have this luxury because they’re just dealing with raw http requests and responses (I’m surprised node.js even parses the headers, it is a miracle!). Before I begin, this whole process should be optimally streamed to a file so large file uploads do not consume more RAM than their response chunk size, but I didn’t do this, shame on me.

First, capture the response body. Here I am assuming you have a HTTP server that is listening and responding to requests:

Here, I’ll give you the entire source to my MultipartParser Node.js module. It requires nodeproxy.

And the implementation is as follows:

LittleDiddy does not have any path routing (e.g. it won’t actually show a form that uses enctype="multipart/form-data"), it simply accepts and parses a POST with “file” inputs. This “little diddy” (implementation example) is untested by the way, so good luck.

This article’s source is released under the MIT license; I don’t care what you do with it.

PHP + OpenSSL = RSA Shared Key Generation

OK, so I’m super excited and wanted to jot down a note about RSA shared key generation.

Drama Mama

First, I wanted to generate some PEM formatted RSA public and private key pairs. I was going to do it the risky/hard way and exec openssl commands. Using exec is not a good idea if avoidable, but I was desperate. Just to put my mind at ease, I did some quick googling and found out it is possible to do this natively in PHP if you have the openssl extension installed and configured. FANTASTIC!

Then I got the curve ball. Using openssl_pkey_get_public doesn’t work as expected: it wouldn’t take a pkey resource handler and get the public key – how deceiving! I thought the thrill was over and I’d have to resort back to exec until the all mighty Brad shed his light upon us… four years ago (doh).

TL;DR

Here’s the code that generates the RSA shared key and fetched the public and private keys.

 

Alternatively, one could call the openssl commands directly: