Thursday, June 20, 2013

Working notes - sorter.herokuapp.com

So where am I :

I was implementing a travis-ci-like badge providing lines of code report

Purpose
  - get some exposure on building travis-ci like services
  - see how viral sth like this can be
  - leverage a rather amazing program like perl cloc

In doing so I decided to structure the service on top of web service version of cloc  (what modules if they were to exist would make the implementation of whatever you want to develop trivial..)
In doing so I realized/decided that implementing the cloc web service (ie exposing the plain cloc functionality as a web service) would need some more basic glue that could be reused by any use of someone that would want to expose a typical stdin/stdout processing unix utility as a functional web service. In doing so I realized that to get the full power of the later I would some way to describe nodejs streams so as they do their streaming server-side….  (all that in http://www.otdump.com/2013/06/unix-utilities-as-web-services.html)


Anyway, this was yesterday. I asked my friend gl for an opinion and he saw me falling down my typical infinite recursive loop ... and he suggested to cut and return to the original cloc web service .
(which was itself pretty far deeper than where I was supposed to be)
He probably didn't hear that I am on vacation which in the context means that  I am Alice and I have no problem chasing rabbits wherever they might take me...

So, 1 day later, I have my first unix utility as a web service:
 http://sorter.herokuapp.com

You can use it rather simply:
> cat /etc/passwd|curl --data-binary @- 'http://sorter.herokuapp.com/?args=-t&args=%3A&args=-k&args=3&args=-n'
....
_krb_changepw:*:932:-2:Open Directory Kerberos Change Password Service:/var/empty:/usr/bin/false
_krb_kerberos:*:933:-2:Open Directory Kerberos:/var/empty:/usr/bin/false
_krb_anonymous:*:934:-2:Open Directory Kerberos Anonymous:/var/empty:/usr/bin/false

%3A is :  
The code is surprisingly simple
https://github.com/ogt/sorter/blob/master/index.js


The interesting things in the code are:
Apparently req and res are streams, readable and writable correspondingly.
Any process you get a handle on (with spawn, exec etc.. or just via global var process.xx) gives you access to file descriptors like stdin, stdout, that are themselves streams.
A little handy function "child" create a duplex stream out of anything that has stdin and stdout.
You need duplex streams to do things like
a.pipe(b).pipe(c).pipe(d) :
b and c need to both send and recv a just send down the  pipe and d only receives from its left....
So, req.pipe(proc).pipe(res) requires very little documentation to explain what it does!.

The testing client would do sth like:

var req = request(options, function(res) {
  res.pipe(process.stdout);
});
process.stdin.pipe(req);


We could create a duplex stream out of req/res, (there is a utility function called duplexer or sth in event-stream) so as we could make the 3 lines above look more the server ... but that would come at a cost: we only have a handle to both req and res after the callback is called... which means that we would need to do the "piping" inside the callback.  Its not a big deal but the code above starts pushing data to the server earlier... I think...

Anyway, the fun thing is that this was sooo easy to do. On top of that heroku seems to have in its minimal slugs the core unix utilities, so this works on heroku.
This means that I can create - with practically no effort one app (seder, gziper, gunziper, awker, sorter,...) for anything I like.. no effort and I get functionality (and service/cpu time) for free.
I like the xx-er terminology its funky enough that probably nobody has taken these names. I will probably push the single 10-line function of the server into a module... and then the project would be practically just the scaffolding of every nodejs heroku app. Nothing more. (by creating a separate app per utility, I get one server per utility - more cpu... Besides most unix utilities (e.g. ls) don't make sense this way.. Maybe I will create an automated way to generate these projects/repositories/deploys... how about doc... all I need is the man-page to markdown converter and that together with a common header would suffice... 



No comments:

Post a Comment