Monday, June 24, 2013

Language trends

Continuing from last related post about this,
read a dr. dobbs article (kind of happy that this publication still exists.. it was the publication we were respecting the most with my friends as we were getting into the PC world back in college (83-88) )
Dr Dobbs post :

Language surveys mentioned :
 ohloh
 tiobe

I would add to these the
 hacker news poll
 github stats  (can't find historical version of this)


Finally regarding google trends as a metric for judging the reach of a language.
Google trends is good to assess whether a language is picking up or is in decline.
It is harder to use as a comparative metric... many language have less unique terms than others and adding keywords eg. programming next to them makes the comparison less apples to apples (nodejs programming vs java programming vs ...)

I was discussing this again with sk and I wanted to make the argument that there is a trend towards a single language across all layers ... (js front end, back end nodejs, db mongodb)
anyway using github.com/language page from the archive.org

here are the data  (the data show that they can't be too relied upon, viml made it in the top10 a couple years ago... and the swings are too massive => they must be looking at file-commits/short terms instead of something more reliable...)

2009
Language Name Percentage
  Ruby 30%
  JavaScript 18%
  Python 9%
  Shell 7%
  Perl 6%
  C 6%
  PHP 5%
  Java 4%
  C++ 4%
  Objective-C 2%

20010
Ruby  19%
JavaScript  16%
Perl  12%
Python  9%
Shell 7%
PHP 7%
C 6%
Java  5%
C++ 4%
Objective-C 2%

2011
JavaScript  19%
Ruby  17%
Python  9%
C 8%
PHP 7%
Shell 7%
Perl  7%
Java  7%
C++ 4%
VimL  2%

2013

JavaScript  21%
Ruby  12%
Java  8%
Shell 8%
Python  8%
PHP  7%
C 6%
C++ 5%
Perl  4%
CoffeeScript  3%

Thursday, June 20, 2013

Working notes - sorter.herokuapp.com

So where am I :

I was implementing a travis-ci-like badge providing lines of code report

Purpose
  - get some exposure on building travis-ci like services
  - see how viral sth like this can be
  - leverage a rather amazing program like perl cloc

In doing so I decided to structure the service on top of web service version of cloc  (what modules if they were to exist would make the implementation of whatever you want to develop trivial..)
In doing so I realized/decided that implementing the cloc web service (ie exposing the plain cloc functionality as a web service) would need some more basic glue that could be reused by any use of someone that would want to expose a typical stdin/stdout processing unix utility as a functional web service. In doing so I realized that to get the full power of the later I would some way to describe nodejs streams so as they do their streaming server-side….  (all that in http://www.otdump.com/2013/06/unix-utilities-as-web-services.html)


Anyway, this was yesterday. I asked my friend gl for an opinion and he saw me falling down my typical infinite recursive loop ... and he suggested to cut and return to the original cloc web service .
(which was itself pretty far deeper than where I was supposed to be)
He probably didn't hear that I am on vacation which in the context means that  I am Alice and I have no problem chasing rabbits wherever they might take me...

So, 1 day later, I have my first unix utility as a web service:
 http://sorter.herokuapp.com

You can use it rather simply:
> cat /etc/passwd|curl --data-binary @- 'http://sorter.herokuapp.com/?args=-t&args=%3A&args=-k&args=3&args=-n'
....
_krb_changepw:*:932:-2:Open Directory Kerberos Change Password Service:/var/empty:/usr/bin/false
_krb_kerberos:*:933:-2:Open Directory Kerberos:/var/empty:/usr/bin/false
_krb_anonymous:*:934:-2:Open Directory Kerberos Anonymous:/var/empty:/usr/bin/false

%3A is :  
The code is surprisingly simple
https://github.com/ogt/sorter/blob/master/index.js


The interesting things in the code are:
Apparently req and res are streams, readable and writable correspondingly.
Any process you get a handle on (with spawn, exec etc.. or just via global var process.xx) gives you access to file descriptors like stdin, stdout, that are themselves streams.
A little handy function "child" create a duplex stream out of anything that has stdin and stdout.
You need duplex streams to do things like
a.pipe(b).pipe(c).pipe(d) :
b and c need to both send and recv a just send down the  pipe and d only receives from its left....
So, req.pipe(proc).pipe(res) requires very little documentation to explain what it does!.

The testing client would do sth like:

var req = request(options, function(res) {
  res.pipe(process.stdout);
});
process.stdin.pipe(req);


We could create a duplex stream out of req/res, (there is a utility function called duplexer or sth in event-stream) so as we could make the 3 lines above look more the server ... but that would come at a cost: we only have a handle to both req and res after the callback is called... which means that we would need to do the "piping" inside the callback.  Its not a big deal but the code above starts pushing data to the server earlier... I think...

Anyway, the fun thing is that this was sooo easy to do. On top of that heroku seems to have in its minimal slugs the core unix utilities, so this works on heroku.
This means that I can create - with practically no effort one app (seder, gziper, gunziper, awker, sorter,...) for anything I like.. no effort and I get functionality (and service/cpu time) for free.
I like the xx-er terminology its funky enough that probably nobody has taken these names. I will probably push the single 10-line function of the server into a module... and then the project would be practically just the scaffolding of every nodejs heroku app. Nothing more. (by creating a separate app per utility, I get one server per utility - more cpu... Besides most unix utilities (e.g. ls) don't make sense this way.. Maybe I will create an automated way to generate these projects/repositories/deploys... how about doc... all I need is the man-page to markdown converter and that together with a common header would suffice... 



Tuesday, June 18, 2013

Unix utilities as web services

I wanted to convert a very useful unix utility (cloc) into a web service.
I thought, that someone may have already created a little glue that can do that.
Especially given nodejs  and the recent love that streams are getting there (https://github.com/substack/stream-handbook) together with the re-focus back into the unix small utility programs fit together with stdio/outs.. etc someone must have done it.

Anyway, so far I can't find anyone.
The only thing I found is someone 5yr old blog and broken references (guthub and gists weren't that popular then .. so no trace of whatever this guy did...)
http://swik.net/Unix/BASH+Cures+Cancer+Blog/Exposing+command+line+programs+as+web+services/b3x8u

This guy seems to have the right idea:

$ ./to_web.py -p8008 sort &
Thu Mar 27 13:45:54 2008 sort server started - 8008
$ ./to_web.py -p8009 gzip &
Thu Mar 27 13:46:29 2008 gzip server started - 8009
Use the services:
$ for i in {1..10}; do echo ${RANDOM:0:2}; done | \
> curl –data-binary @- “http://swat:8008/sort+-nr” | \
> curl –data-binary @- “http://swat:8009/gzip” | \
> gunzip
97
37
23
23
21
18
11
11
10
10
-
Note that the to_web.py, seems to be doing just a popen of the command passing the stdin/stdout of the server to the command. Even the URL query string is passed intact as  ... no need to think much about mapping cli parameters to url params.
The whole thing looks like substack's way of doing things.
(without any extra that substack may have already added I would use this
http://nodejs.org/docs/v0.4.6/api/child_processes.html  as starting point to write the ~10 line to_web.py
... possibly by using .pipe() instead of tie-ing outs  and ins manually https://github.com/substack/stream-handbook)

Any command that is sideffect free and uses just cpu/mem resources, can be made available to third parties... possibly with a bit lower than normal ulimit to prevent abuse... just like travis-ci....
I wonder how many of these commands are part of what heroku exposes... if it does it should be easy to create a free web-servuce version of all unix utilities. All that would be needed would some consistent naming for the domains  ... hm and some client side pipe operation (would we need any syntactic sugar??) across the web services.. If I recall correctly streams and related libs can be browserified just fine....
On the other hand Ideally I should be able to hook webservices directly as opposed to be using the browser as the intermediate point.
If I want for example to take a gziped csv file unzipped it sort it and re-zipped.. assuming that I have
gzip.herokuapp.com, gunzip.herokuapp.com  sort.herokuapp.com how would I do keeping the stream of bytes going from one webservice to the next....???

One ugly way would be to be calling gunzip with a special option
"send the response to "sort.herokuapp.com" instead of me
But that wouldn't work because then sort would send it to me instead of gunzip.herokuapp.com
I guess we would have to be calling gunzip with a special option
"send the response to "sort.herokuapp.com" instead of me but when doing so with a special option to
send it to gunzip.herokuapp.com" instead of me...

So then the trick is to create a library that does the "syntactic sugar" on top of the ugly thing above so as all I have to say is sth very simlar to
"gunzip.herokuapp.com PIPE sort.herokupapp.com PIPE gunzip.herokuapp.com



Sunday, June 16, 2013

Exporting from evernote

All my evernote files that are a "startup idea" have a unique title "NIDEA - xxx" (tags are nice - but for some types of notes - I add the unique keyword as part of the title... e.g.
HOWTO - xxx
NIDEA - xxx
TODOS - xxxx
)
anyway I wanted to capture the titles of all my nideas into a file to work with them... how to do that with evernote?
Ok : simple:
1. Do the evernote search that finds them, in my case 'nidea'
2. Select All or multiple select individual whichever you want..
3. File -> Export notes -> exports to some folder all the notes be default as an html file per note
4. cd to that folder, ls -1 and capture that output (the filenames are the titles) and paste it into your file..

Travelling...

So one more indication that blogger sucks http://realgl.blogspot.gr/2013/05/omg.html ..
I am travelling and when I reopen my blogger to edit a older post... blogger now treats me as if I am a different person...
It uses the origin of the current IP as a more important indication about my language compared to the last several years of IP/counrty, my own preferences and whatever else..
I can see how this happens in the apps that have some visitor-site components and some signed in use components... bu this one is clearly a signed in user display...
I think by now it is clear that blogger is left practically the way it was when Google bought it 10 years ago (http://en.wikipedia.org/wiki/Blogger_(service) ) with just a bug maintainance crew dedicated to it.
I guess I should be the one to blame.. I am typically more appreciative of services left (alone) running for ever.. (compared to services that seem to keep on changing un-necesarily interfaces make it increasingly harder to find the whatever functionality made you come to the product in the first place...)..
So I am to blame because I keep on doing my blog posts at blogger - instead of using something better...


Technology adoption delays

The first time I used github's "paste an image in a text area by just doing Cmd-V" feature I was hooked.
I mean what was obvious before in every wordprocessor took definitely a long time to arrive in a web browser context. Even gmail tooks its time - for while pasting a picture in the message box was not working ,was working differently than attachement ... it was a bit confusing etc..
But with github's editor everything made sense. After that I started expecting to find in all other places
be it in evernote, in my blogger edit message, in oD job post, everywhere.
And everytime I was annoyed that who ever was the author of the sw I was using didn't do what github did:
[! uploading image ]]
.....
[![image](https://f.cloud.github.com/assets/153419/659286/bbb6622a-d67d-11e2-9022-84f81400a0a3.png)
So simple, using the editor normal text to provide a progress indicator, upload thge picture somewhere, create an unguessable url to it and then do whatever would have been the natural thing to do to display in the text a picture that is available online via a url...

That approach would definitely work for blogger and oD job post/messages..
That approach may not have worked for evernote and gmail (the images in this case should have become message attachments).

Anyway my key question is why it takes so much time for a simple innovation
(there is both a product functionality innovation and an implementation innovation in this case) takes so long to spread:

- The authors of one system isn't exposed to the other system
- The features of one system (e.g. blogger) are decided by a pm - pms aren't using github. The people that use github (e.g. programmers) aren't in charge of making product decisions in their normal (non-github) life as programmers working for some company.
- People like web designers are trained in a mode of "I see all the services I use, anytime I see an interesting effect, font, ui thingy, I check it out, figure out how to do it, often link it up in my blog - to make sure that I do the same thing soon... somewhere. In that sense contrary to "programmers" "designers" are users and decision makers for whatever they do and that allows "design" innovation to spread more rapidly than developer-innovation.
- While most things are reverse-engineerable in web design... thats not the case with anything that requires server side functionality. This is probably the case with the "paste an image feature" - is the code for doing that available publicly? probably not (checking it out.. looking for something similar to http://techcrunch.com/2013/01/02/github-replaces-copy-and-paste-with-zeroclipboard/)


(after writing all this I went to catchup the blog of my friend gl and then read this post (from a month ago..)
http://realgl.blogspot.gr/2013/05/omg.html . The funny think is that I think I am the "friend" mentioned that said "oh yeahh... its been around for years" (I actually think that github's paste is fresh.. gmail's is older...)

Vacations



I understand why vacations feel good in the first day:
because i can make plans about all the things that I can achieve during my vacations - I haven't a single indication of the fallibility of my plans.. I am in a vacation, I am new person, in a new place, I can do/I can be anything I want. Of course with every day that passes the indications that your plans will fail like everytime before start mounting...