OT Dump: 2013

Saturday, December 14, 2013

Think of this every time you search

Many people often have wondered how much money does it cost to google every time we perform a search. The Search that google will get everything perfect to respond instantly to produce 100 results, to search multiple times while you write the search etc etc... Even though we understand that the resources involved have to be significant we also understand that the cost of a web page request - even at an expensive site - it is small enough that you would need to batch a 1000 requests to really talk about meaningful numbers.

Starting from that point I asked my self the question
- How much money does google earn every time I search.
My first instinctive answer would have been :
Nothing really - given that I never really click on ads ... but if you have millions searching... that quicky adds up

Which is completely wrong:
Google really charges per impression - not per clicks : given that the clickthoughrate of the ads are carefully regulated - and advertisers are penalized if the CTR is below 0.5-1%... we can make the approximation that average impression nets to Google about 1% of the average click.
What is the avg CPC? I would definitely put it at $1 for US geos... which means that Google makes about 1cent per ad impression

Quick reality check
- googling average ctr on google => 3%

- googling average cpc on google => $0.35

ie cpc was 1/3rd of what I thought but ctr is 3x of what I thought - their multiple is exactly what I thought (smiling)

Ok So google makes 1cent per search of mine? No. Google makes 1cent for each ad that is impressed on my search results. How many ads are there on my results?3 at the top - 7-8 on the side and recently there are more in the bottom. Plus I will get more ads displayed as I am writing my query - and the page reflects results+ads on demand... So lots of ads.

If the avg # of ads is 10... then google makes 10cents every time I do a search - no matter whether I click or not!!!!

Unreal!

Cool node projects

Three related interesting projects - they open the potential for crowd-expanded education…in the form of multiple people adding problems/exercises that others can do/verify/learn
1. https://github.com/maxogden/art-of-node
2. https://github.com/rvagg/learnyounode
3. https://github.com/rvagg/workshopper

They also make use of some new widgets I have not seen

Talking about widgets I also liked the license widget seen by substack at stream-handbook

My first HN post - Sunday Assembly

I did it rather impulsively - no preparation. It has been falling like a rock since I posted it :-(

I really liked the idea of the "Sunday Assembly" when I heard about it on NPR on the way home yesterday. I liked their moto too "You can talk about Good without talking about God"
They seem to be running a crowdfunding campaign - which was not very successful 33K british pounds instead of 500K (ends tomorrow..)
My hope was that a bit of HN publicity can go long ways in helping them get some funding...

Friday, November 8, 2013

Alter egos

I was supposed to be doing actual work... instead I went into HN... so there I am after a while daydreaming (since I don't have anyone to talk my ideas with (I am talking about you gl)).
So the post that piqued my interest was the post about working from a cruise-ship http://tynan.com/cruisework .

Now.

The idea of having some way to have all your life-logistics handled efficiently by the "system" so as you don't have to worry about them... is well discussed in prior chats.

The idea of internet isolation (in the form of actual wifi isolation) maybe useful interesting to some - but for me its not ideal... nodejs like programming is relying too much on continuously browsing working on top the work of others etc.

The idea of communication isolation is obvious. However getting on a cruise ship to achieve this seems a rather inefficient way to achieve this.

So here is an alternative way:

I always felt that it is worthwhile thing to maintain an alter ego. One or more identities that are completely disconnected from your normal identity - (pushing this to the limit is also providing you with means of defending enemy of state situations). What is though interesting is to think of one of these identity as your vacation identity. Imagine that you go to your email, skype etc and you change status signatures, vacation messages... that you are on vacation and that any emails/pings will not be responded to for a while.

And then , you close your chrome that is associated with your normal identity (if you are like me you have already a chrome profile for the "employer" with all the employer related logins , google apps etc and a different one for your personal gmail etc.. So what I am suggesting is that you have a third one. Your vacation self.

So you close the chrome profiles - and you don't open them for a weekend, or a week. You do the same with your skype, your phone etc. You open your vacation profile switch there all your devices and browsers etc.

You are switching to a new connected self who has no ties with your normal self. Even your good friends don't know about it - you have probably diff followers contacts if at all .. .vacationing gypsies rarely have long term connections.. As soon as you brought all this up there is significant friction (given that your android chrome supports only one profile...) to go back to your normal self for quick peak. You can still do it ... but it inconvenient (Note that your vacationing self ... has a diff google voice mail.. .and if you are like me your android nexus phone now gets associated with another phone number... )

So that is it. If you feel like going on a vacation, switch to your alter ego and your digital world around you changes. You are someone else, with diff playlists, diff Stack overflow profile, following different blogs and news... If you want the "watching through the glass" feeling of visiting another country - without really visiting it (thats my view of cruise ship tourism) ... hmmm thats a diff post... I need to be going.

Friday, November 1, 2013

Airbnb and regulatory environment

Just read this article at HN http://needwant.com/p/buying-apartment-airbnb/

I've had the discussion about this "business model" (with similar ideas for finding high demand / low supply areas) with at least 3 separate set of friends - debating the profitability/scalability of the model.

One of the key question that arises - is the long term regulatory environment - the whole hotel and entertainment industry pays a great deal of use taxes - allows for strict regulation - compared to an apartment rent this way. The odds are that the hotel industry will focus first on homes/apartments whose exclusive purpose is the daily rent (as opposed to a resident that occasionally rents their place when they are away...). This long term outlook is relevant when you are looking into 4yr plan - (more than that if you are to take into account the cost of capital/interest rate, taxes etc.).

... doing a bit more research: Las Vegas city tax on hotel rooms is 12%...

I am curious how much is the regulatory overhead : making sure that your processes pass regular health inspections, that the people you employ are working legally, pay taxes etc.

Quite probably it may add up to another 10% - which still doesn't kill the model.

However, if I were in the hotel business I would probably push primarily for enforcement of certain rules that would make it impossible to allow the retrofiting of a typical residence into a "hotel", these are typically the regulations around fire and emergency rules (e.g. fire doors across the building, alternative exits, exit lights throughout the building that work when electricity is out, building plans and exit maps throughout the building so as the un-experienced resident can find their way out etc etc...

I wonder when this will happen. Either when the first bad airbnb accident happens or when the hotel industry reads the HN post :-) (I am sure that the post will produce more copy-cats)

I bought an apartment to rent out on airbnb | Need/Want

Sunday, October 13, 2013

Inbreeding and evolution

Following a thread with my kids about the fragility of the human body
- I was telling them "imagine having a pet so fragile, that as it walks, it may slip, fall, break its head and die. Thats how humans are. Walking accidents, top heavy, too much guts and brains, too little muscle mass and bone mass to keep things together. Imagine a dipod, like a chicken falling as it walks and injuring itself.
To accentate the difference I pointed out to felines like chetah that while running at 70mph lounge to their prey, bringing it down to the ground - something equivalent of jumping off a car at highway speeds - and doing that multiple times daily.

My smart-alecky kids, instead of being impressed from my points they jumped to tell me I am wrong and that cheetahs are not that fast (todays kids have grown with animal planet and know pretty well the largest, fastest,deadliest,smartest of any sort of thing) . Defending my speed claims I arrived at the cheetah wikipedia page (I was off by a bit, cheetah go up to 100-110km per hr, and they are an extreme case) where I was attracted to the following comment (and abandoned the not as satisfying discussion with my kids):
The cheetah has unusually low genetic variability. This is accompanied by a very low sperm count, motility, and deformed flagella.
That seemed interesting. Reading further the article went on to explain that this was caused due to a recent genetic bottleneck.
It is thought that the species went through a prolonged period of inbreeding following a genetic bottleneck during the last ice age.
What changed my view from interest to annoyance was that comment though:
It has been suggested that the low genetic diversity of cheetahs is a cause of poor sperm, birth defects, cramped teeth, curled tails, and bent limbs. Some biologists even believe that they are too inbred to flourish as a species.^[45] Note, however, that they lost most of their genetic diversity thousands of years ago (see the beginning of this article), and yet seem to have only been in decline in the last century or so, suggesting factors other than genetics are mainly responsible.
See, my simple way to explain most of evolution effects to my kids, why I have black eyes, why my hair is thick and black why skin is this or that why swedish people are blond but Eskimos are dark haired, why our nose is less pointy than the chimps and the chimps nose is less pointy than other animals etc etc... all that includes a continuing combination of using
- fatal disadvantages
- strong advantages
to explain how a some particular member (or group) of a species was able to reach a certain level of environmental dominance/monopoply which often implies that this member will become a new Abraam patriarch/matriarch and explode its genetic tree - with most of the descendants carry the unique genetic characterists of nose, hair etc.

Its a relatively simplistic way to explain things - given the more incremental nature of genetic change ... still I think this model of explaining evolution is a very good one:
Where evolution happens the fastest when a relatively small sample of a species find itself owning a slice of the environment/geography , either due
- they were able to move to a new place due to their small incremental advantage
- they were they only ones to survive to the new place due to their incremental advantage
(new place - may be the old place + some new environmental change (draught, flood, new disease, newfound prey, newfound enemies, newfound resources))
and then genetically explodes to fill the "new place" until it reaches some sort of equilibrium.
The smaller the group that survives, the biggest is the genetic bottlneck(and following inbreeding) and then the big is the change that happens as a result.

Of course human morals are seeing inbreeding as synonymous to incest and thus wrong/bad.
Somehow that same morality made it all the way to scientific analysis and became a meaningful explanation -
- this species is too much inbred thats why it wont survive
- those species parents doesn't love/care for their offsprings that why it doesn't as well
- that species is cannibalistic, it kills and eats its own ..... why should we even put it in the endangered list if it does that to its own.

I get annoyed just writing it.

Wednesday, September 4, 2013

Maps and visualizations

I found this at HN a couple weeks ago

http://twistedsifter.com/2013/08/maps-that-will-help-you-make-sense-of-the-world/

Apparently most of these come from a reddit http://www.reddit.com/r/MapPorn/ . Some of the maps can be good topics for dinner discussions.. its a definitely intriguing list.

On the visualization topic, I really want to find a reason to play with D3.js ....

Sunday, August 25, 2013

Marketing your node module - darker patterns

I am capturing here a few ideas that I've had recently in the same topic as in the last post .

I realize that the ideas become more and more extreme/desperate/dark still I felt that there is nothing wrong in capturing them.

At the high level the idea here is that you want to increase the "page rank" of your module.

Soon that pagerank will be a combination of

- the recent/all time downloads # as published by npm

- the page rank of the author (some aggregate of the pagerank of all the modules authored by the person)

- the github stars/forks accumulated by that module's repo

- the aggregate follows/starrs etc accumulated by that repo's owner

As a result improving the page rank of the module implies that you will somehow get these # s up - which is not obvious when nobody yet knows your module.

So here are a few things that you can do to speed up the process

Increasing Download counts

- Make sure that *you* use the new modules. You or your friends may have some modules that have daily downloads. You should see if it is possibly to refactor any of your existing modules (particular ones with high download counts) to make use of your module. If you achieve that any download of the former will result in downloads of the later!. This also means that as soon as you have one success you can easily build up on it...

- Look at all the competing/alternative modules - find which modules/apps are using it. If you can make a case that your module can do a better job at it... than go ahead file an issue pull request doing the change (replacing one module with the other) and explaining that by doing so you are improving the functionality etc etc. The odds are that, assuming your module is actually better, the user of the alternative doesn't care whose module to use - he just cares to improve his own functionality and doing so with minimal personal effort. Again here you should focus on cases where the target module has significant traffic

- Your employer may be a great resource. Most employers don't publish their code. However, they have teams of developers continuously commiting => firing continuous integration systems which do continously pull dependencies and rebuild stuff. Ie even private reposititories contribute to "download counts" and you definitely know at least one big such example - the repos of your employer. Finding how/where your opensource module can be used in your employer's repo is a very easy way to build up download count.

Increasing your module stars.

- Get people with lots of followers to star your module. The idea here is that if someone with lots of followers stars your repo... that goes into the newsfeed of all their followers . I typically find out about interesting new modules this way by looking what other people star... Of course then the next question is and how do you get these guys to notice you? The answer: the hard way. You need to closely keep an eye on their interests - where they put their time/commits - and knowing that may give you an idea about how you can help them . It might take more than one indirection - you see something that they do - you write some code/a PR for them - so as they know you / build up some credit - so when the time that they use something that you have a better version of... you are ready to offer your alternative - and if you persuade them to use your in your current project - they might be ok if you PR to replace the module in all prior modules they might have written depending on it.

- if you manage with all the above efforts to get someone to visit your module - your primary goals is not to star it - but to extend his page-count visit. Essentially think that you just bought an expensively aquired visitor and you want to get most of them. They came following a link towards a specific module but they actually may look around. You need to be prepared for that:

+ Your public repo should have only your jewels - it should not have every single repo you have ever written. This is very important - both because if you get followers you want them to checkout whatever you are creating and because a user with lots of crappy repos is just like a blogger with lots of crappy blog posts - nobody has the patience to go through the crap to find the jewel. So what you need to do is create a separate github account and move there anything thats is not yet publish-grade. This will also help you with the problem that the very first github push of a repo - will invite your followers to visit it - they should better see it in its full glory with the polished readme etc...

+ Your readme should have something provocative/novel in how it presents itself - "badges" (like travis) that others don't use - interesting clip art, an animated gif that shows a terminal running the module etc

+ Your readme should point if possible to other modules of yours.

+ You should make sure that you have your links to your blog/homepage etc in your github account... again making sure that you appear as a guy that talks rarely but is worthwhile listen to when he does. Exactly the opposite of me in this blog.....
+ Add a gitcomment thingy (what is a gitcomment will be captured in a future post) at the bottom of your project's readme - seeded with some comments from your friends.

Friday, August 9, 2013

Marketing your node module

Continuing the thoughts in the last post I am finding a bitmore about the steps that you need to do to market your blog

I just tried to find a module that provide google spreadsheet editing capability..
When I type google spreadsheet in the npm search
npm find one module google-spreadsheet (right in the first rows) but the probably better module (comparable downloads, comparable github stars, include the exact string "google spreadsheet" in the description, include google and spreadsheet in the keywords) named google-spreadsheets is in the 3rd page after several completely unrelated to "spreadsheet" matches (somehow it seems that npm shows in the search even modules that don't match all my keywords.
Now this says something about the poor state of affairs in the node module search-land.
Looking at stackoverflow in case I am the last person that is not using someting cooler and better to find node modules... I found this post . which lead me to two more useful places (and a couple more broken links). A github wiki page that includes lots of modules - https://github.com/joyent/node/wiki/modules,
(unfortunately I didn't find any of the "google spreadsheet" modules I was looking for there which shows that while long this list can have significant gaps...) plus sth that went pretty close to what I was looking - nipster - a search tool that combines the github stars and a bit better (still lousy if you read the open issue) search.

Anyway, after logging in my todo "NIDEA - better npm module search" I concluded that until then one should definitely include in the "marketing steps" the following
- add module in joyent module list page
- make searches in npm to see if your module shows up - if not tinker with description, keywords etc until it does
- look for your competition - other modules that show up on search - include in the description something that would make your module more attractive than them - its the only line that shows from your module in the search listing
- look for what keywords your "competitor" modules are using - add them too - npm doesn't use any advanced matching that would decrease the weight of the keywords if there are many - so the more the merier.
- confirm that your module shows up correctly in nipster search
- pleade your gh friends to star it
- make sure that you push in github with travis-ci any module/app that is using your module - for a while you will be the only user of your module and travis npm installs do count against the download counts that npm shows - which are the only signs of life when someone searches for your module.

Tuesday, August 6, 2013

Writing reusable modules vs deployable services

We were discussing this with gl and we realized that the differences (like in the past.. ) between writing code that becomes an open source node module vs writing code that is part of a deployable (web service/web app or even executable) are even fewer.

In reality it seems to me that you always need to think that you "deploy" your code
0. npm version major to 1.0 from 0.xx
1. fix up the readme
2. add keywords to package.json to allow searches to find you
3. possibly add command line executables, man page in your npm
4. possibly add a drawing the way substack does
5. post the link/readme of the module to your blog/twiter
6. go through the places that you search to find a module (and when you didn'y you ended up builting it) and let them know you did (github issue comments, stack overflows, node mailing list
7. (there are probably sites that are have some "new/cool modules… where it may be worth to submit your module to…)
--
Of course when you are substack.. all you need to do is add the module as public repo and your followers all find it… but even then he still does the 1..5.

It is also interesting that you rarely have a module that doesn't have a natural deploy:
- gl created a random-password service. Its a deployable but ideally he should have captured the logic into a module - excluding the mouse-generating-randomness - the module's function would probably have a optional "randomness" paramterer.
- I created a slugify-url module... given that its use is to create mnemonic filenames out of urls... it makes sense as npm -g installed executable with a man page etc.

Wednesday, July 17, 2013

Toilets and innovation

I have always been surprised by the difficulty that knowledge and innovation find to flow across cultures, fields, countries.

My wife just called me panicky to tell me that the washer water is coming out and it is all over the wooden floors. The most silly litlle thing and you have permanant damage to the floors. If I were in an appartment or a second floor the damage would extend to the ceiling of the guy below me. This disaster would never happen in Europe. There every toilet has in the middle what the showers have here: a small covered drain , and the toilet floor is always built in a way that it has a tilt towards that point. On top of avoiding flooding the appartment, it makes washing toilets a much less ugly chore.

(I will avoid discussing the details of the opposite problem. European toilet bowls haven't discovered the simple idea of being filled up with water (its more than that - they use a different flushing technique) Water is only at the very bottom which makes flushing/self cleaning a disaster compared to American toilets bowls...)

Nodejs development trick

I am developing an app . I have pushed out most of the functionality of the app in modules. The modules are relatively application specific - but they would allow someone to create a similar/slightly different app without having to deal/change much of the module-encapsulated code.
Any this has resulted in a 2 level deep dependence
app depends on module1
module1 depends on module2.

To keep things developped in parallel modules are alredy in repos.. and published as they advance.
When I need to do some concurrent development across modules/apps how do I do it.

Here is my trick
I have cloned the repos of all three app, module1, module2 above.
I have run npm install in the app where I prefer to work
I run
> nodemon index.js
in the app.
I remove the (sole code ) index.js in the node_modules and replace them with a soft link to the repos..
> cd node_modules/module1
> rm index.js
> ln -s ../../../module1/index.js .
> cd ../..
I do the same for the deeper module as well:
> cd node_modules/module1/node_modules/module2
> rm index.js
> ln -s ../../../../../module2/index.js .

At this point I can edit the module's index.js inside their own repos...as I would have like to do and the changes are automatically reflected in my app repo. Nodemon is smart , it does anyway a recursive decent and follows the symlinks... saving my index.js in the module repo results in nodemon restarting my app.

The only problem is I need to remember and not do any destructive operations on the node_modules tree.

Living on github

1. I am happy/proud that my github contributions are getting denser :

Still I find it hard to keep a 7-day streak.. I think it has to do with the length of my code iterations.. (and yes one day I want my contributions chart to look like substack's :-).

2. Looking at my repositories I realize that I dislike the fact the home page becomes this endless list of .files. At a minimum every repo of mine would have 4 dot files:

The reason that I need to commit .gitignore, .jshint and .cover is that I collaborate with others. They fork the repo and .jshint is the coding rules that I want to enforce, .coverignore is the coverage I need them to have etc. So this is a feature request for github : please collapse the dot files as a "hidden file" line that one can expand at whim. This will make the list more similar to what you would get with an ls in unix.

Monday, June 24, 2013

Language trends

Continuing from last related post about this,
read a dr. dobbs article (kind of happy that this publication still exists.. it was the publication we were respecting the most with my friends as we were getting into the PC world back in college (83-88) )
Dr Dobbs post :

Language surveys mentioned :
ohloh
tiobe

I would add to these the
hacker news poll
github stats (can't find historical version of this)

Finally regarding google trends as a metric for judging the reach of a language.
Google trends is good to assess whether a language is picking up or is in decline.
It is harder to use as a comparative metric... many language have less unique terms than others and adding keywords eg. programming next to them makes the comparison less apples to apples (nodejs programming vs java programming vs ...)

I was discussing this again with sk and I wanted to make the argument that there is a trend towards a single language across all layers ... (js front end, back end nodejs, db mongodb)
anyway using github.com/language page from the archive.org

here are the data (the data show that they can't be too relied upon, viml made it in the top10 a couple years ago... and the swings are too massive => they must be looking at file-commits/short terms instead of something more reliable...)

2009
Language Name Percentage
Ruby 30%
JavaScript 18%
Python 9%
Shell 7%
Perl 6%
C 6%
PHP 5%
Java 4%
C++ 4%
Objective-C 2%

20010
Ruby 19%
JavaScript 16%
Perl 12%
Python 9%
Shell 7%
PHP 7%
C 6%
Java 5%
C++ 4%
Objective-C 2%

2011
JavaScript 19%
Ruby 17%
Python 9%
C 8%
PHP 7%
Shell 7%
Perl 7%
Java 7%
C++ 4%
VimL 2%

2013

JavaScript 21%
Ruby 12%
Java 8%
Shell 8%
Python 8%
PHP 7%
C 6%
C++ 5%
Perl 4%
CoffeeScript 3%

Thursday, June 20, 2013

Working notes - sorter.herokuapp.com

So where am I :

I was implementing a travis-ci-like badge providing lines of code report

Purpose
- get some exposure on building travis-ci like services
- see how viral sth like this can be
- leverage a rather amazing program like perl cloc

In doing so I decided to structure the service on top of web service version of cloc (what modules if they were to exist would make the implementation of whatever you want to develop trivial..)
In doing so I realized/decided that implementing the cloc web service (ie exposing the plain cloc functionality as a web service) would need some more basic glue that could be reused by any use of someone that would want to expose a typical stdin/stdout processing unix utility as a functional web service. In doing so I realized that to get the full power of the later I would some way to describe nodejs streams so as they do their streaming server-side…. (all that in http://www.otdump.com/2013/06/unix-utilities-as-web-services.html)

Anyway, this was yesterday. I asked my friend gl for an opinion and he saw me falling down my typical infinite recursive loop ... and he suggested to cut and return to the original cloc web service .
(which was itself pretty far deeper than where I was supposed to be)
He probably didn't hear that I am on vacation which in the context means that I am Alice and I have no problem chasing rabbits wherever they might take me...

So, 1 day later, I have my first unix utility as a web service:
http://sorter.herokuapp.com

You can use it rather simply:
> cat /etc/passwd|curl --data-binary @- 'http://sorter.herokuapp.com/?args=-t&args=%3A&args=-k&args=3&args=-n'

....

_krb_changepw:*:932:-2:Open Directory Kerberos Change Password Service:/var/empty:/usr/bin/false

_krb_kerberos:*:933:-2:Open Directory Kerberos:/var/empty:/usr/bin/false

_krb_anonymous:*:934:-2:Open Directory Kerberos Anonymous:/var/empty:/usr/bin/false

%3A is :

The code is surprisingly simple
https://github.com/ogt/sorter/blob/master/index.js

The interesting things in the code are:
Apparently req and res are streams, readable and writable correspondingly.
Any process you get a handle on (with spawn, exec etc.. or just via global var process.xx) gives you access to file descriptors like stdin, stdout, that are themselves streams.
A little handy function "child" create a duplex stream out of anything that has stdin and stdout.
You need duplex streams to do things like
a.pipe(b).pipe(c).pipe(d) :
b and c need to both send and recv a just send down the pipe and d only receives from its left....
So, req.pipe(proc).pipe(res) requires very little documentation to explain what it does!.

The testing client would do sth like:

var req = request(options, function(res) {
res.pipe(process.stdout);
});
process.stdin.pipe(req);

We could create a duplex stream out of req/res, (there is a utility function called duplexer or sth in event-stream) so as we could make the 3 lines above look more the server ... but that would come at a cost: we only have a handle to both req and res after the callback is called... which means that we would need to do the "piping" inside the callback. Its not a big deal but the code above starts pushing data to the server earlier... I think...

Anyway, the fun thing is that this was sooo easy to do. On top of that heroku seems to have in its minimal slugs the core unix utilities, so this works on heroku.

This means that I can create - with practically no effort one app (seder, gziper, gunziper, awker, sorter,...) for anything I like.. no effort and I get functionality (and service/cpu time) for free.

I like the xx-er terminology its funky enough that probably nobody has taken these names. I will probably push the single 10-line function of the server into a module... and then the project would be practically just the scaffolding of every nodejs heroku app. Nothing more. (by creating a separate app per utility, I get one server per utility - more cpu... Besides most unix utilities (e.g. ls) don't make sense this way.. Maybe I will create an automated way to generate these projects/repositories/deploys... how about doc... all I need is the man-page to markdown converter and that together with a common header would suffice...

Tuesday, June 18, 2013

Unix utilities as web services

I wanted to convert a very useful unix utility (cloc) into a web service.
I thought, that someone may have already created a little glue that can do that.
Especially given nodejs and the recent love that streams are getting there (https://github.com/substack/stream-handbook) together with the re-focus back into the unix small utility programs fit together with stdio/outs.. etc someone must have done it.

Anyway, so far I can't find anyone.
The only thing I found is someone 5yr old blog and broken references (guthub and gists weren't that popular then .. so no trace of whatever this guy did...)
http://swik.net/Unix/BASH+Cures+Cancer+Blog/Exposing+command+line+programs+as+web+services/b3x8u

This guy seems to have the right idea:

$ ./to_web.py -p8008 sort &
Thu Mar 27 13:45:54 2008 sort server started - 8008
$ ./to_web.py -p8009 gzip &
Thu Mar 27 13:46:29 2008 gzip server started - 8009

Use the services:

$ for i in {1..10}; do echo ${RANDOM:0:2}; done | \
> curl –data-binary @- “http://swat:8008/sort+-nr” | \
> curl –data-binary @- “http://swat:8009/gzip” | \
> gunzip
97
37
23
23
21
18
11
11
10
10

Note that the to_web.py, seems to be doing just a popen of the command passing the stdin/stdout of the server to the command. Even the URL query string is passed intact as ... no need to think much about mapping cli parameters to url params.

The whole thing looks like substack's way of doing things.

(without any extra that substack may have already added I would use this

http://nodejs.org/docs/v0.4.6/api/child_processes.html as starting point to write the ~10 line to_web.py
... possibly by using .pipe() instead of tie-ing outs and ins manually https://github.com/substack/stream-handbook)

Any command that is sideffect free and uses just cpu/mem resources, can be made available to third parties... possibly with a bit lower than normal ulimit to prevent abuse... just like travis-ci....

I wonder how many of these commands are part of what heroku exposes... if it does it should be easy to create a free web-servuce version of all unix utilities. All that would be needed would some consistent naming for the domains ... hm and some client side pipe operation (would we need any syntactic sugar??) across the web services.. If I recall correctly streams and related libs can be browserified just fine....
On the other hand Ideally I should be able to hook webservices directly as opposed to be using the browser as the intermediate point.
If I want for example to take a gziped csv file unzipped it sort it and re-zipped.. assuming that I have
gzip.herokuapp.com, gunzip.herokuapp.com sort.herokuapp.com how would I do keeping the stream of bytes going from one webservice to the next....???

One ugly way would be to be calling gunzip with a special option
"send the response to "sort.herokuapp.com" instead of me
But that wouldn't work because then sort would send it to me instead of gunzip.herokuapp.com
I guess we would have to be calling gunzip with a special option
"send the response to "sort.herokuapp.com" instead of me but when doing so with a special option to
send it to gunzip.herokuapp.com" instead of me...

So then the trick is to create a library that does the "syntactic sugar" on top of the ugly thing above so as all I have to say is sth very simlar to
"gunzip.herokuapp.com PIPE sort.herokupapp.com PIPE gunzip.herokuapp.com

Sunday, June 16, 2013

Exporting from evernote

All my evernote files that are a "startup idea" have a unique title "NIDEA - xxx" (tags are nice - but for some types of notes - I add the unique keyword as part of the title... e.g.
HOWTO - xxx
NIDEA - xxx
TODOS - xxxx
)
anyway I wanted to capture the titles of all my nideas into a file to work with them... how to do that with evernote?
Ok : simple:
1. Do the evernote search that finds them, in my case 'nidea'
2. Select All or multiple select individual whichever you want..
3. File -> Export notes -> exports to some folder all the notes be default as an html file per note
4. cd to that folder, ls -1 and capture that output (the filenames are the titles) and paste it into your file..

Travelling...

So one more indication that blogger sucks http://realgl.blogspot.gr/2013/05/omg.html ..
I am travelling and when I reopen my blogger to edit a older post... blogger now treats me as if I am a different person...

It uses the origin of the current IP as a more important indication about my language compared to the last several years of IP/counrty, my own preferences and whatever else..
I can see how this happens in the apps that have some visitor-site components and some signed in use components... bu this one is clearly a signed in user display...
I think by now it is clear that blogger is left practically the way it was when Google bought it 10 years ago (http://en.wikipedia.org/wiki/Blogger_(service) ) with just a bug maintainance crew dedicated to it.
I guess I should be the one to blame.. I am typically more appreciative of services left (alone) running for ever.. (compared to services that seem to keep on changing un-necesarily interfaces make it increasingly harder to find the whatever functionality made you come to the product in the first place...)..
So I am to blame because I keep on doing my blog posts at blogger - instead of using something better...

Technology adoption delays

The first time I used github's "paste an image in a text area by just doing Cmd-V" feature I was hooked.
I mean what was obvious before in every wordprocessor took definitely a long time to arrive in a web browser context. Even gmail tooks its time - for while pasting a picture in the message box was not working ,was working differently than attachement ... it was a bit confusing etc..
But with github's editor everything made sense. After that I started expecting to find in all other places
be it in evernote, in my blogger edit message, in oD job post, everywhere.
And everytime I was annoyed that who ever was the author of the sw I was using didn't do what github did:
[! uploading image ]]
.....
[![image](https://f.cloud.github.com/assets/153419/659286/bbb6622a-d67d-11e2-9022-84f81400a0a3.png)

So simple, using the editor normal text to provide a progress indicator, upload thge picture somewhere, create an unguessable url to it and then do whatever would have been the natural thing to do to display in the text a picture that is available online via a url...

That approach would definitely work for blogger and oD job post/messages..

That approach may not have worked for evernote and gmail (the images in this case should have become message attachments).

Anyway my key question is why it takes so much time for a simple innovation

(there is both a product functionality innovation and an implementation innovation in this case) takes so long to spread:

- The authors of one system isn't exposed to the other system

- The features of one system (e.g. blogger) are decided by a pm - pms aren't using github. The people that use github (e.g. programmers) aren't in charge of making product decisions in their normal (non-github) life as programmers working for some company.

- People like web designers are trained in a mode of "I see all the services I use, anytime I see an interesting effect, font, ui thingy, I check it out, figure out how to do it, often link it up in my blog - to make sure that I do the same thing soon... somewhere. In that sense contrary to "programmers" "designers" are users and decision makers for whatever they do and that allows "design" innovation to spread more rapidly than developer-innovation.

- While most things are reverse-engineerable in web design... thats not the case with anything that requires server side functionality. This is probably the case with the "paste an image feature" - is the code for doing that available publicly? probably not (checking it out.. looking for something similar to http://techcrunch.com/2013/01/02/github-replaces-copy-and-paste-with-zeroclipboard/)

(after writing all this I went to catchup the blog of my friend gl and then read this post (from a month ago..)
http://realgl.blogspot.gr/2013/05/omg.html . The funny think is that I think I am the "friend" mentioned that said "oh yeahh... its been around for years" (I actually think that github's paste is fresh.. gmail's is older...)

Vacations

I understand why vacations feel good in the first day:
because i can make plans about all the things that I can achieve during my vacations - I haven't a single indication of the fallibility of my plans.. I am in a vacation, I am new person, in a new place, I can do/I can be anything I want. Of course with every day that passes the indications that your plans will fail like everytime before start mounting...

Monday, May 27, 2013

More random findings - how to module

Reading http://howtonode.org/how-to-module I remembered reading substack's own guid https://gist.github.com/substack/5075355

Went back to http://nodejs.org/community/ trying to find the irc (it became the first step of substack advice when he copied his gist https://gist.github.com/substack/5075355 to a blog post http://substack.net/how_I_write_modules ) but then got side-tracked visiting the various other nodejs community pages http://nodejs.org/community/ like http://planetnodejs.com/ where I found a new interesting futurealoof article "no builds" http://www.futurealoof.com/posts/no-builds.html which I thought of sharing with od's own IT, found http://nodeup.com/ listened the podcast http://nodeup.com/forty
and one of the guys there said "growing up is the process of becoming a hypocrite you can live with".
I thought that was a pretty good stmt meaningful both the for particular context as well as a more general statement.

.... next morning
a few tweets from substack https://twitter.com/substack/status/339098167935119360:

when you start from scratch every time, you continuously re-evaluate your assumptions and you feel the full burden of boilerplate

and...

if you use a scaffold generator to write your boilerplate instead of typing it out by hand, you're less likely to get rid of the boilerplate

Looking for something like sonar for nodejs...http://nodejsrocks.blogspot.com/2012/06/never-mind-i-will-find-something-like.html I found yo.. http://yeoman.io/ . It seems that the yeoman/grunt/bower world is not consistent with the no-scaffolding world that substack, TJ and others are dreaming off.

Random findings : Npm, dirty, grunt, bower, components

Discovered grunt … a very contributor-happy system to address the too large, too much replicating across projects [MRJC]akefiles that people use.
I think its awsome … and the reason (thats its awsome) is that the guy/author managed to find a way to make it easy for people to contribute… where before they were not... via reusable grunt tasks/plugins.
- http://benalman.com/news/2012/08/why-grunt/
- http://gruntjs.com/

My new favorite guy https://github.com/sindresorhus/todo
(take a look at some of his repos… really interesting …. http://sindresorhus.com/bower-components/ (top right)
One of them is https://github.com/sindresorhus/GitHub-Notifier, a browser plugin that tells me if I have any unread notifs in github . Just added it to my chrome... It even make a sad face when it cannot connect to github.

I just found dirty https://github.com/felixge/node-dirty
http://www.slideshare.net/the_undefined/dir-5299121
it has some aspects of gl's prototypeDB.

Nice... open source bootstrap themes :
http://bootswatch.com/ (I like the united/ubuntu one)

Interesting ..https://github.com/bower/bower/issues/39
TJ asking the bower community (which has become a very powerful one) to change their component.json into bower.json so as it doesn't conflict with hisgithub-based (why have another registry if you rely on github…) component manager that will win them all github.com/component/component
the nice people at bower agree to change their name and TJ suggest that they use his bot for automated PRs to all the registered bower modules.
https://github.com/component/bot

(watching TJ http://vimeo.com/48054442 ) explaining component (which apparently is an alternative to browserify which is also an alternative to bower…

All this space seems still rather in flux with no clear winner… and even though they are packaging js.. their corresponding css still remain rather not as well packaged due to their more global by nature status.. http://www.forbeslindesay.co.uk/post/44144487088/browserify-vs-component

I found interesting the comment by substack :
https://twitter.com/substack/status/337310410657128448
ie don't bring all kinds of tools when a single tool will do it.

Looking at my boxchareditor I kind of feel that way.. had to use way too many tools to get a toy app up. (look in https://github.com/ogt/boxchareditor/blob/gh-pages/Makefile)
Anyway in the thread that ensued TJ says he doesn't really like grunt (as a more modern make replacement) - but he argues against using npm. He is happy with make.
Then substack points out how he was able to use npm for everything in ploy… https://github.com/substack/ploy#scripts (i re-read about ploy and it is quite an awsome tool by the way)

Found this article http://www.devthought.com/2012/02/17/npm-tricks/ which discuss about a bunch of interesting tricks about npm

Took a further look at the sparse blogging in http://www.devthought.com/
The guy founded LearnBoost … Wow I have been looking for that ...http://shelr.tv/about he seems to have written something around that - but interactive codestre.am but is not up now.. https://twitter.com/rauchg/status/332211183539060736
https://github.com/antono/shelr.tv
(its unbeleivable the projects that have came out of learnboost… https://github.com/learnboost
stylus, socket.io knox mongoose...

Started looking at http://howtonode.org/how-to-module (remembering substack's own guide https://gist.github.com/substack/5075355 ) saw that the author was writing it via wheat https://github.com/creationix/wheat as a community contributed blog https://github.com/creationix/howtonode.org where anyone that wants to contribute forks, adds article, submits PR and assuming that it passes the quality bar of the owner/mainnainer it gets posted…

Followed to the author
https://github.com/creationix
http://creationix.com/
saw that he is really into online education, creating a programming learning environment/experience for kids => js-git
http://www.kickstarter.com/projects/creationix/js-git , https://github.com/creationix/js-git

and realized again some of the hot debate of packaging vs componentization:
"The Problem" https://gist.github.com/creationix/5657945

--
and after looking at js-git I had to look back again at the chromebook ( I got one from google 2 yrs ago… and never managed to get it up and runing due to some wifi security problems)
(it seems that the msft PR machine has flooded the media with "chromebook is failing/struggling" in April.. which means that at last the Chromebook is gaining real traction http://www.zdnet.com/amazons-top-selling-laptop-doesnt-run-windows-or-mac-os-it-runs-linux-7000009433/

Wednesday, May 15, 2013

The great alexander game

We played one of my favorite games with gl yesterday. Think that smartmoney.com/marketmap is the world, we are great alexender and we randomly pick a country (company) to invade.

Greece and its city states is silicon valley and its tech savvy innovative < 10yr old companies. We exclude these. Its the rest of the world, the old fashioned world that is the target. Greg picks a continent "Consumer Goods" and in it he picks the company that does all the energy drinks.
Thats Monster Beverages, the recently renamed 70-80 yr old company that had a rebirth after it focused into the energy/sport drink space.
Its a 10B market cap.
SO what is the game then? The game is to try to figure out how to "invade that country". Using innovations/understanding of disandvantages that are obvious to anyone in silicon valley , try to come up with a hypothetical war plan that can pass the bar of two relatively smart guys talking about it without having any real clue about the industry at hand. It makes for a nice coffe break discussion..

So here is the war plan for Monster Beverages (I've never really had any Monster beverage - so I will pretend that "energy drinks" are the same as "sport drinks" and I will be talking here about sport drinks as if Monster is the company that owns the sport drink space.

So, sport drinks like gatorade, or vitamin water or things like that are extremely overpriced watery colored drinks that have lots of electrolytes depending on the case lots of carbs, or not so many, and taste that varies from fruity to weird. It doesn'y seem thought that the taste of these drinks is a really proprieterary recipee. It seems that it is fairly simple to create a replica with all the properties that costs not $2 per bottle but 10-25cents per bottle (like the normal sodas).

So why do people spend $2.
My simple answer is that this space captured the price un-elasticity of the relatively afluent sportsy group a group that grew substantially as market share over the last 10-20 years. Together with well targeted marketing (bottles shapes, colors) and lots of TV advertising , presence/promotions in all health clubs etc.. they become a well known brand in a very small amount of time (compared to century old soda brands).
Its high price finances
- a expensive advertising model
- incentivizes supermarkets for premium self space (safeway makes much more per sq inch of gatorade than coke)
- a highly profitable corporation.

Thats all good..
Now seeing it from the tech side:
- Someone is managing to sell colored/artificially flavored 20c /bottle water for $2 => thats wrong/inefficient
- They do that by doing old fashioned typed of advertising that we know are loosing mind share (TV, magazines, stadium banners)
- They rely on old fashioned distribution models (retailers) that are seeing their own market erose for anything that is "commoditized"

The above more or less gives us the obvisou directions of attack:
- Provide a lower cost alternatice
- that doesn't pay for TV advertising
- that doesn't rely / pay high margin retailers for distribution

and possibly leverage more novel forms of viral marketing.

Here is an example:
(Assume that we have created a replica of the major gatorate like products and came up with an equally attractive bottle/labels names etc).
We have a bottler that fills them up and we have no distribution - zero awareness.

We will follow the soda drink model for advertising : free samples (soft drinks disperse 1/3 of their volume as "free samples" ie through the fast food industry where they practically make no money - allowing the fast food industry to make all the money (fast food industry produces the burger practically at a loss and makes all its margine through the soft drink - by providing an exclusive channel to a single company - cant find the research paper I have read all that...)).

In our case we will address the rather un-targeted children sport tournament space as the space where we would real-sponsor with free samples.
Typically every family with kids is involved in one or more sports, having weekly practices, matches and monthly or by monthly tournaments.
Open tournaments typically have half a dozen of small vendors that sell anything from foods to drinks.
We will need a methodology to identify all tournaments in all sports and contact the organizer with the proposal of "sponsored" "our sport drink" free for the tournament participants drinks in exchange for a free spot plus a few ad banners. Nobody has done sth like that so the odds are that we should be able to get something
more than local tournament awareness: kids love to play games to make an extra $.. they are allready socially connected in instagram, kick and all kinds of kid-chat thingys. They have a iphone/android or know someone that does. To get the free drink all they need to is (sth like download the app, like the drink, broadcast their choice to their friends/followers... add the fb app... or any some combination of the above). The outcome is that they "market to friends, show proof, get a free expensive icy cold drink. Primary costs are the people/truck that run the tent.
Ad Banners and bottle show - "can only purchased online" .(32 bottles at $1/bottle leaves enough room both for free shipping as well as for this kind of free sampling)

Such program would allow awareness within a year.
It would require investment - but it can grow locally area by area..(bay area first etc), which can make many things easier (including the actual shipping)

Thats all

Sunday, May 12, 2013

Retained Earnings Tax

I was at a friends birthday the other day , and there was someone at the table, a self made entrepreneur, apparently rather successful, the stereotype of someone who is sure about himself, thinks he knows the answers to any problem that he has thought of etc. etc. As it turns out he is rather fascinated about the problem of Retained Earnings tax. Without any further wikipedia search, here is his story:

"I started my company as a C Corp. because i thought that what real companies are/do. I was always extremely conservative in terms of my tax returns. Every year I get from Amazon all internet sales and report them. The one time that I have been tax-audited they ended up returning me taxes. Still the company was successfull and made good profits, year after year.. I paid the equivalent of 48% (Federal + California ) tax and left most of that taxed profits in the company. My accountant seeing kept on telling me that he should have made the company a S-Corp as opposed to an C corp and that I should take the money (as bigger salary) instead of letting it to become C-Corp profits. But I was always affraid to do that. Never having taking debt or any investment, I was seeing it as my company's weakness that I have to deal with. Having a big cash hoard is what makes a well funded company able to fund growth spurts acquisitions, or just ride the hard times when they come and I wanted my company to be like that.
Until one day my accountant told me that I owe a huge amount of tax, equivalent to a 15% of all my accumulated over the years earnings.. as "retained earnings tax".

He then went on to explain that this is really just one more example of the fight between the old foos the aristocracy and the bourgeois the people that have assets vs the people that produce income, the old money vs the new money.
In his view the following changes should happen to the tax law.
S-Corporations are extensions of one-self and are taxed fine they way they do today.
People should be taxes as today based on a combination of income tax, use tax (like sales tax) AND asset tax (like the property tax). The asset taxes should be applied to all liquid and iliquid assets one might have. Be it land, buildings, shares of C-corps, domestic or foreign, bank accounts, gold diamonds, or commodities... The tax can be 1.25% annually, like the property tax (in his view we should be taxed more..)
C-Corporations should become pass through entities - C-corporations are not people and should not be taxed.
So when a company like Apple becomes huge its profits aren't taxed, neither its cash reserves. However, the shareholders of Apple are taxed and as their equity becomes bigger their taxes become bigger. Ideally unlike property taxes these taxes are like income, a retiree owning a few shares will have to deal with a much lower tax rate than a billionaire.
Ah, and one last thing, Warren Buffet is old money. He is arguing for a bigger income tax rate when he is sitting on a 200B asset on which he doesn't pay taxes for: any large corporation today is creating a (rather expensive for small/medium business legal) scheme that allows it to report consolidated multiational profits even though within US they report no profits or losses. Profits are being report within sub-entities whose country of incorporation is a tax-heaven, ie doesn't cause significant corporate income tax and has no retained earnings tax, The company cannot bring its earnings back.... but is immune from the retained earnings taxation (like Apple who got a loan (using its foreign cash hoard as a collateral) to pay dividends, without having to bring the cash home, hoping for a tax law change in the future...).

One last thing I didn't discuss with him:

In his model, I can have a large company who I control through a small class of high voting power shares. That makes my net-worth small, (all shares are "owning" the future cash flows/profits equally) even though it retains my power. Of course in a correct world, the price for the special shares that the google or facebook founders hold should be much bigger and not the same as the common share... Hm...

Friday, May 3, 2013

Bottom up vs top down

I have been having this argument with my friend gl. I have been telling him that he is not a true believer to cool-aid we have been drinking lately around small modules, reusable components in nodejs etc.

Yesterday he told me rather proud that he actually followed the model and published his first npm module.

I think that the difference of believes exists and understanding it is important.

GL is the hugest proponent of incremental design. Being a hci person he always wants to have an version of the end product no matter how primitive. I would call this iterative top down development.

His first version is almsot always a simple hello world site, still it is displayed in the deployment context of the app, be it mobile web or desktop. He slowly adds minifeatures and so and on. At any point he has visibility of how close or how far he is from the end. At any point he co-workers/clients/mgr can take a look at the early prototype can give feedback or can even use the product.

I on the other side apply the rhetorical question, what services/modules if I had would make the building of my app trivial (substack's slogan). I am asking this question recursively, trying to force myself to avoid depth-first-search descent and do more of a BFS, as I go down trying to implement the next module I would need, re-asking the same question..

The result of the two paths (he does better than I do) may be reflecting more the difference in programming/focusing skills than a true advantage in one or the other methodology. The truth however, is that gl will be writing little custom plugs that would slowly get some more meat and bones.. and then often he has written something that may already exist in some similar form. With his methodology publishing modules is an after the fact exercise... this thing I used I could make it more generic and publish it.

The negative of the methodology I follow are obvious. Starting building my app bottoms up, from its components, I cannot have a good idea of either how much effort/time is still needed. I also find myself too often falling into recursive black holes... as one particular component that I start building becomes so important that it ends up becoming my priority 1 for many many days..

I am not sure what exactly is the right balance here. How I can reap gl's advantages but still be true to the substack slogan...

Thursday, May 2, 2013

Work notes - new lessons from last HS/GH experiment

The last and greatest HS/GH experiment is still undergoing but there are already lots of learnings:

The experiment involved OD sourcing issues 3,4 and 5 of the boxchareditor repo.

The experiment is the biggest i have ran so far in terms of size/budget/expected amount of work.

New elements in the experiment:

1. Test Validation (node-tap based)

2. Coverage Validation (node-cover based)

3. Code practices Validation (jshint based)

4. Task Dependencies

The test is still ongoing - but the first Issue has been successful with 20 hrs turnaround time from post to pull request acceptance.

None of the 1,2,3 were automated. The way that acceptance test was done by me cloning locally the developers fork and running

> cd /tmp;rm -rf boxchareditor; hub clone GulinSS/boxchareditor; cd boxchareditor;npm install;make

Make would run the tests, the coverage report, and lint and essentially prdocuce an error if any of the 1,2,3 validations were to fail.

The contractor applied to the job within 1hr from job post and immediately got hired.

It is interesting to point out that even though I invited past successful project hires the applicant responded faster. By the time I hire him he had gone to bed. He started working ~ 8hrs after job post.

Came back for syncing 15 hrs later - and from that point on we had an occassional back and forth

every hour or two initially in the issue comments and then on skype.

Interesting conclusions:

- Even though it is absolutely meaningful and productive to capture in written documentation, instructions every element that may be a recurring question from a contributor, still ongoing interaction, (eg, communication guidance iterative review ) between the two parties are things that are normal and should be happening as part of a task. This may appear against the more pure idea of HS - which involved complete operational instructions and no communication besides the acceptance phase but I think it is a necessary compromise.

The contractor in this case

a) suggested to refactor the code following a significantly better architecture

b) the contractor suggest to unify some of the follow up tasks - given the new refactoring.

c) the refactoring exposed later a difference in the actual end user behavior of the system

All of these made sense, all of these required interaction. The system ended up better as a result of these interactions. Overall I spend approximately 1hr in these interactions - while my estimate is that the developer spent > 10hrs developing them.

I had to spend another 1hr in preparing the issue of which approx 20 minutes were the mechanical aspects that are being automated by gl.

Still, the leverage I obtained was very impressive.

Issues 3 and 4 were dependent on 2. This implied a few changes in the flow:

- Jobs for issues 3 and 4 were created as private (awaiting the completion of Issue 2)

- My plan was/is to make the dependent job available (ie invite) to the depending job's contractor and only if they were to reject it make it public.

- The language in the template was updated to explain that successful hire would get first dibs to the dependent jobs as well.

It is interesting to note that the successful developer didn't have time to do features 3 and 4 showing again the importance of the global liquidity and the difficulty in relying in a preset pool of folks.