Sunday, January 5, 2014

How often do I google

I wanted to find out the distribution over the course of a day of a typical person's google searches.
A friend in the office said that I should look at real data ... I guess the only data I have was mine. I looked at my chrome web history - chrome doesn't have timestamps there...
Ok.. after a bit of digging I found it :
For a Mac user its

ithaca> cp ~/Library/Application Support/Google/Chrome/Profile 3/History /tmp
ithaca> sqlite3 /tmp/History
sqlite> select avg(cnt) from (
  select ddate,count(*) cnt from (
    select 
      date(last_visit_time/1000000-11644473600,'unixepoch','localtime') as ddate,       time(last_visit_time/1000000-11644473600,'unixepoch','localtime') as dtime 
    from urls 
    where 
      url like '%google.com/sea%' 
    order by 1 desc 
  ) a 
  group by ddate 
) b;
13.0923076923077

Interesting things there:
  There is a history file for each profile of yours as you would expect
  The history file is an sqlite db file! A very nice surprise
  The timestamps aren't unix epoch based - they start somewhere in the 17th century for some reason - so you need to do the adjustment (SO gave the right offset)
  I still can't believe that browsers treat web history as something disposable - no clue why mine is truncated 2-3 months ago. 
  A tool that is syncing the history file daily/weekly with a more persistent DB (from all my profiles) may be worthwhile
  On Average I do 13 searches per day (I was expecting that to be more given how frequently I use the chrome omni-bar..
  Reference to the previous post - Google is making at least $30/mo from me from my search activity..... 

If I knew R better at this point I would just show the histogram of these timestamps but given that I mostly use as a command line calculator ..... this this will take me more than a few minutes...

No comments:

Post a Comment