I wanted to find out the distribution over the course of a day of a typical person's google searches.
A friend in the office said that I should look at real data ... I guess the only data I have was mine. I looked at my chrome web history - chrome doesn't have timestamps there...
Ok.. after a bit of digging I found it :
For a Mac user its
ithaca> cp ~/Library/Application Support/Google/Chrome/Profile 3/History /tmp
ithaca> sqlite3 /tmp/History
sqlite> select avg(cnt) from (
select ddate,count(*) cnt from (
select
date(last_visit_time/1000000-11644473600,'unixepoch','localtime') as ddate, time(last_visit_time/1000000-11644473600,'unixepoch','localtime') as dtime
from urls
where
url like '%google.com/sea%'
order by 1 desc
) a
group by ddate
) b;
13.0923076923077
Interesting things there:
There is a history file for each profile of yours as you would expect
The history file is an sqlite db file! A very nice surprise
The timestamps aren't unix epoch based - they start somewhere in the 17th century for some reason - so you need to do the adjustment (SO gave the right offset)
I still can't believe that browsers treat web history as something disposable - no clue why mine is truncated 2-3 months ago.
A tool that is syncing the history file daily/weekly with a more persistent DB (from all my profiles) may be worthwhile
On Average I do 13 searches per day (I was expecting that to be more given how frequently I use the chrome omni-bar..
Reference to the previous post - Google is making at least $30/mo from me from my search activity.....
If I knew R better at this point I would just show the histogram of these timestamps but given that I mostly use as a command line calculator ..... this this will take me more than a few minutes...
No comments:
Post a Comment