It is all about me!
you have nothign to prove, except theorems


 
Soundtrack:

It is all about the soundtrack in your life.
The one that plays while you act on life's stage.
Po wered by WordPress

-->

September 2, 2010

New rules for twitter

Filed under: Projects, Business — ivan @ 2:36 am

So basically they are saying…


> ---------- Forwarded message ----------
> From: Twitter
>
> Over the coming weeks, we will be making two important updates …
>
> Update 1: New authorization rules for applications
>

we want all the account data
(not allowed to keep the password?)


> Update 2: t.co URL wrapping
>

and we want all the click information

why not… the masses want to tweet so they control
the “platform”

i mean //obviously// the solution to the whole link shortener
thing is to have YET ANOTHER REDIRECT in the whole process

t.co/45213 –>
< --- 301 tinyurl.com/alksja
tinyurl.com/alksja -->
< --- 301 somesite.com/article
somesite.com/article
<-- 200

This is a ridiculous grab for data !

They are basically saying, we are sitting on this very valuable data
which is click statistics and we want to control it, have access to it,
maybe even remove your redirect and just leave ours …

data is the new money on the web
and they just made a massive grab with two hands

twitter hasn’t come up with a business model but it
wants to harvest the real-time link structure.
They are imitating the google model … control lots of data
and do ML on it to produce some “useful” application:
the application is real time search
not crazy i know… but it is pretty important …
entertainment, suggestions, news search, messaging
everything google does kind of…

pfff… all these “web giants” are like kindergarden software developers
google is kind of useful with the whole search thing
but twitter and facebook are really for kids for god’s sake !
what are grownups doing on these platforms?

the collective age of the north american poulation is like…
16 years of mental maturity !
hormones, pictures, and vanity … oh and status messages !
WTF? “I am thinking about having ice cream….”
Good for you. Now fuck off!

link sharing is the only actual useful service of this whole
web 2.0 bubble….


August 21, 2010

Math competition

Filed under: Math, Minireference — ivan @ 9:09 pm

It appears that yet another person has come up with the “cool math” book idea. $17k raised. Dood…. I am not sure what that means. Does he get $17k in his bank account. That is VERY impressive if that is the case.

More info:
idea map
interview


Filed under: liblda — ivan @ 4:38 am

This is one of the results that comes up when I type my name into google.

Very cool!
I mean this is what I want to be associated with ;)

Let’s have an update
We are waaaay past June, and no LDA code has been written. Some papers have been read. Some new results out on the list. I spoke with my supervisor and she said my project was a good idea. Let’s see if I can make it…

I kind of stopped thinking about a whole py-lda, because I learned about the MAHOUT project, which has a perfectly good LDA implementation that can even take advantage of clusters. I want to investigate that further and maybe write MAHOUT plugins instead of making my own complete ML library…

On the other hand playing with NumPy arrays from C and python will probably be a good exercise in efficiency (and pointer counting).

On the theory side, I have managed to word the problem scientifically, but I am not sure how it can be solved…. wait. I just realized I was complicating the problem unnecessarily. Here is the simpler version:

Let p(x) be a discrete probability distribution over x \in {list of words}
<br />
\sum p(x)  = 1<br />

Let  q_1(x), q_2(x), q_3(x), \ldots, q_n be a set of prob distributions and let
<br />
 q(x) = \lambda_1 q_1(x) + \lambda_2 q_2(x) + \cdots + \lambda_n q_n(x)<br />

Can you find the optimal  \vec{\lambda} which minimizes the Kullback-Liebler
divergence between p(x) and q(x), i.e
<br />
  \argmin_\lambda KL(p(x), \vec{\lambda}\cdot\vec{q}(x) )<br />


August 11, 2010

The future of the web

Filed under: Political, Business — ivan @ 2:09 am

These days heavy things are hitting the news. Google and Telco want to define how you access internet, facebook is hitting its peak times and twitter still hasn’t figured out how to make money.

Facebook the dominant system creation continues to hold number one spot as web app. People make money from it, through marketing and flash games so now many people have a an interest in supporting the facebook platform.

And it is not like there is competition.
Where are these guys at?

diaspora

Someone just asked about that on HN.

How hard is it to setup a crypto system and some very scalable simple web app?
I mean why not?
Weekend project ;) Anyone care to give me 50k and watch me try?

Someone should do it! Because the longer facebook goes unchallenged the more difficult it will be eradicate it as a social platform… Just look at how entrenched windows is despite being a shit OS (relative to Ubuntu and OSX). It is not really windows that matters so much as the apps for that platform. Even if you build a better platform, you might still fail because of the number of apps for the old system…

The big selling point of an OSS and self-administered facebook is privacy. But people don’t seem to care about privacy all that much.
Every time the privacy debate comes up in relation to google or facebook the conversation is steered towards the following assumptions:

  1. The attacker is another member of the website or an anonymous web user.
  2. The platform (gmail or facebook) is a trusted third party.

This is not the debate I want to be having. We don’t agree on the assumptions. Any “privacy policy” I can fill out for my data will not prevent the company from accessing my data. They own the database and the file servers and even have authority over my login credentials for that website.

What about the privacy in the following paradigm:

  1. The attacker is person on the internet.
  2. The software runs on a distributed platform of hosts run by untrusted VM contributors.
  3. You run your own credentials server which you can trust.

The last item is still beyond the technical abilities of mom and pop who barely figured
out how to login to facebook so maybe this paradigm isn’t that relevant.
It is relevant to me though. And I think to the generation of people who use the internet
even more than I do.

How do you secure the web?
How do you make the web immune to eavesdroppers?
(Oh noo…. oh… waiiiiit… boay. whatyou tawking bout? if we can’t listen in on your social conversations we won’t be happy — say the letter agencies. )

This guys has lots to say about this.


August 8, 2010

Another trace

Filed under: Minireference — ivan @ 2:56 pm

I am just starting to realize how brutally difficult it is to engage a web surfer.
I get them for a minute or two at most. Perhaps the lessons are too long for web consumption? I shouldn’t

THis is a recent trace from a McGill IP. He googled for minireference and shows up at
7:37PM.

[06/Aug/2010:19:37:22 -0400] "GET index HTTP/1.1" 200 2292 "http://www.google.ca/search?sourceid=chrome&ie=UTF-8&q=minireference&qscrl=1"
[06/Aug/2010:19:37:26 -0400] ...

she gave a total of 4 seconds to the home page. Dood!


[06/Aug/2010:19:37:26 -0400] "GET calculus/introduction HTTP/1.1" 200 4486 "index"
[06/Aug/2010:19:37:39 -0400] ...

Nice. That is almost engagement. She took 13 seconds to skim the contents of the Calculus introduction.


[06/Aug/2010:19:37:39 -0400] "GET index HTTP/1.1" 200 2292 "calculus/introduction"
[06/Aug/2010:19:37:45 -0400] "GET lessons/fundamentals HTTP/1.1" 200 655 "index"
[06/Aug/2010:19:37:48 -0400] "GET index HTTP/1.1" 200 2292 "lessons/fundamentals"
[06/Aug/2010:19:37:49 -0400] "GET lessons/fundamentals HTTP/1.1" 200 655 "index"
[06/Aug/2010:19:37:52 -0400] "GET index HTTP/1.1" 200 2292 "lessons/fundamentals"
[06/Aug/2010:19:37:54 -0400] "GET lessons/cal2/index HTTP/1.1" 200 78 "index"
[06/Aug/2010:19:38:02 -0400] "GET lessons/cal2/riemann_sum HTTP/1.1" 200 100 "lessons/cal2/index"
[06/Aug/2010:19:38:14 -0400] "GET lessons/cal2/techniques_of_integration HTTP/1.1" 200 193 "lessons/cal2/riemann_sum"
[06/Aug/2010:19:38:30 -0400] "GET calculus/old_introduction HTTP/1.1" 200 5651 "lessons/cal2/techniques_of_integration"
[06/Aug/2010:19:38:34 -0400] "GET lessons/cal2/techniques_of_integration HTTP/1.1" 200 193 "calculus/old_introduction"
[06/Aug/2010:19:38:35 -0400] "GET lessons/cal2/riemann_sum HTTP/1.1" 200 100 "lessons/cal2/techniques_of_integration"
[06/Aug/2010:19:38:36 -0400] "GET lessons/cal2/index HTTP/1.1" 200 78 "lessons/cal2/riemann_sum"
[06/Aug/2010:19:38:37 -0400] "GET index HTTP/1.1" 200 2292 "lessons/cal2/index"
[06/Aug/2010:19:38:48 -0400] ...

After clicking about aimlessly for some time and wondering “where the hell is the content?”
she settles back to the index page and reads the first paragraph. Then goes to the math index page.

[06/Aug/2010:19:38:48 -0400] "GET math/index HTTP/1.1" 200 543 "index"
[06/Aug/2010:19:38:53 -0400] "GET math/trig_identites HTTP/1.1" 200 858 "math/index"
[06/Aug/2010:19:39:00 -0400] "GET book/buy_printed HTTP/1.1" 200 673 "math/trig_identites"
[06/Aug/2010:19:39:11 -0400] "GET book/buy_pdf HTTP/1.1" 200 297 "book/buy_printed"

I love this…. she clicked on “buy printed” and “buy pdf” in the end. There might be a point.

Lessons learned

  1. Homepage needs a cleanup
  2. index of all topics (contents) back to the front

August 6, 2010

Launch

Filed under: Minireference, Business — ivan @ 2:02 pm

I spent the entire Wednesday night preparing for the grandiose launch of minireference.com and the book version. I prepared a “pitch” letter that invites students to come to the website and test the product.
I also printed out 8 copies of the book — all 60 pages of current content printed on half-of-letter paper and stapled together on the side to give the book-like feeling. I was quite impressed with the print quality actually. \documentclass[10pt]{book} works magic things out of the box…
Everything was ready by 8:10AM and I went to the classroom and placed the pitch letters on the exact spots where people
tend to sit (I had previously attended a class to size-up how many copies I will need).

One problem. There was no class that day!

To be honest, the launch is not a total fail. I saw two students going into the class to use the space for studying and one of them must have looked at the pitch. He/she bought it and gave me this nice trace in access.log.


[05/Aug/2010:09:53:34 -0400] "GET index HTTP/1.1" 200 2292 "" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.3) Gecko/2008092414 Firefox/3.0.3"
[05/Aug/2010:09:53:49 -0400] "GET lessons/fundamentals HTTP/1.1" 200 655 "index"
[05/Aug/2010:09:54:12 -0400] "GET index HTTP/1.1" 200 2292 "lessons/fundamentals"
[05/Aug/2010:09:54:14 -0400] "GET lessons/cal2/index HTTP/1.1" 200 78 "index"
[05/Aug/2010:09:54:27 -0400] "GET lessons/cal2/riemann_sum HTTP/1.1" 200 100 "lessons/cal2/index"
[05/Aug/2010:09:54:34 -0400] "GET lessons/cal2/techniques_of_integration HTTP/1.1" 200 193 "lessons/cal2/riemann_sum"
[05/Aug/2010:09:57:03 -0400] "GET calculus/series HTTP/1.1" 200 3318 "lessons/cal2/techniques_of_integration"
[05/Aug/2010:09:57:40 -0400] "GET calculus/formulas_to_memorize HTTP/1.1" 200 915 "calculus/series"

I captured someone’s attention for 4 minutes!
That is pretty good by internet standards no?

Visit analysis: Came in, checked the fundamentals page and didn’t find the link to “solving equations” to be too interesting so I guess she is a calculus student — she knows her basics. Then from the main page she started the cal2 lesson sequence where she spend:

  1. 30 secs on the calculus intro page
  2. 7 secs on the Riemann sum page
  3. 3 minutes on techniques of integration
  4. 30 secs on series

Ok. So more visitors needed. More content quality needed. And definitely some software help for log analysis!


July 14, 2010

Filed under: Minireference, Django — ivan @ 1:59 am

This is a good example of a report a problem form.

Every website should have one like that.


July 9, 2010

Content check up

Filed under: Minireference — ivan @ 9:55 am

I am thinking (again) that the output of ls -lR * could be a good for
the minireference homepage. Everyone is familiar with apache index pages, why not borrow
on that representation ?

This little awk code will clean this up and make it presentable.

ls -lhR * | awk '{ if (NF==9) print $1"\t"$3"\t"$4"\t"$5"\t"$7" "$6"\tat "$8"\t"$9; else print }'

Now the tabs don’t work out too well and I need to wrap some of the
files with [[ ]] links…. but maybe it would be cleaner than the current setup.

book:
total 56
-rw-r--r-- ivan _www 738B Jul 8 at 23:20 about.txt
-rw-r--r-- ivan _www 297B Jun 25 at 01:29 buy_pdf.txt
-rw-r--r-- ivan _www 171B Jun 25 at 01:31 buy_printed.txt
-rw-r--r-- ivan _www 699B Jun 25 at 01:05 donate.txt

calculus:
total 64
-rwxrwxrwx@ ivan staff 4.1K Jun 24 at 23:51 basics.txt
-rw-r--r-- ivan _www 146B Jun 15 at 11:02 concept_template.txt
-rw-r--r-- ivan _www 3.3K Jun 19 at 16:47 index.txt
-rw-r--r-- ivan _www 5.5K Jun 20 at 22:29 introduction.txt
-rw-r--r-- _www _www 6.1K Jul 8 at 22:42 riemann_sum.txt

electricity:
total 48
-rw-r--r-- ivan _www 117B Mar 30 at 01:29 capacitors.txt
-rw-r--r-- ivan _www 1.7K May 16 at 15:24 circuits.txt
-rw-r--r-- ivan _www 2.2K Apr 25 at 14:43 magnetic_field.txt
-rw-r--r-- ivan _www 4.7K May 23 at 15:19 start.txt

linear_algebra:
total 80
-rw-r--r--@ ivan staff 2.5K Jun 24 at 22:35 basis.txt
-rw-r--r--@ ivan staff 4.6K Jun 25 at 00:46 concepts.txt
-rw-r--r-- ivan _www 353B Jun 19 at 19:28 index.txt
-rw-r--r-- ivan _www 1.2K Jun 24 at 22:38 len_direction.txt
-rw-r--r-- ivan _www 5.0K Jun 25 at 02:16 operations_on_vectors.txt
-rw-r--r-- ivan _www 1.1K Jun 24 at 23:43 special_types_of_matrices.txt
-rw-r--r-- ivan _www 4.8K Jun 24 at 23:51 vectors.txt

math:
total 48
-rw-rw-r-- ivan _www 101B Jun 18 at 12:30 all.txt
-rw-r--r--@ ivan staff 3.3K Jun 16 at 15:13 functions.txt
-rw-r--r-- ivan _www 353B Jun 20 at 18:34 functions_and_inverses.txt
-rw-r--r-- ivan _www 662B Jul 8 at 23:20 index.txt
-rw-r--r-- ivan _www 2.6K Jul 8 at 23:20 numbers.txt
-rw-r--r-- ivan _www 3.2K Jul 8 at 23:20 solving_equations.txt

physics:
total 104
-rw-r--r--@ ivan _www 3.1K Jun 15 at 23:22 basics.txt
-rw-r--r-- ivan _www 1.5K Mar 21 at 01:07 energy.txt
-rw-r--r-- ivan _www 1.4K Mar 30 at 14:52 force_diagrams.txt
-rw-r--r-- ivan _www 2.1K Jun 25 at 00:20 index.txt
-rw-r--r-- ivan _www 4.7K May 23 at 07:15 kinematics.txt
-rw-r--r-- ivan _www 3.4K Jun 2 at 14:02 momentum.txt
-rw-r--r-- ivan _www 303B Feb 21 at 15:13 optics.txt
-rw-r--r-- ivan _www 11K Jun 25 at 00:20 other_resources.txt
-rw-r--r-- ivan _www 2.3K May 12 at 00:48 simple_harmonic_motion.txt
-rw-r--r-- ivan _www 733B Jun 1 at 12:40 template.txt

I seem to have about 16k words on all the topics

wpa114015:miniref ivan$ for x in `find . -name "*.txt" -print`; do cat $x | wc ; done | awk '{ SUMW += $2} END {print "Total # words: " SUMW }'
Total # words: 15959

This is probably not enough to fill a 100 page book. My thesis was about 35k words and that filled 120 pages or so, but in the McGill stylesheet which is very airy.

wpa114015:Thesis ivan$ for x in `ls *.tex`; do cat $x | wc ; done | awk '{ SUMW += $2} END {print "Total # words: " SUMW }'
Total # words: 34437

I really need to get cracking on the content side before the 12th !


July 8, 2010

Google cookies

Filed under: Uncategorized — ivan @ 9:44 am

A very cool article about the cookies set by google’s web-surfers tracking system.

The Very Basics – The Google Analytics Cookies
When someone visits a website that is properly coded with Google Analytics Tracking Code, that website sets four first-party cookies on the visitor’s computer automatically.

So, What Are These Four Cookies?
Well, there can be up to five different cookies that a website with Google Analytics tracking code sets on your computer. However, four of them are automatically set, while the fifth one is an optional cookie. Let’s take a look at each one.

The __utma Cookie
This cookie is what’s called a “persistent” cookie, as in, it never expires (technically, it does expire…in the year 2038…but for the sake of explanation, let’s pretend that it never expires, ever). This cookie keeps track of the number of times a visitor has been to the site pertaining to the cookie, when their first visit was, and when their last visit occurred. Google Analytics uses the information from this cookie to calculate things like Days and Visits to purchase.

The __utmb and __utmc Cookies
The B and C cookies are brothers, working together to calculate how long a visit takes. __utmb takes a timestamp of the exact moment in time when a visitor enters a site, while __utmc takes a timestamp of the exact moment in time when a visitor leaves a site. __utmb expires at the end of the session. __utmc waits 30 minutes, and then it expires. You see, __utmc has no way of knowing when a user closes their browser or leaves a website, so it waits 30 minutes for another pageview to happen, and if it doesn’t, it expires.

The __utmz Cookie
Mr. __utmz keeps track of where the visitor came from, what search engine you used, what link you clicked on, what keyword you used, and where they were in the world when you accessed a website. It expires in 15,768,000 seconds – or, in 6 months. This cookie is how Google Analytics knows to whom and to what source / medium / keyword to assign the credit for a Goal Conversion or an Ecommerce Transaction. __utmz also lets you edit its length with a simple customization to the Google Analytics Tracking code.

The __utmv Cookie
If you are making use of the user-defined report in Google Analytics, and have coded something on your site for some custom segmentation, the __utmv cookie gets set on the person’s computer, so that Google Analytics knows how to classify that visitor. The __utmv cookie is also a persistent, lifetime cookie.

That’s all Great since Users can Deletes These Cookies!
Fortunately, google cannot do anything about someone deleting their cookies from their computers. The __utmb and __utmc cookies are gone before you know it, but the __utma, __utmz, and __utmv cookie (when applicable) will remain for a long period of time. Whenever someone deletes the __utma cookie, they are in essence deleting their history with your website. When they visit your website again, they are considered a brand new visitor, just as they were the first time they came around.

This means that from your own tracking code, you can piggy back on the uinque ids assigned by google !
very cool…


July 7, 2010

Mac OS version of /etc/init.d/

Filed under: Mac OS X — ivan @ 9:08 pm

Where are the boot scripts in Mac OS?

Here /System/Library/LaunchDaemons.

Instead of “calling” them you have to call launchctl, ex


sudo launchctl unload /System/Library/LaunchDaemons/org.cups.cupsd.plist
sudo launchctl load /System/Library/LaunchDaemons/org.cups.cupsd.plist

This might come in handy, if you ever go around killing essential processes on your computer…


Next Page »