Machine Learning, etc: March 2005

Thursday, March 10, 2005

The joy of scanning

I found a book scanner in our library's basement and decided to put it to good use by scanning some hard-to-find-online references on foundations of Bayesianism.

"Algebra of Probable Inference" by Cox, 1961 (aka, Why everyone should be a Bayesian). Demonstrates a functional derivation of probability theory as the unique extension of Boolean Algebra.
"Why I'm not a Bayesian" by Clark Glymour, Theory and Evidence, 1981. Criticizes Bayesian approach from the philosophy of science point of view.
"Why Glymour is a Bayesian" by R Rosenktrantz, Testing Scientific Theories, 1983
"Why isn't everyone a Bayesian?" by Efron B, American Statistician 1986. Examines reasons why not everybody was a Bayesian, as of 1986, with scorching reply from Lindley.
"Axioms of Maximum Entropy" by Skilling, MaxEnt 1988 proceedings. Sets up four practically motivated axioms, and uses them to derive maximum entropy as the unique method for picking a single probability distribution from the set of valid probability distributions.

I included the last one because there are some parallels with philosophy of Bayesianism. Following Cox, one should be a Bayesian if

They assume boolean logic
Can encode their true belief as a pdf

On other hand, following Skilling, one should use ME principle if they

Believe Skilling's axioms.
Have statements about true distribution in the form of constraints.

In both models assumption number 1 sounds more or less reasonable, whereas assumption number 2 causes discontent. With Bayesian approach, we don't know how to represent our prior knowledge as a pdf, whereas with ME approach, we don't know where to get the constraints from. However, in practice, we can often find a pdf that is close to representing the true belief of the expert, and similarly we can often find constraints that approximately rule out unfeasible distributions.

Questions

The references are over 20 years old. One newer one from 2000 by Berger looks at rising popularity of "objective" Bayesian and robust Bayesian approaches, and predicts practical Bayesianism of the future to contain both frequentist and traditional Bayesian elements. Does anyone know of more up-to-date overviews of different inductive reasoning methods?

BTW, if you are the author, and don't like the links to your work, let me know, and I'll remove them

Saturday, March 05, 2005

More on CiteULike

I've noticed some people I know starting to use CiteULike recently. Services like CiteULike are generally a good development because they increase efficiency of research: usually people share bibliography through bibliography section of their published articles, but a publication can take years to become accessible.

The immediate advantage of CiteULike is that it can fill Bib details in for you. Here are some CiteULike usage tips

Make sure you have "Post to Citeulike" button on toolbar
When adding article X, first search for X on scholar.google.com
If search results have a link to ACM, IEEE, JSTOR, Ingenta or some other supported collection you are in luck -- go there, click "Post to Citeulike" and it'll automatically extract the bibliographic information for you
If you are adding books, search for book on Amazon, and do the same procedure
For off-campus access, save electronic book/article copies to a web accessible directory, and add link to (it's only seen by you) to the book/article entry
Add RSS of new submissions with relevant tags from CiteULike to your RSS aggregator
Choose usernames such that people can find you through Google (ie, I use yaroslavvb, which google knows about)

Questions:

Any other efficiency tips?

Friday, March 04, 2005

Machine Learning journals

Here are some journals to keep an eye on, along with their RSS feeds. I picked out the list by seeing where my favourite Machine Learning/stats papers came from:

Journal of Machine Learning Research (web, rss) The journal for machine learning publications. A better and freer replacement to "Machine Learning" journal -- here's some history
IEEE Transactions on Pattern Analysis and Machine Intelligence (web, rss) -- pattern means "visual pattern"
Machine Learning journal (web, rss) -- seems more applied than JMLR
Studies in History and Philosophy of Science Part B: Modern Physics (web, rss) -- sometimes discussions of Bayesianism, Maxent
Neural Computation(web, rss) -- neural nets, neurobiology, and general machine learning
ARXIV submissions on Learning (rss) and on Information Theory (rss)
Journal of the Royal Statistical Society (web, rss) -- some of the best papers on estimation were published here
Physics Review Letters (web) -- sometimes Machine Learning related stuff
The Annals of Statistics (web) -- Czisar published here
SIAM applied math (web)
Journal of Econometrics (web, rss) -- often the same problems but with a different name
Proceedings of the Nat'l Academy of Sciences(web) -- sometimes have interesting papers in applied math/computer science/statistics sections

CWI journal club has a bigger list of journals and links to their websites, mostly AI (not ML) oriented -- http://homepages.cwi.nl/~tomas/journalclub/20050301.html

Watch out for information overload :O

Questions:

What are some other journals that Machine Learning researchers should know about?