Sunday, October 16, 2005

Commercial article databases and indexes

Full texts of recent machine learning papers are usually freely available on the web (ie, google). That's often not the case for related fields like statistics/math/econometrics, so one has to rely on some indexing service that links into commercial article databases. There are hundreds of such indexing services. I tend to use the following indexes, which I think have the highest recall for ML/statistics stuff: MathSciNet, JSTOR and Google Scholar.

Here are the numbers of results for 3 queries, searching for exact phrase in title:

"exponential families":
Google Scholar -- 635
mathscinet -- 660
JSTOR -- 147

"support vector":
Google Scholar -- 3800
mathscinet -- 131
JSTOR -- 1

"cross validation":
Google Scholar -- 1530
MathSciNet -- 181
JSTOR -- 53

JSTOR has full texts for all articles it indexes. In my experience, MathSciNet had full text links to about half the articles. Google Scholar is probably lowest precision search because it includes duplicates and documents not published anywhere besides the Web.

Questions:
What are other good indexing services?

Saturday, October 15, 2005

Learning Mathematica

Here are some useful Mathematica training notebooks I came across.

Mathematica Training -- Notebooks from a 2 day Mathematica course
NKS summer school -- some notebooks from the "New Kind of Science" summer school intro to Mathematica
Programming Paradigms via Mathematica - notebooks from a course developed by Neidinger and Swallow