Sunday, October 16, 2005

Commercial article databases and indexes

Full texts of recent machine learning papers are usually freely available on the web (ie, google). That's often not the case for related fields like statistics/math/econometrics, so one has to rely on some indexing service that links into commercial article databases. There are hundreds of such indexing services. I tend to use the following indexes, which I think have the highest recall for ML/statistics stuff: MathSciNet, JSTOR and Google Scholar.

Here are the numbers of results for 3 queries, searching for exact phrase in title:

"exponential families":
Google Scholar -- 635
mathscinet -- 660
JSTOR -- 147

"support vector":
Google Scholar -- 3800
mathscinet -- 131
JSTOR -- 1

"cross validation":
Google Scholar -- 1530
MathSciNet -- 181
JSTOR -- 53

JSTOR has full texts for all articles it indexes. In my experience, MathSciNet had full text links to about half the articles. Google Scholar is probably lowest precision search because it includes duplicates and documents not published anywhere besides the Web.

Questions:
What are other good indexing services?

3 comments:

Yaroslav said...

From Aleks Jakulin:

Hi -

your blog doesn't allow "anonymous" comments; some additional links:

http://www.scirus.com/srsapp/
http://www.isinet.com/products/citation/wos/
http://www.sciencedirect.com/
http://ejournals.ebsco.com/

JoSeK said...

I usually use

http://citeseer.ist.psu.edu/

E said...

Here are some for computational linguistics, some are useful for their content, others are interesting for how their algo's work

http://findory.com/search?do=1&type=Blogs&q=search&submit=go
http://semanticsarchive.net/

http://uk.arxiv.org/form/cs.CL
http://nsdl.org/
http://bibsonomy.org/
http://rexa.info/
http://naboo.ilit.umbc.edu/aks1/v2/index.pl
http://lse.umiacs.umd.edu:8080/