Machine Learning, etc: Tensor Methods for Optimization

Saturday, February 12, 2005

Tensor Methods for Optimization

Second-order methods use an exact Hessian, or a compact approximation to direct the search for the minimum. They are called second-order because the Hessian they use, ie, second derivative, is the second order term in the Taylor expansion of the function. What if our function is not a quadratic, and we wanted to guide our minimization using some 3rd or 4th order information? That's where tensor methods come in.

Description

Main idea to Schnabel's tensor method is to select a set of p points from previous evaluations such that they are mostly linearly independent (each new direction makes at least a 45 degree angle with old direction). The idea is then to try to interpolate function values at those points by fitting a 4th order model with special form of 3rd and 4th order terms. Because of the restricted form of these terms, the overhead associated with this step is not significant over the standard Newton-Gauss algorithm. Because it approximates function with 4th order expansion, the number of problems for which this method converges was larger. Schnabel also proves superlinear convergence of this method for some cases when Newton's method converges linearly.

Performance

Seth Greenblatt tried tensor method on some econometric models, and results were 1.5-3 times reduction in number of function evaluations. He compared it against Newton-Rhapson method, using exact matrix inversion (O(n^3)) and the extra overhead of for the tensor method wasn't significant. The problems he tested on were econometric models including Lucas-Rapping model and Berndt-Wood KLEM model. Lucas-Rapping model in his thesis was basically logistic regression with some econometric semantics.

Schnabel's paper in SIAM J Opt also used explicit Hessian, and O(n^3) steps to invert it. His 1998 paper introduces implementation of tensor method, TenSolve, which uses same approach as Gauss-Newton to approximate second order term from gradients, which means O(n^2) space and time requirement. Asymptotic overhead of using tensors was O(n^1.5)

Thoughts

TenSolve stores the approximation to the Hessian explicitly, so that takes O(n^2) space. It seems conceptually possible to reduce storage to O(n) if we use a method analogous to limited memory BFGS to store the Hessian. If that worked, it would result in a tensor optimization method with O(n) storage complexity and O(n^1.5) computation complexity

What are some applications of Machine Learning where an O(n^1.5) method would be acceptable?
What would be a good dataset on which to test CRF training using this method?

Resources:

Schnabel, Robert B and Ta-Tung Chow (1991) "Tensor Methods for Unconstrained Optimization Using Second Derivatives," SAIM J. Optimization 1,3,293-315
Ali Bouaricha, Robert B. Schnabel: Algorithm 768: TENSOLVE: A Software Package for Solving Systems of Nonlinear Equations and Nonlinear Least-Squares Problems Using Tensor Methods. ACM Trans. Math. Softw. 23 (2): 174-195 (1997)
Greenblatt, "Tensor Methods for Full-Information Maximum Likelihood Estimation", PhD thesis, George Washington University
TenSolve (Implementation of Schnabel's tensor optimization method): http://www.netlib.org/toms/768

10 comments:

Anonymous said...: Thanks! This post was interesting.; 8:14 AM
Yaroslav Bulatov said...: Thanks! It's good to know someone is actually reading my posts; 2:51 PM
Anonymous said...: haha! yes! Be sure that someone reads your weblog. At least, I do so!
I think it is worthy to analyze the interaction of the weblog and its readers. It is not that easy though.
However, from my experiences in non-technical weblog writing in persian, I have found out that depending on the number of visitors of a weblog, the percent of readers who comment to the whole number of readers is something between 2% to 10%. This relation, in my opinion, is monotonically increasing (!) as a more visited weblog has more percent of commentators. I guess this percentage would be lower for academic weblogs.; 11:37 AM
draj said...: Excellent machine learning blog,thanks for sharing...
Seo Internship in Bangalore
Smo Internship in Bangalore
Digital Marketing Internship Program in Bangalore; 11:50 PM
카지노사이트 said...: Great blog here! Also your web site loads up very fast! D.; 6:47 PM
토토사이트 said...: What host are you using? D.; 6:48 PM
안전놀이터 said...: Can I get your affiliate link to your host? D.; 6:48 PM
소액결제 현금화 said...: I wish my web site loaded up as fast as yours lol D.; 6:49 PM
파워볼사이트 said...: Hi there, just became alert to your blog through Google, and found that it's truly informative. D.; 6:49 PM
먹튀검증사이트 said...: Hi there! This is my very first visit.; 5:26 AM