This post is a sequel to the previous article on how to use the old Fortran code to solve optimization problems in C++ applications. This time we consider the L-BFGS-B algorithm for solving smooth and box-constrained optimization problems of the form $$ \begin{align*} \min_{x}\quad & f(x)\\ \text{subject to}\quad & l\le x\le u, \end{align*} $$ where $l$ and $u$ are simple bounds for $x\in\mathbb{R}^n$, and can take $-\infty$ and $+\infty$ values.
L-BFGS is a well-known and widely-used optimization algorithm for smooth and unconstrained optimization problems. It was originally implemented in Fortran, and also has some more recent implementations including libLBFGS and my own LBFGS++.
The Fortran code was written more than 30 years ago, and looks a bit exotic from today’s perspective. However, it is still one of the most stable and mature implementations of the L-BFGS algorithm, and is typically used as a baseline in testing and benchmarking.
Motivation Deep learning frameworks such as PyTorch and Tensorflow provide excellent auto-differentiation support for matrices and vectors. They have included many built-in functions and operators that can be combined together to create complicated yet auto-differentiable functions. However, in some cases we prefer to manually define the gradient of a function, instead of relying on automatic differentiation; yet we still allow this function to be embedded into a larger program, which has end-to-end auto-differentiation support.
Recently I was reading articles and books about MCMC, and realized that many materials were not taught in my graduate study. To this end, I decide to make a summary of such content, to assist readers and myself to gain deeper understanding of MCMC in the future. I hope to make this topic a series, although I cannot guarantee its completion. This article is the first one in this hypothetical series, and we are going to introduce an important concept, the geometric ergodicity.
Per the suggestion by @robmaz, RSpectra::svds() now has two new parameters center and scale, to support implicit centering and scaling of matrices in partial SVD. The minimum version for this new feature is RSpectra >= 0.16-0.
These two parameters are very useful for principal component analysis (PCA) based on the covariance or correlation matrix, without actually forming them. Below we simulate a random data matrix, and use both R’s built-in prcomp() and the svds() function in RSpectra to compute PCA.
A Quick View of Recommender System The main task of recommender system is to predict unknown entries in the rating matrix based on observed values, as is shown in the table below:
Each cell with number in it is the rating given by some user on a specific item, while those marked with question marks are unknown ratings that need to be predicted. In some other literatures, this problem may be named collaborative filtering, matrix completion, matrix recovery, etc.
In January 2016, I was honored to receive an “Honorable Mention” of the John Chambers Award 2016. This article was written for R-bloggers, whose builder, Tal Galili, kindly invited me to write an introduction to the rARPACK package.
A Short Story of rARPACK Eigenvalue decomposition is a commonly used technique in numerous statistical problems. For example, principal component analysis (PCA) basically conducts eigenvalue decomposition on the sample covariance of a data matrix: the eigenvalues are the component variances, and eigenvectors are the variable loadings.