<p>Big data over geographically distributed devices necessitates the development of decentralized optimization. Although first-order methods enjoy low per-iteration computational complexity, second-order methods are attractive due to their faster convergence rates. Motivated by this, we aim to propoe decentralized second-order algorithms inheriting the advantage of fast convergence as in the single-machine setting while avoiding high communication cost.
In the first work, we propose a Newton tracking algorithm, where no Hessian matrices exchange over the network. In the single-machine setting, the Newton method has theoretically faster rate than first-order methods. However, developing a communicate-efficient decentralized variant of the Newton method with condition-number-independence property or super-linear rate is non-trivial. In the second work, we fill this gap by proposing a decentralized Newton method and establishing a theoretically faster rate than first-order methods. In the third work, we move on to the stochastic setting when each node has many samples so that computing the local full gradients is not affordable. We develop a general algorithmic framework that incorporates stochastic quasi-Newton approximations with variance reduction and then specify two fully decent</p>