Low-Rank NMF vs Classical NMF¶

We will compare the performance of Low-Rank NMF and Classical NMF across data settings. For each choice, we will compare rank-1 and rank-10 updates ($k = 1,10$) and 1,5, and 10 subiterations.

Some smaller changes since last time:

  • We replaced the Hessian step size trace calculations with equivalent and faster Frobenius norm calculations.
  • We have moved over to using relative error ($\|X - WH\|_F^2 / \|X\|_F^2$) rather than absolute error
  • We have improved the plotting code significantly, and added the interpolated plots suggested by Hanbaek (shown in Section 2)
  • Initialization size continues to be important. Upon closer inspection, the sklearn implementations also deal with this (rescaling the data causes different performance). For now, we are just continuing to use [0,1] initialization, which seems to work well without oscillation or divergence.
In [2]:
import matplotlib
%matplotlib inline
import matplotlib.pyplot as plt
from timing_experiment import full_settings, make_loss_plot, make_scatterplot, make_interpolated_timing_plot
import plotly
import plotly.io as pio
pio.renderers.default = "notebook"
plotly.offline.init_notebook_mode()