GCP-OPT Examples with Amino Acids Dataset
Contents
Setup
We use the well known amino acids dataset for some tests. This data has some negative values, but the factorization itself should be nonnegative.
% Load the data load(fullfile(getfield(what('tensor_toolbox'),'path'),'doc','aminoacids.mat')) clear M fit vizopts = {'PlotCommands',{@bar,@(x,y) plot(x,y,'r'),@(x,y) plot(x,y,'g')},... 'BottomSpace',0.1, 'HorzSpace', 0.04, 'Normalize', @(x) normalize(x,'sort',2)};
CP-ALS
Just a reminder of what CP-ALS does.
cnt = 1; tic, M{cnt} = cp_als(X,3,'printitn',10); toc fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X); fprintf('Fit: %g\n', fit(cnt)); viz(M{cnt},'Figure',cnt,vizopts{:});
CP_ALS: Iter 10: f = 8.049060e-01 f-delta = 2.6e-03 Iter 20: f = 8.394980e-01 f-delta = 4.2e-03 Iter 30: f = 9.296736e-01 f-delta = 1.4e-02 Iter 40: f = 9.610986e-01 f-delta = 1.5e-03 Iter 50: f = 9.715672e-01 f-delta = 6.0e-04 Final f = 9.715672e-01 Elapsed time is 0.665246 seconds. Fit: 0.971567

GCP with Gaussian
We can instead call the GCP with the Gaussian function.
cnt = 2; M{cnt} = gcp_opt(X,3,'type','Gaussian','printitn',10); fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X); fprintf('Fit: %g\n', fit(cnt)); viz(M{cnt},'Figure',cnt,vizopts{:});
GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition) Tensor size: 5 x 201 x 61 (61305 total entries) Generalized function Type: Gaussian Objective function: @(x,m)(m-x).^2 Gradient function: @(x,m)2.*(m-x) Lower bound of factor matrices: -Inf Optimization method: lbfgsb Max iterations: 1000 Projected gradient tolerance: 6.131 Begin Main loop Iter 10, f(x) = 3.695406e+08, ||grad||_infty = 6.45e+06, 4.64e-01 Iter 20, f(x) = 2.984033e+07, ||grad||_infty = 1.36e+06, 6.67e-01 Iter 30, f(x) = 5.556832e+06, ||grad||_infty = 7.77e+05, 8.06e-01 Iter 40, f(x) = 1.471521e+06, ||grad||_infty = 1.71e+04, 9.64e-01 Iter 50, f(x) = 1.445878e+06, ||grad||_infty = 7.40e+03, 1.08e+00 Iter 60, f(x) = 1.445124e+06, ||grad||_infty = 2.24e+03, 1.19e+00 Iter 70, f(x) = 1.445110e+06, ||grad||_infty = 3.69e+01, 1.32e+00 Iter 77, f(x) = 1.445110e+06, ||grad||_infty = 3.18e+01, 1.44e+00 End Main Loop Final objective: 1.4451e+06 Setup time: 0.07 seconds Main loop time: 1.44 seconds Outer iterations: 77 Total iterations: 165 L-BFGS-B Exit message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH. Fit: 0.974951

GCP with Gaussian and Missing Data
What if some data is missing?
cnt = 3; % Proportion of missing data p = 0.35; % Create a mask with the missing entries set to 0 and everything else 1 W = tensor(double(rand(size(X))>p)); % Fit the model, using the 'mask' option M{cnt} = gcp_opt(X.*W,3,'type','Gaussian','mask',W,'printitn',10); fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X); fprintf('Fit: %g\n', fit(cnt)); viz(M{cnt},'Figure',cnt,vizopts{:});
GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition) Tensor size: 5 x 201 x 61 (61305 total entries) Missing entries: 21547 (35%) Generalized function Type: Gaussian Objective function: @(x,m)(m-x).^2 Gradient function: @(x,m)2.*(m-x) Lower bound of factor matrices: -Inf Optimization method: lbfgsb Max iterations: 1000 Projected gradient tolerance: 6.131 Begin Main loop Iter 10, f(x) = 6.661433e+07, ||grad||_infty = 3.32e+06, 7.84e-02 Iter 20, f(x) = 1.041029e+06, ||grad||_infty = 4.21e+04, 1.52e-01 Iter 30, f(x) = 9.230744e+05, ||grad||_infty = 4.57e+04, 2.06e-01 Iter 40, f(x) = 9.198762e+05, ||grad||_infty = 3.99e+02, 2.52e-01 Iter 50, f(x) = 9.198701e+05, ||grad||_infty = 7.03e+01, 2.99e-01 Iter 60, f(x) = 9.198699e+05, ||grad||_infty = 5.57e+01, 3.51e-01 Iter 61, f(x) = 9.198699e+05, ||grad||_infty = 7.81e+00, 3.57e-01 End Main Loop Final objective: 9.1987e+05 Setup time: 0.01 seconds Main loop time: 0.36 seconds Outer iterations: 61 Total iterations: 131 L-BFGS-B Exit message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH. Fit: 0.974846

GCP with ADAM
We can also use stochastic gradient, though it's pretty slow for such a small tensor.
cnt = 4; % Specify 'opt' = 'adam' M{cnt} = gcp_opt(X,3,'type','Gaussian','opt','adam','printitn',1,'fsamp',5000,'gsamp',500); fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X); fprintf('Fit: %g\n', fit(cnt)); viz(M{cnt},'Figure',cnt,vizopts{:});
GCP-OPT-ADAM (Generalized CP Tensor Decomposition) Tensor size: 5 x 201 x 61 (61305 total entries) Generalized function Type: Gaussian Objective function: @(x,m)(m-x).^2 Gradient function: @(x,m)2.*(m-x) Lower bound of factor matrices: -Inf Optimization method: adam Max iterations (epochs): 1000 Iterations per epoch: 1000 Learning rate / decay / maxfails: 0.001 0.1 1 Function Sampler: uniform with 5000 samples Gradient Sampler: uniform with 500 samples Begin Main loop Initial f-est: 2.339158e+09 Epoch 1: f-est = 1.535372e+09, step = 0.001 Epoch 2: f-est = 1.208430e+09, step = 0.001 Epoch 3: f-est = 1.015857e+09, step = 0.001 Epoch 4: f-est = 9.052066e+08, step = 0.001 Epoch 5: f-est = 8.390940e+08, step = 0.001 Epoch 6: f-est = 7.919432e+08, step = 0.001 Epoch 7: f-est = 7.464879e+08, step = 0.001 Epoch 8: f-est = 6.950998e+08, step = 0.001 Epoch 9: f-est = 6.292686e+08, step = 0.001 Epoch 10: f-est = 5.482254e+08, step = 0.001 Epoch 11: f-est = 4.613978e+08, step = 0.001 Epoch 12: f-est = 3.865036e+08, step = 0.001 Epoch 13: f-est = 3.324597e+08, step = 0.001 Epoch 14: f-est = 2.963748e+08, step = 0.001 Epoch 15: f-est = 2.686114e+08, step = 0.001 Epoch 16: f-est = 2.480659e+08, step = 0.001 Epoch 17: f-est = 2.339591e+08, step = 0.001 Epoch 18: f-est = 2.231691e+08, step = 0.001 Epoch 19: f-est = 2.153023e+08, step = 0.001 Epoch 20: f-est = 2.072950e+08, step = 0.001 Epoch 21: f-est = 1.999384e+08, step = 0.001 Epoch 22: f-est = 1.915436e+08, step = 0.001 Epoch 23: f-est = 1.823977e+08, step = 0.001 Epoch 24: f-est = 1.711630e+08, step = 0.001 Epoch 25: f-est = 1.592570e+08, step = 0.001 Epoch 26: f-est = 1.461925e+08, step = 0.001 Epoch 27: f-est = 1.314062e+08, step = 0.001 Epoch 28: f-est = 1.157353e+08, step = 0.001 Epoch 29: f-est = 9.992046e+07, step = 0.001 Epoch 30: f-est = 8.352893e+07, step = 0.001 Epoch 31: f-est = 6.778976e+07, step = 0.001 Epoch 32: f-est = 5.358143e+07, step = 0.001 Epoch 33: f-est = 4.072781e+07, step = 0.001 Epoch 34: f-est = 3.010716e+07, step = 0.001 Epoch 35: f-est = 2.158019e+07, step = 0.001 Epoch 36: f-est = 1.491409e+07, step = 0.001 Epoch 37: f-est = 9.884627e+06, step = 0.001 Epoch 38: f-est = 6.348294e+06, step = 0.001 Epoch 39: f-est = 3.958453e+06, step = 0.001 Epoch 40: f-est = 2.530503e+06, step = 0.001 Epoch 41: f-est = 1.816156e+06, step = 0.001 Epoch 42: f-est = 1.498712e+06, step = 0.001 Epoch 43: f-est = 1.412878e+06, step = 0.001 Epoch 44: f-est = 1.398484e+06, step = 0.001 Epoch 45: f-est = 1.398154e+06, step = 0.001 Epoch 46: f-est = 1.400793e+06, step = 0.001, nfails = 1 (resetting to solution from last epoch) Epoch 47: f-est = 1.388885e+06, step = 0.0001 Epoch 48: f-est = 1.383505e+06, step = 0.0001 Epoch 49: f-est = 1.386106e+06, step = 0.0001, nfails = 2 (resetting to solution from last epoch) End Main Loop Final f-est: 1.3835e+06 Setup time: 0.11 seconds Main loop time: 63.64 seconds Total iterations: 49000 Fit: 0.974933

GCP with Gamma (terrible!)
We can try Gamma, but it's not really the right distribution and produces a terrible result.
cnt = 5; Y = tensor(X(:) .* (X(:) > 0), size(X)); M{cnt} = gcp_opt(Y,3,'type','Gamma','printitn',25); fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X); fprintf('Fit: %g\n', fit(cnt)); viz(M{cnt},'Figure',cnt,vizopts{:});
Warning: Using 'Gamma' type but tensor X is not nonnegative GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition) Tensor size: 5 x 201 x 61 (61305 total entries) Generalized function Type: Gamma Objective function: @(x,m)x./(m+1e-10)+log(m+1e-10) Gradient function: @(x,m)-x./((m+1e-10).^2)+1./(m+1e-10) Lower bound of factor matrices: 0 Optimization method: lbfgsb Max iterations: 1000 Projected gradient tolerance: 6.131 Begin Main loop Iter 25, f(x) = 3.103239e+05, ||grad||_infty = 3.91e+03, 4.11e-01 Iter 50, f(x) = 3.099469e+05, ||grad||_infty = 7.42e+04, 5.81e-01 Iter 75, f(x) = 3.092352e+05, ||grad||_infty = 2.00e+05, 7.67e-01 Iter 100, f(x) = 3.089673e+05, ||grad||_infty = 3.88e+05, 9.09e-01 Iter 125, f(x) = 3.087874e+05, ||grad||_infty = 2.83e+05, 1.02e+00 Iter 150, f(x) = 3.085926e+05, ||grad||_infty = 3.79e+05, 1.16e+00 Iter 175, f(x) = 3.084644e+05, ||grad||_infty = 3.52e+05, 1.27e+00 Iter 200, f(x) = 3.080462e+05, ||grad||_infty = 2.41e+05, 1.43e+00 Iter 225, f(x) = 3.073277e+05, ||grad||_infty = 3.00e+05, 1.60e+00 Iter 250, f(x) = 3.070011e+05, ||grad||_infty = 3.96e+05, 1.77e+00 Iter 275, f(x) = 3.064863e+05, ||grad||_infty = 4.87e+05, 1.98e+00 Iter 300, f(x) = 3.060977e+05, ||grad||_infty = 5.91e+06, 2.17e+00 Iter 314, f(x) = 3.060824e+05, ||grad||_infty = 1.00e+07, 2.25e+00 End Main Loop Final objective: 3.0608e+05 Setup time: 0.01 seconds Main loop time: 2.25 seconds Outer iterations: 314 Total iterations: 758 L-BFGS-B Exit message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH. Fit: 0.294307

GCP with Huber + Lower Bound
Huber works well. By default, Huber has no lower bound. To add one, we have to pass in the func/grad/lower information explicitly. We can use gcp_fg_setup to get the func/grad parameters.
cnt = 6; % Call helper function tt_gcp_fg_setup to get the function and gradient handles [fh,gh] = tt_gcp_fg_setup('Huber (0.25)'); % Pass the func/grad/lower explicitly. M{cnt} = gcp_opt(X,3,'func',fh,'grad',gh,'lower',0,'printitn',25); fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X); fprintf('Fit: %g\n', fit(cnt)); viz(M{cnt},'Figure',cnt,vizopts{:});
GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition) Tensor size: 5 x 201 x 61 (61305 total entries) Generalized function Type: user-specified Objective function: @(x,m)(x-m).^2.*(abs(x-m)<0.25)+(0.5.*abs(x-m)-0.0625).*(abs(x-m)>=0.25) Gradient function: @(x,m)-2.*(x-m).*(abs(x-m)<0.25)-(0.5.*sign(x-m)).*(abs(x-m)>=0.25) Lower bound of factor matrices: 0 Optimization method: lbfgsb Max iterations: 1000 Projected gradient tolerance: 6.131 Begin Main loop Iter 25, f(x) = 9.281439e+05, ||grad||_infty = 7.09e+03, 2.65e-01 Iter 50, f(x) = 6.810279e+05, ||grad||_infty = 3.15e+03, 4.14e-01 Iter 75, f(x) = 6.720943e+05, ||grad||_infty = 2.10e+03, 5.40e-01 Iter 100, f(x) = 6.628711e+05, ||grad||_infty = 3.35e+03, 6.75e-01 Iter 125, f(x) = 6.612254e+05, ||grad||_infty = 2.98e+03, 7.61e-01 Iter 150, f(x) = 6.611298e+05, ||grad||_infty = 3.09e+03, 8.40e-01 Iter 175, f(x) = 6.610727e+05, ||grad||_infty = 3.18e+03, 9.61e-01 Iter 200, f(x) = 6.609277e+05, ||grad||_infty = 3.21e+03, 1.05e+00 Iter 225, f(x) = 6.606686e+05, ||grad||_infty = 3.38e+03, 1.13e+00 Iter 250, f(x) = 6.605062e+05, ||grad||_infty = 3.51e+03, 1.21e+00 Iter 275, f(x) = 6.603939e+05, ||grad||_infty = 3.46e+03, 1.30e+00 Iter 300, f(x) = 6.602154e+05, ||grad||_infty = 2.53e+03, 1.38e+00 Iter 325, f(x) = 6.600801e+05, ||grad||_infty = 1.77e+03, 1.46e+00 Iter 350, f(x) = 6.600285e+05, ||grad||_infty = 1.82e+03, 1.57e+00 Iter 375, f(x) = 6.600083e+05, ||grad||_infty = 1.84e+03, 1.68e+00 Iter 400, f(x) = 6.599970e+05, ||grad||_infty = 1.86e+03, 1.84e+00 Iter 425, f(x) = 6.599888e+05, ||grad||_infty = 1.88e+03, 2.02e+00 Iter 450, f(x) = 6.599836e+05, ||grad||_infty = 1.88e+03, 2.16e+00 Iter 475, f(x) = 6.599804e+05, ||grad||_infty = 1.89e+03, 2.30e+00 Iter 500, f(x) = 6.599761e+05, ||grad||_infty = 1.90e+03, 2.44e+00 Iter 525, f(x) = 6.599682e+05, ||grad||_infty = 1.92e+03, 2.59e+00 Iter 550, f(x) = 6.599631e+05, ||grad||_infty = 1.92e+03, 2.76e+00 Iter 575, f(x) = 6.599465e+05, ||grad||_infty = 1.99e+03, 2.86e+00 Iter 600, f(x) = 6.599187e+05, ||grad||_infty = 1.60e+03, 2.98e+00 Iter 625, f(x) = 6.599049e+05, ||grad||_infty = 1.22e+03, 3.09e+00 Iter 650, f(x) = 6.598981e+05, ||grad||_infty = 1.19e+03, 3.17e+00 Iter 675, f(x) = 6.598898e+05, ||grad||_infty = 1.17e+03, 3.26e+00 Iter 700, f(x) = 6.598859e+05, ||grad||_infty = 1.19e+03, 3.34e+00 Iter 725, f(x) = 6.598849e+05, ||grad||_infty = 1.20e+03, 3.44e+00 Iter 750, f(x) = 6.598831e+05, ||grad||_infty = 1.21e+03, 3.54e+00 Iter 775, f(x) = 6.598821e+05, ||grad||_infty = 1.23e+03, 3.66e+00 Iter 800, f(x) = 6.598819e+05, ||grad||_infty = 1.24e+03, 3.77e+00 Iter 825, f(x) = 6.598812e+05, ||grad||_infty = 1.24e+03, 3.88e+00 Iter 844, f(x) = 6.598810e+05, ||grad||_infty = 1.27e+03, 3.99e+00 End Main Loop Final objective: 6.5988e+05 Setup time: 0.01 seconds Main loop time: 3.99 seconds Outer iterations: 844 Total iterations: 1708 L-BFGS-B Exit message: CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL. Fit: 0.573896

GCP with Beta
This is also pretty bad, which gives an idea of the struggle of choosing the wrong distribution. It can work a little bit, but it's clearly the wrong objective.
cnt = 7; M{cnt} = gcp_opt(X,3,'type','beta (0.75)','printitn',25); fit(cnt) = 1 - norm(full(M{cnt})-X)/norm(X); fprintf('Fit: %g\n', fit(cnt)); viz(M{cnt},'Figure',cnt,vizopts{:});
Warning: Using 'beta' type but tensor X is not nonnegative GCP-OPT-LBFGSB (Generalized CP Tensor Decomposition) Tensor size: 5 x 201 x 61 (61305 total entries) Generalized function Type: beta (0.75) Objective function: @(x,m)(1.33333).*(m+1e-10).^(0.75)-(-4).*x.*(m+1e-10).^(-0.25) Gradient function: @(x,m)(m+1e-10).^(-0.25)-x.*(m+1e-10).^(-1.25) Lower bound of factor matrices: 0 Optimization method: lbfgsb Max iterations: 1000 Projected gradient tolerance: 6.131 Begin Main loop Iter 25, f(x) = 9.675506e+06, ||grad||_infty = 1.81e+06, 1.36e+00 Iter 50, f(x) = 9.672284e+06, ||grad||_infty = 4.27e+06, 2.17e+00 Iter 75, f(x) = 9.667505e+06, ||grad||_infty = 3.33e+07, 2.84e+00 End Main Loop Final objective: 9.6675e+06 Setup time: 0.02 seconds Main loop time: 2.84 seconds Outer iterations: 75 Total iterations: 187 L-BFGS-B Exit message: CONVERGENCE: REL_REDUCTION_OF_F_<=_FACTR*EPSMCH. Fit: 0.719253
