KPLS benchmark on Griewank function #337

AlexandrePlot44 · 2021-11-30T19:16:19Z

Hello,

I am currently working on KPLS techniques as part of my thesis. I am trying to reproduce the results established in the following article: https://hal.archives-ouvertes.fr/hal-01232938/document . The article focuses in part on applying KPLS model on the Griewank function while varying the input ranges, the number of inputs and the number of learning points.

I wrote a code to test KPLS under the same conditions as those defined in the article. Here is my code :

def griewank_function(x):
    """Griewank's function multimodal, symmetric, inseparable """
    y = np.zeros(x.shape[0])
    for j in range(x.shape[0]):
        x_j = x[j,:]
        partA = 0
        partB = 1
        for i in range(x.shape[1]):
            partA += x_j[i]**2
            partB *= np.cos(x_j[i] / np.sqrt(i+1))
        y[j] = 1 + (partA/4000.0) - partB
    return y

def calculate_error(sm,X_test, Y_test):
    Y_predicted = sm.predict_values(X_test)
    err_rel = 100*np.sqrt(np.sum(np.square(Y_predicted[:, 0] - Y_test))/np.sum(np.square(Y_test)))
    return err_rel

#Parameters
X_inf = -5.0
X_sup = 5.0
dim=20
num = 300
num_test = 5000
cri='ese' #ese or c
X_lim = np.tile(np.array([X_inf, X_sup]), (dim,1))

#Calculating test points. To be commented after first iteration
sampling = Random(xlimits=X_lim)
Xt = sampling(num_test)
Yt = griewank_function(Xt)

#initializing errors
err_krg=np.array([])
err_kpls1=np.array([])
err_kpls2=np.array([])
err_kpls3=np.array([])

#Loop for computing mean and sigma
for j in range (10):
    sampling = LHS(xlimits=X_lim, criterion=cri) 
    X = sampling(num)
    Y=griewank_function(X)
    print(Y.shape)
    
   #initializing surrogate models
    sm_krg = KRG(print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0], theta0=[1e-2])
    sm_kpls1 = KPLS(n_comp=1,print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0], theta0=[1e-2])
    sm_kpls2 = KPLS(n_comp=2, print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0],theta0=[1e-2])
    sm_kpls3 = KPLS(n_comp=3, print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0],theta0=[1e-2])

   #Training surrogate models
    sm_krg.set_training_values(X, Y)
    start = time.time()
    sm_krg.train()
    time_krg = time.time()-start

    sm_kpls1.set_training_values(X, Y)
    start = time.time()
    sm_kpls1.train()
    time_kpls1 = time.time() - start

    sm_kpls2.set_training_values(X, Y)
    start = time.time()
    sm_kpls2.train()
    time_kpls2 = time.time() - start

    sm_kpls3.set_training_values(X, Y)
    start = time.time()
    sm_kpls3.train()
    time_kpls3 = time.time() - start

   #Calculating errors
    err_rel_krg = calculate_error(sm_krg, Xt, Yt)
    err_rel_kpls1 = calculate_error(sm_kpls1, Xt, Yt)
    err_rel_kpls2 = calculate_error(sm_kpls2, Xt, Yt)
    err_rel_kpls3 = calculate_error(sm_kpls3, Xt, Yt)
    #print("smt", err_rel)

    err_krg=np.append(err_krg, err_rel_krg)
    err_kpls1 = np.append(err_kpls1, err_rel_kpls1)
    err_kpls2 = np.append(err_kpls2, err_rel_kpls2)
    err_kpls3 = np.append(err_kpls3, err_rel_kpls3)

print("error krg",np.mean(err_krg,0),"; sigma",np.sqrt(np.var(err_krg,0)),"; time krg",time_krg)
print("error kpls1",np.mean(err_kpls1,0),";  sigma",np.sqrt(np.var(err_kpls1,0)),"; time kpls1",time_kpls1)
print("error kpls2",np.mean(err_kpls2,0),";  sigma",np.sqrt(np.var(err_kpls2,0)),"; time kpls2",time_kpls2)
print("error kpls3",np.mean(err_kpls3,0),";  sigma",np.sqrt(np.var(err_kpls3,0)),"; time kpls3",time_kpls3)

I am using the exact same error definition as the one defined in the article. The error is computed over 5000 random test points.
Correlation function is gaussian.

Here are the results I get (array on the left are my results and array on the right the results from the article)

My fist observation is that the error depends a lot on the DOE used for learning, that is to say, if I generate a new DOE for the same case, I will not get the same results.

In case 3, you can see that I used 2 different samples, one with ESE optimization and one with center criterion which leads to very different results. Furthermore, using KPLS with 2 or 3 components is supposed to lead to very small error (according to the array on the right).

What could be the cause of such a difference ?

I also did somme tests with [-5 5] input range. Do not hesitate to ask me for more details.
Thank you in advance,

Alexandre

relf · 2022-03-07T12:25:35Z

Hi, sorry for the late answer. I can't reproduce the results of the paper with the latest version either. As you may have guessed it is not a high priority but it should definitely be investigated by pulling older versions to pin down a change responsible for this.

plantom007 · 2022-10-04T08:11:12Z

I have encountered the same problem. Have you solved it

AlexandrePlot44 · 2022-10-04T08:31:21Z

Hi, no I didn't go any further.

plantom007 · 2022-10-04T08:37:08Z

Thanks. I change the sampling method, which also leads to great changes in precision. Can you post this article again? The link won't open

relf mentioned this issue Oct 5, 2022

The training time of Kriging model and GEKPLS #387

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KPLS benchmark on Griewank function #337

KPLS benchmark on Griewank function #337

AlexandrePlot44 commented Nov 30, 2021

relf commented Mar 7, 2022

plantom007 commented Oct 4, 2022

AlexandrePlot44 commented Oct 4, 2022

plantom007 commented Oct 4, 2022

KPLS benchmark on Griewank function #337

KPLS benchmark on Griewank function #337

Comments

AlexandrePlot44 commented Nov 30, 2021

relf commented Mar 7, 2022

plantom007 commented Oct 4, 2022

AlexandrePlot44 commented Oct 4, 2022

plantom007 commented Oct 4, 2022