Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KPLS benchmark on Griewank function #337

Open
AlexandrePlot44 opened this issue Nov 30, 2021 · 4 comments
Open

KPLS benchmark on Griewank function #337

AlexandrePlot44 opened this issue Nov 30, 2021 · 4 comments

Comments

@AlexandrePlot44
Copy link

Hello,

I am currently working on KPLS techniques as part of my thesis. I am trying to reproduce the results established in the following article: https://hal.archives-ouvertes.fr/hal-01232938/document . The article focuses in part on applying KPLS model on the Griewank function while varying the input ranges, the number of inputs and the number of learning points.

I wrote a code to test KPLS under the same conditions as those defined in the article. Here is my code :

def griewank_function(x):
    """Griewank's function multimodal, symmetric, inseparable """
    y = np.zeros(x.shape[0])
    for j in range(x.shape[0]):
        x_j = x[j,:]
        partA = 0
        partB = 1
        for i in range(x.shape[1]):
            partA += x_j[i]**2
            partB *= np.cos(x_j[i] / np.sqrt(i+1))
        y[j] = 1 + (partA/4000.0) - partB
    return y

def calculate_error(sm,X_test, Y_test):
    Y_predicted = sm.predict_values(X_test)
    err_rel = 100*np.sqrt(np.sum(np.square(Y_predicted[:, 0] - Y_test))/np.sum(np.square(Y_test)))
    return err_rel

#Parameters
X_inf = -5.0
X_sup = 5.0
dim=20
num = 300
num_test = 5000
cri='ese' #ese or c
X_lim = np.tile(np.array([X_inf, X_sup]), (dim,1))

#Calculating test points. To be commented after first iteration
sampling = Random(xlimits=X_lim)
Xt = sampling(num_test)
Yt = griewank_function(Xt)

#initializing errors
err_krg=np.array([])
err_kpls1=np.array([])
err_kpls2=np.array([])
err_kpls3=np.array([])

#Loop for computing mean and sigma
for j in range (10):
    sampling = LHS(xlimits=X_lim, criterion=cri) 
    X = sampling(num)
    Y=griewank_function(X)
    print(Y.shape)
    
   #initializing surrogate models
    sm_krg = KRG(print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0], theta0=[1e-2])
    sm_kpls1 = KPLS(n_comp=1,print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0], theta0=[1e-2])
    sm_kpls2 = KPLS(n_comp=2, print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0],theta0=[1e-2])
    sm_kpls3 = KPLS(n_comp=3, print_prediction=False, corr='squar_exp', n_start=10, theta_bounds=[1e-06, 20.0],theta0=[1e-2])

   #Training surrogate models
    sm_krg.set_training_values(X, Y)
    start = time.time()
    sm_krg.train()
    time_krg = time.time()-start

    sm_kpls1.set_training_values(X, Y)
    start = time.time()
    sm_kpls1.train()
    time_kpls1 = time.time() - start

    sm_kpls2.set_training_values(X, Y)
    start = time.time()
    sm_kpls2.train()
    time_kpls2 = time.time() - start

    sm_kpls3.set_training_values(X, Y)
    start = time.time()
    sm_kpls3.train()
    time_kpls3 = time.time() - start

   #Calculating errors
    err_rel_krg = calculate_error(sm_krg, Xt, Yt)
    err_rel_kpls1 = calculate_error(sm_kpls1, Xt, Yt)
    err_rel_kpls2 = calculate_error(sm_kpls2, Xt, Yt)
    err_rel_kpls3 = calculate_error(sm_kpls3, Xt, Yt)
    #print("smt", err_rel)

    err_krg=np.append(err_krg, err_rel_krg)
    err_kpls1 = np.append(err_kpls1, err_rel_kpls1)
    err_kpls2 = np.append(err_kpls2, err_rel_kpls2)
    err_kpls3 = np.append(err_kpls3, err_rel_kpls3)

print("error krg",np.mean(err_krg,0),"; sigma",np.sqrt(np.var(err_krg,0)),"; time krg",time_krg)
print("error kpls1",np.mean(err_kpls1,0),";  sigma",np.sqrt(np.var(err_kpls1,0)),"; time kpls1",time_kpls1)
print("error kpls2",np.mean(err_kpls2,0),";  sigma",np.sqrt(np.var(err_kpls2,0)),"; time kpls2",time_kpls2)
print("error kpls3",np.mean(err_kpls3,0),";  sigma",np.sqrt(np.var(err_kpls3,0)),"; time kpls3",time_kpls3)

I am using the exact same error definition as the one defined in the article. The error is computed over 5000 random test points.
Correlation function is gaussian.

Here are the results I get (array on the left are my results and array on the right the results from the article)

res

My fist observation is that the error depends a lot on the DOE used for learning, that is to say, if I generate a new DOE for the same case, I will not get the same results.

In case 3, you can see that I used 2 different samples, one with ESE optimization and one with center criterion which leads to very different results. Furthermore, using KPLS with 2 or 3 components is supposed to lead to very small error (according to the array on the right).

What could be the cause of such a difference ?

I also did somme tests with [-5 5] input range. Do not hesitate to ask me for more details.
Thank you in advance,

Alexandre

@relf
Copy link
Member

relf commented Mar 7, 2022

Hi, sorry for the late answer. I can't reproduce the results of the paper with the latest version either. As you may have guessed it is not a high priority but it should definitely be investigated by pulling older versions to pin down a change responsible for this.

@plantom007
Copy link

I have encountered the same problem. Have you solved it

@AlexandrePlot44
Copy link
Author

Hi, no I didn't go any further.

@plantom007
Copy link

Thanks. I change the sampling method, which also leads to great changes in precision. Can you post this article again? The link won't open

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants