DBNInference - Key Error #1744

Rajaram1604 · 2024-03-15T08:05:37Z

Subject of the issue

Getting Key Error while making inference by using DBNInference

Your environment

pgmpy version - 0.1.24(installed from pgmpy dev branch)
Python version - 3.11
Operating System - Windows

Steps to reproduce

Fitting the below DataFrame data with DynamicBayesian Network model

data_t0 = pd.DataFrame({
    'CreditScore': np.random.randint(500, 800, size=(100, 1)).flatten(),
    'Income': np.random.randint(20000, 20100, size=(100,)),
    'LoanAmount': np.random.randint(15000, 15100, size=(100,)),
})

data_t0['LoanApproval'] = np.where((data_t0['CreditScore'] > 650) & (data_t0['Income'] > 20080) & (data_t0['LoanAmount'] < data_t0['Income']), 'Approved', 'Denied')

df_t0 = pd.DataFrame(data_t0, columns=data_t0.keys())

# Slice 1 ,  data
data_t1 = pd.DataFrame({
    'CreditScore': np.random.randint(500, 900, size=(100, 1)).flatten(),
    'Income': np.random.randint(20000, 20200, size=(100,)),
    'LoanAmount': np.random.randint(15000, 15200, size=(100,)),
})

data_t1['LoanApproval'] = np.where((data_t1['CreditScore'] > 700) & (data_t1['Income'] > 20100) & (data_t1['LoanAmount'] < data_t1['Income']), 'Approved', 'Denied')

df_t1 = pd.DataFrame(data_t1, columns=data_t1.keys())

# Concatenating the two slice DataFrames
concat_df = pd.concat([df_t0, df_t1], axis=1, join='inner')

print(concat_df)

#convert dataframe to list
list = concat_df.values.tolist()
print(list)

# define the column names
col_names = [('CreditScore', 0), ('Income', 0), ('LoanAmount', 0),('LoanApproval',0),('CreditScore', 1), ('Income', 1), ('LoanAmount', 1),('LoanApproval',1)]

final_df = pd.DataFrame(list, columns=col_names)

print(final_df)

# fit the data into model, Currently only Maximum Likelihood Estimator is supported.
loan_dbn_model.fit(data=final_df, estimator="MLE")
loan_dbn_model.initialize_initial_state()

Expected behaviour

Making the inference by using DBNInference which I have fitted the model with the above mentioned data , example evidence {("CreditScore", 0): 672}. Expected some results but getting the error. below the code snippet
`
dbn_inference = DBNInference(loan_dbn_model)
results = dbn_inference.query(variables=[("LoanApproval", 0)], evidence={("CreditScore", 0): 672})

`

Actual behaviour

Getting the error below error while making the inference. But while am making the inference with evidence={("CreditScore", 0): 0}) its working, We know that internally scaling down the data and fitting the model. But while make the inference need to use a real data which we have fitted the model.
results = dbn_inference.query(variables=[("LoanApproval", 0)], evidence={("CreditScore", 0): 672})
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\pgmpy\inference\dbn_inference.py", line 475, in query
return self.backward_inference(variables, evidence)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\pgmpy\inference\dbn_inference.py", line 385, in backward_inference
potential_dict = self.forward_inference(variables, evidence, "potential")
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\pgmpy\inference\dbn_inference.py", line 281, in forward_inference
initial_factor = self._get_factor(start_bp, evidence_0)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Python311\Lib\site-packages\pgmpy\inference\dbn_inference.py", line 204, in get_factor
final_factor.reduce([(var, evidence[var])])
File "C:\Python311\Lib\site-packages\pgmpy\factors\discrete\DiscreteFactor.py", line 570, in reduce
phi.values = phi.values[tuple(slice)]
~~~~~~~~~~^^^^^^^^^^^^^^^
IndexError: index 672 is out of bounds for axis 0 with size 81

The text was updated successfully, but these errors were encountered:

ankurankan · 2024-03-18T09:18:53Z

@Rajaram1604 Could you please also add how the load_dbn_model model is defined in the code so that I can reproduce the issue?

Rajaram1604 · 2024-03-18T10:38:37Z

Defining the DBN Model.

loan_dbn_model = DBN();

Add Edges

loan_dbn_model.add_edges_from([(('Income', 0), ('LoanAmount', 0)),
                               (('CreditScore', 0), ('LoanAmount', 0)),
                               (('LoanAmount', 0), ('LoanApproval', 0)),

                               (('Income', 1), ('LoanAmount', 1)),
                               (('CreditScore', 1), ('LoanAmount', 1)),
                               (('LoanAmount', 1), ('LoanApproval', 1)),

                               (('Income', 0), ('Income', 1)),
                               (('CreditScore', 0), ('CreditScore', 1)),
                               (('LoanAmount', 0), ('LoanAmount', 1)),
                               (('LoanApproval', 0), ('LoanApproval', 1))
                               ])

making inference

dbn_inference = DBNInference(loan_dbn_model)

evidence = {("CreditScore", 0): 672}
results = dbn_inference.query(variables=[("LoanApproval", 0)], evidence=evidence)

ankurankan · 2024-03-19T06:42:25Z

@Rajaram1604 In the model, the variable CreditScore has only 78 (named 0-77) states.

In [22]: print(loan_dbn_model.get_cpds(('CreditScore', 0)).state_names)
{<DynamicNode(CreditScore, 0) at 0x7a572e539450>: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77]}

And because the state that is specified in the evidence ({("CreditScore," 0): 672}) does not exist, inference is throwing an error.

Rajaram1604 · 2024-03-19T07:18:53Z

@ankurankan, Yes got it. Actually we are generating random numbers between 500 to 800 for the credit score variable by using numpy and training the model by scale down the values(Algorithms might be scale down the state values). I mean state was internally converted into small values between 1 and 77 by algorithm to reduce the consumption memory and etc..

But while make the inference for the particular evidence, we can not scale down the evidence right. The DynamicInference should scale down the particular evidence which should used for make the inference but we end up with the key error.

In this case, Could you please suggest me how we can make the inference for the particular evidence.

Note: In Bayesian Network model also, during training phase states are scale down but while make the inference with real evidence like {'creditScore':672} which is working fine, I think may be evidence also scale down the value and making inference perfectly.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DBNInference - Key Error #1744

DBNInference - Key Error #1744

Rajaram1604 commented Mar 15, 2024

ankurankan commented Mar 18, 2024

Rajaram1604 commented Mar 18, 2024 •

edited

ankurankan commented Mar 19, 2024 •

edited

Rajaram1604 commented Mar 19, 2024 •

edited

DBNInference - Key Error #1744

DBNInference - Key Error #1744

Comments

Rajaram1604 commented Mar 15, 2024

Subject of the issue

Your environment

Steps to reproduce

Expected behaviour

Actual behaviour

ankurankan commented Mar 18, 2024

Rajaram1604 commented Mar 18, 2024 • edited

Defining the DBN Model.

Add Edges

making inference

ankurankan commented Mar 19, 2024 • edited

Rajaram1604 commented Mar 19, 2024 • edited

Rajaram1604 commented Mar 18, 2024 •

edited

ankurankan commented Mar 19, 2024 •

edited

Rajaram1604 commented Mar 19, 2024 •

edited