Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disabling enable_external_access does not disable replacement scans #12114

Closed
2 tasks done
jwimberl opened this issue May 17, 2024 · 2 comments · Fixed by #12224
Closed
2 tasks done

Disabling enable_external_access does not disable replacement scans #12114

jwimberl opened this issue May 17, 2024 · 2 comments · Fixed by #12224

Comments

@jwimberl
Copy link

jwimberl commented May 17, 2024

What happens?

When the enable_external_access configuration setting is set to false, pandas and numpy replacement scans are not disabled. This contradicts the documentation, which states that it must be set to true to

Allow the database to access external state (through e.g., loading/installing modules, COPY TO/FROM, CSV readers, pandas replacement scans, etc)

To Reproduce

Code:

import duckdb
print(f"duckdb version = {duckdb.__version__}")
import numpy as np
import pandas as pd
secrets = np.array([1,2,3])
secrets_df = pd.DataFrame(secrets)
duckdb.execute("SET enable_external_access = false")
duckdb.sql("SELECT name, value FROM duckdb_settings() WHERE name='enable_external_access'").show()
print("looking for numpy array")
df1 = duckdb.sql("SELECT * FROM secrets").to_df()
print(df1)
print("looking for pandas df")
df2 = duckdb.sql("SELECT * FROM secrets_df").to_df()
print(df2)

Output:

$ python3 --version
Python 3.11.9
$ python3 duckdb_repro.py 
duckdb version = 0.10.2
┌────────────────────────┬─────────┐
│          name          │  value  │
│        varchar         │ varchar │
├────────────────────────┼─────────┤
│ enable_external_access │ false   │
└────────────────────────┴─────────┘

looking for numpy array
   column0
0        1
1        2
2        3
looking for pandas df
   0
0  1
1  2
2  3

OS:

Rocky Linux 8.0, x64

DuckDB Version:

0.10.2

DuckDB Client:

Python

Full Name:

Jack Wimberley

Affiliation:

Paradigm4

What is the latest build you tested with? If possible, we recommend testing with the latest nightly build.

I have tested with a stable release

Did you include all relevant data sets for reproducing the issue?

Yes

Did you include all code required to reproduce the issue?

  • Yes, I have

Did you include all relevant configuration (e.g., CPU architecture, Python version, Linux distribution) to reproduce the issue?

  • Yes, I have
@Tishj
Copy link
Contributor

Tishj commented May 17, 2024

That does sound like a bug

If others agree, it should likely be fixed here: Binder::BindWithReplacementScan as that is where all replacement scans take place

@Mytherin
Copy link
Collaborator

Currently enable_external_access must be set on connect for replacement scans to be disabled, e.g.:

import duckdb
import numpy as np
import pandas as pd
con = duckdb.connect(':memory:', config={'enable_external_access': False})
secrets = np.array([1,2,3])
secrets_df = pd.DataFrame(secrets)
con.sql('SELECT * FROM secrets_df')
# duckdb.duckdb.CatalogException: Catalog Error: Table with name secrets_df does not exist!

It should work using SET as well. We'll look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants