Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recognize py file as a notebook and use azure cluster as a kernel #578

Open
virtualdvid opened this issue Mar 17, 2023 · 6 comments
Open
Labels
enhancement New feature or request

Comments

@virtualdvid
Copy link

Describe the enhance

It will be a great feature if vscode can recognize the databricks notebook py files as a notebook and allow us to select the databricks cluster as a kernel.

Even though we can run the notebook as workflow on Databricks it runs the entire notebook. What if I only want to run one cell at a time.

How to reproduce

vscode already recognize # COMMAND ---------- as a cell:

image

but if we try to run it (or debug), there is not the databricks cluster:

image

image

Ideal

based on the first commented line # Databricks notebook source open the file as a notebook and allow me to select the databricks kernel:

image

NOTE: Please add a tag to suggest features additional to the one for bugs :)

@virtualdvid virtualdvid added the bug Something isn't working label Mar 17, 2023
@kartikgupta-db kartikgupta-db added enhancement New feature or request and removed bug Something isn't working labels Mar 20, 2023
@virtualdvid
Copy link
Author

Hi guys,

Found this documentation and I have been trying to implement databricks-connect so far all good until I get this error trying to get dbutils:

Exception has occurred: AttributeError
'NoneType' object has no attribute 'user_ns'
File "/Users/xxxx/xxx/my_repo/poc.py", line 6, in
dbutils = DBUtils(spark)
AttributeError: 'NoneType' object has no attribute 'user_ns'

Here the code:

from databricks.connect import DatabricksSession
from pyspark.dbutils import DBUtils

spark = DatabricksSession.builder.getOrCreate()
dbutils = DBUtils(spark) # error in this line

if I commented out that line and try to load a list of dictionaries in a pyspark df it fails with not mayor logs:

Exception has occurred: AssertionError X
exception: no description
File "/Users/xxxx/xxx/my_repo/poc.py", line 21, in
df = spark.createDataFrame(my_list_of_dicts)
AssertionError:

This code works fine in databricks UI notebook

df = spark.createDataFrame(my_list_of_dicts)
df.limit(10)

Environment:

vscode:
Version: 1.77.3 (Universal)
Commit: 704ed70d4fd1c6bd6342c436f1ede30d1cff4710
Date: 2023-04-12T09:19:37.325Z (2 wks ago)
Electron: 19.1.11
Chromium: 102.0.5005.196
Node.js: 16.14.2
V8: 10.2.154.26-electron.0
OS: Darwin arm64 22.3.0
Sandboxed: Yes

Databricks Cluster:
image

Databricks extension version: v0.3.11

@nansravn
Copy link

nansravn commented May 5, 2023

Thumbs up for this feature request! 👍

I do use .py scripts or .ipynb notebooks depending on the project, and for both approaches GitHub copilot is amazing for accelerating code development.

It would be great to be able to open Databricks .py scripts in VS code as a notebook, selecting a Databricks cluster as the compute environment. This would allow us to have a GitHub Copilot enabled Databricks environment, where we could be developing notebooks together with the copilot.

@fludo
Copy link

fludo commented Jul 25, 2023

+1 for the .py file which are the format used when notebook are synchroniser in a repository (Git).

@virtualdvid
Copy link
Author

Is there any progress on this request? I have been using a lot databaricks and this will be a killer feature.

@kartikgupta-db
Copy link
Contributor

We have some limited notebook support in experimental. You can enable it here https://docs.databricks.com/en/dev-tools/vscode-ext/dev-tasks/databricks-connect.html#additional-notebook-features-with-databricks-connect. You will still have to use the code lenses (the little gray buttons on top of cells), but it should popup a really nice window to show you outputs for the current cell.
This also includes some more features to make the experience of using a notebook in the IDE closer to the UI. You can find more details about the supported features in the documentation above.

@virtualdvid
Copy link
Author

I have been trying that but it is running the notebook in my local kernel. The real deal is if we can select the cluster:

image

Hope you guys can have it soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants