{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":17165658,"defaultBranch":"master","name":"spark","ownerLogin":"apache","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2014-02-25T08:00:08.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/47359?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1715902415.0","currentOid":""},"activityList":{"items":[{"before":"14d3f447360b66663c8979a8cdb4c40c480a1e04","after":"4a471cceebedd938f781eb385162d33058124092","ref":"refs/heads/master","pushedAt":"2024-05-23T11:46:52.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[MINOR][TESTS] Add a helper function for `spark.table` in dsl\n\n### What changes were proposed in this pull request?\nAdd a helper function for `spark.table` in dsl\n\n### Why are the changes needed?\nto be used in tests\n\n### Does this PR introduce _any_ user-facing change?\nno, test only\n\n### How was this patch tested?\nci\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46717 from zhengruifeng/dsl_read.\n\nAuthored-by: Ruifeng Zheng <ruifengz@apache.org>\nSigned-off-by: Ruifeng Zheng <ruifengz@apache.org>","shortMessageHtmlLink":"[MINOR][TESTS] Add a helper function for <code>spark.table</code> in dsl"}},{"before":"e8f58a9c4a641b830c5304b34b876e0cd5d3ed8e","after":"14d3f447360b66663c8979a8cdb4c40c480a1e04","ref":"refs/heads/master","pushedAt":"2024-05-23T08:12:48.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[SPARK-48395][PYTHON] Fix `StructType.treeString` for parameterized types\n\n### What changes were proposed in this pull request?\nthis PR is a follow up of https://github.com/apache/spark/pull/46685.\n\n### Why are the changes needed?\n`StructType.treeString` uses `DataType.typeName` to generate the tree string, however, the `typeName` in python is a class method and can not return the same string for parameterized types.\n\n```\nIn [2]: schema = StructType().add(\"c\", CharType(10), True).add(\"v\", VarcharType(10), True).add(\"d\", DecimalType(10, 2), True).add(\"ym00\", YearM\n   ...: onthIntervalType(0, 0)).add(\"ym01\", YearMonthIntervalType(0, 1)).add(\"ym11\", YearMonthIntervalType(1, 1))\n\nIn [3]: print(schema.treeString())\nroot\n |-- c: char (nullable = true)\n |-- v: varchar (nullable = true)\n |-- d: decimal (nullable = true)\n |-- ym00: yearmonthinterval (nullable = true)\n |-- ym01: yearmonthinterval (nullable = true)\n |-- ym11: yearmonthinterval (nullable = true)\n```\n\nit should be\n```\nIn [4]: print(schema.treeString())\nroot\n |-- c: char(10) (nullable = true)\n |-- v: varchar(10) (nullable = true)\n |-- d: decimal(10,2) (nullable = true)\n |-- ym00: interval year (nullable = true)\n |-- ym01: interval year to month (nullable = true)\n |-- ym11: interval month (nullable = true)\n```\n\n### Does this PR introduce _any_ user-facing change?\nno, this feature was just added and not release out yet.\n\n### How was this patch tested?\nadded tests\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46711 from zhengruifeng/tree_string_fix.\n\nAuthored-by: Ruifeng Zheng <ruifengz@apache.org>\nSigned-off-by: Ruifeng Zheng <ruifengz@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48395\">SPARK-48395</a>][PYTHON] Fix <code>StructType.treeString</code> for parameterized t…"}},{"before":"a393d6cdf00ce95b2a3fb4bd15bfc4d82883d1d2","after":"e8f58a9c4a641b830c5304b34b876e0cd5d3ed8e","ref":"refs/heads/master","pushedAt":"2024-05-23T04:50:37.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48370][SPARK-48258][CONNECT][PYTHON][FOLLOW-UP] Refactor local and eager required fields in CheckpointCommand\n\n### What changes were proposed in this pull request?\n\nThis PR is a followup of https://github.com/apache/spark/pull/46683 and https://github.com/apache/spark/pull/46570 that refactors `local` and `eager` required fields in `CheckpointCommand`\n\n### Why are the changes needed?\n\nTo make the code easier to maintain.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, the main change has not been released yet.\n\n### How was this patch tested?\n\nManually tested.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46712 from HyukjinKwon/SPARK-48370-SPARK-48258-followup.\n\nAuthored-by: Hyukjin Kwon <gurwls223@apache.org>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48370\">SPARK-48370</a>][<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48258\">SPARK-48258</a>][CONNECT][PYTHON][FOLLOW-UP] Refactor local…"}},{"before":"5df9a0866ae60a42d78136a21a82a0b6e58daefa","after":"a393d6cdf00ce95b2a3fb4bd15bfc4d82883d1d2","ref":"refs/heads/master","pushedAt":"2024-05-23T03:19:10.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48370][CONNECT] Checkpoint and localCheckpoint in Scala Spark Connect client\n\n### What changes were proposed in this pull request?\n\nThis PR adds `Dataset.checkpoint` and `Dataset.localCheckpoint` into Scala Spark Connect client. Python API was implemented at https://github.com/apache/spark/pull/46570\n\n### Why are the changes needed?\n\nFor API parity.\n\n### Does this PR introduce _any_ user-facing change?\n\nYes, it adds `Dataset.checkpoint` and `Dataset.localCheckpoint` into Scala Spark Connect client.\n\n### How was this patch tested?\n\nUnittests added.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46683 from HyukjinKwon/SPARK-48370.\n\nAuthored-by: Hyukjin Kwon <gurwls223@apache.org>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48370\">SPARK-48370</a>][CONNECT] Checkpoint and localCheckpoint in Scala Spark …"}},{"before":"a48365dd98c9e52b5648d1cc0af203a7290cb1dc","after":"5df9a0866ae60a42d78136a21a82a0b6e58daefa","ref":"refs/heads/master","pushedAt":"2024-05-23T02:46:13.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"LuciferYang","name":"YangJie","path":"/LuciferYang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1475305?s=80&v=4"},"commit":{"message":"[SPARK-48386][TESTS] Replace JVM assert with JUnit Assert in tests\n\n### What changes were proposed in this pull request?\nThe pr aims to replace `JVM assert` with `JUnit Assert` in tests.\n\n### Why are the changes needed?\nassert() statements do not produce as useful errors when they fail, and, if they were somehow disabled, would fail to test anything.\n\n### Does this PR introduce _any_ user-facing change?\nNo.\n\n### How was this patch tested?\n- Manually test.\n- Pass GA.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo.\n\nCloses #46698 from panbingkun/minor_assert.\n\nAuthored-by: panbingkun <panbingkun@baidu.com>\nSigned-off-by: yangjie01 <yangjie01@baidu.com>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48386\">SPARK-48386</a>][TESTS] Replace JVM assert with JUnit Assert in tests"}},{"before":"d96ab44c82519eec88b28df6974ddb5b7f429dbf","after":"a48365dd98c9e52b5648d1cc0af203a7290cb1dc","ref":"refs/heads/master","pushedAt":"2024-05-23T02:27:23.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48387][SQL] Postgres: Map TimestampType to TIMESTAMP WITH TIME ZONE\n\n### What changes were proposed in this pull request?\n\nCurrently, Both TimestampType/TimestampNTZType are mapped to TIMESTAMP WITHOUT TIME ZONE for writing while being differentiated for reading.\n\nIn this PR, we map TimestampType to TIMESTAMP WITH TIME ZONE to differentiate TimestampType/TimestampNTZType for writing against Postgres.\n\n### Why are the changes needed?\n\nTimestampType <-> TIMESTAMP WITHOUT TIME ZONE is incorrect and ambiguous with TimestampNTZType\n\n### Does this PR introduce _any_ user-facing change?\n\nYes\nmigration guide and legacy configuration provided\n\n### How was this patch tested?\n\nnew tests\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46701 from yaooqinn/SPARK-48387.\n\nAuthored-by: Kent Yao <yao@apache.org>\nSigned-off-by: Kent Yao <yao@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48387\">SPARK-48387</a>][SQL] Postgres: Map TimestampType to TIMESTAMP WITH TIME…"}},{"before":"06aafb1eeaf32bdc7abce5bb4a9ffb474a9e61ae","after":"d96ab44c82519eec88b28df6974ddb5b7f429dbf","ref":"refs/heads/master","pushedAt":"2024-05-23T00:34:45.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48393][PYTHON] Move a group of constants to `pyspark.util`\n\n### What changes were proposed in this pull request?\nMove a group of constants to `pyspark.util`, move them from connect to pyspark.util, so reusable in both\n\n### Why are the changes needed?\ncode clean up\n\n### Does this PR introduce _any_ user-facing change?\nno, they are internal constants\n\n### How was this patch tested?\nci\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46710 from zhengruifeng/unity_constant.\n\nAuthored-by: Ruifeng Zheng <ruifengz@apache.org>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48393\">SPARK-48393</a>][PYTHON] Move a group of constants to <code>pyspark.util</code>"}},{"before":"6be3560f3c89e212e850a0788d24a7c0755ea35b","after":"06aafb1eeaf32bdc7abce5bb4a9ffb474a9e61ae","ref":"refs/heads/master","pushedAt":"2024-05-22T23:52:39.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48258][PYTHON][CONNECT][FOLLOW-UP] Bind relation ID to the plan instead of DataFrame\n\n### What changes were proposed in this pull request?\n\nThis PR addresses https://github.com/apache/spark/pull/46683#discussion_r1608527529 comment within Python, by using ID at the plan instead of DataFrame itself.\n\n### Why are the changes needed?\n\nBecause the DataFrame holds the relation ID, if DataFrame B are derived from DataFrame A, and DataFrame A is garbage-collected, then the cache might not exist anymore. See the example below:\n\n```python\ndf = spark.range(1).localCheckpoint()\ndf2 = df.repartition(10)\ndel df\ndf2.collect()\n```\n\n```\npyspark.errors.exceptions.connect.SparkConnectGrpcException: (org.apache.spark.sql.connect.common.InvalidPlanInput) No DataFrame with id a4efa660-897c-4500-bd4e-bd57cd0263d2 is found in the session cd4764b4-90a9-4249-9140-12a6e4a98cd3\n```\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, the main change has not been released out yet.\n\n### How was this patch tested?\n\nManually tested, and added a unittest.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46694 from HyukjinKwon/SPARK-48258-followup.\n\nAuthored-by: Hyukjin Kwon <gurwls223@apache.org>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48258\">SPARK-48258</a>][PYTHON][CONNECT][FOLLOW-UP] Bind relation ID to the pla…"}},{"before":"e04d3d7c430a1fa446f0379680f619b8b14b5eb5","after":"6be3560f3c89e212e850a0788d24a7c0755ea35b","ref":"refs/heads/master","pushedAt":"2024-05-22T12:21:26.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"cloud-fan","name":"Wenchen Fan","path":"/cloud-fan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3182036?s=80&v=4"},"commit":{"message":"[SPARK-48364][SQL] Add AbstractMapType type casting and fix RaiseError parameter map to work with collated strings\n\n### What changes were proposed in this pull request?\nFollowing up on the introduction of AbstractMapType (https://github.com/apache/spark/pull/46458) and changes that introduce collation awareness for RaiseError expression (https://github.com/apache/spark/pull/46461), this PR should add the appropriate type casting rules for AbstractMapType.\n\n### Why are the changes needed?\nFix the CI failure for the `Support RaiseError misc expression with collation` test when ANSI is off.\n\n### Does this PR introduce _any_ user-facing change?\nYes, type casting is now allowed for map types with collated strings.\n\n### How was this patch tested?\nExtended suite `CollationSQLExpressionsANSIOffSuite` with ANSI disabled.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo.\n\nCloses #46661 from uros-db/fix-abstract-map.\n\nAuthored-by: Uros Bojanic <157381213+uros-db@users.noreply.github.com>\nSigned-off-by: Wenchen Fan <wenchen@databricks.com>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48364\">SPARK-48364</a>][SQL] Add AbstractMapType type casting and fix RaiseErro…"}},{"before":"9fd85d9acc5acf455d0ad910ef2848695576242b","after":"e04d3d7c430a1fa446f0379680f619b8b14b5eb5","ref":"refs/heads/master","pushedAt":"2024-05-22T11:28:10.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"cloud-fan","name":"Wenchen Fan","path":"/cloud-fan","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/3182036?s=80&v=4"},"commit":{"message":"[SPARK-48215][SQL] Extending support for collated strings on date_format expression\n\n### What changes were proposed in this pull request?\nWe are extending support for collated strings on date_format function, since currently it throws DATATYPE_MISSMATCH exception when collated strings are passed as \"format\" parameter. https://docs.databricks.com/en/sql/language-manual/functions/date_format.html\n\n### Why are the changes needed?\nException is thrown on invocation when collated strings are passed as arguments to date_format.\n\n### Does this PR introduce _any_ user-facing change?\nNo user facing changes, extending support.\n\n### How was this patch tested?\nTests are added with this PR.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo.\n\nCloses #46561 from nebojsa-db/SPARK-48215.\n\nAuthored-by: Nebojsa Savic <nebojsa.savic@databricks.com>\nSigned-off-by: Wenchen Fan <wenchen@databricks.com>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48215\">SPARK-48215</a>][SQL] Extending support for collated strings on date_for…"}},{"before":"1188887c31c17ca9b96a230d22c784fd87874d64","after":"9fd85d9acc5acf455d0ad910ef2848695576242b","ref":"refs/heads/master","pushedAt":"2024-05-22T11:26:23.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"Revert \"[SPARK-48379][INFRA] Cancel previous builds on a PR when new commit is pushed\"\n\nThis reverts commit 93d83c347042d434064dd229164603ead6556a6d.","shortMessageHtmlLink":"Revert \"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48379\">SPARK-48379</a>][INFRA] Cancel previous builds on a PR when new …"}},{"before":"93d83c347042d434064dd229164603ead6556a6d","after":"1188887c31c17ca9b96a230d22c784fd87874d64","ref":"refs/heads/master","pushedAt":"2024-05-22T11:15:27.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48389][INFRA] Remove obsolete workflow cancel_duplicate_workflow_runs\n\n### What changes were proposed in this pull request?\n\nThis PR proposes to remove obsolete workflow cancel_duplicate_workflow_runs\n\n### Why are the changes needed?\n\nAfter https://github.com/apache/spark/commit/93d83c347042d434064dd229164603ead6556a6d, we don't need this anymore. In fact, this has not been working for very long time because of the workflow name mismatch.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, dev-only.\n\n### How was this patch tested?\n\nManually.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46703 from HyukjinKwon/SPARK-48389.\n\nAuthored-by: Hyukjin Kwon <gurwls223@apache.org>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48389\">SPARK-48389</a>][INFRA] Remove obsolete workflow cancel_duplicate_workfl…"}},{"before":"366a052fe10662379ab7f636ccf14ae53a46d8ed","after":"93d83c347042d434064dd229164603ead6556a6d","ref":"refs/heads/master","pushedAt":"2024-05-22T10:02:23.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48379][INFRA] Cancel previous builds on a PR when new commit is pushed\n\n### What changes were proposed in this pull request?\n\nWhen there is an open PR with a build in progress and a new commit is pushed I propose that we should cancel the previous build.\n\nThe exceptions are branches `master` and `branch-*` where we still want to be able to have multiple builds executing concurrently (if two PRs are merged to main at the same time we don't want to just run the build on the second one).\n\n[Concurrecy docs](https://docs.github.com/en/actions/using-jobs/using-concurrency)\n\n### Why are the changes needed?\n\nTo reduce wait times for newer commit's and save compute resources.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nManually.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46689 from stefankandic/cancelOlderBuilds.\n\nLead-authored-by: Stefan Kandic <stefan.kandic@databricks.com>\nCo-authored-by: Hyukjin Kwon <gurwls223@apache.org>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48379\">SPARK-48379</a>][INFRA] Cancel previous builds on a PR when new commit i…"}},{"before":"e55875b0bbe08c435ffcb0ea034ceb95938d8729","after":"366a052fe10662379ab7f636ccf14ae53a46d8ed","ref":"refs/heads/master","pushedAt":"2024-05-22T07:35:53.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HeartSaVioR","name":"Jungtaek Lim","path":"/HeartSaVioR","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1317309?s=80&v=4"},"commit":{"message":"[SPARK-47920][DOCS][SS][PYTHON] Add doc for python streaming data source API\n\n### What changes were proposed in this pull request?\nadd doc for python streaming data source API\n\n### Why are the changes needed?\n\nAdd user guide to help user develop python streaming data source.\n\nCloses #46139 from chaoqin-li1123/python_ds_doc.\n\nAuthored-by: Chaoqin Li <chaoqin.li@databricks.com>\nSigned-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-47920\">SPARK-47920</a>][DOCS][SS][PYTHON] Add doc for python streaming data sou…"}},{"before":"bf7f664296c541231edf53de9b5032915de6393f","after":"e55875b0bbe08c435ffcb0ea034ceb95938d8729","ref":"refs/heads/master","pushedAt":"2024-05-22T07:31:38.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[SPARK-48372][SPARK-45716][PYTHON] Implement `StructType.treeString`\n\n### What changes were proposed in this pull request?\nImplement `StructType.treeString`\n\n### Why are the changes needed?\nfeature parity, this method is Scala-only before\n\n### Does this PR introduce _any_ user-facing change?\nyes\n\n```\nIn [2]: schema1 = DataType.fromDDL(\"c1 INT, c2 STRUCT<c3: INT, c4: STRUCT<c5: INT, c6: INT>>\")\n\nIn [3]: print(schema1.treeString())\nroot\n |-- c1: integer (nullable = true)\n |-- c2: struct (nullable = true)\n |    |-- c3: integer (nullable = true)\n |    |-- c4: struct (nullable = true)\n |    |    |-- c5: integer (nullable = true)\n |    |    |-- c6: integer (nullable = true)\n```\n\n### How was this patch tested?\nadded tests\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46685 from zhengruifeng/py_tree_string.\n\nAuthored-by: Ruifeng Zheng <ruifengz@apache.org>\nSigned-off-by: Ruifeng Zheng <ruifengz@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48372\">SPARK-48372</a>][<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-45716\">SPARK-45716</a>][PYTHON] Implement <code>StructType.treeString</code>"}},{"before":"f4958ba9587c7c7d956eef4b664f7c974a09aa42","after":"bf7f664296c541231edf53de9b5032915de6393f","ref":"refs/heads/master","pushedAt":"2024-05-22T05:41:05.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-47515][SPARK-47406][SQL][FOLLOWUP] Add legacy config spark.sql.legacy.mysql.timestampNTZMapping.enabled\n\n### What changes were proposed in this pull request?\n\nThis PR adds a legacy config `spark.sql.legacy.mysql.timestampNTZMapping.enabled` to make user available to restore the old type mapping between timestamp_ntz and MySQL TIMESTAMP/DATETIME\n\n### Why are the changes needed?\n\nAvoid unrecoverable breaking changes, following the principles of https://lists.apache.org/thread/6bb1crw9dqgntp2p4dk67b7313mo5frz\n\n### Does this PR introduce _any_ user-facing change?\nno\n\n### How was this patch tested?\nnew tests\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46686 from yaooqinn/SPARK-47515-F.\n\nAuthored-by: Kent Yao <yao@apache.org>\nSigned-off-by: Kent Yao <yao@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-47515\">SPARK-47515</a>][<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-47406\">SPARK-47406</a>][SQL][FOLLOWUP] Add legacy config spark.sql…"}},{"before":"a886121aee45220c1ad6f06770e148ba13c3b1f0","after":"f4958ba9587c7c7d956eef4b664f7c974a09aa42","ref":"refs/heads/master","pushedAt":"2024-05-22T03:28:16.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48367][CONNECT][FOLLOWUP] Replace keywords that identify `lint-scala` detection results\n\n### What changes were proposed in this pull request?\nThe pr is following up https://github.com/apache/spark/pull/46679.\nThe pr aims to replace `keywords` (from `Requires formatting` to `Unformatted files found`) that identify `lint-scala` detection results.\n\n### Why are the changes needed?\nAccording to the discussion and analysis of https://github.com/apache/spark/pull/46679#issuecomment-2122142979, with the upgrade of `org.antipathy:mvn-scalafmt_*` version, it seems more general to use `Unformatted files found` as the keyword, because it directly comes from `the prompt of error level output`.\n\n### Does this PR introduce _any_ user-facing change?\nNo.\n\n### How was this patch tested?\n- Manually test\n```\n(base) ➜  spark-community git:(SPARK-48367_FOLLOWUP) ✗ dev/lint-scala\nScalastyle checks passed.\nScalafmt checks passed.\n```\n- Pass GA.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo.\n\nCloses #46684 from panbingkun/SPARK-48367_FOLLOWUP.\n\nAuthored-by: panbingkun <panbingkun@baidu.com>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48367\">SPARK-48367</a>][CONNECT][FOLLOWUP] Replace keywords that identify `lint…"}},{"before":"e702b32656bcbe194be19876990954a4be457734","after":"a886121aee45220c1ad6f06770e148ba13c3b1f0","ref":"refs/heads/master","pushedAt":"2024-05-22T02:15:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[MINOR][TESTS] Fix `DslLogicalPlan.as`\n\n### What changes were proposed in this pull request?\n1, Fix `DslLogicalPlan.as`, initialize without any plan; No similar cases in the `dsl`\n2, btw, `newBuilder` -> `newBuilder()` to avoid warnings\n\n### Why are the changes needed?\nit should not initialize with the input `logicalPlan`\n\n### Does this PR introduce _any_ user-facing change?\nno, test only\n\n### How was this patch tested?\nci\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46693 from zhengruifeng/fix_dsl_as.\n\nAuthored-by: Ruifeng Zheng <ruifengz@apache.org>\nSigned-off-by: Ruifeng Zheng <ruifengz@apache.org>","shortMessageHtmlLink":"[MINOR][TESTS] Fix <code>DslLogicalPlan.as</code>"}},{"before":"a1e27a38868b80bef93b7d7bb43e6e15b79b14aa","after":"e702b32656bcbe194be19876990954a4be457734","ref":"refs/heads/master","pushedAt":"2024-05-22T01:58:08.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HeartSaVioR","name":"Jungtaek Lim","path":"/HeartSaVioR","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1317309?s=80&v=4"},"commit":{"message":"[SPARK-48314][SS] Don't double cache files for FileStreamSource using Trigger.AvailableNow\n\n### What changes were proposed in this pull request?\n\nFiles don't need to be cached for reuse in `FileStreamSource` when using `Trigger.AvailableNow` because all files are already cached for the lifetime of the query in `allFilesForTriggerAvailableNow`.\n\n### Why are the changes needed?\n\nAs reported in https://issues.apache.org/jira/browse/SPARK-44924 (with a PR to address https://github.com/apache/spark/pull/45362), the hard coded cap of 10k files being cached can cause problems when using a maxFilesPerTrigger > 10k. It causes every other batch to be 10k files, which can greatly limit the throughput of a new streaming trying to catch up.\n\n### Does this PR introduce _any_ user-facing change?\n\nEvery other streaming batch won't be 10k files if using Trigger.AvailableNow and maxFilesPerTrigger greater than 10k.\n\n### How was this patch tested?\n\nNew UT\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo\n\nCloses #46627 from Kimahriman/available-now-no-cache.\n\nAuthored-by: Adam Binford <adamq43@gmail.com>\nSigned-off-by: Jungtaek Lim <kabhwan.opensource@gmail.com>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48314\">SPARK-48314</a>][SS] Don't double cache files for FileStreamSource using…"}},{"before":"80c0f116541712431e69ed4bce04f944062e88d2","after":"a1e27a38868b80bef93b7d7bb43e6e15b79b14aa","ref":"refs/heads/master","pushedAt":"2024-05-22T00:09:25.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48341][CONNECT] Allow plugins to use QueryTest in their tests\n\n### What changes were proposed in this pull request?\n\nThis PR changes `QueryTest` to no longer depend on `RemoteSparkSession`.\n\n### Why are the changes needed?\n\nThis allows the tests for Spark Connect plugin to provide their version of `RemoteSparkSession` (which depends on some idiosyncrasies of how Spark is built).\n\n### Does this PR introduce _any_ user-facing change?\n\nNo\n\n### How was this patch tested?\n\nExisting tests to ensure that nothing breaks. Manually tested that this allows a plugin to use `QueryTest`.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo\n\nCloses #46667 from tomvanbussel/SPARK-48341.\n\nLead-authored-by: Tom van Bussel <tom.vanbussel@databricks.com>\nCo-authored-by: Tom van Bussel <Tom.VanBussel@databricks.com>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48341\">SPARK-48341</a>][CONNECT] Allow plugins to use QueryTest in their tests"}},{"before":"febdbf56fb223e9e9317d87ae8de5dc867f5d60f","after":"80c0f116541712431e69ed4bce04f944062e88d2","ref":"refs/heads/master","pushedAt":"2024-05-21T18:06:48.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"dongjoon-hyun","name":"Dongjoon Hyun","path":"/dongjoon-hyun","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9700541?s=80&v=4"},"commit":{"message":"[SPARK-48381][K8S][DOCS] Update `YuniKorn` docs with v1.5.1\n\n### What changes were proposed in this pull request?\n\nThis PR aims to update `YuniKorn` docs with v1.5.1 for Apache Spark 4.0.0.\n\n### Why are the changes needed?\n\nApache YuniKorn v1.5.1 was released on 2024-05-16 with 18 bug fixes.\n\n- https://yunikorn.apache.org/release-announce/1.5.1\n\n    - Locking fixes to avoid existing and potential deadlocks (YUNIKORN-2521, YUNIKORN-2544)\n    - Deadlock detection (YUNIKORN-2539)\n\nI installed YuniKorn v1.5.1 on K8s 1.29 and tested manually.\n\n**K8s v1.29**\n```\n$ kubectl version\nClient Version: v1.30.1\nKustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3\nServer Version: v1.29.2\n```\n\n**YuniKorn v1.5.1**\n```\n$ helm list -n yunikorn\nNAME    \tNAMESPACE\tREVISION\tUPDATED                             \tSTATUS  \tCHART         \tAPP VERSION\nyunikorn\tyunikorn \t1       \t2024-05-21 10:15:40.207847 -0700 PDT\tdeployed\tyunikorn-1.5.1\n```\n\n```\n$ build/sbt -Pkubernetes -Pkubernetes-integration-tests -Dspark.kubernetes.test.deployMode=docker-desktop \"kubernetes-integration-tests/testOnly *.YuniKornSuite\" -Dtest.exclude.tags=minikube,local,decom,r -Dtest.default.exclude.tags=\n...\n[info] YuniKornSuite:\n[info] - SPARK-42190: Run SparkPi with local[*] (6 seconds, 787 milliseconds)\n[info] - Run SparkPi with no resources (9 seconds, 869 milliseconds)\n[info] - Run SparkPi with no resources & statefulset allocation (9 seconds, 815 milliseconds)\n[info] - Run SparkPi with a very long application name. (8 seconds, 765 milliseconds)\n[info] - Use SparkLauncher.NO_RESOURCE (9 seconds, 783 milliseconds)\n[info] - Run SparkPi with a master URL without a scheme. (9 seconds, 836 milliseconds)\n[info] - Run SparkPi with an argument. (9 seconds, 790 milliseconds)\n[info] - Run SparkPi with custom labels, annotations, and environment variables. (9 seconds, 880 milliseconds)\n[info] - All pods have the same service account by default (9 seconds, 802 milliseconds)\n[info] - Run extraJVMOptions check on driver (6 seconds, 138 milliseconds)\n[info] - SPARK-42474: Run extraJVMOptions JVM GC option check - G1GC (5 seconds, 906 milliseconds)\n[info] - SPARK-42474: Run extraJVMOptions JVM GC option check - Other GC (5 seconds, 898 milliseconds)\n[info] - SPARK-42769: All executor pods have SPARK_DRIVER_POD_IP env variable (8 seconds, 783 milliseconds)\n[info] - Verify logging configuration is picked from the provided SPARK_CONF_DIR/log4j2.properties (10 seconds, 423 milliseconds)\n[info] - Run SparkPi with env and mount secrets. (12 seconds, 95 milliseconds)\n[info] - Run PySpark on simple pi.py example (9 seconds, 881 milliseconds)\n[info] - Run PySpark to test a pyfiles example (10 seconds, 951 milliseconds)\n[info] - Run PySpark with memory customization (9 seconds, 815 milliseconds)\n[info] - Run in client mode. (4 seconds, 279 milliseconds)\n[info] - Start pod creation from template (8 seconds, 856 milliseconds)\n[info] - SPARK-38398: Schedule pod creation from template (8 seconds, 912 milliseconds)\n[info] - A driver-only Spark job with a tmpfs-backed localDir volume (6 seconds, 269 milliseconds)\n[info] - A driver-only Spark job with a tmpfs-backed emptyDir data volume (6 seconds, 401 milliseconds)\n[info] - A driver-only Spark job with a disk-backed emptyDir volume (5 seconds, 787 milliseconds)\n[info] - A driver-only Spark job with an OnDemand PVC volume (5 seconds, 926 milliseconds)\n[info] - A Spark job with tmpfs-backed localDir volumes (8 seconds, 525 milliseconds)\n[info] - A Spark job with two executors with OnDemand PVC volumes (10 seconds, 206 milliseconds)\n[info] - PVs with local hostpath storage on statefulsets !!! CANCELED !!! (2 milliseconds)\n[info] - PVs with local hostpath and storageClass on statefulsets !!! IGNORED !!!\n[info] - PVs with local storage !!! CANCELED !!! (0 milliseconds)\n[info] Run completed in 6 minutes, 37 seconds.\n[info] Total number of tests run: 27\n[info] Suites: completed 1, aborted 0\n[info] Tests: succeeded 27, failed 0, canceled 2, ignored 1, pending 0\n[info] All tests passed.\n[success] Total time: 414 s (06:54), completed May 21, 2024, 10:28:59 AM\n```\n\n```\n$ kubectl describe pod -l spark-role=driver -n spark-8a4e575d62cf4634b025601b3d0d4409\n...\nEvents:\n  Type    Reason             Age   From      Message\n  ----    ------             ----  ----      -------\n  Normal  Scheduling         2s    yunikorn  spark-8a4e575d62cf4634b025601b3d0d4409/spark-test-app-f04cff44b1dd48b4b2aac7b0e1a74b12-driver is queued and waiting for allocation\n  Normal  Scheduled          2s    yunikorn  Successfully assigned spark-8a4e575d62cf4634b025601b3d0d4409/spark-test-app-f04cff44b1dd48b4b2aac7b0e1a74b12-driver to node docker-desktop\n  Normal  PodBindSuccessful  2s    yunikorn  Pod spark-8a4e575d62cf4634b025601b3d0d4409/spark-test-app-f04cff44b1dd48b4b2aac7b0e1a74b12-driver is successfully bound to node docker-desktop\n  Normal  Pulled             1s    kubelet   Container image \"docker.io/kubespark/spark:dev\" already present on machine\n  Normal  Created            1s    kubelet   Created container spark-kubernetes-driver\n  Normal  Started            1s    kubelet   Started container spark-kubernetes-driver\n```\n\n### Does this PR introduce _any_ user-facing change?\n\nNo.\n\n### How was this patch tested?\n\nManual review.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46690 from dongjoon-hyun/SPARK-48381.\n\nAuthored-by: Dongjoon Hyun <dhyun@apple.com>\nSigned-off-by: Dongjoon Hyun <dhyun@apple.com>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48381\">SPARK-48381</a>][K8S][DOCS] Update <code>YuniKorn</code> docs with v1.5.1"}},{"before":"6b3a88195e30027b74166d7729c232cd7ddba83b","after":"febdbf56fb223e9e9317d87ae8de5dc867f5d60f","ref":"refs/heads/master","pushedAt":"2024-05-21T17:45:21.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"gengliangwang","name":"Gengliang Wang","path":"/gengliangwang","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/1097932?s=80&v=4"},"commit":{"message":"[SPARK-48031] Grandfather legacy views to SCHEMA BINDING\n\n### What changes were proposed in this pull request?\n\nWhen enabling schema evolution for views, legacy views created prior should always use SCHEMA BINDING for compatibility. This is independent of the default mode.\n\n### Why are the changes needed?\n\nGet predictable behavior.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, because it's a fix for unreleased feature.\n\n### How was this patch tested?\n\nAdd new tests\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo\n\nCloses #46681 from srielau/SPARK-48031-View-evolution-fix.\n\nAuthored-by: Serge Rielau <serge@rielau.com>\nSigned-off-by: Gengliang Wang <gengliang@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48031\">SPARK-48031</a>] Grandfather legacy views to SCHEMA BINDING"}},{"before":"df50d4b309b55ca97a2409dcdda30011e3b43f87","after":"6b3a88195e30027b74166d7729c232cd7ddba83b","ref":"refs/heads/master","pushedAt":"2024-05-21T17:00:22.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"dongjoon-hyun","name":"Dongjoon Hyun","path":"/dongjoon-hyun","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/9700541?s=80&v=4"},"commit":{"message":"[SPARK-48329][SQL] Enable `spark.sql.sources.v2.bucketing.pushPartValues.enabled` by default\n\n### What changes were proposed in this pull request?\n\nThis PR aims to enable `spark.sql.sources.v2.bucketing.pushPartValues.enabled` by default for Apache Spark 4.0.0 while keeping `spark.sql.sources.v2.bucketing.enabled` is `false`.\n\n### Why are the changes needed?\n\n`spark.sql.sources.v2.bucketing.pushPartValues.enabled` was added at Apache Spark 3.4.0 and has been used as one of the datasource v2 bucketing feature. This PR will help the datasource v2 bucketing users use this feature more easily.\n\nNote that this change is technically no-op for the default users because `spark.sql.sources.v2.bucketing.enabled` is `false` still.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo\n\n### How was this patch tested?\n\nPass the CIs.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46673 from szehon-ho/default_pushpart.\n\nLead-authored-by: Szehon Ho <szehon.apache@gmail.com>\nCo-authored-by: chesterxu <chesterxu@tencent.com>\nSigned-off-by: Dongjoon Hyun <dhyun@apple.com>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48329\">SPARK-48329</a>][SQL] Enable `spark.sql.sources.v2.bucketing.pushPartVal…"}},{"before":"6e6e7a00f662ae1dc7e081c9e8ec40c30ad8d3d4","after":"df50d4b309b55ca97a2409dcdda30011e3b43f87","ref":"refs/heads/master","pushedAt":"2024-05-21T11:39:00.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[SPARK-48336][PS][CONNECT] Implement `ps.sql` in Spark Connect\n\n### What changes were proposed in this pull request?\nImplement `ps.sql` in Spark Connect\n\n### Why are the changes needed?\nfeature parity in Spark Connect\n\n### Does this PR introduce _any_ user-facing change?\nyes:\n\n```\nIn [4]: spark\nOut[4]: <pyspark.sql.connect.session.SparkSession at 0x105136390>\n\nIn [5]:     >>> ps.sql('''\n   ...:     ...   SELECT m1.a, m2.b\n   ...:     ...   FROM {table1} m1 INNER JOIN {table2} m2\n   ...:     ...   ON m1.key = m2.key\n   ...:     ...   ORDER BY m1.a, m2.b''',\n   ...:     ...   table1=ps.DataFrame({\"a\": [1,2], \"key\": [\"a\", \"b\"]}),\n   ...:     ...   table2=pd.DataFrame({\"b\": [3,4,5], \"key\": [\"a\", \"b\", \"b\"]}))\n/Users/ruifeng.zheng/Dev/spark/python/pyspark/pandas/utils.py:1018: PandasAPIOnSparkAdviceWarning: The config 'spark.sql.ansi.enabled' is set to True. This can cause unexpected behavior from pandas API on Spark since pandas API on Spark follows the behavior of pandas, not SQL.\n  warnings.warn(message, PandasAPIOnSparkAdviceWarning)\n/Users/ruifeng.zheng/Dev/spark/python/pyspark/pandas/utils.py:1018: PandasAPIOnSparkAdviceWarning: The config 'spark.sql.ansi.enabled' is set to True. This can cause unexpected behavior from pandas API on Spark since pandas API on Spark follows the behavior of pandas, not SQL.\n  warnings.warn(message, PandasAPIOnSparkAdviceWarning)\n\n   a  b\n0  1  3\n1  2  4\n2  2  5\n```\n\n### How was this patch tested?\n\n1. enabled UTs\n2. also manually tested all the examples\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46658 from zhengruifeng/ps_sql.\n\nAuthored-by: Ruifeng Zheng <ruifengz@apache.org>\nSigned-off-by: Ruifeng Zheng <ruifengz@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48336\">SPARK-48336</a>][PS][CONNECT] Implement <code>ps.sql</code> in Spark Connect"}},{"before":"664c8c19dae7ca23dc9142133471d96501093bed","after":"6e6e7a00f662ae1dc7e081c9e8ec40c30ad8d3d4","ref":"refs/heads/master","pushedAt":"2024-05-21T11:35:35.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"zhengruifeng","name":"Ruifeng Zheng","path":"/zhengruifeng","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/7322292?s=80&v=4"},"commit":{"message":"[SPARK-48369][SQL][PYTHON][CONNECT] Add function `timestamp_add`\n\n### What changes were proposed in this pull request?\nAdd function `timestamp_add`\n\n### Why are the changes needed?\nthis method is missing in dataframe API due to it is not in `FunctionRegistry`\n\n### Does this PR introduce _any_ user-facing change?\nyes, new method\n\n```\n    >>> import datetime\n    >>> from pyspark.sql import functions as sf\n    >>> df = spark.createDataFrame(\n    ...     [(datetime.datetime(2016, 3, 11, 9, 0, 7), 2),\n    ...      (datetime.datetime(2024, 4, 2, 9, 0, 7), 3)], [\"ts\", \"quantity\"])\n    >>> df.select(sf.timestamp_add(\"year\", \"quantity\", \"ts\")).show()\n    +--------------------------------+\n    |timestampadd(year, quantity, ts)|\n    +--------------------------------+\n    |             2018-03-11 09:00:07|\n    |             2027-04-02 09:00:07|\n    +--------------------------------+\n```\n\n### How was this patch tested?\nadded tests\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46680 from zhengruifeng/func_ts_add.\n\nAuthored-by: Ruifeng Zheng <ruifengz@apache.org>\nSigned-off-by: Ruifeng Zheng <ruifengz@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48369\">SPARK-48369</a>][SQL][PYTHON][CONNECT] Add function <code>timestamp_add</code>"}},{"before":"255e7408d95be65682b3fe5af0d5c13b77697282","after":"664c8c19dae7ca23dc9142133471d96501093bed","ref":"refs/heads/master","pushedAt":"2024-05-21T09:16:27.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48365][DOCS] DB2: Document Mapping Spark SQL Data Types to DB2\n\n### What changes were proposed in this pull request?\n\nIn this PR, we document the mapping rules for Spark SQL Data Types to DB2 ones\n\n### Why are the changes needed?\n\ndoc improvement\n### Does this PR introduce _any_ user-facing change?\nno\n\n### How was this patch tested?\n\ndoc build\n![image](https://github.com/apache/spark/assets/8326978/40092f80-1392-48a0-96e9-8ef9cf9516e2)\n\n### Was this patch authored or co-authored using generative AI tooling?\nno\n\nCloses #46677 from yaooqinn/SPARK-48365.\n\nAuthored-by: Kent Yao <yao@apache.org>\nSigned-off-by: Kent Yao <yao@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48365\">SPARK-48365</a>][DOCS] DB2: Document Mapping Spark SQL Data Types to DB2"}},{"before":"6213fa661ffeff073f3f1a6253f7039a45f284c7","after":"255e7408d95be65682b3fe5af0d5c13b77697282","ref":"refs/heads/master","pushedAt":"2024-05-21T07:34:24.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[SPARK-48367][CONNECT] Fix lint-scala for scalafmt to detect files to format properly\n\n### What changes were proposed in this pull request?\n\nSeems like scalafmt upgrade (https://github.com/apache/spark/pull/44845) changed its output format. This PR proposes to fix `./dev/lint-scala` script to detect files format properly.\n\n### Why are the changes needed?\n\nTo fix the regression in CI.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, dev-only.\n\n### How was this patch tested?\n\nManually ran the script.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46679 from HyukjinKwon/SPARK-48367.\n\nAuthored-by: Hyukjin Kwon <gurwls223@apache.org>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48367\">SPARK-48367</a>][CONNECT] Fix lint-scala for scalafmt to detect files to…"}},{"before":"0e0134a9a48d3f58e81d26d01637dca6f2b05a92","after":"6213fa661ffeff073f3f1a6253f7039a45f284c7","ref":"refs/heads/master","pushedAt":"2024-05-21T05:47:48.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[SPARK-48300][SQL] Codegen Support for `from_xml`\n\n### What changes were proposed in this pull request?\nThe PR aims to add `Codegen Support` for `from_xml`\n\n### Why are the changes needed?\n- Improve codegen coverage.\n\n### Does this PR introduce _any_ user-facing change?\nNo.\n\n### How was this patch tested?\n- Add new UT & existed UT.\n- Pass GA.\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo.\n\nCloses #46609 from panbingkun/from_xml_codegen.\n\nLead-authored-by: panbingkun <panbingkun@baidu.com>\nCo-authored-by: Kent Yao <yao@apache.org>\nSigned-off-by: Kent Yao <yao@apache.org>","shortMessageHtmlLink":"[<a class=\"issue-link js-issue-link notranslate\" rel=\"noopener noreferrer nofollow\" href=\"https://issues.apache.org/jira/browse/SPARK-48300\">SPARK-48300</a>][SQL] Codegen Support for <code>from_xml</code>"}},{"before":"87c345d8d4ea4f11b07b8aa47e0ec6473d267d10","after":"0e0134a9a48d3f58e81d26d01637dca6f2b05a92","ref":"refs/heads/master","pushedAt":"2024-05-21T05:46:36.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"yaooqinn","name":"Kent Yao","path":"/yaooqinn","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/8326978?s=80&v=4"},"commit":{"message":"[MINOR][DOCS] correct the doc error in configuration page (fix rest to reset)\n\n### What changes were proposed in this pull request?\n1. correct the doc error in spark-docs's  `configuration` page, it should be ```reset to their initial values by RESET command```, not ```rest to their initial values by RESET command```\n\n### Why are the changes needed?\n1. correct the doc error to make the doc clearer\n\n### Does this PR introduce _any_ user-facing change?\nNo\n\n### How was this patch tested?\nno need to test, just spell a word incorrectly\n\n### Was this patch authored or co-authored using generative AI tooling?\nNo\n\nCloses #46663 from Justontheway/patch-1.\n\nAuthored-by: NOTHING <wcxontheway@126.com>\nSigned-off-by: Kent Yao <yao@apache.org>","shortMessageHtmlLink":"[MINOR][DOCS] correct the doc error in configuration page (fix rest t…"}},{"before":"2a1bdc3eda8aa20d2fc5648c879caa648375e470","after":"87c345d8d4ea4f11b07b8aa47e0ec6473d267d10","ref":"refs/heads/master","pushedAt":"2024-05-21T05:30:16.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"HyukjinKwon","name":"Hyukjin Kwon","path":"/HyukjinKwon","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6477701?s=80&v=4"},"commit":{"message":"[MINOR][TESTS] Rename test_union to test_eqnullsafe at ColumnTestsMixin\n\n### What changes were proposed in this pull request?\n\nThis PR proposes to rename `test_union` to `test_eqnullsafe` at `ColumnTestsMixin`.\n\n### Why are the changes needed?\n\nTo avoid confusion from the test name.\n\n### Does this PR introduce _any_ user-facing change?\n\nNo, test-only\n\n### How was this patch tested?\n\nCI in this PR.\n\n### Was this patch authored or co-authored using generative AI tooling?\n\nNo.\n\nCloses #46675 from HyukjinKwon/minor-test-rename1.\n\nAuthored-by: Hyukjin Kwon <gurwls223@apache.org>\nSigned-off-by: Hyukjin Kwon <gurwls223@apache.org>","shortMessageHtmlLink":"[MINOR][TESTS] Rename test_union to test_eqnullsafe at ColumnTestsMixin"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEUfJFOwA","startCursor":null,"endCursor":null}},"title":"Activity · apache/spark"}