Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-41281

Feature parity: SparkSession API in Spark Connect

    XMLWordPrintableJSON

Details

    • Umbrella
    • Status: Resolved
    • Critical
    • Resolution: Resolved
    • 3.4.0
    • None
    • Connect
    • None

    Description

      Implement SparkSession API in Spark Connect.

      Attachments

        Issue Links

          1.
          Support session.sql in Connect DSL Sub-task Resolved Rui Wang
          2.
          Support session.range in Python client Sub-task Resolved Rui Wang
          3.
          Improve `session.sql` testing coverage in Python client Sub-task Resolved Rui Wang
          4.
          RemoteSparkSession should only accept one `user_id` Sub-task Resolved Unassigned
          5.
          Connect DataFrame should require RemoteSparkSession Sub-task Resolved Rui Wang
          6.
          Support local data for LocalRelation Sub-task Resolved Deng Ziming
          7.
          RemoteSparkSession should be called SparkSession Sub-task Resolved Martin Grund
          8.
          Implement DataFrame.tail Sub-task Resolved Rui Wang
          9.
          Support Range in Connect proto Sub-task Resolved Rui Wang
          10.
          Arrow based collect Sub-task Resolved Ruifeng Zheng
          11.
          Control the max size of arrow batch Sub-task Resolved Ruifeng Zheng
          12.
          SparkSession.range should treat end as optional Sub-task Resolved Martin Grund
          13.
          Support target field for UnresolvedStar Sub-task Resolved Deng Ziming
          14.
          Make `createDataFrame` support schema and more input dataset types Sub-task Resolved Ruifeng Zheng
          15.
          Implement SparkSession.stop Sub-task Resolved Hyukjin Kwon
          16.
          SparkSession.createDataFrame does not respect the column names in the row Sub-task Resolved Ruifeng Zheng
          17.
          Make `createDataFrame` support list of Rows Sub-task Resolved Ruifeng Zheng
          18.
          `createDataFrame` doesn't handle None/NaN properly Sub-task Resolved Ruifeng Zheng
          19.
          SparkSession.createDataFrame does not respect the column names in the dictionary Sub-task Resolved Hyukjin Kwon
          20.
          Implement SparkSession.sparkContext Sub-task Resolved Unassigned
          21.
          Implement Dataframe.readStream Sub-task Resolved Unassigned
          22.
          Implement creating empty Dataframe Sub-task Resolved Ruifeng Zheng
          23.
          SparkSession.createDataFrame error parity Sub-task Resolved Hyukjin Kwon
          24.
          Fix DataFrame createDataframe handling of None Sub-task Resolved Ruifeng Zheng
          25.
          Add the unsupported function list for `session` Sub-task Resolved Ruifeng Zheng
          26.
          SparkSession.range to take float as arguments Sub-task Resolved Hyukjin Kwon
          27.
          SparkSession.createDataFrame does not support nested datatypes Sub-task Resolved Ruifeng Zheng
          28.
          Make `createDataFrame` support array Sub-task Resolved Hyukjin Kwon
          29.
          Support data type int8 Sub-task Resolved Ruifeng Zheng
          30.
          Support data type ndarray Sub-task Resolved Ruifeng Zheng
          31.
          Support aware datetimes Sub-task Resolved Ruifeng Zheng
          32.
          createDataFrame with array.array Sub-task Resolved Ruifeng Zheng
          33.
          createDataFrame should corse types of string false to bool false Sub-task Resolved Hyukjin Kwon
          34.
          Support Pandas DF to Spark DF with Nanosecond Timestamps Sub-task Resolved Martin Grund
          35.
          Remove session.register_udf Sub-task Resolved Hyukjin Kwon
          36.
          createDataFrame should corse types of string float to float Sub-task Resolved Ruifeng Zheng
          37.
          Implement SparkSession.conf Sub-task Resolved Takuya Ueshin
          38.
          `CreateDataFrame` should accept objects Sub-task Resolved Ruifeng Zheng
          39.
          Support parameterized SQL by sql() Sub-task Resolved Takuya Ueshin
          40.
          SparkConnectStreamHandler should manage configs properly while creating plans. Sub-task Resolved Takuya Ueshin
          41.
          createDataFrame should support DDL string as schema Sub-task Resolved Takuya Ueshin
          42.
          Remove the workaround of sql(...).collect back in PySpark tests Sub-task Resolved Hyukjin Kwon
          43.
          Support data type Duration(NANOSECOND) Sub-task Resolved Takuya Ueshin
          44.
          Implement SparkSession.udf Sub-task Resolved Unassigned
          45.
          Handle duplicate columns in `createDataFrame` Sub-task Resolved Unassigned
          46.
          createDataFrame should autogenerate missing column names Sub-task Resolved Hyukjin Kwon
          47.
          createDataFrame with UDT Sub-task Resolved Takuya Ueshin
          48.
          Support data type Timestamp(NANOSECOND, null) Sub-task Resolved Ruifeng Zheng
          49.
          Streaming createDataFrame implementation Sub-task Resolved Max Gekk
          50.
          createDataFrame doesn't work with non-nullable schema. Sub-task Resolved Unassigned
          51.
          SparkSession.sql doesn't return values from commands. Sub-task Resolved Takuya Ueshin
          52.
          Fix createDataFrame to respect both type inference and column names. Sub-task Resolved Takuya Ueshin
          53.
          createDataFrame should support duplicated nested field names Sub-task Resolved Takuya Ueshin
          54.
          Positional parameters in sql() Sub-task Resolved Max Gekk
          55.
          Support positional parameters in sql() by python connect client Sub-task Resolved Max Gekk

          Activity

            People

              podongfeng Ruifeng Zheng
              gurwls223 Hyukjin Kwon
              Hyukjin Kwon Hyukjin Kwon
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: