Удалить внешний столбец структуры в искровом фрейме - PullRequest
0 голосов
/ 28 марта 2020

Текущая схема My Spark Dataframe показана ниже, есть ли способ удалить внешний столбец Struct (DTC_CAN_SIGNALS).

**Current Schema**:

root
|-- DTC: string (nullable = true)
|-- DTCTS: long (nullable = true)
|-- VIN: string (nullable = true)
|-- DTC_CAN_SIGNALS: struct (nullable = true)
|    |-- SGNL: array (nullable = true)
|    |    |-- element: struct (containsNull = true)
|    |    |    |-- SN: string (nullable = true)
|    |    |    |-- ST: long (nullable = true)
|    |    |    |-- SV: double (nullable = true)


**Expected Schema**:

root
|-- DTC: string (nullable = true)
|-- DTCTS: long (nullable = true)
|-- VIN: string (nullable = true)
|-- SGNL: array (nullable = true)
     |-- element: struct (containsNull = true)
     |    |-- SN: string (nullable = true)
     |    |-- ST: long (nullable = true)
     |    |-- SV: double (nullable = true)

1 Ответ

1 голос
/ 28 марта 2020

Просто выберите ваш столбец из структуры, например

df.withColumn("SGNL", col("DTC_CAN_SIGNALS.SGNL"))
or
df.select("DTC_CAN_SIGNALS.SGNL")

Код:

import sparkSession.implicits._
import org.apache.spark.sql.functions._

val data = Seq(
  ("DTC", 42L, "VIN")
).toDF("DTC", "DTCTS", "VIN")


val df = data.withColumn("DTC_CAN_SIGNALS", struct(array(struct(lit("sn1").as("SN"), lit(42L).as("ST"), lit(42.0D).as("SV"))).as("SGNL")))

df.show()
df.printSchema()

// alternatively
// val resDf = df
//   .withColumn("SGNL", col("DTC_CAN_SIGNALS.SGNL"))
//   .drop("DTC_CAN_SIGNALS")

val resDf = df.select("DTC", "DTCTS", "VIN", "DTC_CAN_SIGNALS.SGNL")

resDf.show()
resDf.printSchema()

Вывод:

+---+-----+---+-------------------+
|DTC|DTCTS|VIN|    DTC_CAN_SIGNALS|
+---+-----+---+-------------------+
|DTC|   42|VIN|[[[sn1, 42, 42.0]]]|
+---+-----+---+-------------------+

root
 |-- DTC: string (nullable = true)
 |-- DTCTS: long (nullable = false)
 |-- VIN: string (nullable = true)
 |-- DTC_CAN_SIGNALS: struct (nullable = false)
 |    |-- SGNL: array (nullable = false)
 |    |    |-- element: struct (containsNull = false)
 |    |    |    |-- SN: string (nullable = false)
 |    |    |    |-- ST: long (nullable = false)
 |    |    |    |-- SV: double (nullable = false)

+---+-----+---+-----------------+
|DTC|DTCTS|VIN|             SGNL|
+---+-----+---+-----------------+
|DTC|   42|VIN|[[sn1, 42, 42.0]]|
+---+-----+---+-----------------+

root
 |-- DTC: string (nullable = true)
 |-- DTCTS: long (nullable = false)
 |-- VIN: string (nullable = true)
 |-- SGNL: array (nullable = false)
 |    |-- element: struct (containsNull = false)
 |    |    |-- SN: string (nullable = false)
 |    |    |-- ST: long (nullable = false)
 |    |    |-- SV: double (nullable = false)
...