Использование from_json
df.withColumn("data_struct",from_json($"data",StructType(Array(StructField("month", StringType),StructField("day", StringType)))))
На Spark 2.4.0 я получаю следующее
import org.apache.spark.sql.types.{StructType, StructField, StringType}
val df = List ( ("[{\"month\":\"Jan\",\"day\":\"monday\"}]")).toDF("data")
val df2 = df.withColumn("data_struct",from_json($"data",StructType(Array(StructField("month", StringType),StructField("day", StringType)))))
df2.show
+--------------------+-------------+
| data| data_struct|
+--------------------+-------------+
|[{"month":"Jan","...|[Jan, monday]|
+--------------------+-------------+
df2.printSchema
root
|-- data: string (nullable = true)
|-- data_struct: struct (nullable = true)
| |-- month: string (nullable = true)
| |-- day: string (nullable = true)