Мой Json:
{"apps": {"app": [{"id": "id1","user": "hdfs"}, {"id": "id2","user": "yarn"}]}}
Схема:
root
|-- apps: struct (nullable = true)
| |-- app: array (nullable = true)
| | |-- element: struct (containsNull = true)
| | | |-- id: String (nullable = true)
| | | |-- name: String (nullable = true)
Мой код:
StructType schema = new StructType()
.add("apps",(new StructType()
.add("app",(new StructType()))
.add("element",new StructType().add("id",new StringType())add("user",new StringType())
)));
Dataset<Row> df = sparkSession.read().schema(schema).json(<path_to_json>);
Это дает мне эту ошибку:
Exception in thread "main" scala.MatchError: org.apache.spark.sql.types.StringType@1fca53a7 (of class org.apache.spark.sql.types.StringType)
df.show()
должен показать мне:
id user
id1 hdfs
id2 yarn