Как выбрать столбцы в датафрейме с кавычками в имени в спарк - PullRequest
0 голосов
/ 04 октября 2018

Я пытался получить доступ к столбцам "accession" "database" "disease" "ec.code" "omics_type" "species", используя

fileDf.select("\"accession\"","\"database\"","\"connections\"")

, но все еще получал ошибку

root
 |-- "accession"    "database"  "disease"   "ec.code"   "omics_type"    "species"   "tissue"    "citations.x"   "coding.x"  "ensembl.x" "go.x"  "intact.x"  "kegg.compound.x"   "kegg.glycan.x" "kegg.pathway.x"    "kegg.reaction.x"   "metabolights.x"    "ncbi.x"    "pubchem.compound.x"    "pubchem.substance.x"   "reactome.x"    "reanalysis.x"  "rnacentral.x"  "sgd.x" "sra.x" "uniprot.x" "ajs.connectivity.score"    "citations.y"   "coding.y"  "ensembl.y" "go.y"  "intact.y"  "kegg.compound.y"   "kegg.glycan.y" "kegg.pathway.y"    "kegg.reaction.y"   "metabolights.y"    "ncbi.y"    "pubchem.compound.y"    "pubchem.substance.y"   "reactome.y"    "reanalysis.y"  "rnacentral.y"  "sgd.y" "sra.y" "uniprot.y" "connections": string (nullable = true)

Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot resolve '`"accession"`' given input columns: ["accession" "database"  "disease"   "ec.code"   "omics_type"    "species"   "tissue"    "citations.x"   "coding.x"  "ensembl.x" "go.x"  "intact.x"  "kegg.compound.x"   "kegg.glycan.x" "kegg.pathway.x"    "kegg.reaction.x"   "metabolights.x"    "ncbi.x"    "pubchem.compound.x"    "pubchem.substance.x"   "reactome.x"    "reanalysis.x"  "rnacentral.x"  "sgd.x" "sra.x" "uniprot.x" "ajs.connectivity.score"    "citations.y"   "coding.y"  "ensembl.y" "go.y"  "intact.y"  "kegg.compound.y"   "kegg.glycan.y" "kegg.pathway.y"    "kegg.reaction.y"   "metabolights.y"    "ncbi.y"    "pubchem.compound.y"    "pubchem.substance.y"   "reactome.y"    "reanalysis.y"  "rnacentral.y"  "sgd.y" "sra.y" "uniprot.y" "connections"];;
'Project ['"accession", '"database", '"connections"]
+- AnalysisBarrier
      +- Relation["accession"   "database"  "disease"   "ec.code"   "omics_type"    "species"   "tissue"    "citations.x"   "coding.x"  "ensembl.x" "go.x"  "intact.x"  "kegg.compound.x"   "kegg.glycan.x" "kegg.pathway.x"    "kegg.reaction.x"   "metabolights.x"    "ncbi.x"    "pubchem.compound.x"    "pubchem.substance.x"   "reactome.x"    "reanalysis.x"  "rnacentral.x"  "sgd.x" "sra.x" "uniprot.x" "ajs.connectivity.score"    "citations.y"   "coding.y"  "ensembl.y" "go.y"  "intact.y"  "kegg.compound.y"   "kegg.glycan.y" "kegg.pathway.y"    "kegg.reaction.y"   "metabolights.y"    "ncbi.y"    "pubchem.compound.y"    "pubchem.substance.y"   "reactome.y"    "reanalysis.y"  "rnacentral.y"  "sgd.y" "sra.y" "uniprot.y" "connections"#10] csv

Как выбрать столбцы в кадре данных с кавычками в имени в spark?

...