преобразование dict в столбцы в pandas кадре данных - PullRequest
0 голосов
/ 08 июля 2020

Я знаю, что подобные вопросы были опубликованы, но ни одно из опубликованных решений, которые я пробовал, не сработало так, как я ожидал.

У меня есть столбец во фрейме данных pandas, который является dict, и я хочу создать новый фрейм данных с каждым ключом в качестве столбца фрейма данных. Вот что содержит одна запись в talentpool_subset:

'{"Paradigms":["Agile Software Development","Scrum","DevOps","Serverless Architecture"],"Platforms":["Kubernetes","Linux","Windows","Eclipse","PagerDuty","Apache2","Docker","AWS EC2","Amazon Web Services (AWS)","Sysdig","Apache Kafka","AWS Lambda","Azure","OpenStack"],"Storage":["AWS S3","MongoDB","Cassandra","MySQL","PostgreSQL","AWS DynamoDB","Spring Data MongoDB","AWS RDS","MySQL/MariaDB","Datadog","Memcached"],"Languages":["Java","PHP","SQL","Bash","Perl","JavaScript","Python","C#","Go"],"Frameworks":["Ruby on Rails (RoR)","AWS HA",".NET","Serverless Framework","Selenium","CodeIgniter","Express.js"],"Other":["Cisco","Content Delivery Networks (CDN)","Kubernetes Operations (Kops)","Prometheus","VMware ESXi","Bash Scripting","Scrum Master","Infrastructure as Code","Performance Tuning","Serverless","System Administration","Linux System Administration","Code Review"],"Libraries/APIs":["Node.js","Jenkins Pipeline","jQuery","React","Selenium Grid"],"Tools":["Jenkins","Bitbucket","GitHub","AWS ECS","AWS IAM","Amazon CloudFront CDN","Terraform","AWS CloudFormation","Git Flow","Artifactory","Nginx","Grafana","Zabbix","Docker Compose","AWS CLI","AWS ECR","Chef","Jira","Git","Postfix","MongoDB Shell","Wowza","Amazon SQS","AWS SES","Subversion (SVN)","TeamCity","Microsoft Visual Studio","Google Kubernetes Engine (GKE)","VMware ESX","Fluentd","Sumo Logic","Slack","Apache ZooKeeper","AWS Fargate","Ansible","ELK (Elastic Stack)","Microsoft Team Foundation Server","Azure Kubernetes Service (AKS)"]}'Stack)","Microsoft Team Foundation Server","Azure Kubernetes Service (AKS)"]}'

Итак, я хотел бы, Paradigms, Platforms, Storage, Languages, et c. чтобы все были отдельными столбцами.

Я пробовал:

df = talentpool_subset.drop('skills', axis=1).join(pd.DataFrame(talentpool_subset.skills.values.tolist()))

И все же получил тот же результат:

name    profile     location    0
0   Hugo L. Samayoa     DevOps Developer    Long Beach, CA, United States   {"Paradigms":["Agile Software Development","Sc...
1   Stepan Yakovenko    Software Developer  Novosibirsk, Novosibirsk Oblast, Russia     {"Platforms":["Debian Linux","Windows","Linux"...
2   Slobodan Gajic  Software Developer  Sremska Mitrovica, Vojvodina, Serbia    {"Platforms":["Firebase","XAMPP"],"Storage":["...
3   Bruno Furtado Montes Oliveira   Visual Studio Team Services (VSTS) Developer    Niterói - State of Rio de Janeiro, Brazil   {"Paradigms":["Agile","CQRS","Azure DevOps"],"...
4   Jennifer Aquino     Query Optimization Developer    West Ryde, New South Wales, Australia   {"Paradigms":["Automation","ETL Implementation...

Любые идеи о том, как решить эту проблему ?

Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...