Scikit Learn для кластеризации смешанных данных (числовые и категориальные) - PullRequest
0 голосов
/ 03 июля 2018

Может кто-нибудь помочь изменить приведенный ниже рабочий пример для создания кластеров из общих данных?

В этом примере используется кластеризация по среднему сдвигу из Scikit-Learn, чтобы идентифицировать участки схожих / совместно расположенных видов растений в агрономическом объекте.

Подобные вопросы об использовании категориальных значений в дополнение к числовым значениям в этих типах задач были заданы ранее, но я думаю, что этот пример отличается по следующей причине: нечисловые значения в этой проблеме не могут быть просто закодированы с помощью один и ноль фиктивных значений. Например, мы не можем закодировать значения One-Hot, такие как 'Aristolochia macrophylla' и 'Aristolochia durior' , потому что виды с таким сходством в названиях должны группироваться на основе на их семью, в дополнение к их географической близости, определяемой значениями X и Y. Сходство имени так же важно, как и местоположение при создании кластеров.

Я пробовал две вещи: присвоение произвольных числовых значений буквам в названии вида, чтобы показать, что имена с похожим написанием будут ближе друг к другу в числовой строке. Я собирался применить автоматическое масштабирование к значениям и подключиться к сценарию с координатами X и Y. Это не работает, потому что разные имена оказались очень похожими по численности.

Моя другая попытка включить категориальные значения была через использование расстояния Левенштейна. Но вывод расстояния основан на сравнении только двух значений. И если вы делаете вывод, показывающий расстояние каждой строки до всех остальных, как вы можете реализовать этот результат в качестве входных данных для алгоритма Meanshift?

В любом случае, вот сценарий данных и работы, который пока использует только числовые значения. Буду очень признателен за любые примеры того, как кластеризовать эти данные, используя сходство категориальных значений.

Спасибо

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from itertools import cycle
from sklearn.cluster import MeanShift, estimate_bandwidth
from sklearn.datasets.samples_generator import make_blobs

df=pd.DataFrame()

df["POINT_X"]=[-75.933169765,-75.932900302,-75.933060039,-75.932456135,-75.932334122,-75.933383845,-75.933378563,-75.933290334,-75.933302506,-75.932024669,-75.931803297,-75.931777655,-75.9317845,-75.931807731,
               -75.931794839,-75.932045113,-75.932165473,-75.932763574,-75.93216276,-75.932066326,-75.931934871,-75.932294115,-75.931852284,-75.93187799,-75.932063549,-75.932377939,-75.932466697,-75.9324484,-75.932523695,
               -75.932484492,-75.931882652,-75.932006344,-75.932228988,-75.932702486,-75.933245229,-75.933165385,-75.932990797,-75.932741398,-75.932519195,-75.932336262,-75.932264764,-75.932953569,-75.932938167,-75.933098289,
               -75.932503985,-75.932597591,-75.932551382,-75.932541384,-75.932575066,-75.932751274,-75.932869969,-75.932086405,-75.932125915,-75.932089623,-75.932229816,-75.932356252,-75.93221234,-75.932505964,-75.932455199,
               -75.932672148,-75.932823439,-75.93266258,-75.932722695,-75.93262497,-75.932613958,-75.932726832,-75.933179618,-75.933413275,-75.932911947,-75.93293013,-75.933129681,-75.933348106,-75.933328068,-75.9333501,
               -75.933133529,-75.93306104,-75.933020824,-75.933056158,-75.933261164,-75.933157803,-75.933320158,-75.93306193,-75.932935915,-75.933125758,-75.933088069,-75.933158642,-75.9331282,-75.933096121,-75.933250109,
               -75.933325084,-75.933336448,-75.934785616,-75.934843128,-75.93387422,-75.933996517,-75.934114484,-75.934560855,-75.935138185,-75.935228902,-75.935550248,-75.935326059,-75.935167468,-75.935038326,-75.934937151,
               -75.934476218,-75.934576771,-75.934556169,-75.934324709,-75.934215059,-75.934185509,-75.933996183,-75.938853557,-75.937435702,-75.93755249,-75.93709863,-75.937584727,-75.937080786,-75.93717527,-75.937158245,
               -75.937153622,-75.937255458,-75.937291351,-75.937463492,-75.937508635,-75.937568922,-75.937604,-75.937643152,-75.937538299,-75.936224493,-75.936538213,-75.936653234,-75.936672687,-75.936781092,-75.936765158,
               -75.936775048,-75.93680606,-75.936808197,-75.936753824,-75.936637658,-75.936923553,-75.936872045,-75.936871187,-75.936735385,-75.936800934,-75.936504657,-75.936528774,-75.936462867,-75.936301988,-75.936248282,
               -75.936192436,-75.935933385,-75.93679036,-75.936984567,-75.937178376,-75.937072594,-75.936212479,-75.937100912,-75.937075027,-75.93703418,-75.936553923,-75.936563813,-75.936750108,-75.935328068,-75.93329076,
               -75.933274837,-75.932816577,-75.932958943,-75.932872736,-75.933039998,-75.932930987,-75.932975423,-75.932987859,-75.932944342,-75.932984985,-75.933102016,-75.933042959,-75.935432474,-75.93539475,-75.935456177,
               -75.935413297,-75.935564812,-75.936518316,-75.935680005,-75.936558194,-75.935736741,-75.935754977,-75.935809,-75.935866569,-75.936134435,-75.936272398,-75.936252114,-75.936497277,-75.936178069,-75.933545359,
               -75.933462287,-75.933528848,-75.933456247,-75.933508043,-75.933443108,-75.933436682,-75.933293086,-75.933458306,-75.932948828,-75.933541322,-75.933719067,-75.933560447,-75.934586709,-75.934531055,-75.93416494,
               -75.933882234,-75.934830229,-75.934978045,-75.934357619,-75.934605828,-75.934754661,-75.934743056,-75.934130125,-75.935928887,-75.936286533,-75.936425628,-75.936477105,-75.935622798,-75.935607342,-75.936576534,
               -75.936823941,-75.936664385,-75.936985859,-75.936927641,-75.937655315,-75.93754798,-75.937409554,-75.937780814,-75.936920843,-75.93724831,-75.937473965,-75.937712006,-75.935331673,-75.936250622,-75.934986449,
               -75.938144151,-75.938287148,-75.938572438,-75.938677207,-75.938737192,-75.936696505,-75.9379094,-75.937601482,-75.931082221,-75.931152233,-75.931929379,-75.931886037,-75.931539305,-75.93145414,-75.931517537,
               -75.93206476,-75.931104594,-75.930886831,-75.930796839,-75.930770692,-75.934395391,-75.933485857,-75.935094793,-75.935243938,-75.934978751,-75.935325475,-75.935361712,-75.933975927,-75.933883586,-75.936299827,
               -75.934936738,-75.935015301,-75.934930658,-75.935287011,-75.935294894,-75.937784172,-75.937770775,-75.938253481,-75.93826076,-75.937784726,-75.93717805,-75.938872368,-75.938875092,-75.939336652,-75.940266037,
               -75.940331239,-75.940421181,-75.940331999,-75.940177713,-75.939332917,-75.938994759,-75.939607395,-75.939598636,-75.939560673,-75.939534037,-75.939555948,-75.939015855,-75.939243491,-75.938789939,-75.933198497,
               -75.93296926,-75.933132717,-75.932772368,-75.932419051,-75.93293841,-75.932798596,-75.932208745,-75.93206523,-75.931983351,-75.932410373,-75.931891975,-75.931568921,-75.931771254,-75.932397243,-75.931396196,
               -75.931519619,-75.932093909,-75.931942073,-75.934429867,-75.934438719,-75.93453334,-75.934266886,-75.934183909,-75.93452075,-75.933856314,-75.933881074,-75.933901224,-75.933751983,-75.933594864,-75.93358154,
               -75.93347677,-75.933895768,-75.933917682,-75.933687372,-75.933927415,-75.933739282,-75.933891053,-75.933712267,-75.93361711,-75.933901067,-75.934161321,-75.934305249,-75.934239461,-75.934211658,-75.933980238,
               -75.934018133,-75.93397582,-75.933918536,-75.933971179,-75.933877169]

df["POINT_Y"]=[38.95259201,38.952468493,38.952585964,38.952220643,38.952172451,38.952978948,38.952611101,38.952620123,38.952527583,38.952013642,38.951971095,38.951950598,38.951878617,38.951867573,38.952051039,38.952319899,
               38.952751776,38.952261808,38.951645828,38.951591344,38.951583443,38.951660428,38.951750197,38.951752666,38.951776696,38.951792968,38.951787078,38.951862848,38.951800999,38.951744805,38.951870508,38.951889649,
               38.951936158,38.95170948,38.951751749,38.951735386,38.951742727,38.951588575,38.951528477,38.951520106,38.951519453,38.951936698,38.952010261,38.952013956,38.952102079,38.952165877,38.952146088,38.952089106,
               38.952117254,38.952151545,38.949969545,38.951201998,38.951159228,38.951123753,38.950778391,38.950531943,38.950989092,38.950097211,38.950208568,38.950065183,38.950071356,38.949923603,38.9498474,38.949809668,
               38.949757376,38.949571133,38.951447294,38.95147755,38.950581745,38.950733667,38.951069352,38.951237478,38.95107276,38.95096753,38.9508122,38.950734862,38.950688169,38.950514372,38.950075351,38.950010511,38.949960875,
               38.949992064,38.95007398,38.950101272,38.950295815,38.950227769,38.950211517,38.950441255,38.950335632,38.95024686,38.950307666,38.950528546,38.950513096,38.950187972,38.950217841,38.950263645,38.950510523,
               38.950755399,38.950708302,38.950286311,38.950229957,38.950164615,38.950045229,38.949970825,38.949877169,38.949993101,38.949660647,38.949543522,38.949625589,38.949412861,38.949487811,38.949880172,38.951839048,
               38.952063455,38.949880835,38.951913953,38.949897842,38.949754481,38.949913573,38.951052934,38.951134326,38.951215119,38.951281057,38.951294341,38.951397886,38.951533389,38.951672146,38.949658462,38.950068808,
               38.949883166,38.949852263,38.949919533,38.950057898,38.950028999,38.950188832,38.950304129,38.950435138,38.950514515,38.950622084,38.950381874,38.949994828,38.950052327,38.949830647,38.949824853,38.949732702,
               38.949761675,38.949791427,38.949879419,38.949914074,38.949955099,38.951691376,38.951766177,38.951785811,38.951832242,38.951733008,38.950873805,38.951440038,38.951405074,38.951254936,38.951212584,38.951201821,
               38.951198089,38.951901959,38.94884403,38.948941748,38.949353979,38.949035993,38.949016785,38.94887402,38.948802413,38.948722997,38.94868013,38.948698153,38.948609493,38.948407937,38.948413538,38.94884251,
               38.948821237,38.948818421,38.948795076,38.949678178,38.949281509,38.949751466,38.949261269,38.949715525,38.949652229,38.949566304,38.949532396,38.949542936,38.949567821,38.94953658,38.949563742,38.948735942,
               38.952147575,38.952155751,38.951912912,38.951985954,38.952728799,38.952622921,38.952451597,38.952436249,38.95231594,38.952313127,38.951745893,38.952390373,38.952286187,38.952708734,38.951839413,38.952030386,
               38.951616852,38.951420298,38.951608998,38.952554863,38.9520134,38.951292914,38.951667791,38.952112184,38.954031241,38.953799626,38.953837241,38.953853864,38.953692287,38.953686947,38.953751245,38.953616457,
               38.95369262,38.953694331,38.953744736,38.953742862,38.953858308,38.953767308,38.953659111,38.953499777,38.953494864,38.953676808,38.953570088,38.953574927,38.953146008,38.953138966,38.953219752,38.953218684,
               38.953196026,38.953217491,38.953260642,38.953365184,38.953343071,38.953392347,38.95584336,38.955799692,38.956182326,38.95621302,38.956049617,38.957470088,38.957171152,38.956453402,38.956649954,38.956791692,
               38.957180989,38.957521592,38.955754158,38.95553646,38.955953035,38.956405511,38.956660878,38.957086511,38.957423389,38.957793854,38.957835976,38.955448024,38.955021013,38.954934154,38.954927544,38.954598007,
               38.954570833,38.954367294,38.954343,38.954497793,38.954471,38.954821256,38.954369125,38.955348715,38.955333171,38.955343991,38.955489753,38.955493927,38.955516735,38.955049181,38.955110383,38.954724398,38.954521524,
               38.954517463,38.954512208,38.954493542,38.954434212,38.954117479,38.95435162,38.954310712,38.954277052,38.954161078,38.954580606,38.954197375,38.955451505,38.955596079,38.955045523,38.955097295,38.955970146,
               38.954232335,38.95411988,38.953505553,38.955288869,38.955759644,38.955647996,38.955040953,38.954949777,38.95485026,38.954643337,38.954546745,38.953547289,38.953542137,38.953995634,38.954146947,38.954862356,
               38.953287566,38.954523419,38.954915863,38.955002144,38.954945777,38.955006524,38.95507815,38.955120243,38.953067979,38.953073084,38.953453648,38.953640022,38.953641026,38.954062633,38.954027667,38.954110137,
               38.954249401,38.953874232,38.953529725,38.953628972,38.953476826,38.95351151,38.953498365,38.953491846,38.953767787,38.953843351,38.953849161]

#Must incorporate these identifiers and cluster by similarity of species in addition to their proximity.
df["Category"]=['Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla',
                'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla',
                'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia macrophylla', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior',
                'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior',
                'Aristolochia durior', 'Aristolochia durior', 'Aristolochia durior', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa',
                'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa',
                'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Aristolochia tomentosa', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii',
                'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii', 'Buddleia davidii',
                'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana',
                'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Buddleia x weyeriana', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa',
                'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyparis obtusa', 'Chamaecyfoccia gracilis',
                'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyfoccia gracilis', 'Chamaecyparis pisifera',
                'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera',
                'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Chamaecyparis pisifera', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba',
                'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus alba', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia',
                'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia',
                'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia', 'Cornus albernifolia',
                'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis',
                'Cornus canadensis', 'Cornus canadensis', 'Cornus canadensis', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata',
                'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euonymus alata', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima',
                'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima',
                'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Euphorbia pulcherrima', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis',
                'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalis', 'Galanthus nivalisodoratum',
                'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum',
                'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Galanthus nivalisodoratum', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra',
                'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra',
                'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra',
                'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Hakonechloa aureola-macra', 'Ilex crenata Hetzii',
                'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Ilex crenata Hetzii',
                'Ilex crenata Hetzii', 'Ilex crenata Hetzii', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens', 'Iberis sempervirens',
                'Iberis sempervirens', 'Iberis sempervirens', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum',
                'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Lamium maculatum', 'Mertensia virginica', 'Mertensia virginica', 'Mertensia virginica', 'Mertensia virginica', 'Mertensia virginica', 'Mertensia virginica',
                'Mertensia virginica', 'Mertensia virginica', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus',
                'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Aristolochata pseudophilus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus',
                'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis duplicatus', 'Chamaecyparis crenata Hetzii',
                'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis crenata Hetzii', 'Chamaecyparis',
                'Chamaecyparis', 'Chamaecyparis', 'Chamaecyparis', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum',
                'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum', 'Veronicastrum virginicum',
                'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris', 'Veronicastrum vulgaris',
                'Veronicastrum vulgaris', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra',
                'Veronicastrum pulchra', 'Veronicastrum pulchra', 'Veronicastrum pulchra']



#Get clusters with MeanShift
X= np.array(df.loc[:,["POINT_X","POINT_Y"]].values.tolist()) # Only using numeric values for now
bandwidth = estimate_bandwidth(X, quantile=0.0595, n_samples=15000)
ms = MeanShift(bandwidth=bandwidth, bin_seeding=True)
ms.fit(X)
labels = ms.labels_
cluster_centers = ms.cluster_centers_
labels_unique = np.unique(labels)
n_clusters_ = len(labels_unique)
print("Estimated number of clusters: %d" % n_clusters_)

#Make plot
plt.figure(1)
plt.clf()
colors = cycle('bgrcmykbgrcmykbgrcmykbgrcmyk')

for k, col in zip(range(n_clusters_), colors):
    my_members = labels == k
    cluster_center = cluster_centers[k]
    plt.plot(X[my_members, 0], X[my_members, 1], col + '.')
    plt.plot(cluster_center[0], cluster_center[1], 'o', markerfacecolor=col,
             markeredgecolor='k', markersize=14)
plt.title('Clusters found by X/Y proximity (before using categorical values): %d' % n_clusters_)
plt.show(); plt.show()
Добро пожаловать на сайт PullRequest, где вы можете задавать вопросы и получать ответы от других членов сообщества.
...