Question

Мне нужно вести список подключенных клиентов, которые очень недолговечны и часто идут вверх и вниз. Из-за потенциального количества клиентов мне нужна коллекция, которая поддерживает быструю вставку / удаление. Предложения?

Henrik · Answer 1 · 23 февраля 2009

C5 Generic Collection Library

Лучшие реализации, которые я нашел в C # и C ++, это - для C # / CLI:

Он хорошо исследован, имеет расширяемые модульные тесты, а с февраля они также реализовали общие интерфейсы в .Net, что значительно облегчает работу с коллекциями. Они были показаны на Channel9 , и они провели обширное тестирование производительности на коллекциях.

Если вы все равно используете структуры данных, у этих исследователей есть реализация красно-черное дерево в их библиотеке, аналогичная той, которую вы найдете, если запустите отражатель Люца и загляните в System.Data. внутренние структуры: с. Вставка-сложность: O (log (n)).

Коллекции C ++ без блокировки

Тогда, если вы можете разрешить некоторое взаимодействие с C ++ , и вам абсолютно необходима скорость и вы хотите как можно меньше накладных расходов, то эти ADT без блокировки от Дмитрия Вьюкова, вероятно, являются лучшими, которые вы можете сделать попасть в этот мир, опережая параллельную библиотеку ADT от Intel.

http://groups.google.com/group/lock-free

Я прочитал часть кода, и он действительно создан кем-то, кто хорошо разбирается в том, как все это складывается. VC ++ может делать нативное взаимодействие C ++ без раздражающих границ. http://www.swig.org/ может иначе помочь вам обернуть интерфейсы C ++ для потребления в .Net, или вы можете сделать это самостоятельно через P / Invoke.

Microsoft's Take

Они написали учебники, , в которых реализован довольно неполированный список пропусков в C # и обсуждаются другие типы структур данных. (Есть лучший SkipList в CodeProject , который очень отточен и хорошо взаимодействует с интерфейсами.) Они также имеют несколько структур данных, связанных с .Net, а именно HashTable / Словарь <,> и HashSet . Конечно, есть также тип ResizeArray / List вместе со стеком и очередью, но все они «линейны» при поиске.

Google perf-tools

Если вы хотите сократить время, выделяемое для выделения памяти, вы можете воспользоваться программой Google Perf. Они доступны в коде Google и содержат очень интересную многопоточную реализацию malloc (TCMalloc) , которая показывает гораздо более согласованную синхронизацию, чем обычный malloc. Вы можете использовать это вместе с вышеприведенными структурами без блокировки, чтобы действительно сходить с ума от производительности.

Улучшение времени отклика с запоминанием

Вы также можете использовать запоминание функций для повышения производительности за счет кэширования, что интересно, например, при использовании, например. F # . F # также позволяет взаимодействие с C ++, так что вы в порядке.

О (к)

Существует также возможность сделать что-то самостоятельно, используя исследование, проведенное на bloom-filters , которые позволяют O (k) сложность поиска, где k - это константа, которая зависит от числа хэш-функции, которые вы реализовали. Вот как Google BigTable был реализован. Этот фильтр даст вам элемент, если он находится в наборе или, возможно, с очень низкой вероятностью - элемент, который не тот, который вы ищете (см. График в Википедии - он приближается к P (неправильный ключ) -> 0,01 по размеру составляет около 10000 элементов, но вы можете обойти это, внедрив дополнительные хэш-функции / уменьшив набор.

Я не искал реализации .Net этого, но, поскольку вычисления хэширования независимы, вы можете использовать реализацию задач, поставленную командой разработчиков MS 10 *, чтобы ускорить это.

«Мой» дубль - рандомизировать, чтобы достичь среднего O (log n)

Как оказалось, я только что провел курсовую работу с использованием структур данных. В этом случае мы использовали C ++, но его очень легко перевести на C #. Мы построили три разные структуры данных; фильтр Блума, список пропусков и случайное двоичное дерево поиска .

См. Код и анализ после последнего абзаца.

Аппаратные "коллекции"

Наконец, чтобы сделать мой ответ "полным", если вам действительно нужна скорость, вы можете использовать что-то вроде Таблицы маршрутизации или Адресно-ориентированная память . Это позволяет вам очень быстро O (1) в принципе получить «хеш» -значение поиска ваших данных.

Случайное дерево двоичного поиска / Фильтр Блума C ++ code

Буду очень признателен за обратную связь, если вы обнаружите ошибки в коде или просто укажете, как я могу сделать это лучше (или с лучшим использованием шаблонов). Обратите внимание, что фильтр Блума не такой, как в реальной жизни; обычно вам не нужно удалять из него, и тогда он намного эффективнее, чем хак, который я сделал для проверки delete .

DataStructure.h

#ifndef DATASTRUCTURE_H_
#define DATASTRUCTURE_H_

class DataStructure
{
public:
    DataStructure() {countAdd=0; countDelete=0;countFind=0;}
    virtual ~DataStructure() {}

    void resetCountAdd() {countAdd=0;}
    void resetCountFind() {countFind=0;}
    void resetCountDelete() {countDelete=0;}

    unsigned int getCountAdd(){return countAdd;}
    unsigned int getCountDelete(){return countDelete;}
    unsigned int getCountFind(){return countFind;}

protected:
    unsigned int countAdd;
    unsigned int countDelete;
    unsigned int countFind;
};

#endif /*DATASTRUCTURE_H_*/

key.h

#ifndef KEY_H_
#define KEY_H_

#include <string>
using namespace std;

const int keyLength = 128;

class Key : public string
{
public:
    Key():string(keyLength, ' ') {}
    Key(const char in[]): string(in){}
    Key(const string& in): string(in){}

    bool operator<(const string& other);
    bool operator>(const string& other);
    bool operator==(const string& other);

    virtual ~Key() {}
};

#endif /*KEY_H_*/

Key.cpp

#include "Key.h"

bool Key::operator<(const string& other)
{
    return compare(other) < 0;
};

bool Key::operator>(const string& other)
{
    return compare(other) > 0;
};

bool Key::operator==(const string& other)
{
    return compare(other) == 0;
}

BloomFilter.h

#ifndef BLOOMFILTER_H_
#define BLOOMFILTER_H_

#include <iostream>
#include <assert.h>
#include <vector>
#include <math.h>
#include "Key.h"
#include "DataStructure.h"

#define LONG_BIT 32
#define bitmask(val) (unsigned long)(1 << (LONG_BIT - (val % LONG_BIT) - 1))

// TODO: Implement RW-locking on the reads/writes to the bitmap.

class BloomFilter : public DataStructure
{
public:
    BloomFilter(){}
    BloomFilter(unsigned long length){init(length);}
    virtual ~BloomFilter(){}

    void init(unsigned long length);
    void dump();

    void add(const Key& key);
    void del(const Key& key);

    /**
     * Returns true if the key IS BELIEVED to exist, false if it absolutely doesn't.
     */
    bool testExist(const Key& key, bool v = false);

private:
    unsigned long hash1(const Key& key);
    unsigned long hash2(const Key& key);
    bool exist(const Key& key);
    void getHashAndIndicies(unsigned long& h1, unsigned long& h2, int& i1, int& i2, const Key& key);
    void getCountIndicies(const int i1, const unsigned long h1,
        const int i2, const unsigned long h2, int& i1_c, int& i2_c);

    vector<unsigned long> m_tickBook;
    vector<unsigned int> m_useCounts;
    unsigned long m_length; // number of bits in the bloom filter
    unsigned long m_pockets; //the number of pockets

    static const unsigned long m_pocketSize; //bits in each pocket
};

#endif /*BLOOMFILTER_H_*/

BloomFilter.cpp

#include "BloomFilter.h"

const unsigned long BloomFilter::m_pocketSize = LONG_BIT;

void BloomFilter::init(unsigned long length)
{
    //m_length = length;
    m_length = (unsigned long)((2.0*length)/log(2))+1;
    m_pockets = (unsigned long)(ceil(double(m_length)/m_pocketSize));
    m_tickBook.resize(m_pockets);

    // my own (allocate nr bits possible to store in the other vector)
    m_useCounts.resize(m_pockets * m_pocketSize);

    unsigned long i; for(i=0; i< m_pockets; i++) m_tickBook[i] = 0;
    for (i = 0; i < m_useCounts.size(); i++) m_useCounts[i] = 0; // my own
}

unsigned long BloomFilter::hash1(const Key& key)
{
    unsigned long hash = 5381;
    unsigned int i=0; for (i=0; i< key.length(); i++){
        hash = ((hash << 5) + hash) + key.c_str()[i]; /* hash * 33 + c */
    }

    double d_hash = (double) hash;

    d_hash *= (0.5*(sqrt(5)-1));
    d_hash -= floor(d_hash);
    d_hash *= (double)m_length;

    return (unsigned long)floor(d_hash);
}

unsigned long BloomFilter::hash2(const Key& key)
{
    unsigned long hash = 0;
    unsigned int i=0; for (i=0; i< key.length(); i++){
        hash = key.c_str()[i] + (hash << 6) + (hash << 16) - hash;
    }
    double d_hash = (double) hash;

    d_hash *= (0.5*(sqrt(5)-1));
    d_hash -= floor(d_hash);
    d_hash *= (double)m_length;

    return (unsigned long)floor(d_hash);
}

bool BloomFilter::testExist(const Key& key, bool v){
    if(exist(key)) {
        if(v) cout<<"Key "<< key<<" is in the set"<<endl;
        return true;
    }else {
        if(v) cout<<"Key "<< key<<" is not in the set"<<endl;
        return false;
    }
}

void BloomFilter::dump()
{
    cout<<m_pockets<<" Pockets: ";

    // I changed u to %p because I wanted it printed in hex.
    unsigned long i; for(i=0; i< m_pockets; i++) printf("%p ", (void*)m_tickBook[i]);
    cout<<endl;
}

void BloomFilter::add(const Key& key)
{
    unsigned long h1, h2;
    int i1, i2;
    int i1_c, i2_c;

    // tested!

    getHashAndIndicies(h1, h2, i1, i2, key);
    getCountIndicies(i1, h1, i2, h2, i1_c, i2_c);

    m_tickBook[i1] = m_tickBook[i1] | bitmask(h1);
    m_tickBook[i2] = m_tickBook[i2] | bitmask(h2);

    m_useCounts[i1_c] = m_useCounts[i1_c] + 1;
    m_useCounts[i2_c] = m_useCounts[i2_c] + 1;

    countAdd++;
}

void BloomFilter::del(const Key& key)
{
    unsigned long h1, h2;
    int i1, i2;
    int i1_c, i2_c;

    if (!exist(key)) throw "You can't delete keys which are not in the bloom filter!";

    // First we need the indicies into m_tickBook and the
    // hashes.
    getHashAndIndicies(h1, h2, i1, i2, key);

    // The index of the counter is the index into the bitvector
    // times the number of bits per vector item plus the offset into
    // that same vector item.
    getCountIndicies(i1, h1, i2, h2, i1_c, i2_c);

    // We need to update the value in the bitvector in order to
    // delete the key.
    m_useCounts[i1_c] = (m_useCounts[i1_c] == 1 ? 0 : m_useCounts[i1_c] - 1);
    m_useCounts[i2_c] = (m_useCounts[i2_c] == 1 ? 0 : m_useCounts[i2_c] - 1);

    // Now, if we depleted the count for a specific bit, then set it to
    // zero, by anding the complete unsigned long with the notted bitmask
    // of the hash value
    if (m_useCounts[i1_c] == 0)
        m_tickBook[i1] = m_tickBook[i1] & ~(bitmask(h1));
    if (m_useCounts[i2_c] == 0)
        m_tickBook[i2] = m_tickBook[i2] & ~(bitmask(h2));

    countDelete++;
}

bool BloomFilter::exist(const Key& key)
{
    unsigned long h1, h2;
    int i1, i2;

    countFind++;

    getHashAndIndicies(h1, h2, i1, i2, key);

    return  ((m_tickBook[i1] & bitmask(h1)) > 0) &&
            ((m_tickBook[i2] & bitmask(h2)) > 0);
}

/*
 * Gets the values of the indicies for two hashes and places them in
 * the passed parameters. The index is into m_tickBook.
 */
void BloomFilter::getHashAndIndicies(unsigned long& h1, unsigned long& h2, int& i1,
    int& i2, const Key& key)
{
    h1 = hash1(key);
    h2 = hash2(key);
    i1 = (int) h1/m_pocketSize;
    i2 = (int) h2/m_pocketSize;
}

/*
 * Gets the values of the indicies into the count vector, which keeps
 * track of how many times a specific bit-position has been used.
 */
void BloomFilter::getCountIndicies(const int i1, const unsigned long h1,
    const int i2, const unsigned long h2, int& i1_c, int& i2_c)
{
    i1_c = i1*m_pocketSize + h1%m_pocketSize;
    i2_c = i2*m_pocketSize + h2%m_pocketSize;
}

** RBST.h **

#ifndef RBST_H_
#define RBST_H_

#include <iostream>
#include <assert.h>
#include <vector>
#include <math.h>
#include "Key.h"
#include "DataStructure.h"

#define BUG(str) printf("%s:%d FAILED SIZE INVARIANT: %s\n", __FILE__, __LINE__, str);

using namespace std;

class RBSTNode;
class RBSTNode: public Key
{
public:
    RBSTNode(const Key& key):Key(key)
    {
        m_left =NULL;
        m_right = NULL;
        m_size = 1U; // the size of one node is 1.
    }
    virtual ~RBSTNode(){}

    string setKey(const Key& key){return Key(key);}

    RBSTNode* left(){return m_left; }
    RBSTNode* right(){return m_right;}

    RBSTNode* setLeft(RBSTNode* left) { m_left = left; return this; }
    RBSTNode* setRight(RBSTNode* right) { m_right =right; return this; }

#ifdef DEBUG
    ostream& print(ostream& out)
    {
        out << "Key(" << *this << ", m_size: " << m_size << ")";
        return out;
    }
#endif

    unsigned int size() { return m_size; }

    void setSize(unsigned int val)
    {
#ifdef DEBUG
        this->print(cout);
        cout << "::setSize(" << val << ") called." << endl;
#endif

        if (val == 0) throw "Cannot set the size below 1, then just delete this node.";
        m_size = val;
    }

    void incSize() {
#ifdef DEBUG
        this->print(cout);
        cout << "::incSize() called" << endl;
#endif

        m_size++;
    }

    void decrSize()
    {
#ifdef DEBUG
        this->print(cout);
        cout << "::decrSize() called" << endl;
#endif

        if (m_size == 1) throw "Cannot decrement size below 1, then just delete this node.";
        m_size--;
    }

#ifdef DEBUG
    unsigned int size(RBSTNode* x);
#endif

private:
    RBSTNode(){}
    RBSTNode* m_left;
    RBSTNode* m_right;
    unsigned int m_size;
};

class RBST : public DataStructure
{
public:
    RBST() {
        m_size = 0;
        m_head = NULL;
        srand(time(0));
    };

    virtual ~RBST() {};

    /**
     * Tries to add key into the tree and will return
     *      true  for a new item added
     *      false if the key already is in the tree.
     *
     * Will also have the side-effect of printing to the console if v=true.
     */
    bool add(const Key& key, bool v=false);

    /**
     * Same semantics as other add function, but takes a string,
     * but diff name, because that'll cause an ambiguity because of inheritance.
     */
    bool addString(const string& key);

    /**
     * Deletes a key from the tree if that key is in the tree.
     * Will return
     *      true  for success and
     *      false for failure.
     *
     * Will also have the side-effect of printing to the console if v=true.
     */
    bool del(const Key& key, bool v=false);

    /**
     * Tries to find the key in the tree and will return
     *      true if the key is in the tree and
     *      false if the key is not.
     *
     * Will also have the side-effect of printing to the console if v=true.
     */
    bool find(const Key& key, bool v = false);

    unsigned int count() { return m_size; }

#ifdef DEBUG
    int dump(char sep = ' ');
    int dump(RBSTNode* target, char sep);
    unsigned int size(RBSTNode* x);
#endif

private:
    RBSTNode* randomAdd(RBSTNode* target, const Key& key);
    RBSTNode* addRoot(RBSTNode* target, const Key& key);
    RBSTNode* rightRotate(RBSTNode* target);
    RBSTNode* leftRotate(RBSTNode* target);

    RBSTNode* del(RBSTNode* target, const Key& key);
    RBSTNode* join(RBSTNode* left, RBSTNode* right);

    RBSTNode* find(RBSTNode* target, const Key& key);

    RBSTNode* m_head;
    unsigned int m_size;
};

#endif /*RBST_H_*/

** RBST.cpp **

#include "RBST.h"

bool RBST::add(const Key& key, bool v){
    unsigned int oldSize = m_size;
    m_head = randomAdd(m_head, key);
    if (m_size > oldSize){
        if(v) cout<<"Node "<<key<< " is added into the tree."<<endl;
        return true;
    }else {
        if(v) cout<<"Node "<<key<< " is already in the tree."<<endl;
        return false;
    }
    if(v) cout<<endl;
};

bool RBST::addString(const string& key) {
    return add(Key(key), false);
}

bool RBST::del(const Key& key, bool v){
    unsigned oldSize= m_size;
    m_head = del(m_head, key);
    if (m_size < oldSize) {
        if(v) cout<<"Node "<<key<< " is deleted from the tree."<<endl;
        return true;
    }
    else {
        if(v) cout<< "Node "<<key<< " is not in the tree."<<endl;
        return false;
    }
};

bool RBST::find(const Key& key, bool v){
    RBSTNode* ret = find(m_head, key);
    if (ret == NULL){
        if(v) cout<< "Node "<<key<< " is not in the tree."<<endl;
        return false;
    }else {
        if(v) cout<<"Node "<<key<< " is in the tree."<<endl;
        return true;
    }
};

#ifdef DEBUG
int RBST::dump(char sep){
    int ret = dump(m_head, sep);
    cout<<"SIZE: " <<ret<<endl;
    return ret;
};

int RBST::dump(RBSTNode* target, char sep){
    if (target == NULL) return 0;
    int ret = dump(target->left(), sep);
    cout<< *target<<sep;
    ret ++;
    ret += dump(target->right(), sep);
    return ret;
};
#endif

/**
 * Rotates the tree around target, so that target's left
 * is the new root of the tree/subtree and updates the subtree sizes.
 *
 *(target)  b               (l) a
 *         / \      right      / \
 *        a   ?     ---->     ?   b
 *       / \                     / \
 *      ?   x                   x   ?
 *
 */
RBSTNode* RBST::rightRotate(RBSTNode* target) // private
{
    if (target == NULL) throw "Invariant failure, target is null"; // Note: may be removed once tested.
    if (target->left() == NULL) throw "You cannot rotate right around a target whose left node is NULL!";

#ifdef DEBUG
    cout    <<"Right-rotating b-node ";
    target->print(cout);
    cout    << " for a-node ";
    target->left()->print(cout);
    cout    << "." << endl;
#endif

    RBSTNode* l = target->left();
    int as0 = l->size();

    // re-order the sizes
    l->setSize( l->size() + (target->right() == NULL ? 0 : target->right()->size()) + 1); // a.size += b.right.size + 1; where b.right may be null.
    target->setSize( target->size() -as0 + (l->right() == NULL ? 0 : l->right()->size()) ); // b.size += -a_0_size + x.size where x may be null.

    // swap b's left (for a)
    target->setLeft(l->right());

    // and a's right (for b's left)
    l->setRight(target);

#ifdef DEBUG
    cout    << "A-node size: " << l->size() << ", b-node size: " << target->size() << "." << endl;
#endif

    // return the new root, a.
    return l;
};

/**
 * Like rightRotate, but the other way. See docs for rightRotate(RBSTNode*)
 */
RBSTNode* RBST::leftRotate(RBSTNode* target)
{
    if (target == NULL) throw "Invariant failure, target is null";
    if (target->right() == NULL) throw "You cannot rotate left around a target whose right node is NULL!";

#ifdef DEBUG
    cout    <<"Left-rotating a-node ";
    target->print(cout);
    cout    << " for b-node ";
    target->right()->print(cout);
    cout    << "." << endl;
#endif

    RBSTNode* r = target->right();
    int bs0 = r->size();

    // re-roder the sizes
    r->setSize(r->size() + (target->left() == NULL ? 0 : target->left()->size()) + 1);
    target->setSize(target->size() -bs0 + (r->left() == NULL ? 0 : r->left()->size()));

    // swap a's right (for b's left)
    target->setRight(r->left());

    // swap b's left (for a)
    r->setLeft(target);

#ifdef DEBUG
    cout    << "Left-rotation done: a-node size: " << target->size() << ", b-node size: " << r->size() << "." << endl;
#endif

    return r;
};

//
/**
 * Adds a key to the tree and returns the new root of the tree.
 * If the key already exists doesn't add anything.
 * Increments m_size if the key didn't already exist and hence was added.
 *
 * This function is not called from public methods, it's a helper function.
 */
RBSTNode* RBST::addRoot(RBSTNode* target, const Key& key)
{
    countAdd++;

    if (target == NULL) return new RBSTNode(key);

#ifdef DEBUG
    cout << "addRoot(";
    cout.flush();
    target->print(cout) << "," << key << ") called." << endl;
#endif

    if (*target < key)
    {
        target->setRight( addRoot(target->right(), key) );
        target->incSize(); // Should I?
        RBSTNode* res = leftRotate(target);
#ifdef DEBUG
        if (target->size() != size(target))
            BUG("in addRoot 1");
#endif
        return res;
    }

    target->setLeft( addRoot(target->left(), key) );
    target->incSize(); // Should I?
    RBSTNode* res = rightRotate(target);
#ifdef DEBUG
    if (target->size() != size(target))
        BUG("in addRoot 2");
#endif
    return res;
};

/**
 * This function is called from the public add(key) function,
 * and returns the new root node.
 */
RBSTNode* RBST::randomAdd(RBSTNode* target, const Key& key)
{
    countAdd++;

    if (target == NULL)
    {
        m_size++;
        return new RBSTNode(key);
    }

#ifdef DEBUG
    cout << "randomAdd(";
    target->print(cout) << ", \"" << key << "\") called." << endl;
#endif

    int r = (rand() % target->size()) + 1;

    // here is where we add the target as root!
    if (r == 1)
    {
        m_size++;   // TODO: Need to lock.
        return addRoot(target, key);
    }

#ifdef DEBUG
    printf("randomAdd recursion part, ");
#endif

    // otherwise, continue recursing!
    if (*target <= key)
    {
#ifdef DEBUG
    printf("target <= key\n");
#endif
        target->setRight( randomAdd(target->right(), key) );
        target->incSize(); // TODO: Need to lock.
#ifdef DEBUG
        if (target->right()->size() != size(target->right()))
            BUG("in randomAdd 1");
#endif
    }
    else
    {
#ifdef DEBUG
    printf("target > key\n");
#endif
        target->setLeft( randomAdd(target->left(), key) );
        target->incSize(); // TODO: Need to lock.
#ifdef DEBUG
        if (target->left()->size() != size(target->left()))
            BUG("in randomAdd 2");
#endif
    }

#ifdef DEBUG
    printf("randomAdd return part\n");
#endif

    m_size++;       // TODO: Need to lock.
    return target;
};

/////////////////////////////////////////////////////////////
/////////////////////  DEL FUNCTIONS ////////////////////////
/////////////////////////////////////////////////////////////

/**
 * Deletes a node with the passed key.
 * Returns the root node.
 * Decrements m_size if something was deleted.
 */
RBSTNode* RBST::del(RBSTNode* target, const Key& key)
{
    countDelete++;

    if (target == NULL) return NULL;

#ifdef DEBUG
    cout << "del(";
    target->print(cout) << ", \"" << key << "\") called." << endl;
#endif

    RBSTNode* ret = NULL;

    // found the node to delete
    if (*target == key)
    {
        ret = join(target->left(), target->right());

        m_size--;
        delete target;

        return ret; // return the newly built joined subtree!
    }

    // store a temporary size before recursive deletion.
    unsigned int size = m_size;

    if (*target < key)  target->setRight( del(target->right(), key) );
    else                target->setLeft( del(target->left(), key) );

    // if the previous recursion changed the size, we need to decrement the size of this target too.
    if (m_size < size) target->decrSize();

#ifdef DEBUG
    if (RBST::size(target) != target->size())
        BUG("in del");
#endif

    return target;
};

/**
 * Joins the two subtrees represented by left and right
 * by randomly choosing which to make the root, weighted on the
 * size of the sub-tree.
 */
RBSTNode* RBST::join(RBSTNode* left, RBSTNode* right)
{
    if (left == NULL) return right;
    if (right == NULL) return left;

#ifdef DEBUG
    cout << "join(";
    left->print(cout);
    cout << ",";
    right->print(cout) << ") called." << endl;
#endif

    // Find the chance that we use the left tree, based on its size over the total tree size.
    // 3 s.d. randomness :-p e.g. 60.3% chance.
    bool useLeft = ((rand()%1000) < (signed)((float)left->size()/(float)(left->size() + right->size()) * 1000.0));

    RBSTNode* subtree = NULL;

    if (useLeft)
    {
        subtree = join(left->right(), right);

        left->setRight(subtree)
            ->setSize((left->left() == NULL ? 0 : left->left()->size())
                        + subtree->size() + 1 );

#ifdef DEBUG
        if (size(left) != left->size())
            BUG("in join 1");
#endif

        return left;
    }

    subtree = join(right->left(), left);

    right->setLeft(subtree)
         ->setSize((right->right() == NULL ? 0 : right->right()->size())
                    + subtree->size() + 1);

#ifdef DEBUG
    if (size(right) != right->size())
        BUG("in join 2");
#endif

    return right;
};

/////////////////////////////////////////////////////////////
/////////////////////  FIND FUNCTIONS ///////////////////////
/////////////////////////////////////////////////////////////

/**
 * Tries to find the key in the tree starting
 * search from target.
 *
 * Returns NULL if it was not found.
 */
RBSTNode* RBST::find(RBSTNode* target, const Key& key)
{
    countFind++; // Could use private method only counting the first call.
    if (target == NULL) return NULL; // not found.
    if (*target == key) return target; // found (does string override ==?)
    if (*target < key) return find(target->right(), key); // search for gt to the right.
    return find(target->left(), key); // search for lt to the left.
};

#ifdef DEBUG

unsigned int RBST::size(RBSTNode* x)
{
    if (x == NULL) return 0;
    return 1 + size(x->left()) + size(x->right());
}

#endif

Я сохраню SkipList в другой раз, поскольку по ссылкам уже можно найти хорошие реализации SkipList, и моя версия не сильно отличалась.

Графики, сгенерированные из тестового файла, выглядят следующим образом:

График, показывающий время, необходимое для добавления новых элементов для BloomFilter, RBST и SkipList. график http://haf.se/content/dl/addtimer.png

График, показывающий время, необходимое для поиска элементов для BloomFilter, RBST и SkipList график http://haf.se/content/dl/findtimer.png

График, показывающий время, необходимое для удаления элементов для BloomFilter, RBST и SkipList график http://haf.se/content/dl/deltimer.png

Итак, как вы можете видеть, случайное двоичное дерево поиска было намного лучше, чем SkipList. Фильтр Блума соответствует своему O (k).

Zach Scrivena · Answer 2 · 23 февраля 2009

Для этого рассмотрим коллекции на основе хеша, например, HashSet, Dictionary, HashTable, которые обеспечивают постоянное время добавления и удаления элементов.

Дополнительная информация из Руководства разработчика .NET Framework:

Marc Gravell · Answer 3 · 23 февраля 2009

Ну, сколько вам нужно, чтобы запросить его? Связанный список имеет быструю вставку / удаление (в любой позиции), но не так быстр для поиска, как (например) словарь / отсортированный список. Альтернативно, прямой список с парой бит / значение в каждом, то есть «все еще имеет значение». Просто повторно используйте логически пустых ячеек перед добавлением. Удалить просто очищает ячейку.

Для ссылочных типов здесь можно использовать значение «ноль». Для типов значений Nullable<T>.

baretta · Answer 4 · 23 февраля 2009

Вы можете использовать Hashtable или строго типизированный словарь . Клиентский класс может переопределить GetHashCode для обеспечения более быстрой генерации хеш-кода, или, если вы используете Hashtable, вы можете дополнительно использовать IHashCodeProvider.

Mike Bonnell · Answer 5 · 23 февраля 2009

Я был очень впечатлен интервью Channel 9 с Питером Сестофтом:

channel9.msdn.com / шоу / Going + Deep / Peter-Sestoft-C5-Generic-Collection-Library для-C-и-CLI /

Он является профессором Копенгагенского университета информационных технологий, который помог создать библиотеку универсальных коллекций C5:

www.itu.dk / исследование / c5 /

Это может быть излишним, или это просто быстрая коллекция, которую вы искали ...

НТН,

-Mike

Chris S · Answer 6 · 23 февраля 2009

Как вам нужно найти клиентов? Нужен ли Tuple / Dictionary? Вы, скорее всего, найдете что-то, что решит вашу проблему, в библиотеке Power Collections Джеффри Рихтера, в которой есть списки, деревья, большинство структур данных, которые вы можете себе представить.

Коллекция .net для быстрой вставки / удаления

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 6 ]

C5 Generic Collection Library

Коллекции C ++ без блокировки

Microsoft's Take

Google perf-tools

Улучшение времени отклика с запоминанием

О (к)

«Мой» дубль - рандомизировать, чтобы достичь среднего O (log n)

Аппаратные "коллекции"

Случайное дерево двоичного поиска / Фильтр Блума C ++ code

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Коллекция .net для быстрой вставки / удаления

Пожалуйста, войдите или зарегистрируйтесь чтобы ответить на этот вопрос.

Ответы [ 6 ]

C5 Generic Collection Library

Коллекции C ++ без блокировки

Microsoft's Take

Google perf-tools

Улучшение времени отклика с запоминанием

О (к)

«Мой» дубль - рандомизировать, чтобы достичь среднего O (log n)

Аппаратные "коллекции"

Случайное дерево двоичного поиска / Фильтр Блума C ++ code

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Пожалуйста, войдите или зарегистрируйтесь что бы добавить комментарий.

Нет похожих вопросов