Tuesday, 16 August 2011

ConsistentHash implementation in C# 4.0

For those not familiar with ConsistentHash, start here: Consistent Hashing

The following is an attempt to implement the functionality using features of C# 3/4. I needed this functionality for the distributed cache project that I was working on (Available at HoC )

namespace HoC.Common
{
    public class ConsistentHash  : ICloneable
    {
        SortedList<string, string> itemCircle = new SortedList<string, string>();

        public string GetNearestItem(string key)
        {
            if (itemCircle.Count == 0)
                throw new ConsistentHashCircleEmpty();

            string keyHash = Hasher.GetHash(key);

            //find the last item that is just after the passed in key (clockwise)
            Func<KeyValuePair<string, string>, bool> stringCompare = x => (x.Key.CompareTo(keyHash) > 0);
            KeyValuePair<string, string> item = itemCircle.FirstOrDefault(stringCompare);

            if (string.IsNullOrEmpty(item.Key)) 
            {
                //if no item, fallback to the first item => traverse circle clockwise 
                return itemCircle.First().Value; 
            }

            return item.Value;
        }

        public void StoreItem(string item)
        {
            string hash = Hasher.GetHash(item);
            itemCircle[hash] = item;
        }

        public void RemoveItem(string item)
        {
            string hash = Hasher.GetHash(item);
            if (itemCircle.ContainsKey(hash))
                itemCircle.Remove(hash);
        }

        public string GetNextItemInCircle(string key)
        {
            return GetNearestItem(key);
        }

        //this function traverses anticlockwise, whereas GetNearestItem() traverses clockwise
        public string GetPreviousItemInCircle(string key)
        {
            if (itemCircle.Count == 0)
                throw new ConsistentHashCircleEmpty();

            string keyHash = Hasher.GetHash(key);

            //traverse, anticlock wise , find first item
            Func<KeyValuePair<string, string>, bool> stringCompare = x => (x.Key.CompareTo(keyHash) < 0);
            KeyValuePair<string, string> item = itemCircle.FirstOrDefault(stringCompare);

            if (string.IsNullOrEmpty(item.Key))
            {
                //if no item, fallback to the last item => traverse circle anti clockwise 
                return itemCircle.Last().Value;
            }

            return item.Value;
        }



        public object Clone()
        {
            return (ConsistentHash)MemberwiseClone();
        }
    }

    class ConsistentHashCircleEmpty : ApplicationException
    {

    }


}

Couple of drill downs:

1.) Basically each objects gets assigned a hash (Hasher class internally uses RIPEMD160Managed) so that it can be arranged in the circle and then gets stored to the internal circle. RIPEMD160Managed though a bit slower, supposedly has the lowest collision.

2.) The circle as seen is implemented using a SortedList list class - itemCircle

3.) The core functionality is implemented using the two methods:

3.1) GetNearestItem : This method traverses clockwise - basically find the object that comes up next in the sorted list after the given key.

3.2) GetPreviousItemInCircle : Opposite of GetNearestItem. Traverses anti-clockwise.

In the project that I use this internally, the objects are all serializable, hence the objects could be easily represented as a string.


No comments: