https://www.henrik.org/

Blog

Tuesday, July 14, 2015

How to get the most out of your BizSpark azure credits

BizSpark is arguably one of the best deals on the internet for startups. For me the key benefit that it brings is the 5 x $150 per month of free Azure credits. That said they are a little bit tricky to claim.

The first thing you need to do is claim all you BizSpark accounts and then from each of those accounts claim your Azure credits. This blog post describes this process, so start by doing that.

So after doing this you have 5 separate Azure accounts each with $150 per month of usage. However what we want is one Azure account where we can see services from all of these subscriptions at once and that requires a couple of more hoops to jump through. In the end you will end up with one account where you can see and create services from all 5 subscriptions without having to log in and out the Azure management portal to switch between them.

  1. The first step is to pick the one account you want to use to administrate all the other accounts.
  2. This is a bit counter intuitive, but you need to start by adding every other account as co administrators to the account from the first step. Yes, I am saying this correctly. All the other accounts need to be added as administrators to the main admin account (Don't worry, this is temporary).
  3. The following steps need to be done for each of the accounts except for the main account from step 1.
    1. Log into the management console using one of the four auxiliary accounts and go to settings.
    2. Make sure you are on the subscription tab.
    3. Select the subscription that belongs to the account you are currently logged into. It will be the one that has the account administrator set to the account you are currently logged into. If you have done this correct you should see two different subscriptions, one for the subscription you are logged in as and one from the account in step 1.
    4. Click the Edit Directory button at the bottom.
    5. In the image below make sure you select the directory of the main account from step 1. It shouldn't be hard because it will be the only account in the list and pre-selected. If you have already set up any co administrators to the account you will be warned that they will all be removed.
    6. Add the account from step 1 as co administrator to this account as described in the linked to article at the top of the post.
    7. The last step is optional but all the subscriptions will be called Bizspark and hard to keep apart so you might want to rename them.
      1. To do this go to the Azure account portal at https://account.windowsazure.com/Subscriptions. This page tend to be very slow, so be patient following links.
      2. Click on the subscription name. Your screen might look different depending on how many subscriptions you have.
      3. Click on the Edit Subscription Details.
      4. Enter the new name in the dialog presented. You can also optionally change the administrator to the account from step 1 at the top, this will remove the owning account as an administrator from the account all together (Although they are still responsible for billing).
  4. You can now remove all the other accounts from being administrators to the main account that you added in step 2 if you want.

If you follow all these steps when you log into the account from step 1 you should be able to see all of your subscriptions at the same time in the Azure management console like in the screenshot below.

Keep in mind this does not mean that you have $750 to spend as you want. Each subscription still has a separate limit of $150 and you have to puzzle together your services as you create them to keep all of the 5 limits from running out but at least this way you have a much better overview of what services you have provisioned in one place.

Thursday, July 9, 2015

Algorithm for distributed load balancing of batch processing

Just for reference this algorithm doesn't work in practice. The problem is that nodes under heavy load tend to be too slow to answer to hold on to their leases causing partitions to jump between hosts. I have moved on to another algorithm that I might write up at some point if I get time. just a fair warning to anybody who was thinking of implementing this.

I recently played around a little bit with the Azure EventHub managed service which promises high throughput event processing at relatively low cost. At first it seems relatively easy to use in a distributed matter using the class EventProcessorHost and that is what all the online examples provided by Microsoft are using too.

My experience is that the EventProcessorHost is basically useless. Not only does it not contain any provision that I have found to provide a retry policy to make its API calls fault tolerant. It also is designed to only checkpoint its progress at relatively few intervals meaning that you have to design your application to work properly even if events are reprocessed (Which is what will happen after a catastrophic failure). Worse than that though once you fire up more than one processing node it simply falls all over itself constantly causing almost no processing to happen.

So if you want to use the EventHub managed service in any serious way you need to code directly to the EventHubClient interface which means that you have to figure out your own way of distributing its partitions over the available nodes.

This leads me to an interesting problem. How do your evenly balance the load of work evenly over a certain number of nodes (In the nomenclature below the work is split into one or more partitions) which can at any time have a catastrophic failure and stop processing without a central orchestrator.

Furthermore I want the behavior that if the load is completely evenly distributed between the nodes the pieces of the load should be sticky, meaning that the partitions of work currently allocated to a node should stay allocated to that node.

The algorithm I have come up with requires a Redis cache to handle the orchestration and it uses only 2 hash tables and two subscription for handling the orchestration. But any key value store that provides publish and subscribe functionality should do.

The algorithm have 5 time spans that are important.

  • Normal lease time. I'm using 60 seconds for this. It is the normal time a partition will be leased without generally being challenged.
  • Maximum lease time. Must be significantly longer than the normal lease time.
  • Maximum shutdown time. The maximum time a processor has to shut down after it has lost a lease on a partition.
  • Minimum lease grab time. Must be less than the normal lease time.
  • Current leases held delay. Should be relatively short. A second should be plenty (I generally operate in the 100 to 500 millisecond range). This is multiplied by the number of currently processing partitions. It can't be too low though or you will run into scheduler based jitter of partitions jumping between partitions.

Each node also should listen to two Redis subscriptions (Basically notifications to all subscribers). Each will send out a notification that is the partition being affected.

  • Grab lease subscription. Used to signal that the leas of a partition is being challenged.
  • Allocated lease subscription. Used to signal that the lease of a partition has ended when somebody is waiting to start processing it.

There are also two hash keys in use to keep track of things. Each one contains the hash field of the partition and will contain the name of the host currently owning it.

  • Lease allocation. Contains which nodes currently is actually processing which partition.
  • Lease grab. Used to race and indicate which node won a challenge to take over processing of a partition.

This is the general algorithm.

  1. Once every time per normal lease time each node will send out a grab lease subscription notification per each partition that.
    • It does not yet own and which does not currently have any value set for the partition in the lease grab hash key.
    • If it has been more than the maximum lease time since the last time a lease grab was signaled for the partition (This is required for the case when a node dies somewhere after step 3 but before step 6 has completed). If this happens also clear the lease allocation and lease grab hash for the partition before raising the notification since it is an indication that a node has gone offline without cleaning up.
  2. Upon receipt of this notification the timer for this publications is reset (So generally only one publication per partition will be sent during the normal lease time, but it can happen twice if two nodes send them out at the same time. Also when this is received each node will wait based on this formula.
    • If the node currently is already processing the partition it will wait the number of active partitions on the node currently held times the current leases held delay minus half of this delay (So basically (Locally active partitions - 1) * current leases held delay).
    • If the node currently is not busy processing the partition that is being grabbed the node should wait the local active partitions plus one times the current leases held delay (On so fewer words (Locally active partitions + 0.5) * current leases held delay).
  3. Once the delay is done try to set the lease grab hash key for the partition with the conditional transaction parameter of it not being set.
    • Generally the node that has the lowest delay from step 2 should get this which also means that the active partitions on each node should distribute evenly among any active nodes since the more active partitions each individual node has the longer it will wait in step 2 and the less likely it is that they will win the race to own the partition lease.
    • If a node is currently processing a partition but did not win the race in step 2 it should immediately signal its partition to gracefully shut down and once it is shut down it should remove the lease allocation hash field for the partition. Once this is done it should also publish the allocated lease subscription notification. After that is completed this node should skip the rest of the steps.
  4. Check by reading the lease allocation hash value to see if another node than the winner in step 3 is currently busy processing the partition. If this is the case either wait for either the allocated lease subscription notification signaling that the other node has finished from step 3b or if this does not happen wait for a maximum of maximum shutdown time and start the partition anyway.
  5. Mark the lease allocation hash with the new current node that is now processing this partition.
  6. Also after the minimum lease grab time remove the winning indication in the lease grab hash key for the partition so that it can be challenged again from step 1.

When I run this algorithm in my tests it works exactly as I want it. Once a new node comes online within the normal lease time the workload has been distributed evenly among the new and old nodes. Also an important test is that if you only have one partition the partition does not skip among the nodes, but squarely lands on one node and stays there. And finally if I kill a node without giving it any chance to do any cleanup after roughly maximum lease time the load is distributed out to the remaining nodes.

This algorithm does not in any way handle the case when the load on the different partitions is not uniform, in that case you could relatively easily tweak the formula in step 2 above and replace the locally active partitions with whatever measurement of load or performed work you wish. It will be tricky to keep the algorithm sticky though with these changes.

Thursday, June 25, 2015

Designing for failure

One of the first things you hear when you learn about how to design for the cloud is that you should always design for failure. This generally means that any given piece of your cloud infrastructure can stop working at a given time so you need to design for this when constructing your architecture and gain reliability by creating your application with redundancy so that any given part of your applications infrastructure can fail without affecting the actual functionality of the website.

Here is where it gets tricky though. Before I actually started running things in a cloud environment I assumed this meant that every once in a while a certain part of your infrastructure (For instance a VM) would go away and be replaced by another computer within a short time. That is not what designing for failure means. To be sure this happens too, but if that was the only problem you would encounter you could even design your application to deal with failures in a manual way once they happen. In my experience even in a relatively small cloud environment you should expect random intermittent failures to happen at least once every few hours and you really have to design every single piece of your code to handle failures automatically and work around them.

Every non local service you use, even the once that are designed for ultra high reliability like Amazon S3 and Azure Blob Storage can be assumed to fail a couple of times a day if you make a lot of calls to them. Same thing with any database access or any other API.

So what are you supposed to do about it. The key thing is that whenever you try to do anything with a remote service you need to verify that the call succeeded and if it didn't keep retrying. Most failures that I have encountered are transient and tend to pass within a minute or so at the most. The key is to design your application to be loosely coupled and whenever a piece of the infrastructure experiences a hiccup you just keep retrying it for a while and usually the issue will go away.

Microsoft has some code that will help you do this as well which is called The Transient Fault Handling Block. If you are using the Entity Framework everything is done for you and you just have to specify a Retry Execution Strategy by creating a class like this.

    public class YourConfiguration : DbConfiguration 
    { 
      public YourConfiguration() 
      { 
        SetExecutionStrategy("System.Data.SqlClient",
                             () => new SqlAzureExecutionStrategy()); 
      } 
    }

Then all you have to do is add an attribute specifying to use the configuration on your Entity context class like so.

    [DbConfigurationType(typeof(YourDbConfiguration))] 
    public class YourContextContext : DbContext 
    { 
    }

It also comes with more generic code for retrying execution. However I am not really happy with the interface of the retry policy functionality. Specifically, there is no way that I could figure out to create a generic log function that allows me to log the failures where I can see what is actually requiring retries. I also don't want to have a gigantic log file just because for a while every SQL call takes 20 retries each one being logged. I rather get one log message per call that indicates how many retries were required before it succeeded (Or not).

So to that effect I created this little library. It is compatible with the transient block mentioned earlier in that you can reuse retry strategies and transient exception detection from this library. It does improve on logging though as mentioned before. Here is some sample usage.

      RetryExecutor executor = new RetryExecutor();
      executor.ExecuteAction(() =>
        { ... Do something ... });
      var val executor.ExecuteAction(() =>
        { ... Do something ...; return val; });
      await executor.ExecuteAsync(async () =>
        { ... Do something async ... });
      var val = await executor.ExecuteAsync(async () =>
        { ... Do something async ...; return val; });

By default only ApplicationExceptions are passed through without retries. Also the retry strategy will try 10 times waiting for the number of previously tries seconds until the next try (Which means it will signal a failure after around 55 seconds). The logging will just write to the standard output.

Saturday, June 20, 2015

Simple Soap envelope parser and generator

So I figured as a followup to my previous post here is a small sample project of what I would love to find when searching for a person online that is looking for a job.

This library on GitHub is just one class to help you generate and parse a SOAP envelope, something I was surprised to see wasn't actually available in the .Net framework as a stand alone class (Or at least I haven't found it).

Its use is very simple. To create a SOAP envelope you create an instance of the class SoapEnvelope and assign the Headers and `Body` properties (And possible the Exception if you want to signal an error) and then call the ToDocument method to generate the XML document for the SOAP envelope.

To read data simply call the method SoapEnvelop.FromStream or SoapEnvelope.FromRequest and it will return the envelope it parsed from the stream or request. It does support handling GZip content encoding from the request.

Here is a simple round trip example of its use (For more examples check out the tests).

      SoapEnvelope envelope = new SoapEnvelope();
      envelope.Body = new[] { new XElement("BodyElement") };
      envelope.Headers = new[] { new XElement("HeaderElement") };
      XDocument doc = envelope.ToDocument();
      MemoryStream stream = new MemoryStream();
      doc.Save(stream);
      stream.Seek(0, SeekOrigin.Begin);
      SoapEnvelope otherEnvelope = SoapEnvelope.FromStream(stream);

To continue from the previous post from a few days ago. Even though this example is very short it does show a couple of things if I were to evaluate the author of something similar for a job.

  • This is somebody who actually likes to code because otherwise why would he (Or she) even have taken the time to do this.
  • This is somebody that cares about the quality of their code because even though this is a super simple class it contains a small test suite to make sure that it works.
  • This person has at least a decent grasp of the C# and .Net framework and understand how to use inheritance and interfaces to create something new (If you are a coder and doesn't know, it is scary how few people who should know this stuff, do actually know it).

Thursday, June 18, 2015

What I look for when evaluating future hires

Even though I am not a manager and have put a high importance of never becoming one as one of my own personal development goals I do quite often chime in on evaluating future hires both currently for permanent positions or in the past for consultancy contracts and there is one thing that it seems to be an important thing that a lot if not even most software developers are not doing that I put a high premium on when evaluating new candidates for job application.

The first thing I do when I get a resume sent to me for a prospective candidate is that I go to Google and search for their name. If I can't find a single program related name from anything they've ever done online that is a pretty big blotch on their record from my perspective.

My thinking for this is that to be good at software development and like solving problems even if you are straight out of school you will have done one of the following.

  • Asked a question you couldn't figure out, or even better provided an answer to a question for somebody else, on a site like Stack Overflow or CodeProject.
  • Create or participated in an open source project hosted on GitHub or SourceForge.
  • Created some weird obscure website somewhere (Doesn't really matter what it is or how much traffic it has).
  • Create a blog about something. It doesn't have to be old or very active, but at least you've tried.
  • Have some sort of presence I can find on social media, preferably with some comments I can find in relation to software development. Doesn't matter if it is Twitter, Facebook, LinkedIn, Google+ or whatever.

The more of these you can check off the better, but if I can't find you at all that is a huge red flag in my book and you would hopefully be surprised at how common this is for would be software developers.

The problem is if you haven't gotten around to anything of the above to me that signifies that you aren't that into software development and it is just something you do, and generally good coders really like to code and they do it because they like it. If I couldn't make a living for coding I would do it anyway, and most of my public presence online is based on the work I've done when I haven't been collecting a paycheck for it (Since most of the work you do when you do get paid you can't just publish online).

So my advice to anybody who wants to get started working with software development is to sign up for a free account on GitHub and just find a small itch and write an open source application to scratch it. And make sure the repository is associated with your real name so that when I or any other person involved in any hiring search for you we will find it. I can almost guarantee that it will be worth your time.

Thursday, May 7, 2015

C# Task scheduling and concurrency

It is very hard to figure out how the new async Task API for handling threading and concurrency works in .Net 4.5. I have dug around a lot to try and find any documentation on this topic and have mostly failed so when in doubt I decided to simply figure it by writing some test applications that checked how it actually behaved. It is important to note that this is how threading works in a console .Net 4.5 application on Windows 8.1. I would not be surprised if specific numbers of the thread model were different in a server setting, different OS version or even .Net versions. So without further ado here are my findings.

First of all if you simply start a lot of Task's that all run for a long time you quickly notice that by default the .Net runtime will allocate a minimum of 8 threads to run tasks. Then it gets interesting though because for every second that the task queue keep being full another thread is added. This keeps going all the way up to a maximum of 1023 threads. After 1023 threads have been allocated no more threads will be allocated for any reason so any remaining tasks will wait to start until a previous task has completed. If a thread executes no tasks at all for 20 seconds it will be removed from the thread pool.

There are also odd things happening with the order of which tasks are scheduled. For instance if you were to run the following code below it will run very slowly because no threads from the second for loop will be scheduled to run until the thread pool has expanded to run all tasks from the first loop concurrently (So for almost 100 seconds no processing will happen).

      for (int i = 0; i < 100; i++)
      {
        int thread = i;
        firstTasks.Add( Task.Run(() =>
        {
          Thread.Sleep(100);
          // Do something else
          secondTasks[thread].Wait();
        }));
      }

      for (int i = 0; i < 100; i++)
      {
        secondTasks.Add( Task.Run(() =>
        {
          // Do something in the background.
        }));
      }

In fact if you increase the upper bound of i from 100 to 1024 this example will never finish since all the 1023 possible available threads will be taken up with this initial tasks waiting for second tasks to finish which will never be scheduled for execution because of thread exhaustion.

This might seem like a contrived example, but it is actually not that uncommon to end up with a similar scenario if you use non async code inside a task in a complicated multithreaded application. If you instead write the code below like this it will complete almost immediately and not have any issues regardless of how many iterations of the loop you make because the second thread when created within the affinity of the thread that then waits for it actually causes the second thread to be executed immediately on that thread (As long as it hasn't been scheduled to run on another thread already).

      for (int i = 0; i < 100; i++)
      {
        Task.Run(() =>
        {
          Thread.Sleep(100);
          Task secondTask = Task.Run(() =>
          {
            // Do something in the background.
          });
          // Do something else
          secondTask.Wait();
        }));
      }

One last thing you have to be very careful about when it comes to task, especially when using the async syntax is that you have to realize that once you await on something there is absolutely no guarantee that once the execution continues it is on the same thread. So for instance this code is just waiting to creating a deadlock that will be really hard to track down.

      object lockObj = new object();

      Monitor.Enter(lockObj);
      await MethodAsync();
      Monitor.Exit(lockObj);

There really is no way to handle locking securely but if you absolutely need to do locking of a resource while doing async coding you could possibly use semaphores which do not require being reset from the same thread. This generally doesn't lead to good code though and generally if you think about where your synchronization code is you can avoid having locks over awaits but it might take a little bit of extra work.

Sunday, January 25, 2015

Choosing your cloud provider

When you start any coding project you generally need some sort of server capability even if the application your building is not a web site. When choosing your cloud provider there are several different things to think about.

First of all you need to consider if what you need is very basic and will not require a high SLA or the ability to grow with usage you are probably better off choosing a virtual private server provider. If you are fine with a Linux box these can be had extremely cheap. I used to be a customer of Letbox and at the time they provided me with a Virtual Private Server for $5/month, a price that is hard to beat. It is however important to realize that this is not a true VM, it is a specialized version of Linux similar to doing a chroot but also with quotas on memory and CPU usage. This means that these VM:s can only run Linux. That said the price is simply in a league of itself, cheaper usually than even spot instances of AWS EC2.

However once you have something slightly more complicated to run you probably want to go with a "real" cloud provider. These come in two kinds. The first level are companies providing infrastructure as a service (IaaS). This means basically providing virtual machines, storage and networking for them. It is up to you to build everything you need to run off of these primitives. Companies that offer only this kind of computing includes Skytap, Rackspace (Although Rackspace does have some platform services) and many more.

The next level up are the companies that provide platform as a service (PaaS). All of these companies also provide the infrastructure as well if you need it, but on top of this they provide useful packages that they will run for you as managed services to help creating, deploying and running your services easier. These minimally usually include.

  • Load balancing of traffic to multiple servers.
  • Auto scaling of new servers to handle varying load.
  • A fast and scalable NoSQL key value store.
  • A managed transactional database.
  • Web hosting.

There are as I see it three main players in this space and they are Amazon Web Services (AWS), Microsoft Azure and Google App Engine.

Of these Amazon is by far the largest. AWS started out as a mainly platform as a service offering, but now has one of the most complete set of managed services and they have by far the largest set of data centers located all around the world and one region qualified for US government work loads (Having an account on it requires you to be a US citizen so I can not use it). Their infrastructure is truly top notch, but their development tools are not great. Only a few languages have an official SDK (I myself have been missing an SDK for PERL).

Microsoft approached this space from the opposite direction from Amazon and started out by offering specific platform solutions and tightly integrating the development and deployment of Azure applications into their development tool Visual Studio. It is the only cloud provider I am aware of that for a time did not provide IaaS at all (Although they do now). The SDK and tooling for all of their products is truly excellent, especially if you are a .Net C# developer, but many other languages are supported as well. They do unfortunately and understandably run most of their infrastructure on Windows which simply is not as solid as other hyper-visors out there and if you are building a solution that requires reliable quick processing this can be a problem, especially if you have a cluster of synchronized machine this can become really problematic. These synchronization issues usually only occur a few times a month though as the service is migrated to new machines as all the VM:s running the service undergo the monthly Windows patching. However as long as your application does not rely on tight synchronization between several systems you are unlikely to notice it.

Finally there is Google. Their solution is similar to Amazon something that has grown out of their own business and they have several offerings that are obviously simply surfacing of their internal operations like for instance Big Query. Google's infrastructure is fantastic in regards to reliability and performance. They do though in my opinion offer the most narrow platform solution of the big three. What they do provide though is truly top notch, and they are also priced accordingly unfortunately.

Price wise the big three are relatively similarly priced. If your application can take advantage of the AWS spot pricing you can get away with really cheap solutions though. Google is usually the most expensive (I say usually since the prices change all the time for cloud services). One thing that could be worth investigating is if you qualify for a Microsoft Bizspark membership because if you do you will receive $150/month of free credits to use for almost anything in Microsoft Azure (And it also includes licenses to almost every product that Microsoft has in their very extensive portfolio).

Saturday, January 24, 2015

How to get more free build or test minutes with your Visual Studio Online account

If you are one of the lucky ones who has an MSDN or a BizSpark subscription (One of the best deals around on the internet) and use the hosted build environment of Visual Studio Online it is tricky that you only get 60 minutes of free build time a month if you want to do continuous integration (Which you should do!) using it. However I just discovered a trick how to get around this limit by accident.
  1. First of all log into your Azure management console and then go to the tab for Visual Studio Online subscriptions.
  2. Then click on the unlink button at the bottom when you have the subscription you want new minutes for building or testing on. You will get a warning about losing any licenses you have purchased through your Azure subscription for Visual Studio Online so if you have done that you can't use this trick.
  3. Then click on new at the bottom left of the management screen to link your Visual Studio Online account back to your Azure subscription.
  4. Select the Visual Studio Online account you unlinked earlier and make sure you have the correct subscription selected in the drop down (It defaults to the pay as you go subscription so you will need to change this).
  5. Press the link button in the lower right.

That's it, if you go back your home page on Visual Studio Online you should be able to see that you have a new allotment of build and test minutes.

DISCLAIMER: You might be violating your terms of service with Microsoft by doing this and I also expect Microsoft to fix this at some point so you at your own risk.

Thursday, January 22, 2015

Problems with singletons

One of the most basic software design patterns is the singleton pattern and you'd think this wouldn't be one that would cause you problems but in C# it can be surprisingly difficult and I just spent a couple of hours trying to track down a bug because I hadn't implemented one properly. The place in question was using the first implementation and was being used from multiple threads that all started at the same time.

This is the simple pattern and it almost always works except when you are doing the first access from multiple threads at the same time, but when it doesn't it can be a real hard bug to find.

  internal class Simple
  {
    private static Simple instance;

    public static Simple Instance
    {
      get
      {
        if (instance == null)
          instance = new Simple();
        return instance;
      }
    }
  }

It should be pretty obvious that this class would have problems with concurrency so the simple solution is to just add a lock around the whole thing.

  internal class Lock
  {
    private static readonly object lockObj = new object();
    private static Lock instance;

    public static Lock Instance
    {
      get
      {
        lock (lockObj)
        {
          if (instance == null)
            instance = new Lock();
        }
        return instance;
      }
    }
  }

This class is simple and does work, but getting the lock has a performance penalty which makes it useful to keep looking.

  internal class DoubleLock
  {
    private static readonly object lockObj = new object();
    private static DoubleLock instance;

    public static DoubleLock Instance
    {
      get
      {
        if (instance == null)
        {
          lock (lockObj)
          {
            if (instance == null)
              instance = new DoubleLock();
          }
        }
        return instance;
      }
    }
  }

This class is a little bit more complicated, but it has the advantage that except for the very first check there is no locking required. It does rely on the assignment of a reference to a variable being an atomic assignment, but this is fortunately a valid assumption.

However, you can also use the C# runtime to help you create the singleton using a static constructor.

  internal class Static
  {
    private static Static instance = new Static();

    public static Static Instance
    {
      get
      {
        return instance;
      }
    }
  }

This is pretty much as efficient as it gets, you even got rid of the check for null. And this is thread safe as well. It does have the disadvantage of the instance of the singleton being created right before the first access of anything in the class, which might not be what you are looking for if there are more static methods in the class. The following class is based on the previous concept but does not create the singleton until the first time it is accessed.

  internal class DoubleLazy
  {
    private static class LazyLoader
    {
      public static DoubleLazy instance = new DoubleLazy();
    }

    public static DoubleLazy Instance
    {
      get
      {
        return LazyLoader.instance;
      }
    }
  }

The nested class static constructor will not be called until you read the instance. If you are running C# 4.0 or later there is a helper class that makes this easy to do as well using a lambda expression.

  internal class NewLazy
  {
    private static Lazy instance = new Lazy(() => new NewLazy());

    public static NewLazy Instance
    {
      get
      {
        return instance.Value;
      }
    }
  }

This method also allows you to check if you have instantiated the singleton or not (You can still do that with the first implementations, but it is not possible with any of the ones that using the static initializer). So which one should you chose. It might depend on different aspects, but if the only thing you care about is performance I made some relatively unscientific measuring and came up to the following list.

  • The simple static initializer is the absolute fastest implementation.
  • The nested static initializer is only slightly slower.
  • The simple non thread safe solution is slightly slower.
  • The double lock solution is only slightly slower than the the previous three.
  • The lazy lambda expression solutioin takes roughly 50% longer to run than any of the previous solutions.
  • The lock solution is roughly 150% slower than any of the first 4 solutions.

That said even the slowest solution can still perform roughly 40 million accesses to the singleton per second from a single thread on my laptop so unless you access it a lot it really doesn't matter.

Wednesday, January 21, 2015

Caching in a distributed applications

In the basic concept caching of data is really simple. You have a small fast storage medium of limited size and in it you save a subset of items from a larger slower storage medium that you are likely to use often. The typical example would be CPU on die cache which is in any modern CPU or disk caching in any modern operating system. However even in this case it starts getting complicated when you start adding multiple cores and need to make sure data from one CPU isn't being accessed from another CPU while a new value is only available in the first CPU's on die cache.

When you start developing distributed application this problem becomes incredibly complicated and you need to really think about what kind of data you are dealing with at every moment to make sure you end up with a high performance final application. Data tend to fall into one of several different categories.

  • Static data that never changes, but is too large to keep in memory on every instance needing it. This is obviously the easiest kind of data since you can just keep as many as possible available in local memory and start letting old ones slip from memory once you run low or after it hasn't been used for a certain time.
  • Seldom changing data that isn't critical that it is always completely up to date. This can also be cached locally, but you have to make sure you throw away the cache after a certain amount of time to make sure your data doesn't become too stale. Changes can be written directly to a database since they happen relatively seldom.
  • Seldom changing data that need to always be read in proper transactions correctly. Since this data is changed seldom you could just not cache this data. Alternatively you could use an in memory central cache like Redis or Memcached, you just have to make sure every place that access it uses the same method, it is also kind of tricky to deal with transactional changes on in memory databases, but it can be done.
  • Rapidly changing data that isn't critical that it is always completely up to date. Works pretty much the same as seldom changing data except that you probably want some sort of in memory cache for the changes so that you don't have to post every change to a normal database. You can use the in memory database to help you create batches of changes that you post periodically instead of with every change.
  • Rapidly changing data that need to always be read atomically correct. This one is tricky and there isn't really any good way of dealing with this except if you can figure out a way in which messages that need to access the same data ends up on the same node for processing every time. Usually this can be done by routing data based on a specific key and then cache locally based on that. Since you know that all messages that need the data will always end up on one node this should be safe. You do need to make sure you properly handle the case when a processing node goes away though.

Sunday, January 11, 2015

Check out this bag that my mom and wife made from old Dunkin Donuts coffee bags

Check out this awesome bad that my mom and wife made from used Dunkin Donuts coffee bags! I have some more pictures over here.