Searching stuff goes here
I'm Chris Hannon, a software engineer in Salt Lake City, UT.
More about me...
Archive
Tuesday Quiz Day
By: Chris
Filed under: Quiz Of The Day

public delegate void StringAction(string value);

public class Base
{
public void Print(string value) { Console.WriteLine("Base"); }
}
public class Derived : Base
{
public void Print(object value) { Console.WriteLine("Derived"); }
}

And here is some code that calls that code.

var derived = new Derived();
var action = new StringAction(derived.Print);
action("Hello");

What will this print out and why?

Show Answer

This will print out "Derived". There are two things to consider here. The first is that the most derived applicable method will be called in preference over less derived methods that might be more specific. In other words, if you called derived.Print("Hi") directly, it would print out "Derived" because it’s both applicable and more derived than the Base version.

The second thing to consider is the issue of variance. The delegate specifically takes a string as a parameter, so it needs to find the most applicable method group that is also assignment compatible with the delegate signature.

In C# 1.0, the code above would have printed out "Base". Variance in method group conversions was introduced in C# 2.0, which was a breaking change.

In this code, the delegate signature takes a string parameter and returns void, so the Print(string) method group is definitely applicable. With the addition of contravariance, the Print(object) method group is now both more derived and applicable, so it will be called instead. Presumably because this was a breaking change, the compiler will give you a warning under these circumstances:

Delegate 'StringAction' bound to 'Derived.Print(object)' instead of 'Base.Print(string)' because of new language rules


This was taken from Jon Skeet's answer on StackOverflow

I do not claim that these originated from me (although the explanations are written by me). These have come from various people and places, including Eric Lippert, Jon Skeet, Joe Duffy, Stack Overflow and too many blogs and articles to count. Where I can remember, I will provide a link to where it came from. If I can't, forgive me (or provide me with a link to the original location and I will try to include it).
TFS - Unshelving To Another Branch
By: Chris
Filed under: Uncategorized
Every so often, when using Team Foundation Server, I need to migrate a shelveset to another branch. I may have started working on code in one branch, the task shifts priority and gets put in a different branch and then my code needs to be moved. I don't want to lose anything, but checking in, merging and trying to roll back is an obnoxious thing to do.

TFS Power Tools is filled with a lot of useful things, including a command for shelveset migration. After installing TFPT, I shelve my code and open up a command prompt.

"C:\Program Files (x86)\Microsoft Team Foundation Server 2010 Power Tools\TFPT.exe" unshelve MY_SHELVESET /migrate /source:"$/SOURCE_PATH" /target:"$/DESTINATION_PATH"

At this point, there are a few things to be aware of.

1) TFPT needs to be executed from within a folder that's mapped to the workspace in question. For example, if I've mapped the workspace root ($) to D:\TFS then I need to go into at least D:\TFS in the console and then run tfpt.exe to be successful.

2) The server paths must be correct. It's easy to mistype, but it won't call out that this is incorrect. A mistyped source path will result in unshelving the shelveset to the original source location (not the destination or the mistyped source). A mistyped destination path will result in unshelving to that path, which doesn't actually exist and will cause issues when attempting to check in. It's a good idea to just open up the Source Control Explorer in Visual Studio (or however you examine your repository) and copy/paste the location into the console.

After that works, merge conflicts can be resolved as normal, with automerge or a manual merge with your tool of choice.
Thursday Quiz Day
By: Chris
Filed under: Quiz Of The Day

struct A { B b; }
struct B { C c; }
struct C { A a; }

Is this legal? Why or why not?

Show Answer

This is not legal and will cause a compile error. Structs are value types and must have a non-null default value, so they are limited in their direct dependencies. This particular layout would cause an infinite cycle of allocation: Creating A would require the creation of B (in its default state), which would require creation of C, which would start over at A again. If any of these were classes, this would be legal because the default value for a class would be null and would not require immediate instantiation.


This example was taken from the C# 4.0 language specification. The relevant section is Section 11.3.1 (Value semantics).

I do not claim that these originated from me (although the explanations are written by me). These have come from various people and places, including Eric Lippert, Jon Skeet, Joe Duffy, Stack Overflow and too many blogs and articles to count. Where I can remember, I will provide a link to where it came from. If I can't, forgive me (or provide me with a link to the original location and I will try to include it).
Thinking Concurrently, Part 3
By: Chris
Filed under: Random Thoughts, Software
Last time, I said we’d take a look at how to apply the techniques we talked about to the problematic code in the first post. Here’s that code again.

List<String> formattedItems = new List<String>();
foreach (Item someItem in sharedItemsCollection)
{
object obj = new object();
//Make sure that nobody can touch the object while we’re formatting
lock (obj)
{
Formatter itemFormatter = someItem.GetFormatter();
formattedItems.Add(itemFormatter.Format(someItem));
}
}

And here are the things we could do.

1) Be selfish and don’t share
2) Don’t share the actual item, just a copy
3) Share the item but don’t allow changes to it
4) Share the item, allow changes but minimize the sharing time

Given that, in the original code, this was actually being run in a single threaded manner, we could assume that we’re already at #1, remove all attempts at locking and call it good. How could we best handle the concurrency issues if this code was actually being run under multiple threads, though?

We really can’t make that decision based solely off of this code sample.

As tempting as it is to say “Oh, just slap a lock around everything”, we have no idea what any of those methods do, where those variables come from and how much access to shared state any of those have.

As such, I’m going to introduce a hypothetical implementation that we can reason about.

public sealed class Formatter
{
public string Format(Item item)
{
return String.Format("{0}{1}", item.Id, item.Name);
}
}
public sealed class Item
{
public int Id;
public string Name;
private Formatter formatter;
public Item(Formatter formatter)
{
if (formatter == null)
throw new ArgumentNullException("formatter",
"Formatter must not be null");

this.formatter = formatter;
}
public Formatter GetFormatter() { return this.formatter; }
}
public static class ListFormatter
{
public static List<String> Format(List<Item> sharedItemsCollection)
{
List<String> formattedItems = new List<String>();
foreach (Item someItem in sharedItemsCollection)
{
//Removed unnecessary locking
Formatter itemFormatter = someItem.GetFormatter();
formattedItems.Add(itemFormatter.Format(someItem));
}

return formattedItems;
}
}


This will take a list of items in the Format() call and iterate over them, formatting each item with whatever formatter happens to be attached and return a list of the results. Not impressive by any means, but it should help to provide context.

At this point, it’s slightly more clear that there is a much deeper issue. We are being handed shared state in the call to Format() and it’s remaining shared. Code farther up the chain may be giving this list out to everyone and we have no idea what they might be doing with it. We may get halfway through formatting an item and it changes values because of the actions of some other thread!

Locks only work if all access to the shared state is restricted in the same way. Serializing access in multiple ways, or only partially, is like trying to synchronize your children’s access to other people’s toys. Your kids may be behaving perfectly fine, but a child who’s not under your control may come up at any time and rip the toy to pieces (or put it away, if they’re nice).

So what can we do?

We still don’t know how to solve this because we don’t have a complete picture of what’s happening. Let’s go upwards and see who’s using this and how they’re using it. Here’s another hypothetical implementation.

public static void WriteAllFormattedFiles(string primaryFilePath,
string archiveFilePath)
{
List<Item> sharedItems = GetItems();
ThreadPool.QueueUserWorkItem((state) =>
{
var formattedItems = ListFormatter.Format(sharedItems);
WriteFile(primaryFilePath, formattedItems);
});
ThreadPool.QueueUserWorkItem((state) =>
{
var formattedItems = ListFormatter.Format(sharedItems);
WriteFile(archiveFilePath, formattedItems);
});
}
public static void WriteFile(string filePath, IEnumerable<String> fileLines)
{
//Some read-only implementation happens here
}


That clears things up a lot more. The caller loads a single list of items, formats them in separate threads and writes the results to a file. Frankly, it looks like a copy/paste job. Maybe the original developer was in a hurry.

Right now, we’re going to make a few large assumptions solely for the ease of not complicating the explanation too much.

1) GetItems() returns a new List with new contents each time.
2) WriteAllFormattedFiles() is the only place that is actually using the ListFormatter class.

Next, there are a few steps in analysis that should happen before we even think about changing code, and the first step is a big one.

Are multiple threads even necessary?

This is a very important thing to verify. Multiple threads are not a guaranteed cure for anything! You do not automatically get performance boosts or high fives from random strangers by using more than one thread. The exact opposite may be the case: You may actually slow your program down if you use threads in the wrong way. If you can avoid multithreading because you just don’t need it, that’s a great way to clean up all of your concurrency issues and improve your code in one shot.

How do you know if you need them and how many you need? You determine where your specific line is for performance, memory or whatever you’re optimizing for. This is your Good Enough line: If your code passes it then it’s good enough to use, and that is a custom choice for your particular situation. If it’s not good enough, you will want to test a variety of approaches to fix that problem. Threading may be one of those tools in the box, but it is by no means the only tool. If your solution is good enough, then you’re done.

Much like threading not being an automatic cure, there is no magic number that will tell you how slow is too slow and what to do about it. You need to figure both of these out before you preemptively try to solve something that’s not there.

For right now, though, we’ll add one more hand-wavey assumption to the list.

3) Two threads from the ThreadPool have been determined to be the optimal number of threads for our needs, specifically to speed up file writing operations, and they are necessary.

Now that we’ve gotten our assumptions out of the way and have determined that it’s necessary to use threads like this, let’s keep going. We can see clearly where the shared state is coming from and how it’s being used, so let’s step back and take a look at the rest of the moving parts before we get hurt.

1) The only access to the list of items is read only, regardless of which thread uses it
2) Identical formatting is happening on two separate threads.
3) File writing is different on each thread, but only in terms of where it writes to.

The second step in the analysis is also a big one.

Do we currently have any concurrency issues?

Because of our assumptions and the way the moving parts interact, the answer is probably “no”. If any of those change, that may not be the case.

If GetItems() returns a reference to a cached list that may change at any time, we most definitely can’t rely on nothing changing.

If WriteFile() modifies the items that it enumerates over, we can’t rely on the state of the items remaining consistent for the next WriteFile() call.

If other places are using ListFormatter, it becomes ambiguous as to how it affects this code until we know what they are doing. At very least, it could introduce additional use cases for the class and limit the refactoring that can be done. In a worse case, it could mean that the other location has completely different semantics and may even modify state that we're a part of.

List<T> is thread-safe for multiple readers, hence the safety with reading through it in multiple threads, but if we start using a different collection that has an enumerator or a GetEnumerator() call that’s not safe in these circumstances then that’s a different story.

And so on. We could get really, really fine-grained in the analysis of this, but let’s stay at a higher level to try and get the general concepts across.

This is kind of scary. It’s not particularly clear from the code that this is a Jenga tower of potential harm that could come down at any time and create concurrency problems.

Let’s check the list and see if any of these are possible ways to improve our situation.

1) Be selfish and don’t share

Is there a way that we could avoid sharing the items list across multiple threads?

Most definitely. We can pull the creation and formatting out of the threads entirely and just send the same formatted items list into the WriteFile() call. This is an option because of our assumption that WriteFile() doesn’t modify the items that it enumerates through. If it did modify them, we wouldn’t be able to use the same instance for both calls because the state of the items may have changed from call to call.

2) Don’t share the actual item, just a copy

We could make two lists, with a deep copy of the entire contents, and send the second copy into the second thread. We’d spend a lot more memory, especially if the list was quite large to begin with, but each thread would be free to do whatever it wanted with the data. Nobody’s changing the data, though, so that seems like overkill.

3) Share the item but don’t allow changes to it

This is currently what’s happening. As long as the code doesn’t change, we’ll be fine, but these semantics are not at all clear from the code. We know that the code, as is, does not rely on any behavior that’s not found in IEnumerable<T>, such as retrieving items by index, so using an IEnumerable<Item> instead of a List<Item> would be preferable (or, if that functionality is actually required, wrap the list in a readonly implementation, like System.Collections.ObjectModel.ReadOnlyCollection<T>). This is better because it will make an explicit statement about the maximum expected functionality, and if somebody wants something more then they must consciously disable this safety net.

4) Share the item, allow changes but minimize the sharing time

Nobody is modifying the list or its contents and so synchronizing access is unnecessary. If code did modify the list, though, a new locking scheme would need to be thought of, designed, implemented and tested to allow all read and write locations to see the lock and use it appropriately.

Given all of those, where do we stand?

The fourth option involves introducing bad things into the code.

The third option potentially involves a very minor refactor of ListFormatter and WriteAllFormattedFiles(), but we still have the issue of formatting the entire collection twice.

The second option involves a doubling in the memory we’re using, a slight refactor of WriteAllFormattedFiles() and we still have the issue of formatting the entire collection twice.

The first option involves a very minor refactor of WriteAllFormattedFiles() and also solves the double formatting issue.

So what do we do? We don’t currently have any issues but the first option sounds great for one reason: We would be reducing the scope of potential problems. We could even eliminate the need for the ListFormatter class entirely.

If we do change code in the future, would you rather have to deal with possible concurrency issues in a single method or in several methods and a class?


public static void WriteAllFormattedFiles(string primaryFilePath,
string archiveFilePath)
{
List<Item> sharedItems = GetItems();
IEnumerable<String> formatItems = from i in sharedItems
let formatter = i.GetFormatter()
select formatter.Format(i);
IEnumerable<String> formattedItems = formatItems.ToList();
ThreadPool.QueueUserWorkItem((state) =>
{
WriteFile(primaryFilePath, formattedItems);
});
ThreadPool.QueueUserWorkItem((state) =>
{
WriteFile(archiveFilePath, formattedItems);
});
}


So what did we gain?

1) We narrowed the scope of possible problems. That's not to say that this is free of problems! We're still relying heavily on enumerating in a way that's threadsafe, among other things, but the overall reach of the problem space is reduced. Flagging this for a potential refactor to tighten it up in the future would be a good idea.

2) We made it more obvious what the intended functionality was. We're not handing around List<T> any more than we have to, we're performing the formatting in a straightforward way that doesn't rely on jumping through static methods, and that should help going forward.

We also made some huge assumptions and glossed over additional analysis, but the point is that you can’t rely on a single solution for your concurrency issues. There is no One Size Fits All answer. Careful analysis of the code is necessary to come up with something that fits your (and your customer’s) specific needs, and the solution may not be exactly where you thought it would.

Next time... How did we get into this mess and how can we avoid it in the future?
Wednesday Quiz Day
By: Chris
Filed under: Quiz Of The Day
Array covariance is always a fun topic for a few reasons. 1) It’s been around since C# 1.0, 2) It’s not usually in the forefront of people’s mind and 3) It’s a broken kind of covariance. For example, this is both legal and a bad idea.

object[] badArray = new string[] { "OH", "NO" };

It’s a bad idea because you can then try to do this, which will compile just fine but blow up at runtime because the underlying array is not really an array of object.

badArray[0] = new SomeClass();

Covariant array conversions only work when they involve reference types, though. Covariant value type array conversions are not supported, and so the following are both illegal.

object[] illegalArray = new uint[10];
int[] illegalArray2 = new uint[10];

But that brings up the question. Does this succeed at runtime? Why or why not?

int[] x = (int[])(object)new uint[10];

Show Answer

This will work because it both defers the conversion to the runtime and int/uint are represented by the same int32 type at runtime. Operations on a given type may assume a signed/unsigned context, however, and have completely different results or blow up.

uint[] y = (uint[])(object)new int[] { -1 };
Console.WriteLine(y[0]); //Will print out 4294967295

Even though the value was coerced to fit the type, it is still pointing to the same array of int underneath with a single element containing a -1 value. If you debug this and inspect the array elements of ‘y’ in Visual Studio, the visualizer does not know how to handle the case where the type has a value out of the expected range and will give you a question mark.

This came from Stack Overflow

I do not claim that these originated from me (although the explanations are written by me). These have come from various people and places, including Eric Lippert, Jon Skeet, Joe Duffy, Stack Overflow and too many blogs and articles to count. Where I can remember, I will provide a link to where it came from. If I can't, forgive me (or provide me with a link to the original location and I will try to include it).
Why Not Refactor?
By: Chris
Filed under: Random Thoughts, Software
Nearly every codebase has pieces of it that everyone dislikes and wishes that they could refactor, and I’ve had more than a few conversations with people who want to refactor things and make it better. I’ve typically said “no” with a few caveats. So why wouldn’t you just rip into the code and change it? It’s bad, everyone knows it’s bad, why would you just leave it there?

It actually depends on your situation. If you’re programming for fun and have plenty of free time, you may not care. Let’s redo it a thousand times, it’ll be perfect and work beautifully and the solution will be the most elegant piece of engineering you’ve ever seen. This is probably not the case in a business setting, where you tend to have constraints on what you can reasonably do.

A smaller business may still fall under the category of Do Whatever We Want, but it becomes increasingly more difficult to justify this as a business becomes larger and more and more customers depend on the software you produce. At that point, you have a different set of potential constraints, so we have to be careful and ask some questions.

The first one is the most important and dictates whether you even ask the other questions.

What’s the benefit of this refactor?

Presumably you don’t just get bored with how the code looks and want to change it, you think that there’s something worthwhile to gain from changing your code. It might help you test it easier, it might make it more secure, it might just make it easier to read and develop against.

Years ago, Eric Gunnerson wrote a good post on adding new features to C#. The idea is that every feature starts at negative 100 points and has to work its way up to be considered as potentially worthwhile. This same principle should apply to refactoring. Convince me that this is really worth doing. Are we going to plug a security hole? Are we going to gain untold hours in developer productivity by organizing a spaghetti-ridden codebase? Would we spend hours of developer time refactoring an entire service to use dependency injection, making unit testing with mocks easier, when the overall service architecture is not amenable to that style of testing?

Assuming your refactor is worth doing, though, we still need to know more about it.

What’s the complexity of the refactor?

Would this involve replacing an entire layer of code? Is this just a single method? Who depends on this code? More importantly, do we understand what it is actually supposed to do? In every legacy system that I’ve encountered, there are parts of the code that aren’t documented, aren’t understood and have a strangeness about them. It could be that the original developer misunderstood the requirements, it could be that the specifications changed along the way; We don't know.

I wrote something about this a ways back and it’s often a very difficult thing to determine. There may be pieces of code where nobody really knows what it’s supposed to do, but if it stops working the way it currently does then that will be a very, very bad thing.

Assuming you can determine the complexity and it’s acceptable, we need to know one more thing about the refactor itself.

Would this refactor involve a breaking change and who would it affect?

This depends much more on the software product and the target audience. If you’re creating libraries that will be used by thousands of customers in their own code, a breaking change that is visible to them will be an extremely serious thing to consider. If the only code you break is your own, that may be more acceptable.

At this point, if the refactor is still worth doing, we have a different series of questions to answer.

Do we have a schedule to meet?

Quite a lot of software has release cycles. We’ll add some new features, fix some bugs, do some cleanup and release it in N days. After that, we’ll do it again with a different set of features and bugs, and so on. Getting your refactor on the schedule is a product of several things. How many other things are vying to get into that timeframe? You’re most likely not in perpetual maintenance mode because you have a product that nobody wants anything more from, so you have to make choices. What has a higher priority? Do we take the time to add a new feature that will help customers and make more money or do we use that time to modify code that is currently working fine and possibly break it? Exactly how worthwhile is this refactor compared to everything else we could possibly do?

Do we have enough money?

Your company probably has a budget for development. If you have enough money to hire another developer, you might be able to offset the cost to the schedule by having another person available to do additional work. Developers are expensive, though, and if you could afford more employees then you would probably already have them. You work with what you can afford.

Do we have the resources to test additional changes?

You can make all of the changes you want, but it makes no difference if you can’t test them properly and so we have to start at the beginning again. Do we have a testing schedule to meet? Do we have enough money to handle the employee cost of testing these additional changes? If the answer to any of these is “no”, then you should scale your development back.

Are there additional considerations?

You may have a documentation team that needs all of the above for your change. You may have to let customers know, you may have to provide a smooth transition from one codebase to the next, you may have to synchronize your changes with database changes or changes by another team. All of these and more could potentially impact whether your refactor can happen.

Now I’m discouraged and it feels like nothing will ever change

That’s true, it may take a very long time to get your particular refactor onto the list. It may never actually happen, and sometimes that’s difficult to hear. A lot of worthwhile changes never happen, and the same is true with features and bugs. Maybe feature X is amazing, but it’s contending with fifty other things that are slightly more amazing and you can’t have it all.

You do what you can with what you have, and sometimes you can’t do things that you’d really like to do because of that. On the other side of the coin, though, this refactoring didn’t happen because you got a lot of other things done and those can more easily be done correctly and to the best of your ability. You can make it so that, five years later, people aren’t looking at your code and wishing desperately that they could refactor it.
Monday Quiz Day
By: Chris
Filed under: Quiz Of The Day

int x = 5;
byte b = x;

This will give you a compile-time error stating that you "cannot implicitly convert type 'int' to 'byte'" and asks if you're missing a cast.

That said, is the following legal? Why or why not?

const int x = 5;
byte b = x;

What about this?

const int x = 256;
byte b = x;


Show Answer

The first const example is legal, the second is not. Because the value is constant, it is definitely known at compile time and can be validated as such against types for assignment compatibility (including value compatibility). The C# specification says "A constant expression of type int can be converted to type sbyte, byte, short, ushort, uint or ulong, provided the value of the constant expression is within the range of the destination type". Unsurprisingly, that's what's happening here.


This came from StackOverflow.

I do not claim that these originated from me (although the explanations are written by me). These have come from various people and places, including Eric Lippert, Jon Skeet, Joe Duffy, Stack Overflow and too many blogs and articles to count. Where I can remember, I will provide a link to where it came from. If I can't, forgive me (or provide me with a link to the original location and I will try to include it).
Thursday Quiz Day
By: Chris
Filed under: Quiz Of The Day

public class ____
{
public int _ = 1;
public int __ = 2;

public static ____ ___()
{
return new ____();
}

public int _____()
{
____ _ = ____.___();
int ___ = _._ + _.__;

return ___ * this._;
}
}


Is this legal?

Show Answer

This is completely legal. Regardless of the illegibility of the code, an underscore is a valid character to use as an identifier.


I do not claim that these originated from me (although the explanations are written by me). These have come from various people and places, including Eric Lippert, Jon Skeet, Joe Duffy, Stack Overflow and too many blogs and articles to count. Where I can remember, I will provide a link to where it came from. If I can't, forgive me (or provide me with a link to the original location and I will try to include it).
Sunday Quiz Day
By: Chris
Filed under: Quiz Of The Day
Languages interest me: How they're structured, where the meaning is contained and how it's effectively communicated. Our brains have an extraordinary ability to infer meaning from context and ambiguity, and this is also an area where a computer falls incredibly short. Computer languages must be much more constrained and rigid to allow for a decrease in ambiguity. Even with the reduction in ambiguity, though, there are always bits of weirdness.

I like to keep track of those places that are either implementation-defined in a particular language, have special issues surrounding them or just don't behave in a way that's intuitive. It may just make you say "Huh, I didn't know that" and go back to eating soup (presumably you have some). That's what these Quiz Day posts are.

The language that I'm most familiar with is C#, so all quizzes will be in C# unless I've specifically noted otherwise.

Let's start off with a simple one.


try
{
return true;
}
finally
{
return false;
}


Is this legal? If so, what will it return?

Show Answer

No, this is not legal. Control can't leave the body of a finally clause, and this will give you a compile-time error. This is valid behavior in other languages, though. For example, Javascript allows this and will return the value from the finally block.


I do not claim that these originated from me (although the explanations are written by me). These have come from various people and places, including Eric Lippert, Jon Skeet, Joe Duffy, Stack Overflow and too many blogs and articles to count. Where I can remember, I will provide a link to where it came from. If I can't, forgive me (or provide me with a link to the original location and I will try to include it).
Thinking Concurrently, Part 2
By: Chris
Filed under: Random Thoughts, Software
When we talk about multi-threading, there are typically two words that are used. Parallelism is one, concurrency is the other, and they are not the same. Parallelism is running more than one thing at a time. Concurrency is what has to be managed when those things converge on something shared between them.

That means that we have two topics. The previous post showed an example where the programmer had assumed that parallelization was happening and had badly tried to handle the concurrency issues, and I find that there’s more of a focus on the concurrency aspect when dealing with multi-threading, so we’ll talk about that first. Afterwards, we’ll talk about parallelizing code and why your concurrency tools should not be the first thing you reach for.

Making Concurrency Easier

As in the previous post’s example, for our purposes here, concurrency issues boil down to one thing: Shared State. This means that there is some aspect of the program that multiple channels of execution could be interacting with.

As an example, you have a kid’s toy. It’s a desirable thing, all fluffy and squishy and looks like a giraffe of some sort. It’s also just sitting in the middle of the floor. Any child in range could come by and play with it. So what happens when a child is already playing with it and another child tries to grab it? It might be fine, the first child might abandon the toy. It might involve crying and pushing and, depending on the force of the push, memory corruption.

Clearly, if multiple children have access to this, the ideal thing to do would be to restrict their access to stop any potential situations from arising.

So what can be done?

1) Be selfish and don’t share
2) Don’t share the actual item, just a copy
3) Share the item but don’t allow changes to it
4) Share the item, allow changes but minimize the sharing time

The first option is the best, if attainable. Why bother to share the toy at all if you don’t have to? The toy can go in the owner’s room, where the other kids can’t reach it, and only the owner can go inside and play with it.

That may not be feasible, though. Maybe it doesn’t belong to only one child. You may be able to salvage this by getting an exact copy of the toy, one for each child. Then they can all have a version of the toy in their room and there’s no conflicts.

That may be a little bit expensive, though. If you spend all your money buying an exact copy of every toy, you’re going to have a heavier burden financially on other things you may want to do. It also doesn’t teach the children anything about sharing, so you could allow every kid to interact with it in a way that doesn’t interfere with the other children. In effect, saying “You can all look at it and pretend that it’s moving, but keep your hands to yourselves”.

That’s no fun, though. The kids may just absolutely neeeed to touch it and pet it and make it growl at people and have all kinds of fun. They can’t all do this at once, so you need to segregate their access. “You can have it for five minutes and then it’s someone else’s turn”.

In short, this is the problem faced by multiple threads.

1) If you don’t have shared state, there’s no conflict. Each thread can do their own work and not even care what anyone else is doing. This is ideal for situations where threads are doing unrelated work or work can be effectively partioned into disparate chunks. For example, a list of items of work (assuming each unit is unrelated) might very well be partioned into smaller pieces and processed in parallel, with each isolated piece being handled by a separate thread.

2) If you can give everyone a copy, then there still isn’t a conflict because everyone has their own copy of the data. For example, if you need to perform several distinct operations on the entire list, you could provide each thread with a copy and let that thread modify its own copy as much as it wants according to its specific set of instructions. As in the toy example, though, that can become prohibitively expensive (although in terms of memory consumed instead of cash) and it won’t give you changes that may have happened after you took your copy (the first giraffe now has extra floppy ears and a laser cannon, but yours doesn’t).

3) Giving everyone the same object but only allowing read-only access works as long as any write operations can be pushed to within the local thread. For example, running different accumulation functions across a list could read through the list but only write to thread-local storage for its running total. It’s interacting with the list, but it’s keeping modifications to itself.

4) If everyone needs to touch the same object, and everyone needs to modify it, then limiting the amount of time any one thread touches the object is a good idea. This extends to all interactions, so as many operations as possible should be pulled into thread-local storage to minimize the time that any one thread is waiting for another thread to let go of their hold of that state. For example, running calculations and accumulating the results within each individual list item. It’s keeping access to the list item for itself, but only long enough to take the currently stored value and add its own input to it.

So how can this be applied to the example from the previous post?

That’s coming next time.