public class ____ { public int _ = 1; public int __ = 2;
public static ____ ___() { return new ____(); }
public int _____() { ____ _ = ____.___(); int ___ = _._ + _.__;
return ___ * this._; } }
Is this legal?
Show Answer This is completely legal. Regardless of the illegibility of the code, an underscore is a valid character to use as an identifier.
|
Languages interest me: How they're structured, where the meaning is contained and how it's effectively communicated. Our brains have an extraordinary ability to infer meaning from context and ambiguity, and this is also an area where a computer falls incredibly short. Computer languages must be much more constrained and rigid to allow for a decrease in ambiguity. Even with the reduction in ambiguity, though, there are always bits of weirdness.
I like to keep track of those places that are either implementation-defined in a particular language, have special issues surrounding them or just don't behave in a way that's intuitive. It may just make you say "Huh, I didn't know that" and go back to eating soup (presumably you have some). That's what these Quiz Day posts are.
The language that I'm most familiar with is C#, so all quizzes will be in C# unless I've specifically noted otherwise.
Let's start off with a simple one.
try { return true; } finally { return false; }
Is this legal? If so, what will it return?
Show Answer No, this is not legal. Control can't leave the body of a finally clause, and this will give you a compile-time error. This is valid behavior in other languages, though. For example, Javascript allows this and will return the value from the finally block.
|
When we talk about multi-threading, there are typically two words that are used. Parallelism is one, concurrency is the other, and they are not the same. Parallelism is running more than one thing at a time. Concurrency is what has to be managed when those things converge on something shared between them.
That means that we have two topics. The previous post showed an example where the programmer had assumed that parallelization was happening and had badly tried to handle the concurrency issues, and I find that there’s more of a focus on the concurrency aspect when dealing with multi-threading, so we’ll talk about that first. Afterwards, we’ll talk about parallelizing code and why your concurrency tools should not be the first thing you reach for.
Making Concurrency Easier
As in the previous post’s example, for our purposes here, concurrency issues boil down to one thing: Shared State. This means that there is some aspect of the program that multiple channels of execution could be interacting with.
As an example, you have a kid’s toy. It’s a desirable thing, all fluffy and squishy and looks like a giraffe of some sort. It’s also just sitting in the middle of the floor. Any child in range could come by and play with it. So what happens when a child is already playing with it and another child tries to grab it? It might be fine, the first child might abandon the toy. It might involve crying and pushing and, depending on the force of the push, memory corruption.
Clearly, if multiple children have access to this, the ideal thing to do would be to restrict their access to stop any potential situations from arising.
So what can be done?
1) Be selfish and don’t share 2) Don’t share the actual item, just a copy 3) Share the item but don’t allow changes to it 4) Share the item, allow changes but minimize the sharing time
The first option is the best, if attainable. Why bother to share the toy at all if you don’t have to? The toy can go in the owner’s room, where the other kids can’t reach it, and only the owner can go inside and play with it.
That may not be feasible, though. Maybe it doesn’t belong to only one child. You may be able to salvage this by getting an exact copy of the toy, one for each child. Then they can all have a version of the toy in their room and there’s no conflicts.
That may be a little bit expensive, though. If you spend all your money buying an exact copy of every toy, you’re going to have a heavier burden financially on other things you may want to do. It also doesn’t teach the children anything about sharing, so you could allow every kid to interact with it in a way that doesn’t interfere with the other children. In effect, saying “You can all look at it and pretend that it’s moving, but keep your hands to yourselves”.
That’s no fun, though. The kids may just absolutely neeeed to touch it and pet it and make it growl at people and have all kinds of fun. They can’t all do this at once, so you need to segregate their access. “You can have it for five minutes and then it’s someone else’s turn”.
In short, this is the problem faced by multiple threads.
1) If you don’t have shared state, there’s no conflict. Each thread can do their own work and not even care what anyone else is doing. This is ideal for situations where threads are doing unrelated work or work can be effectively partioned into disparate chunks. For example, a list of ten items of work (assuming each unit is unrelated) might very well be partioned into anywhere from one to ten lists and handed to one to ten threads for processing.
2) If you can give everyone a copy, then there still isn’t a conflict because everyone has their own copy of the data. For example, if you need to perform several distinct operations on the entire list, you could provide each thread with a copy and let that thread modify its own copy as much as it wants according to its specific set of instructions. As in the toy example, though, that can become prohibitively expensive (although in terms of memory consumed instead of cash) and it won’t give you changes that may have happened after you took your copy (the first giraffe now has extra floppy ears and a laser cannon, but yours doesn’t).
3) Giving everyone the same object but only allowing read-only access works as long as any write operations can be pushed to within the local thread. For example, running different accumulation functions across a list could read through the list but only write to thread-local storage for its running total. It’s interacting with the list, but it’s keeping modifications to itself.
4) If everyone needs to touch the same object, and everyone needs to modify it, then limiting the amount of time any one thread touches the object is a good idea. This extends to all interactions, so as many operations as possible should be pulled into thread-local storage to minimize the time that any one thread is waiting for another thread to let go of their hold of that state. For example, running calculations and accumulating the results within each individual list item. It’s keeping access to the list item for itself, but only long enough to take the currently stored value and add its own input to it.
So how can this be applied to the example from the previous post?
That’s coming next time.
|
The developers that I’ve been exposed to directly have all been business programmers, and have ranged from the exceptionally talented and intelligent, who are phenomenally interested in what they’re doing and how to do it better, to the paycheck programmers that know enough to get by and don’t really want to learn more unless they have to. Most of them fall somewhere in the middle. They would like to understand concepts and they like learning new things, but they have a limited amount of time, they have other things going on in their life and they don’t necessarily need more than the skills they have to do their job in a satisfactory fashion.
Over this sample group, multi-threading and concurrency seem to be one of those things that all of these developers agree are good things to know, but the knowledge is generally lacking. Granted, most of them are not doing terribly complicated things with threading on a daily basis, if at all. They know what the ‘lock’ statement is, roughly how to use it and that’s it. Unfortunately, that leads to code like this.
List<string> formattedItems = new List<string>(); foreach (Item someItem in sharedItemsCollection) { object obj = new object(); //Make sure that nobody can touch the object while we’re formatting lock (obj) { Formatter itemFormatter = someItem.GetFormatter(); formattedItems.Add(itemFormatter.FormatItem(someItem)); } }
I’ve renamed types and removed logic for clarity, but this example is code that was 1) Written by a Senior Software Engineer and 2) Existed in a production environment. The code always ended up being run in a single-threaded manner so no actual harm was done (besides taking an unnecessary lock), but this demonstrates a fairly serious misunderstanding of synchronization and could have been disastrous under multi-threaded circumstances.
It’s tempting to just say “Well, that person wasn’t really a senior-level developer if they made that kind of mistake”, but that waves away the underlying issue. This was not written by a dumb person. This was written by a person who’d been a developer long enough and done well enough to be put in a senior position. More importantly, this was someone who was confident enough that they understood the semantics of what they were doing but were mistaken.
The world seems to be trending increasingly towards a multi-core/many-core environment, which means that understanding how to effectively parallelize code and deal with concurrency issues is going to become more and more of a necessity.
Let’s take a closer look at the example. The elephant in the room is that the lock is being taken on a local variable. Why is that a bad thing? In C#, locks are meant to allow only one thread at a time to execute the code contained within the associated block. Taking a lock on a given instance ensures that any other thread that tries to take a lock on that specific instance will have to wait their turn to execute that code. From that, we can infer a few things.
1. Each thread must be able to tell when another thread has taken the lock. 2. The instance that is locked on must be visible to all threads to allow #1.
A local variable is local to the method and the thread, meaning that every thread will create their own instance, lock on it and then modify shared state. This is equivalent to replacing a stoplight at an intersection with tiny stoplights mounted within individual cars. Everyone thinks that it’s safe for them to go because their light is green and then you run into problems (and other people).
private static readonly object sharedLock = new object();
This is the instance that should have been created, outside the method. It’s visible to all threads within the AppDomain (static), unlikely to be swapped out for a different instance (readonly) and immune to races around the lock instance creation (field initializers are guaranteed to run before the method does).
The slightly less obvious elephant in the room is regarding what we’re attempting to protect. The collection that we’re iterating through is shared, so it’s conceivable that other threads could be modifying the collection or the items in the collection at the same time. Assuming that the comment is accurate, it seems reasonable to conclude that the item instance is also shared (meaning a reference type) and that formatting dumps the mutable state within the item to a formatted string.
As nice as it would be to have a magic button that fills your code with fairies and unicorns that make your code play nice with multiple threads, you probably don’t have a keyboard with that button. Instead, you’ll have to make do with a not-so-magic set of good things to think about.
And that’s coming next time.
|
I've spent the past three or four months developing an Android application in Java for work, plus a few side projects (one of which was a soundboard for my wife of random things that I say), which has been an interesting/frustrating/fun journey into something new. I really like the majority of how the Android libraries and application flow are structured, it makes sense and mostly works exactly how you think it would. The pieces of an application are componentized and can be largely strung along together as disparate sections of a whole.
I also knew a bit of Java before starting, having done a few random projects here and there, but this was the first full immersion that I've had and it's been fun. I always like having a chance to learn something new, and this was a new twist on a familiar concept.
I'm experienced at C# and .Net in general, and the joke goes that you just search and replace the word 'using' with 'import' and you've got Java. True, to an extent, but I enjoyed seeing how the same problems were solved in completely different ways. Callbacks, for example.
C# has gone down much more of a functional programming road than Java has, and delegates (and their position as first class functions) has pushed it even further. The general 'problem' that they solve is that some situations really need to inject callable code into other code. C# also created anonymous methods, to save the trouble of defining a full method just for a callback.
Java seemed to like interacting with these calls through interfaces and bare class instances, so to accomplish the same you must create an instance of something matching the parameter types and send it in. That's cumbersome, though, and unwieldy if you only have the class around for the purpose of a single function callback, so they created anonymous classes.
Like a lot of things, this is good and bad.
Good: It allows for further definition of fields and related data, scoped to the class instance across N sibling scoped methods.
Bad: It's still unwieldy if you just need a single method.
Good: At least you don't have to define a separate class file or nested class.
Bad: But you still have to define a class, which you don't need most of the time.
Good: Shut up.
Bad: You're dumb and you like a clumsy language.
Good: NO! JAVA RULES! I SPELL MICRO$OFT WITH A DOLLAR SIGN SO HAH!
Bad: TAKE YOUR FIVE MILLION SHODDY LIBRARY PIECES AND EAT THEM LIKE GLASS.
And it never ends well, so we're moving on. I did have a good experience with this whole thing, though, and I am sensing more Android/Java in my future. No matter the syntax, it's just a different way of saying something and hopefully, with more languages under my belt, I'll become slightly more coherent. Also, hopefully, I'll stop storing my languages in my belt. I am not Batman. My belt doesn't even have pockets on it.
|
I've been thinking a lot about tests in the past few months. Developers, and I use that term in as broad a generalization as it needs to be, hate to write tests. This is not news. Developers are notoriously finicky creatures that gravitate towards things that are new and interesting and that will provide a challenge, and writing tests for their own code is, equally notoriously, none of those things.
Writing the code is exciting, it's creating something new, reshaping existing things, actively doing something constructive. This is the equivalent of going to a party, albeit a nerdy one filled with text. Writing tests is the equivalent to cleaning up after said party and assessing the property damage, perhaps being forced to write a strongly worded letter to said partygoers.
There are people on both sides of the fence.
Side 1: You should write tests for everything. If you don't have a test for something, you don't know if you broke it later. Side 2: You don't need tests for anything, your code is going to change fairly quickly and you'll just be maintaining tests all day. Besides, QA is going to catch things anyways. It's their job. Side 3: I'm on the fence about this whole issue, but I think you're both nuts. We need to find a middle ground as a compromise. Side 1 & 3: No.
What's a reasonable middle ground? You should not write tests for everything, because everything does not need to be tested. You should not not write tests, either, because some things do need to be tested. What falls into that middle ground, I think, are a few immediate requirements.
1) Security 2) Customer-facing Functionality 3) Core Business Functionality
Security should be fairly obvious why it's high on the list. If your software has holes that allow your machines to be compromised, that is generally accepted to be a Bad Thing. Tests (and appropriate security-related software) will make sure you're covering your bets.
Customer-facing functionality is a second. Whatever ways your customer can interact with your application, whether that's an API, a framework, a web site, these need to be tested to maintain a certain level of quality. Your customers are how you make money, preserve that.
Core business functionality are things that your business needs to have working in order to do business. If you're a shipping company, your shipping code had damn well better have tests for it. Again, protecting yourself.
The issue, though, is getting developers to do this.
It's not my job, that's QA's job.
And you are making QA's job harder, and therefore making a worse product (and probably pissing them off), by not doing what you can before you give it to them.
I don't have time, it's always on to the next thing.
Could be a valid complaint, and one that your project manager (or equivalent) needs to address with development schedule changes. Either that, or maybe you're being lazy. Stop it.
I don't want to maintain tests.
Everyone should write them, so everyone should maintain them. It's not just up to you, but you're basically saying "I only want to do the FUN part of my job". I want to fly the rocket ship to the moon, but I don't want to spend the hours training for it.
I stepped through the code and it worked, that's good enough.
Unless you broke something on a path you didn't check.
I don't want to.
That's the problem, isn't it. I've worked with developers before that have an attitude of "If it's somehow remotely not my problem, I don't have to do anything about it", leading to situations where they spend more time and effort trying to pin the responsibility on a coworker than it would take to fix the problem.
"I'm having an issue building because of one of your checkins. It's a two-line fix, and I know what that fix is, but I'm going to try to make you do it instead of just adding the code myself".
It's a mentality shift and, I'll be completely honest, I have no idea how to change that. I'm not even sure outside influences can affect this too much. Managers and higher-ups can make testing mandatory, repercussions can come into play for failure to test, processes and standards can make things more streamlined but each developer has to get in the habit of doing it because it's beneficial for the product. Developers are smart, and sometimes so smart that they'll shoot themselves in the foot from disinterest, but it has to come from somewhere. Perhaps beatings are in order.
This is why I'm on the technical track instead of the managerial track.
|
A while ago I started learning Haskell, mainly because it seemed interesting, and became reasonably fluent. It was my first foray into functional programming, which I found to be an interesting shift in thinking.
Imperative programming is where I started, and it's where I've been for my entire career. I've gone through Delphi, various C++ versions, a handful of various languages that I've known and used ad hoc (PHP, Java, Perl, IL and VBScript to name a few) and C#, which is where I've been for quite a while.
When I say that there's a shift in thinking, moving to a functional language, I'm talking specifically about two things.
1) Thinking in immutable structures
All of the languages that I'd been exposed to have a strong emphasis on mutability, thinking in terms of variables instead of values. If you set Variable A to the value 5, you can always change it later. That's what the variable is there for, temporary storage of a temporary state. Along with that comes an entire classification of issues surrounding mutability, so I'm also intimately familiar with threading and synchronization issues.
Functional languages (the ones I've been exposed to, anyways) are geared much more highly towards immutable structures, values over variables. You still can mutate things, if you have to, but that's not the focus. Variables, effectively, are placeholders for values and not mutable storage containers. Trying to change a 5 to 6 doesn't make much sense: The number 5 still exists, you haven't suddenly replaced that number, you've just changed your function to use a 6 instead.
2) Thinking in first-class functions
C#, in particular, has made steps in the direction of first-class functions by making delegates very easy to hand around, in addition to including anonymous methods and lambdas, but most imperative languages are not geared towards functions that can take other functions as parameters.
Function composition is a large part of this other way of thinking, taking functions and combining them, composing new chains that take other chains, and that's a little difficult to think about at first.
So what's the point?
I'm teaching myself F#. So far, it's an interesting amalgamation of concepts that I learned through Haskell (I believe it's supposed to have strong OCaml roots) and integration into the .Net framework. I'm really enjoying the uses for discriminated unions and, like with Haskell, I find that I like the type inference. When I'm programming with C#, I make constant use of the 'var' keyword to help strip unnecessary type information from the logic and increase the readability, and being able to use a language that figures all of that out is an easy transition and one that I like a lot.
|
Here are the previous posts in this series. Part One Part Two Part Three Part Four
The steps outlined in previous posts were these. - Lexical analysis (What words you use) - Parsing (What those words mean to us) - Semantic analysis (Solidify those meanings from context) - Output Of Compiled Code
What the hell is Semantic Analysis?
Good question. Readers of previous posts will know that this is how we determine whether we have a problem with how something is said. At this stage in the compilation process, we've tokenized everything and we've built up a list of which types we have available. Because this is a fairly straightforward compiler, the only kind of analysis we really have to do falls into the following categories.
- Reference linking - Assignment compatibility - Duplicate fields
The first item on the list means that, while parsing, we will have identified a reference to a defined type (i.e. startingDungeon). We need to determine that this actually references an object. At this point, we're just dealing with tokens. Does this reference a top-level token by this name? Good, we're done here. Ensuring validity when each token is expanded is another story.
The second item on the list is a big one, and potentially (because of scripts) a mildly annoying thing to resolve (especially because we are doing type inference behind the scenes). In the case of Definitions, it's easy. We know the inferred type of the right-hand side, and we can inspect the type of the property being set on the base class through reflection. Do they match? Good.
Scripts, however, can define implicitly typed local variables, so we may have to walk the chain backwards. For this reason, and because scripts depend completely on existing types, all Definitions are completed and verified prior to script analysis.
script:TeleportToBadDungeon { var badDungeon = firstEvilDungeon_LevelOne_RoomTwo; badDungeon.TurnsUntilDeath = player.Health; player.CurrentRoom = badDungeon; Print("The dungeon walls change, warping slightly. It looks... different."); }
This is a very simple case. The first line creates a local variable, and the type on the right side is a reference. We go and look up the reference, which we infer as the local variable's type. We also look up a reference to the player, then make sure that they each have the properties that they say they do and that they're assignment compatible.
Duplicate fields are fairly easy, as everything is on a Name Only basis. If another token at the same level has the same name, that's a problem. There is no such thing as overload resolution in this.
So, let's turn that into something else!
I chose to compile this custom code into IL because that's a common ground that this would have with other .Net code. The actual game engine is going to dynamically load a game assembly and try to run it, so as long as it's another .Net assembly then it's very easy.
IL is a slightly different language in structure. It's a stack-based language, meaning that each instruction either adds to, removes or does nothing to a stack of instructions. For example, to return a hardcoded string from a method, you would do this.
ldstr "Hello" ret
The ldstr instruction loads the specified string onto the stack, and the ret instruction takes the top item off of the stack and returns it.
Because I was writing this in C#, I made use of the System.Reflection.Emit namespace, which has assembly compilation and IL generation tools built in. At that point, it's a simple task to send a token in and convert that to the appropriate IL structure. The end result is a compiled assembly, with optional debug information so stepping through the custom source is possible, that the game engine can load and execute.
That's a very, very, extremely quick overview of the whole process, anyways. I'm thinking of doing a different compiler next time and see if I can apply things that I've learned to it. I'm also playing around with the idea of implementing a garbage collector. Maybe those two can be related.
I can't believe I made it through all of that
If you did, I'm surprised too. I hope you found it interesting, though. For me, I find that I keep going through what amounts to academic exercises to keep learning, and because it sparks something in my brain. I definitely wouldn't mind doing DevDiv someday or even working on a compiler, it sounds like a lot of fun and interesting work. Even then, though, I don't think I'll be stopping any time soon.
|
I've noticed that I tend to adopt programming style idioms for two reasons. Either it's mandated by the current employer, and there are quite a few of those kinds (although they tend to focus more on naming conventions and structure semantics), or because it's the best option that I'm aware of. This has obviously evolved over time; My programming style now looks nothing like it used to, and this is good because it's a positive shift.
My first official job was programming in Delphi, and functions in that language have a built-in variable called Result that you may assign to at any point during the function's execution. The Result variable holds the function's return value until the function finishes, at which point it hands back to the caller whatever is in there. This is opposed to the C-style 'return' statement that terminates the function execution when run, handing back what you just gave it.
Even though languages like C# retain the 'return' statement concept, I've noticed that some people still tend towards Delphi-like semantics to structure their code.
public ResultEnum DetermineResult(long amount) { ResultEnum result = ResultEnum.InvalidAmount;
if (amount == this.FullAmount) result = ResultEnum.FullAmount; else if (amount < this.FullAmount) result = ResultEnum.BelowFullAmount; else result = ResultEnum.AboveFullAmount;
if (result == ResultEnum.BelowFullAmount) { if (!this.CanFallBelow) result = ResultEnum.InvalidAmount; }
return result; }
This way has been described to me by some people as easier to read, stating that the result value could very well go through multiple states in the method and only the final state (when the method is done) should be returned. It's also been said that it's easier to reason about because the method should only return once, at the end, and you don't have to track N potential short-circuits of method execution.
That's true, but that means that you have to track N potential state changes instead and, worse, you may have to deal with logic involving those partial states and your current 'result'. This also makes it harder to think about the method logic, because you're dealing with logic that's depending on a given state and, instead of encapsulating that within the appropriate branch of logic related to that choice, the state serves as a placemarker for later logic that may be run at some point.
It is far easier for me, personally, to read something that does not potentially drag along state that will not affect the method. If you already know that a given state should be its final one, don't bother going through the pretense of running the rest of the method. Discard those by returning early, and you can be guaranteed (both when writing code and reading it later) that the logic in the rest of the method does not care about that state anymore. The idea is to restrict the size of the state that you have to think about at a given point in time, which makes for a cleaner and less convoluted method body. This is both more likely to produce code that is verifiably correct and easier to interact with.
public ResultEnum DetermineResult(long amount) { if (amount == this.FullAmount) return ResultEnum.FullAmount;
if (amount > this.FullAmount) return ResultEnum.AboveFullAmount; if (this.CanFallBelow) return ResultEnum.BelowFullAmount;
return ResultEnum.InvalidAmount; }
The idea, with all styles, is to improve the readability and the overall accuracy of a given piece of code. If it's not going to help produce quality code, there's no point in doing it (unless your employer requires it, but presumably that is also to help by achieving consistency).
|
I developed an extension for Visual Studio recently, and as a part of that I needed to inspect the assemblies that were built as a part of the current solution. Pulling them directly into memory with any of the Assembly load methods (Load, LoadFrom, etc) was not a good idea because those assemblies will stay in memory until the AppDomain containing them is unloaded (which will be when Visual Studio closes). As an unfortunate side effect of that, once you run the extension that inspects the assemblies, you can't build anymore because they're locked by the fact that they're in use by a currently running process.
The answer to that is to create a separate AppDomain, load the assemblies into that and unload that AppDomain when you're done. I already knew that, but hadn't ever done it, so I looked around and found a few good resources. There weren't a lot of them, though, and the ones that I did find didn't explain a few things that it would have been useful to know.
Here are the basic steps.
1) Create the AppDomain 2) Load the assemblies into the new AppDomain 3) Get the requested data out of the assemblies 4) Unload the AppDomain when no longer used
Create the AppDomain
Before we can create the domain, we need to define the configuration for that domain.
var appDomainSetup = new AppDomainSetup() { ApplicationBase = AppDomain.CurrentDomain.BaseDirectory };
The ApplicationBase property defines the physical location that is the root for assembly probing, rights and other things. To ensure that this domain has the same abilities as our current one, we'll just grab the path from our current domain and use that for the new one. You could alternately use System.Environment.CurrentDirectory, Path.GetDirectoryName(this.GetType().Assembly.Location) or any number of different ways to get a directory.
Now that we have the configuration defined, creating the new domain can be done in one line. We're not anticipating running this multiple times at once, so a hardcoded string works well for the name.
var customDomain = AppDomain.CreateDomain("CustomDomain", null, appDomainSetup);
The last thing is a very important thing. You now need to resolve assemblies for the current domain yourself. This is a bit of weirdness that needs to be explained, but I'll get into that a little later.
AppDomain.CurrentDomain.AssemblyResolve += customDomain_AssemblyResolve;
Load the assemblies into the new AppDomain
Now that the separate AppDomain has been created, how do we actually load things into it? The first inclination may be to use AppDomain.Load() from your current location and that would be a mistake. Leaving aside the fact that several of those overloads are deprecated and the functionality is used mainly for COM interoperability, all of those methods return an Assembly. It may load through the separate AppDomain, but when it comes back it will be loaded into your current AppDomain as well. That defeats the whole purpose.
There is a method on an AppDomain called CreateInstanceAndUnwrap(), which combines CreateInstance() and Unwrap() calls. This will create an instance of type in a separate AppDomain and hand you back a __TransparentProxy to that object, in effect, giving you an instance that can go into the context of that AppDomain and load all the assemblies it wants. We don't have one of those classes yet, but that sounds like a really good idea. Obviously, that class needs to be capable of being passed across AppDomain boundaries, so it needs to inherit from MarshalByRefObject.
private class CrossDomainQuery : MarshalByRefObject { public void LoadDataFromAssembly(string assemblyPath) { var assembly = Assembly.LoadFrom(assemblyPath); } }
And then, from our current AppDomain, we can do the following.
var crossDomainQuery = (CrossDomainQuery)customDomain.CreateInstanceAndUnwrap( typeof(CrossDomainQuery).Assembly.FullName, typeof(CrossDomainQuery).FullName); crossDomainQuery.LoadDataFromAssembly(assemblyPath);
And here's where the part about resolving assemblies in our current AppDomain becomes extremely relevant. The CreateInstanceAndUnwrap() call hands back a __TransparentProxy object which, frankly, is worthless to us because we can't use it to do what we want. Casting that to the appropriate type is a necessary thing, so we cast to a CrossDomainQuery object, but that cast triggers a resolution. It says "Hey, hold on, where did you get that type from?" and hits the AssemblyResolve event for the current domain (because that's where we're loading this type into).
private Assembly customDomain_AssemblyResolve(object sender, ResolveEventArgs args) { return Assembly.Load(args.Name); }
This is potentially a very, very dangerous thing to do. Using Assembly.Load() to resolve the assembly based off of the name makes the implicit assumption that you have already loaded this assembly. If you have, it will return the assembly reference that you've got in memory, which is what we want because we're using types from our current, in-memory assembly. If it's not, however, that call will fail and trigger another AssemblyResolve event to resolve that failed resolution, and so on, eventually cascading into a stack overflow and killing your process. The appropriate thing to do, if this had to be more robust, would be to develop a resolution mechanism that would avoid this, but for our current needs and expectations this works just fine.
Get the requested data out of the assemblies
Now that we have an object that can go into the other domain and get what we want, how do we get that data back? We can't just send the assembly back and say "Get it yourself", because that will cause a stack overflow from failed resolutions and, even if we did resolve it correctly, it would load it into the calling AppDomain and we're back where we started.
So, what data are we looking for? Primitives like strings and ints are easy and pass boundaries very well, so if you can condense your request down to things like "I just want a string with the assembly's fully qualified type name" or something similar then you're basically done. For argument's sake, though, let's say we have a common attribute class defined in a shared assembly that we'll be watching for in the loaded assembly and we either want the attribute or the data inside it.
Plugin architecture works under similar scenarios, watching for interface implementations, attributes and so on. Our code needs to pull in a specific assembly, inspect it, get information from it (possibly even routing calls to it) but let it go when we're done. Inside of that separate AppDomain we can do whatever we want, but we probably don't want to implement all of the logic related to plugins in that CrossDomainQuery object. If we can't send the assembly back, though, what can we send?
Well, to pass a type back and forth across the boundary, we need a few things.
1) We need to know about the type within our current AppDomain, whether we already have that knowledge in memory or whether we have to resolve and load some external assembly. 2) The type needs to be capable of being passed across AppDomain boundaries.
Step 1, check. We added a reference to the shared library containing the attribute definition, we know about it. Because it's in memory, our existing AssemblyResolve implementation will work fine for this as well. Step 2, not a check yet. The type either needs to inherit from MarshalByRefObject, which can definitely be passed across, or it needs to be marked as serializable with the [Serializable] attribute. Either create a separate container that holds the attribute information or add a [Serializable] tag to the attribute and you're good to go. Now the method inside of the CrossDomainQuery object can look like this.
public CustomPluginAttribute[] LoadDataFromAssembly(string path) { var assembly = Assembly.LoadFrom(path); return assembly.GetCustomAttributes(typeof(CustomPluginAttribute), false).Cast().ToArray(); }
Unload the AppDomain when no longer used
Once you're done, you need to clean up your mess. This means unloading the AppDomain and unhooking the AssemblyResolve event (unless you really like keeping that around for some reason).
AppDomain.Unload(customDomain); AppDomain.CurrentDomain.AssemblyResolve -= customDomain_AssemblyResolve;
This will unload all assemblies that you may have loaded into that AppDomain as well, freeing them up for use, replacement or whatever happens to come by. No harm done, everyone's happy, and now you have an easy way to manage extra assemblies you may need to pull in and release on a conditional basis.
|