Sunday, January 5, 2020

Using Type.GetType with .NET Core / Dynamically Loading .NET Standard Assemblies in .NET Core

The Type.GetType() method lets us load a type from the file system at runtime without needing compile time references. Because of the way that assemblies are loaded in .NET Core, this behaves differently than it did in .NET Framework.

Update Jan 3, 2020: The Type.GetType() method has not changed, but runtime behavior regarding assembly loading has changed. That doesn't negate the workaround shown here, so feel free to keep reading. For more information see the next article: Type.GetType() Functionality Has Not Changed in .NET Core.

Short Version
In .NET Framework, calling Type.GetType() with an assembly-qualified name...
  • Loads the specified assembly from the file system
    Note: "from the file system" is observed behavior, but technically not correct -- see the followup article mentioned above for more details. 
  • Returns a Type object
In .NET Core, calling Type.GetType() with an assembly-qualified name...
  • Does *NOT* load the specified assembly from the file system
  • If the assembly is not already loaded, GetType() returns null
  • If the assembly is already loaded, GetType() returns the Type object
This means that we need to take extra steps in .NET Core in order to use Type.GetType() with .NET Core applications. And things get more interesting if we are loading .NET Standard assemblies.

Note: The code for this article comes from my talk on C# interfaces (IEnumerable, ISaveable, IDontGetIt: Understanding .NET Interfaces). The code is in 2 separate GitHub repositories. The .NET Framework code is at GitHub: jeremybytes/understanding-interfaces. The .NET Core code is at GitHub: jeremybytes/understanding-interfaces-core30.
In the spirit of "the fastest way to get a right answer is to post a wrong one", I'm putting down my thought processes in solving an issue. If there is an easier way (such as setting a flag or adding a config setting), please let me know in the comments.
Why Dynamically Load Types?
I have used dynamic loading of types for 2 primary scenarios: (1) swapping one set of functionality for another, and (2) plugging in business rules. In either of these scenarios, we do not need to have the specifics of the dynamically loaded types available at compile time. Things can be figured out at run time.

In the first scenario (and the code sample we'll look at today), I have changed out data-access code from one system to another. For example, to get data from a SQL database, a web service, or some other location. This is particularly helpful when there are multiple clients using the same application with different data storage systems.

In the second scenario, I deployed an application with the option of adding or changing certain business rules at a later time. The business rules follow a specific interface and are stored in a separate assembly (or multiple assemblies). At runtime, the application loads the business rules from the file system. This made it easy to update existing rules, add rules, or remove rules without needing to recompile or redeploy the application; only the business rule files were affected by updates.

Dynamic Loading with .NET Framework
I ran across this issue when I was moving a WPF application from .NET Framework 4.7 to .NET Core 3.0. We'll start with code from the .NET Framework repository mentioned above (GitHub: jeremybytes/understanding-interfaces), specifically the "completed" code in the "04-DynamicLoading" project (completed/04-DynamicLoading).

This application dynamically loads a data repository that can get data from a text file (comman-separate values (CSV)), a web service (HTTP/JSON), or a SQL database (SQLite local db). The application does not know anything about these data repositories at compile-time. Instead, it loads a repository from the file system based on configuration.

Here is the code that dynamically loads the repository (from the RepositoryFactory.cs in the PeopleViewer project):


The "GetRepository" method returns IPersonRepository -- this is the interface that represents the data repository.

The first line of the method gets the assembly-qualified name from configuration. Here's the configuration section for the CSV Repository (from the App.config file of the PeopleViewer project):


This lists the assembly-qualified name for the repository. This consists of the following parts:
  • Fully-qualified Type Name: PersonRepository.CSV.CSVRepository
  • Assembly Name: PersonRepository.CSV (this is "PersonRepository.CSV.dll" on the file system)
  • Assembly Version: 1.0.0.0
  • Assembly Culture -- this is used for localization (which we haven't implemented here)
  • Assembly Public Key Token -- this is used for strongly-named assemblies (which we haven't implemented here)
The next line in our method makes a call to Type.GetType() with this assembly-qualified name. GetType will find the assembly on the file system (by default, it looks in the same folder as the executable). Then it loads the assembly and pulls out the information for the Type object.

Activator.CreateInstance will create an instance of the Type using a default constructor. In this case, it will be a CSVRepository.

The last 2 lines cast the new instance to the correct interface and return it.

Getting the Repository Assemblies
One other piece is that we need to get the CSVRepository assembly to the executable folder somehow. The project does not have any compile-time references to the assembly, so we need to do this manually.

We have the repositories (and all of their dependencies) in a folder at the solution level called "RepositoriesForBin" (you can look at the contents of RepositoriesForBin on GitHub). Here's a bit of a screenshot from Windows File Explorer:


This snippet shows the "PersonRepository.CSV.dll" file that we're using here. In addition, we have files for the web service repository, the SQL database repository, and all of the dependencies for those repositories.

To get these into the executable folder, the PeopleViewer project has a post-build step. (It's kind of buried in the PeopleViewer.csproj file on GitHub -- it's easier to look at this in the Visual Studio project properties). Here is the post-build section of the project properties:


This copies files from the from the "RepositoriesForBin" folder to the output folder (including any sub-folders).

For more information on build events, take a look at "Using Build Events in Visual Studio to Make Life Easier".

Running the Application
When we run the application, we get data using the CSV repository (which gets data from a text file on the file system).


Changing the Data Source
If you're curious about how the dynamic loading works. Shut down the application and open the executable file in File Explorer (this is in the PeopleViewer/bin/Debug folder).

Run "PeopleViewer.exe" by double-clicking it from File Explorer. You will see the same results as above.

Shut down the application, and then edit the "PeopleViewer.exe.config" file on the file system using your favorite text editor. Comment out the section for the "CSV Repository" and uncomment the section for "SQL Repository". Save and close the file.

Now when you re-run the application, it will use the SQL database instead of the text file.

In real life, we would not be doing this on a single machine. However, think of the scenario where we have multiple clients. For each client, we give them just the assemblies that they need for their particular environment. If a new client has a different data store, that's fine. We create a repository assembly and give it to that client. We do not need to recompile the application or deal with multiple versions deployed at different client sites.

Anyway, on to .NET Core.

Converting to .NET Core
For the .NET Framework project, the WPF application (PeopleViewer) is a WPF application. The web service (People.Service) is an ASP.NET Core 2.2 API. The interface project (PersonRepository.Interface) is a .NET Standard 2.0 project. All of the repository files are .NET Standard 2.0 as well.

Part of the reason for using .NET Standard project for many things is that I knew I would be moving the WPF application to .NET Core once .NET Core 3.0 was released. And that's what I did.

I moved the WPF application and the web service to .NET Core 3.0. And I moved the libraries (including the repositories) to .NET Standard 2.1.

One other thing I did with this application was change all references from "Repository" to "Reader". Since the operations for the repositories are read-only, the term "reader" is more appropriate.

The completed code is on GitHub (jeremybytes/understanding-interfaces-core30), specifically in the "completed/04-DynamicLoading" folder. Note that this repository has the completed code, so you won't be able to follow along with the interim code (you can contact me for details if you'd really like to follow along).

Broken Code in .NET Core
Unfortunately, if we take the "GetRepository" method (now called "GetReader") straight across, the code does not work.

If we run the application, we get an exception:


The "Activator.CreateInstance" method is giving us an ArgumentNullException. This means that the "readerType" variable that we have here is null.

"GetType" is not returning what we want. Here are the values of "readerTypeName" and the "readerType" in the debugger:


This shows that the "readerTypeName" variable is populated with the assembly-qualified name that we expect. So that's fine.

But the "readerType" that is returned from the GetType method is null.
GetType does not automatically load an assembly in .NET Core like it does in .NET Framework.
Frustration and Reasoning
This is where I went through a bit of frustration. When checking the documentation for "GetType", there is currently (as of Jan 5, 2020) no indication that it works differently. Here is a screenshot of the beginning of the "Remarks" section for Type.GetType (link (which will hopefully be updated by the time you read this): https://docs.microsoft.com/en-us/dotnet/api/system.type.gettype?view=netcore-3.1#System_Type_GetType_System_String_):


Note that I do have ".NET Core 3.1" selected for the Version. Here is the start of the "Remarks" text:
"You can use the GetType method to obtain a Type object for a type in another assembly if you know its assembly-qualified name, which an be obtained from AssemblyQualifiedName. GetType causes loading of the assembly specified in typeName." (emphasis mine)
So, according to the documentation, this should work.

Assembly Loading and Unloading
The reason that this does not work is that the assembly loading mechanism was changed for .NET Core. This was done for a couple of reasons. First, we can set up different assembly load contexts; this lets us load different versions of assemblies into different contexts in the same application. This was not really possible before. Second, we can unload assemblies after we're done with them. Again, this is something that was very difficult to do before.

Manually Loading an Assembly
In getting this to work, my first step was to manually load the assembly by hard-coding the value. Here is the code for that.


Before calling the "GetType" method, this code loads the CSV assembly into the context using "AssemblyLoadContext.Default.LoadFromAssemblyPath". This will load the assembly into the default context (which is the main one that the application uses). The parameter is the assembly file name with the full path.

For the path, there is an assemblyPath variable that is set to the current location of the executable (AppDomain.CurrentDomain.BaseDirectory) with the file name appended (PersonReader.CSV.dll).

This gets us a working application:


But it is of limited usefulness since the CSV reader assembly is hard-coded.

A Different Approach
At this point, I figured that I could try to parse the file name out of the assembly-qualified name that we already have in the configuration file, or I could take a different approach.

When we manually load an assembly into the context like we did above, we also get a reference to that assembly. This means that instead of using "GetType" to locate a type, we can poke into the assembly directly using reflection.

For this approach, I made a few changes to configuration, output folders, and code.

Note: this is not the final version of the code, but you can find it by looking at a particular commit in GitHub: commit/49dc7a33d8071e9eef83d9e1a1d7bba5c3de50cb.

New Configuration
Rather than having the full assembly-qualified name of the type, I created settings for just the parts that I needed. Here is the new configuration (in the App.config file for the commit mentioned above):


Now we have a "ReaderAssembly" key with a value of "PersonReader.CSV.dll" -- the name of the file on the file system. We also have "ReaderType" which is "PersonReader.CSV.CSVReader" -- the fully-qualified name of the reader type.

New Output Folder
In addition, since we will no longer rely on "GetType" being able to find files in the executable folder, I decided to move the reader files to a separate sub-folder in the output. This makes it easier to keep track of the reader assemblies, particularly if we need to remove or change the files.

Along with the new output folder comes updated post-build steps. These are in the PeopleViewer.csproj file for the commit mentioned above. Here is the view from Visual Studio, which is a bit easier to read:


This has 2 copy steps. The first step copies files from the "AdditionalFiles" folder into the output folder. This folder contains the data files that are used by the readers, specifically People.txt (for the CSV reader) and People.db (for the SQL reader).

The next step copies files from the "ReaderAssemblies" folder to a "ReaderAssemblies" subfolder in the output. This contains the dlls for the readers along with the dependencies.

New Code
Along with the new configuration and output location, we have some new code to dynamically load the specified data reader. This is in the ReaderFactory.cs file for the commit mentioned above:


Let's walk through this code.

First we get the "ReaderAssembly" value from configuration. As a reminder, this is "PersonReader.CSV.dll".

Next, we create the full directory path to that file by taking the "BaseDirectory" (where the executable is), appending the new "ReaderAssemblies" subfolder, and then adding the name of the file.

As a side note, the "Path.DirectorySeparatorChar" will pick the correct character for the operating system. So in Windows, it will use the backslash; in Linux and macOS, it will use the forward slash.

Notice that after calling "LoadFromAssemblyPath", we store the return value as "readerAssembly". This is the assembly that we just loaded.

The next step is to get the "ReaderType" from configuration. As a reminder, this is "PersonReader.CSV.CSVReader".

Next we get the reader type out of the loaded assembly. This code uses a little bit of LINQ to reflect into the assembly. "ExportedTypes" is a collection of all of the publicly visible types that are in the assembly. In the query, we go through the types and try to find one that matches the value from configuration. If the type is not found, this method returns null.

The rest of the method is what we had before. Once we have the Type, we can use the Activator to create an instance, and then we cast it to the appropriate type.

Working Code (sort of)
This code seems like a good approach. We can use configuration to decide which assembly and type to load. And when we run the application, it works!


The CSV reader works just fine, but we run into a problem if we try to use one of the other reader types.

Let's update the configuration to use the web service reader. (In the App.config file for the commit mentioned above, comment out the CSV section and uncomment the Service section):


This sets the values for "ReaderAssembly" and "ReaderType" to "PersonReader.Service.dll" and "PersonReader.Service.ServiceReader" respectively.

Unfortunately, this breaks the application. If we run the application and click the button, we get an exception:


This is a "file not found" exception. And the details tell us that it is trying to load the assembly for Newtonsoft.Json version 12.0.0.0. The service reader has a dependency on Newtonsoft.Json.

That brings us to the next problem: loading dependencies.

Assembly Dependencies
In searching for a solution, I came across a tutorial about adding plugin support: Create a .NET Core application with plugins.

This tutorial addresses dependencies. Unfortunately, the described solution does not work for the current code. In the section "Plugin target framework recommendations" we see the following (screenshot and text in case it gets updated):


"Because plugin dependency loading uses the .deps.json file, there is a gotcha related to the plugin's target framework. Specifically, your plugins should target a runtime, such as .NET Core 3.0, instead of a version of .NET Standard. The .deps.json file is generated based on which framework the project targets, and since many .NET Standard-compatible packages ship reference assemblies for building against .NET Standard and implementation assemblies for specific runtimes, the .deps.json may not correctly see implementation assemblies, or it may grab the .NET Standard version of an assembly instead of the .NET Core version you expect." (emphasis mine)
This plugin solution relies on ".deps.json" files to resolve dependencies. And there's our first problem.

.deps.json
The .deps.json file has the dependencies for an assembly. For example, when we build the PeopleViewer application, we get the following output:


In addition to the PeopleViewer.exe (which calls PeopleViewer.dll), we also have PeopleViewer.deps.json. By looking inside this file, we can see the following:


This has a "dependencies" section that shows a dependency on "PersonReader.Interface" version 1.0.0. (This is the interface project that we saw above). Because this is included, that assembly can be loaded along with the PeopleViewer assembly.

But our data reader assemblies do not have .deps.json files:


These assemblies are .NET Standard assemblies. As noted in the plugin tutorial, the dependencies for .NET Standard assemblies cannot be generated without knowing what .NET environment it will be running under. For example, the service reader may need a different version of Newtonsoft.Json when run from .NET Framework compared to running in .NET Core.

Options
To go down the path of the sample plugin architecture, I would need to change the data reader projects to .NET Core from .NET Standard. That is not something that is always practical depending on how the projects are being used.

Additionally, the plugin architecture seemed to be quite a bit more than I needed for this application.

Since all of the reader assemblies and dependencies are in a separate folder, I can take a different path. Instead of trying to figure out how to get the dependencies to load automatically, I can just load them manually.

Manually Loading Assemblies
In the previous code, we manually loaded the one data reader assembly based on configuration. To load the dependencies, we will load all of the assemblies that are in the "ReaderAssemblies" folder.

Here is the code for a "LoadAllAssemblies' method (from the "ReaderFactory.cs" file for the commit mentioned above):


In the first line, we build the path to the "ReaderAssemblies" folder.

Next, "Directory.EnumerateFiles()" will give us an enumeration of all the file names that match our search criteria. In this case, we ask for all files that end with ".dll". Also, we only search the top folder (not any subfolders).

Then we use "foreach" to loop through all the file names and load them into the default context. If there are any files that can't be loaded, then we just skip them.

Assumption
This has the assumption that all of the .dlls in this folder are ones that we want to load. This is a bit easier to do since we have a separate folder. If the reader assemblies were still in the root folder (like we had initially), I would be much more reluctant to try this approach.

Working Code
To get the code working, we call "LoadAllReaderAssemblies" at the top of our factory method (from the "ReaderFactory.cs" file for the commit mentioned above):


And now the application works with the service data reader as well:


Note: If you run this application yourself, you will also need to start the service. To start the service, open a command prompt to the "People.Service" folder and type "dotnet run". For more information on .NET Core services, check out this tutorial: Get Comfortable with .NET Core and the CLI.

Duplicated Code
With the updated solution in place, we have some unnecessary code. Let's take another look at the "GetReader" method (same as above):


In this case, the reader assembly gets loaded twice. When we call "LoadAllReaderAssemblies", everything in that folder is loaded, including the one for the data reader.

Then the next lines are concerned about getting a reference to the "Assembly" object that represents the reader assembly. To do this, we end up loading the data reader assembly a second time.

Rethinking the Solution
Let's go back to the initial problem: Type.GetType() does not automatically load an assembly.

But we saw that it still works when we manually loaded the assembly. Remember this code?


When we manually loaded the "PersonReader.CSV.dll" assembly, GetType worked just fine.

Now that we are loading the reader assembly and all of its dependencies, we can go back to that solution.

Back to Square One
With a better understanding of what's going on, we can go back to where we were initially. We can take our original code and add "LoadAllReaderAssemblies" to the top. Here is that code (in the ReaderFactory.cs file in the final code):


With this, we also need to go back to the original configuration (from the App.config file in the final code):


After all of the assemblies are loaded, GetType returns the Type object that we expect, and the rest of the code works as expected.

Running the application with this configuration gets data from the CSV text file:


And we can change the configuration to use the service:



Wrap Up
So we took a bit of a roundabout way to get back to where we started. But we learned some things along the way.
  • With .NET Core, we need to explicitly load assemblies.
  • With dynamically-loaded .NET Standard assemblies, we need to explicitly load any dependencies.
There are also some things to look into further.
  • Using .NET Core (or other specific framework) projects gives us a .dep.json file that specifies dependencies.
Moving to .NET Core is pretty smooth for the most part. But there are things that pop up that can be frustrating. Eventually we'll have all of those things catalogued, and conversions will be easier.

Happy Coding!

2 comments:

  1. Where the documentation says "GetType causes loading of the assembly specified in typeName" it is correct, but in a narrow way that's potentially confusing.

    In cases where a) Type.GetType succeeds and b) the assembly in question hasn't already been loaded yet, it is absolutely true to say that GetType does indeed cause the loading of the assembly. You can verify this by looking either at the Output window in VS (which shows when DLLs load) or using a tool such as SysInternals Process Monitor.

    The reason you've not observed the behaviour the docs describe is that initially, you were in a situation where a) did not apply, and then after your changes you were in a situation where b) did not apply.

    The second one is easy to understand: in your final code where you preload all the assemblies, obviously b) does not apply.

    But what about the initial case? The reason that in this particular case GetType appears not to do what the docs would lead you to expect is that assembly resolution fails, meaning it never even gets to the point where it's going to try to load the assembly.

    And the key point to understand with that last paragraph is: assembly resolution happens *before* the CLR makes any attempt to look for the file on disk. This is the surprising thing that makes the behaviour different from .NET FX, and that makes it appear as though it's not behaving as the docs say it will (even though the docs are actually correct). So various settings that look like they might help for this scenario, such as setting AdditionalProbingPath in your csproj turn out not to, because the CLR doesn't even attempt to probe for an assembly in load-by-name scenarios unless the name is successfully resolved.

    The fundamental issue here is that when it comes to "load-by-name" behaviour, the set of acceptable names is essentially fixed at CLR host load time (unless you provide a custom assembly load context, which is the normal way to implement a plug-in system in .NET Core; any particular reason you didn't use that?). The .NET Core Host uses information from the YourApp.runtimeconfig.json (specifically the target framework) and from YourApp.deps.json to build a list of assemblies. If you try to load-by-name an assembly that's not in that list it will fail. If you manually edit the .deps.json in your example to include the PersonReader.CSV.dll component you'll find that Type.GetType then works exactly as you originally expected.

    Of course, that's not a lot of use if your goal is to discover plug-ins at runtime - by the time your code gets to run it's already too late to extend the list of known assemblies.

    But none of that is an issue if you create your own assembly load context. You get to determine your own resolution logic if you do that. Moreover, it enables you to provide better isolation: the problem with the approach you've used is that because everything is in the default context, plug-ins are obliged to use the same versions of any DLLs that they have in common with your app. In cases where those DLLs define types that appear in the plug-in interface that's necessary, but for implementation details it's not, and it can cause problems, because the plug-in might end up with an incompatible version of a component. So for that reason you want to use a custom context anyway, and once you've gone down that path, you can implement custom resolution behaviour. (And you'll also improve your app load time, because you will get on-demand loading, just as the Type.GetType docs promise.)

    ReplyDelete
    Replies
    1. Thank you for the info on assembly resolution and what is causing the difference in behavior (.NET Framework vs .NET Core). Do you have specific references that you can point me to for details?

      Delete