System.CommandLine is pretty cool

It seems like command-line tools and terminals are back in vogue in the Windows community these days. When I need to quickly automate something or try out some new API one of the first things I do is create quick console app. Things generally start out pretty simple but soon enough I want to parameterize the execution of the code and I have to start parsing the command-line input.

Recent history of command-line toolkits for .NET Core

Just this last week I created a small little utility to automate the creation of some PR validation pipelines for the Azure SDK. For that utility I made use of the excellent McMaster.Extensions.CommandLineUtils package (GitHub, NuGet), which I had used before for some internal tooling at Microsoft. It does a great job of taking the drudgery out of reliably parsing command-line arguments.

The back story for McMaster.Extensions.CommandLineUtils is that it is a fork of the Microsoft.Extensions.CommandLineUtils code-base which is part of the ASP.NET Core tool-chain but wasn't taken further and generalized. Nate McMaster who is the maintainer of the fork has done a good job continuing to evolve the toolkit which makes it easy to build command-line tools. It's done the trick for the couple of times that I've used it.

Today I started working on a little command-line tool and I pulled down McMaster.Extensions.CommandLine utils again. I was looking through some recent pull requests when I stumbled across this in Nate's roadmap issue in GitHub. In the opening comments for the issue he makes reference to a new command-line API that is being developed for .NET Core which is charging along in its own repository. I thought it would check it out.

System.CommandLine is seemingly magic

On the README for System.CommandLine it references three packages, Experimental, DragonFruit and Rendering. Presumably DragonFruit is some kind of temporary placeholder because Microsoft could never be that creative with naming :)

I suspect the plans for System.CommandLine (if it goes ahead) are a bit grander than simple command-line parsing, but for this post I wanted to point out something that I thought was pretty cool in the way that the API simplifies the development of command-line applications.

Let's say that I want to implement a command-line utility that looks like this:

$ detect-duplicates --path [path1] --path [path2] --output duplicates.json

Well, once I install the System.CommandLine.DragonFruit package I can write my main method like this:

public static void Main(string[] path, string output)
{
  // Do stuff.
}

Then if I build and invoke the program those variables will get populated appropriately. When I first saw this I was surprised because really the only thing that I had done was install the package. I decided to crack open the assembly in ILDasm to take a peak at the code that was actually emitted.

ILDasm showing AutoGeneratedProgram class.

The interesting thing in the screenshot above is the presence of the class AutoGeneratedProgram. This is not code that I wrote, my class was called Program and sits within the namespace MitchDenny.DetectDuplicates.Tool. Somehow by using the the System.CommandLine.DragonFruit package this new type is generated.

Before we dig further it is worth understanding a little bit more about how entry points work in .NET. Since forever (in .NET terms) the runtime has supported generation of entry-points with the following method signature.

// Valid entry point signatures since forever.
public static void Main();
public static int Main();
public static void Main(string[] args);
public static int Main(string[] args);

If I have a class that contains a method matching any of these then it will automatically be selected as the entry point for the assembly. If I have two methods which match any of the signatures above the C# compiler will complain (CS0017) and hint that I can work around it by adding a /main switch to my compiler invocation. In MSBuild you can get the same effect by setting the StartupObject property in your MSBuild file.

<Project Sdk="Microsoft.NET.Sdk">
  <PropertyGroup>
    <OutputType>Exe</OutputType>
    <TargetFramework>netcoreapp2.2</TargetFramework>
    <StartupObject>MitchDenny.DetectDuplicates.Tool.Program</StartupObject>
  </PropertyGroup>
</Project>

Knowing this, I was curious about what would happen if I declare my entry point signature in the usual way whilst still using the System.CommandLine.DragonFruit package. It turns out that the AutoGeneratedProgram class is still generated and it has the expected Main entry point. So clearly the compiler is being told which one of our entry points to choose from.

When the /main switch is provided to the C# compiler it tells it which method is the entry point, this information is made available to the runtime via a special IL instruction emitted with the rest of the IL for the method (here is an example from ILDasm):

The .entrypoint instruction emitted by the C# compiler.

You can see in the screenshot above that the System.CommandLine.DragonFruit package tells the compiler to use the AutoGeneratedProgram as the entry point for the assembly.

Curious about how this magic worked I went back to the System.CommandLine repository and did a search for AutoGeneratedProgram and sure enough a targets file that forms part of the System.CommandLine.DragonFruit package turns up.

You can see in this file that it sets the StartupObject property and then creates a temporary file containing a hard-coded entry point (which is referenced by the property). The targets in this file are executed just before compilation and the temporary file is included in the set of files that are built by the compiler.

So using the System.CommandLine.DragonFruit package doesn't change the way the compiler works, it "just" uses some MSBuild hacks and compiler switches to insert itself into the execution flow of the program. From the targets file the code that is inserted looks like this (CDATA sections & MSBuild variables removed for clarity):

// <auto-generated>This file was created automatically</auto-generated>
using System.Runtime.CompilerServices;
using System.Threading.Tasks;
[CompilerGenerated]
internal class AutoGeneratedProgram
{
  public static async Task<int> Main(string[] args)
  {
    return await System.CommandLine.DragonFruit.CommandLine.ExecuteAssemblyAsync(
      entryAssembly: typeof(global::AutoGeneratedProgram).Assembly,
      args: args,
      entryPointFullTypeName: "AutoGeneratedProgram"
      );
  }
}

This is the real entry point for the program. As a side note I listed the valid method signatures for entry points since the dawn of time in .NET terms above; starting with .NET Core 2.0 the following additional signatures are also valid.

public static async Task Main();
public static async Task<int> Main();
public static async Task Main(string[] args);
public static async Task<int> Main(string[] args);

The entry point in AutoGeneratedProgram matches the final option on that list, and you can see it just hands off to the CommandLine.ExecuteAssemblyAsync(...) method to get its job done. This is the method that coordinates parsing the inputs from the args parameter passed to the real entry point, generating help output and finding the real entry point (that I wrote) and invoking it.

Method signatures for the Main entry point can take more than primitive types, so let's say I knew my inputs needed to be directory paths, I could do the following:

public static Task<int> Main(DirectoryInfo[] paths, FileInfo output);

The approach here is reminiscent of the way that ASP.NET translates web-requests to methods on a controller and I wasn't surprised to see some familiar class names in related libraries.

The System.CommandLine set of packages are all very early preview at the moment and they might just be experimentation that goes anywhere, but the way it takes care of translating command-line inputs makes it trivially easy to build command-line applications. I'd recommend take a look through the code because it looks like there is lots of interesting stuff there including a rendering API and also some kind of support for sub-commands and custom argument parsing.