Command line shells are interactive programs for running other programs.

To run a program, just type its name followed by a list of arguments. This rudimentary language does not know any other types than strings, so the name of the program and all of its arguments are just strings, delimited by spaces.

./build/my_program some-command --with-a-flag -m message input_file.txt

Although very basic, this setup is easy to understand and create programs with, because these are just strings that can be handled in any programming or scripting language.

But the problem is, let me say it again, that all arguments are just strings! See what my_program executable receives:

["some-command", "--with-a-flag", "-m", "message", "input_file.txt"]

Note that -m and message are even two separate arguments.

Because of lack of typing, there is nothing stopping us from passing arguments with typos, specifying -m but forgetting the message or just badly formatted arguments, like using a space in a file name without escaping it correctly.

But there is a better way

Let’s modify my_program such that when executed without input, it spews out something describing the following data structure:

struct {
    command: enum {
        SomeCommand {
            has_a_flag: bool,
            has_other_flag: bool,
            message: Optional<String>
        },
        AnotherCommand(String)
    },
    input_file: FilePath,
}

This description could be ideally in some established format for declaring data structures, but this is not central to my point.

So when you invoke ./build/my_program now, the shell can parse the command, figure out the format of expected arguments and map the command to the data structure.

The data passed to the program would be pure information, 0 for AnotherCommand and 1 for SomeCommand. Because the command is represented by one bit, passing anything other than the two specified command would not even be possible.

Note that I’m not saying that we would need to change the language of the shell in any way, you would still write some-command, it’s just the representation of passed data that is different.

What have we gained by this?

Arguments are now parsed and converted to correct data types by the shell instead of the program, which lifts a lot of burden from the program. Parsing user input is not easy, so many programs fail in edge cases or add their own syntax along the existing conventions.

And they are just that: conventions. Flags being prefixed by one or two dashes is a convention, which is not enforced anywhere but implemented again and again in many languages and many libraries. Typed arguments would ease that and enforce at least some convention.

With defined data structure, shells could also greatly improve discoverability of program features, which is now done using -h or -? or --help or man my_program or tldr my_program or Google. In all cases exploring is required even though the shell could be offering you options. Some shells do know a little about autocomplete, but their guesses are typically limited to file paths and top-level commands. Even when using git (which has great autocomplete in zsh) you are sometimes back to exploring man pages of many many options.

Beyond CLI

I believe that all interfaces should have some kind of strongly typed declaration of what they expect. Not just in the form of documentation that needs to be explored and learned, but in a form that can be consumed by the caller program. I know that brings in more moving parts and overhead, but they are worth the effort. Without it, you are just delaying problems from write-time to the run-time.

I’m hoping to one day live in a world where CLIs can be embedded in GUIs as the command palette in a dropdown. Like Sublime Text, they would do fuzzy autocomplete, but also have submenus making sure that you don’t skip an argument.