Traversing the C# syntax tree with F#
This article will go over the basics of the .NET compiler platform feature for analyzing the C# syntax tree, using F#.
.NET provides a Syntax API that can read any C# or Visual Basic source file and provide the corresponding Syntax Tree for that code.
Why⌗
Why would someone need to traverse the C# syntax tree?
Well, it can be for a number of reasons, maybe you want to gather statistics about how many classes, namespaces and methods you have, maybe you want generate code based on what is already written, maybe you want to create new tools like a new linter or a tool like Swagger. All these things can be done by analyzing the syntax tree.
Recently I found myself using the Syntax API for finding Attributes above certain methods and classes, and based on the name and arguments of the Attributes, I generated various other files that were used elsewhere.
using System.Collections;
using System.Linq;
using System.Text;
namespace FunWithSyntaxTrees
{
class Program
{
static void Main(string[] args)
{
// ...
}
}
}
The snippet above shows a small program. We will use this snippet as our input for analyzing the syntax tree.
How⌗
Assuming you have an F# environment setup. You can begin by installing the nuget package Microsoft.CodeAnalysis.CSharp
and importing that into your project.
open Microsoft.CodeAnalysis
open Microsoft.CodeAnalysis.CSharp
open Microsoft.CodeAnalysis.CSharp.Syntax
module Main =
[<EntryPoint>]
let main argv =
0
After you install the package and add your open
directives, we will hardcode the C# source code from above into the file, above the main
entrypoint function.
// ... open directives
let code = """
using System.Collections;
using System.Linq;
using System.Text;
namespace FunWithSyntaxTrees
{
class Program
{
static void Main(string[] args)
{
// ...
}
}
}
"""
module Main =
[<EntryPoint>]
let main argv =
0
When you write a real program that uses the Syntax API, you will most likely be reading the C# source from files, like this let code = File.ReadAllText "/path/to/file"
, instead of hardcoding the string like we did, but for this tutorial it is fine for demonstration.
So we will begin by passing the string of C# source code to the Syntax API to be parsed, in return we will get the Syntax Tree that we can begin analyzing.
[<EntryPoint>]
let main argv =
let syntaxTree: SyntaxTree = CSharpSyntaxTree.ParseText code
0
Note: I will write out the
Type
’s of all the variables, but it is unnecessary most of the time since F#’s type inference is very capable of inferring the type itself. Just like in C# when you use thevar
keyword, it is capable of knowing the underlying type, in F# this inference is even more powerful and applies to arguments, functions and everything in-between.
Now that the Syntax API has returned our needed Syntax Tree, we can begin travering it and exploring what it offers as data.
First let us get all the using
directives in the file. We start by getting the root node of the file, then we iterate over all the child nodes inside the root node and find the ones that are the correct UsingDirective
type.
[<EntryPoint>]
let main argv =
let syntaxTree: SyntaxTree = CSharpSyntaxTree.ParseText code
let rootNode: CompilationUnitSyntax = syntaxTree.GetCompilationUnitRoot()
let rootNodeChildren: SyntaxNode seq = rootNode.ChildNodes()
0
The rootNodeChildren
variable holds all the child SyntaxNode
’s of the root node. The root node is basically the first node of the SyntaxTree
which holds everything, and a SyntaxNode
is the most general type of node.
We now need to iterate over these children to find the correct SyntaxNode
for using
directives since that is what we are looking for. We will declare a small helper function to help find them.
let usingDirectiveNode (node: SyntaxNode): UsingDirectiveSyntax option =
match node with
| :? UsingDirectiveSyntax as usingDirective -> Some usingDirective
| _ -> None
[<EntryPoint>]
let main argv =
let syntaxTree: SyntaxTree = CSharpSyntaxTree.ParseText code
let rootNode: CompilationUnitSyntax = syntaxTree.GetCompilationUnitRoot()
let rootNodeChildren: SyntaxNode seq = rootNode.ChildNodes()
let usingDirectives: UsingDirectiveSyntax seq =
Seq.choose usingDirectiveNode rootNodeChildren
0
The new helper function usingDirectiveNode
takes a generic SyntaxNode
and checks if it is of the UsingDirectiveSyntax
variety, if it is, it returns an F# Option type containing the using
directive node.
Note: An F# Option type is a way to represent a “nullable” value, since there are no real null values in F#, nullable values are representated as Algebraic Data Types, such as the Option type.
We use the new helper function by mapping over every node and passing it to the function. We use Seq.choose
to filter out any None
types and keep all the Some
types. It also unwraps the Some
types so we can keep using them without Option mapping.
So Seq.choose
is just a fancy way of doing Seq.map
and then Seq.filter
specifically with Option types since the type signature is ('T -> 'U option) -> seq<'T> -> seq<'U>
.
Moving along, so now that we have a sequence of using
directives in a variable, we can get the specific properties of a using
directive. For now we wil just print them out as proof.
let usingDirectiveNode (node: SyntaxNode): UsingDirectiveSyntax option =
match node with
| :? UsingDirectiveSyntax as usingDirective -> Some usingDirective
| _ -> None
[<EntryPoint>]
let main argv =
let syntaxTree: SyntaxTree = CSharpSyntaxTree.ParseText code
let rootNode: CompilationUnitSyntax = syntaxTree.GetCompilationUnitRoot()
let rootNodeChildren: SyntaxNode seq = rootNode.ChildNodes()
let usingDirectives: UsingDirectiveSyntax seq =
Seq.choose usingDirectiveNode rootNodeChildren
usingDirectives
|> List.ofSeq
|> List.map (fun u -> printfn $"{u.ToString()}")
|> ignore
0
The output of running our program would be:
using System.Collections;
using System.Linq;
using System.Text;
Pretty cool right? We analyzed our C# code and found our using directives and printed them out.
We can use that strategy to find anything in our code, including methods, method arguments, types, classes, interfaces, enums, comments, attributes, etc, everything!
If you found this useful, feel free to follow me on twitter at @rametta