C# | Code Analysis Using Roslyn Syntax Trivia

watch_later Tuesday, April 20, 2021
Where to start if we want to analyze all the comments in the code? Which syntax tree nodes do we need to traverse to do this? How can we figure out that the analyzed method contains preprocessor directives for conditional compilation? Such elements of the syntax tree as syntax trivia can provide answers to all these questions.
C# Code Analysis Using Roslyn Syntax Trivia

Syntax trivia

As mentioned in the note about syntax trees, syntax trivia includes elements such as comments, preprocessor directives, and various formatting elements (spaces, newlines). Syntax trivia elements will not get into the IL code. However, they are represented in the syntax tree. The SyntaxTree object has the full fidelity property. This means we can get code that is completely identical to the source code from the existing tree. Besides, we get all the elements in all instances of the SyntaxTrivia structure.

The elements of syntax trivia always refer to some token. There are Leading trivia and Trailing trivia. Leading trivia – additional syntax information preceding the token. Trailing trivia – additional syntax information following the token. All the elements of additional syntactic information have the type SyntaxTrivia. To define what exactly the element is (a space, single-line, multiline comment or something else) we use the SyntaxKind enumeration and the Kind and IsKind methods. 

Check out this code:
#if NETCOREAPP3_1
  b = 10;
#endif
//Comment1
a = b;
Directed syntax graph will look as follows for the above code:

Syntax Trivia

All this refers to the 'a' token: such syntax trivia as the preprocessor directives #if NETCOREAPP3_1 and #endif; the text itself inside these directives; space and end-of-line characters, and the single-line comment. The '=' token has only one syntax trivia element attached to it – a space character. And the end-of-line character refers to the ';' token.

Example of comment analysis using syntax trivia


Imagine this situation. In the company where we write code, there is a new rule for a coding standard - do not write comments longer than 130 characters. We decided to check our project for such "forbidden" comments using a simple analyzer that relies on parsing syntax trivia elements. Here's what the code turned out to be:
public static StringBuilder Warnings = new StringBuilder();

public const int MaxCommentLength = 130;
public static void ApplyRule(SyntaxTrivia commentTrivia)
{
    switch (commentTrivia.Kind())
    {
        case SyntaxKind.SingleLineCommentTrivia:
            {
                if (commentTrivia.ToString().Length > MaxCommentLength)
                {
                    int line = commentTrivia.GetLocation().GetLineSpan()
                                            .StartLinePosition.Line + 1;

                    string filePath = commentTrivia.SyntaxTree.FilePath;
                    Warnings.AppendLine($"Length of a comment at line " +
                                        $"{line} in file {filePath} " +
                                        $"exceeds {MaxCommentLength} " +
                                        $"characters, please, " +
                                        $"break up it on several lines");
                }
                break;
            }
        case SyntaxKind.MultiLineCommentTrivia:
        case SyntaxKind.SingleLineDocumentationCommentTrivia:
            {
                var listStr = commentTrivia.ToString()
                                           .Split(new string[] { Environment.NewLine },
                                                  StringSplitOptions.RemoveEmptyEntries
                                                  );

                foreach (string str in listStr)
                {
                    if (str.Length > MaxCommentLength)
                    {
                        int line = commentTrivia.GetLocation().GetLineSpan()
                                                .StartLinePosition.Line + 1;
                        string filePath = commentTrivia.SyntaxTree.FilePath;
                        Warnings.AppendLine($"Multiline comment or XML comment at line " +
                                            $"{line} in file {filePath} " +
                                            $"contains individual lines that " +
                                            $"exceeds {MaxCommentLength} " +
                                            $"characters. Please, break up " +
                                            $"them on several lines.");
                        return;
                    }
                }
                break;
            }
    }
}
public static Project GetProjectFromSolution(String solutionPath)
{
    MSBuildLocator.RegisterDefaults();
    MSBuildWorkspace workspace = MSBuildWorkspace.Create();
    Solution currSolution = workspace.OpenSolutionAsync(solutionPath)
                                     .Result;

    return currSolution.Projects.Single();
}
static void Main(string[] args)
{
    string solutionPath = @"D:\Test\TestForTrivia.sln";
    string logPath = @"D:\Test\warnings.txt";
    Project project = GetProjectFromSolution(solutionPath);

    foreach (var document in project.Documents)
    {
        var tree = document.GetSyntaxTreeAsync().Result;
        var comTriv = tree.GetRoot()
                          .DescendantTrivia()
                          .Where(n =>
                                   n.IsKind(SyntaxKind.SingleLineCommentTrivia)
                                || n.IsKind(SyntaxKind
                                            .SingleLineDocumentationCommentTrivia)
                                || n.IsKind(SyntaxKind.MultiLineCommentTrivia));

        foreach (var commentTrivia in comTriv)
            ApplyRule(commentTrivia);
    }

    if (Warnings.Length != 0)
        File.AppendAllText(logPath, Warnings.ToString());
}
We will not analyze the code above in detail. By the way, we have written the note. It's from the series about using Roslyn to create your own static analyzer. If you have read it then the main points should be clear to you. Pay attention to the LINQ request. In all previous cases we called the DescendantNodes method after getting the root of the syntax tree. But here we call the DescendantTrivia method. It is because we need to get all the comments, represented by the SyntaxTrivia structure. Next, we select only those SyntaxTrivia whose type is either SingleLineCommentTrivia, MultiLineCommentTrivia, or SingleLineDocumentationCommentTrivia. In the ApplyRule method, to get the comment text, we simply call the ToString method of the SyntaxTrivia instance.

We can do more than define the comments' length. For example, we can search in the source file for a copyleft license that requires developers to open the rest of the source code. Opening the source code for many commercial projects is not possible. Therefore, it would be useful and convenient for programmers to know about such a license in the libraries used. This way they will not violate the license agreement. In some analyzers, this rule is even already implemented.

Sometimes, to perform the analysis, we need to find out whether a particular node contains preprocessor directives inside it. We can figure it out by using the same IsKind method:
methodDeclaration.DescendantTrivia()
                 .Any(trivia => trivia.IsKind(SyntaxKind.IfDirectiveTrivia))

Summary


In this article we will learn about the how to do code analysis in C# using Roslyn's Syntax Trivia.

Author Credit

Article Type : Guest Article
Author : Ilya Gainulin
Tags : CSharp, Knowledge
Article Date : 26-03-2021
Article Publish Date : 20-04-2021
Note : All content of this article are copyright of their author.

Codingzee provides articles and blogs on web and software development for beginners as well as free Academic projects for final year students in Asp.Net, MVC, C#, Vb.Net, SQL Server, Angular Js, Android, PHP, Java, Python, Desktop Software Application and etc.

Thank you for your valuable time, to read this article, If you like this article, please share this article and post your valuable comments.

Once, you post your comment, we will review your posted comment and publish it. It may take a time around 24 business working hours.

If you have any questions regarding this article/blog you can contact us on codingzee@gmail.com

sentiment_satisfied Emoticon