Performance Improvements

At some point I'll either run out of steam or get bored so I'm dumping my notes here so someone else can try these idea out. 

See #1735 for the current best performance

## SyntaxPrinter
- Replace Linq with imperative code 
    - See also `Doc.Join`
- Use `Func<T,PrintingContext, Doc>` more to prevent closure creation
- Flatten concat usage, nested concat uses memory and ruins locality
    - A lot of `Doc.Join` are inside a concat with a leading or trailing separator (0.2MB on complex benchmark)
- Try and remove `Doc.Null` usage, probably slows stuff down when printing/ fitting
    - Maybe create `ConcatNotNull(`
        - It might be worth always doing this inside `Concat` or maybe only for certain problematic methods.
    - I feel like `InvocationExpression` may benefit from de nulling and flattening nested concats
- I estimate that around 1/6 - 1/5 of all Docs are Null, this is bad for printing, serialising and performance
- `DetermineLayout` might be able to use syntaxkind instead of is Syntax
- Do a dumb search on the code as a string for #if and then run `PreprocessorSymbols` (12% of our library code total time)
- `HasLeadingCommentMatching` could do a quick check for minimum length, avoid loop, code load or scary regex.
    - Would its span be relevant here, would also have to pass the regex length
- `Token.PrintSyntaxToken` check how often it short circuits. It might be worth moving the logic to an external function to prevent code load.
- See above for `PrivatePrintLeadingTrivia`
- Should use `in` keyword everywhere, but am waiting for more changes to be merged to make performance confirmation easier
- `AppendComment` `StringReader` could be a static shared instance (maybe) or replaced with a span version, not sure what the benefit of this would be
    - Looking at it it seems like a pretty easy rewrite to use spans
    - Could just use `EnumerateLines`, it literally looks like a drop in superior version
    - `StringReader` could be replaced in `SyntaxNodeComparer`
- Scared to look at `BinaryExpression`
- GetLeading/Trailing trivia can be replaced with a fast check for empty
- Using directive better sort
- Could try method inline `RawSyntaxKind` or replace some calls with raw kind comparison NITPICK doubt it will do anything
- Could replace a lot of Doc.Null with additions by either filtering all concat/group creation into a vlb, 
    - Or could use some collection expression weirdness 
    - Call it ConcatNotNull
    - Good for AttributeList
- Doc.Concat(whenTrue) / Doc.Concat(whenFalse) ????	
- ConditionalGroup could contain two Docs because it never uses more than two items
- Conditional Expressoin has a redundant condition
- InvocationExpression doesn't need to call .AsSpan().ToArray() on the printed nodes, it can instead have a VLB of nodes and a VLB of break indexes
    - It might be worth creating a dedicated type to keep logic the same
    - This would probably be useful with Concat.ManyChildren as many concat groups in an invocation chain will likely have less that 5 values
    - Off by one errors may be a big issue
- Modifier ToArray is unneeded to sort, could use stackalloc and span sort or array pool, it will likely escape but we can use concat children, prolly saves 50KB on complez
- Pool Dictionary<string, GroupMode> saves 170KB
- PrintSyntaxToken VLB is oversized
- Using SyntaxToken.TrailingTrivia or LeadingTrivia internally calls the internal greennode which will use both GetTrailingTriviaCore and GetLeadingTriviaCore, 
    - Has comments ends up calling each method two times each.
- Scan for `csharpier-ignore` and write to `PrinterContext` we can then avoid regex checks
- Noticed 170KB of Comparison<SyntaxToken> from `Modifiers`, I have no idea why a delegate is created for each `Array.Sort` call
- AddLeadingTrivia allocates a `StringWriter`, .NET 10 might stack allocate this


## Doc Printer
- Goto statements in `RunOn`, `DocFitter` etc. Literally the one time its recc
- `DocFitter` could have a cache where it remembers the final length of concats and short circuit, probably won't work due to different states
- Discrete type discriminator would be nice, C# 15 may add discriminant unions which may help
    - Might work if size is constrained to 16 bytes with additional info being heap allocated and pointed to
    - lots of indirection, will be tricky and likely not useful will speed up RunOn and DocFitter
- When reverse iterating through `Doc.Concat` it may help to `Span.Copy` and then use `Span.Reverse` in place
    - This will only be useful for larger average concats
    - Won't work with Concat.Two/Three may need to be special cased/ a dedicated abstract method added ie bool `TryCopyToReversed(Span<Doc>)`
- Split `DocPrinter`, `Fitter` and `RunOn` into the main most common path and a secondary check function ie `StringDoc`, `Concat`, `Group` falling back to the secondary one, might help code load / happy path
- `ContainsBreak`, early exit with StringDoc maybe
- `ElementAt`, Either use `UnsafeAccessor.ElementAt` at or `ValueListBuilder`
- Seeing a lot of `StelItem` calls, I assume it's coming from `Stack` I wonder if using VLB will stop this
    - Can't see where else it's from, stacktrace is a bit weird I assume `Stack.Push` is being inlined causing the weird backtrace
- `IncrementIndent` could be a list for instant lookup time instead of hashing shenanigans (maybe sparse list???? if worried about some weird length)
- Groups could use an integer instead of a string, faster, smaller, arguably better than `Guid` for debugging
    - Ask why `Guid` is used sometimes but an `int` is used otherwise, debugging?? 
    - An incremental Int is prolly better for humans than a different `Guid` each time, in most cases it will be consistent or off by one
    - Id is used as dictionary key, ints are much faster as keys (RunOn and Fits are hotspots rn)


## Node Comparer
SyntaxNodeComparer is run when a file is formatted, I think it runs in about half the time to format, not sure if it can be improved
- Might be able to do a simple compare text before running, this would be relevant for updates to csharpier where the file cache is invalidated but nothing changes in the new formatted code
    - I think I'll save this for last
- might be able to reuse `Stack` or estimate capacity
- AllButLast is good, might be possible to create a CompareEnumerable, don't know if the pain of boxing/enumerable allocations offsets the benefit of not calling `ToArray`
- `CompareLists` boxes, this might be avoided with generics, It's always a little janky I've never found this consistent
- `SyntaxNodeComparer` calls `ToList` a lot, it might be better if `ToArray` where used both to remove the `List` allocation and to avoid allocating when there are no items in the collection
- `CompareResult` indenation is off in `SyntaxNodeComparer`
    - I'm pretty sure the indentation is straight wrong.
- `SyntaxNodeComparer` calls `CSharpSyntaxTree.ParseText` on original text for a second time, this is pretty slow
- OrderBy
    - Could use the non boxing `ToArray` and the sort, this is not a stable sort so this might not work
    - Check for length before any boxing nonsense
    - Only call `OrderBy` skip `ToArray` and create a `CompareEnumerable` -  still boxes might be faster imo
- Mystery `Func<SyntaxToken, SyntaxToken, CompareResult>` allocations
    - No idea why this is a thing, driving me nuts, its probably Compare doing something weird. Maybe method group conversion, I swear Linq does this all the time why would it allocate a Func
    - Maybe a stateful vs static thing breaking things?
- I imagine `is` usage in `Compare` is slow, we used SyntaxKind before, not sure if it was worth the effort
`originalToken.ValueText.Replace` could do a span thing where it iterates backwards removing `\r` or comparing the two, probably not worth the hassle
- Apply `SkipLocalsInit` to everything? No point zeroing `CompareResult` because it always assigned to
- Why do we have two stacks? Have one stack with a tuple of 4 `SyntaxNode`
- Could try passing `CompareResult` via `in` or `ref`. Not sure how large a benefit it would be
    - TBF it's 224 due to padding
    - Does `TextSpan?` have to be nullable, can't it just be default
    - Perhaps all methods could return a bool and any `TextSpan` are assigned to a property in `SyntaxNodeComparer`

## CSharpier.CLI
- Would love to see a benchmark on a real world CSharpier cache file, how large is it, how long does it take to deserialise.
    - If this is a major issue CSharpier could start reading files async while the file is being serialised / read.
    - This logic would be a pain it's pretty much a pipeline
- How fast is the hash function, if its a problem (unlikely) perhaps CSharpier should try a faster version I don't think Xxhash32 is vectorised.
    - xxh3 is faster
- XxHash32.Hash has the option to return an int this way we can save a whopping 32 bytes!!! lmao
- I did wonder if TPL dataflow could be used for reading a bunch of files.
    - Maybe `Parallel.ForEachAsync` might work
- Parsing and building the regex for gitignore is super slow
- Enumerating all directories is extrememly slow, prettier uses `fast-glob` and `picoglob` to efficiently go through directories. This would be a huge time save as its a blocking issue

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Performance Improvements #1734

SyntaxPrinter

Doc Printer

Node Comparer

CSharpier.CLI

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Performance Improvements #1734

Description

SyntaxPrinter

Doc Printer

Node Comparer

CSharpier.CLI

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions