Imagine an object like this
public class ContentType
{
public string Alias { get; set;}
public string ParentAlias { get; set;}
}
And a flat collection of these objects
List<ContentType> contentTypes...;
How can I use a linq chained syntax query to determine if there is a circular reference in the collection.
//Example
ContentType #50
Alias: Truck
ParentAlias: Vehicle
ContentType #90
Alias: Vehicle
ParentAlias: Truck
That would be a circular dependency which would break the code that creates the content types (it would get stuck in an infinite loop walking the parent hierarchies..)
So before I process the parent/child content types I would like to first detect if there is a circular dependency in the collection and halt the operation if one is detected.
So we'll start with two helper methods. First we'll want a method that yields all of the ancestors for an item, when given that item and a delegate that gets the parent of an item:
We'll also write a method to determine if a sequence repeats by storing all of the previously yielded items and seeing if each new item is in that set:
From here we can determine if any sequence contains cycles by computing the ancestors of each item and determining if any of those collections repeat:
All that's left is to write a method that computes the parent of each item, since that's not an operation your class already supports, which can be done by just creating a lookup from the alias to the item and then using it:
As far as performance, and possible improvements, if you're dealing with particularly large trees/subtrees, you could cache the results of the intermediate calculations. For example, if we have a node A, and a parent B, and a grandparent C, which is a root, then when doing a calculation to determine if A is in a cycle we also need to determine if B is in a cycle. If we already determined if B was in a cycle earlier, and cached it, we could skip that step. if we didn't, then we could cache it when doing the cycle calculator for A, and then we won't need to do it again when checking B later.
This does complicate the code a fair bit, so if you don't have a particularly large/deep graph, you may choose not to bother caching these intermediate results and just choose to re-calculate them for each item: