Time Complexity Evaluation of Good Binary Tree Traversal | by Daniel García Solla

Concretely, the blue textual content denotes the index of the mum or dad node, whereas the opposite textual content in black illustrates the label of the node in query.

Implementation

Now with a well-defined enter, it’s doable to understand at a extra summary degree what the operations of the tree traversal really do. On one facet, the outer For[] loop traverses by way of all of the nodes in degree order from the bottom degree to the one the place the foundation is positioned. And, for every node the Whereas[] loop traverses the trail from the foundation to the visited node in reverse order, though the essential facet for the time complexity sure is its size.

TreeIteration[v_List] := Module[{n, pos}, n = Length[v];
For[i = n, i >= 0, i--,
pos = v[[i]];
Print["FOR: ", i];
Whereas[pos > 0,
Print["While: ", pos];
pos = v[[pos + 1]];
]]]

So, after implementing it in Wolfram, some Print[] are included to show the index of the mum or dad node of the nodes it traverses throughout its execution, enabling a better reconstruction of its hint.

Output

n = 7;
treeList = GenerateTree[n]
TreeIteration[treeList]
PlotBinaryTreeFromList[Drop[treeList, 1]]

Working the algorithm with a 7-node tree, represented as L={-1, 0, 1, 1, 1, 1, 2, 2, 2, 3, 3}, yields the next final result:

FOR: 8
Whereas: 3
Whereas: 1
FOR: 7
Whereas: 3
Whereas: 1
FOR: 6
Whereas: 2
Whereas: 1
FOR: 5
Whereas: 2
Whereas: 1
FOR: 4
Whereas: 1
FOR: 3
Whereas: 1
FOR: 2
FOR: 1
FOR: 0

At first look, the hint is just not too revealing, so it needs to be mixed with the tree depiction:

Within the first iteration of the for loop, the traversal begins on the final node of the bottom degree, whose mum or dad has index 3. Subsequently, this node can be visited by the whereas loop, till within the subsequent iteration it reaches the foundation and ends. Within the succeeding for iteration, the identical course of is carried out with the distinction that it begins with the node with index 6, whose mum or dad node is identical as earlier than. Thus, it may be famous that the for is definitely traversing all the present paths within the tree that join every of the nodes to the foundation.

With the algorithm in place, and after having absolutely characterised its enter and understood its operation, it’s appropriate to proceed with its evaluation, each by way of reminiscence and time. On the one hand, the evaluation of the reminiscence occupied is easy on this case, for the reason that algorithm doesn’t want further reminiscence to carry out its process, past the integer worth pos by which the node traversed in every iteration is saved. Accordingly, the asymptotic sure representing the extra reminiscence consumption is fixed O(1). And, if we contemplate the house occupied by the enter, an quantity of reminiscence of order O(n) can be required, the place n is the variety of nodes within the tree, or extra exactly O(2^d), the place d is the tree depth.

Alternatively, to find out the time complexity sure, we should outline an elementary operation to be accounted for. For this algorithm, it’s established because the asingnation executed contained in the whereas loop, which at an summary degree could be thought-about because the traversal of an edge between a node and its mum or dad. Subsequently, to establish what number of occasions this operation is invoked, the price of the algorithm is first decomposed into two features. There’s, on one facet, the associated fee Tf(n) of the for loop, which represents the full of 1 algorithm’s execution. This, in flip, is outlined because the sum of the prices incurred by the whereas loop, designated as Tw(i).

For all nodes i contained within the array, we have to decide what number of elementary operations are concerned within the traversal of the trail from the foundation to i, so we append the corresponding Tw(i) analysis. Particularly, that perform will return the precise variety of assignments attributable to a sure enter node. Thus, since we all know that the primary L[0] can not stroll any path to the foundation, it isn’t counted, protecting the sum limits between 1 and the variety of nodes n within the tree.

Earlier than persevering with, we proceed to show that the applying of the perform P(i) to a node i positioned at degree l of the tree ends in the label of a node positioned on the instantly higher degree, for the reason that elementary operation thought-about on this evaluation is equal to pos=P(pos), primarily because of the enter options:

As proven, we start with the inequality that any node should fulfill with respect to the extent at which it’s discovered, being its label bounded by the features m(l) and M(l), and assuming that l is its degree. Afterwards, when making use of P, a number of simplifications could be effected, resulting in the conclusion that P(i) lies between 2^(l-1) and 2^l-1, each coinciding with the evaluations m(l-1) and M(l-1), suggesting that after the transformation the ensuing node is positioned at degree l-1. With this, we’re demonstrating that after a number of iterations of the whereas loop, the node saved in pos may have a degree nearer to the tree root. Consequently, if sufficient of them are accomplished, the trail is assured to succeed in the foundation and terminate. Though, in case of contemplating an infinite tree this won’t maintain.

Strategy 1

In the intervening time, we all know that the complexity is pushed by Tf(n), regardless of the dearth of a precise expression for Tw(i), so we proceed to debate three alternative ways to characterize this perform, and thereby the general asymptotic progress of the execution time.

No matter how the remaining perform is discovered, a constraint on the tree nodes shall be met in all analyses. Specifically, since all of them have a single mum or dad, besides the foundation, we will be sure that the trail size between an arbitrary node positioned at a degree l and the foundation is the same as l. Primarily that is because of the property demonstrated above, though it will also be evidenced by the belief that every node current on such a path is at a special degree, which might range from 0 to l. Then, because the whereas loop traverses each node within the path, it’s concluded that the operations counted in Tw(i) are precisely l.

Now, the evaluation is targeted on computing the extent of a given node, so we begin from the higher inequality. That’s, the label of a node i is at a degree l bounded by m(l) and M(l), yielding two circumstances with which the extent could be precisely quantified:

On the one hand, fixing from the left facet of the inequality results in a situation reducible to a flooring operation on log_2(i), by its personal definition. From this, it may be inferred that the extent is the same as that amount, though the opposite situation of the unique inequality nonetheless must be verified.

Ranging from the right-hand facet, we arrive at a decrease sure for l, which a priori seems to be complementary to the previous. Nonetheless, after working and making use of the definition of the ceiling perform, we arrive on the following formulation for the extent, since its worth is the minimal integer that satisfies the final inequality proven above.

Picture by writer.

Recapitulating, thus far we have now derived a number of expressions for the extent of a node i, which at first might be regarded as bounds of that worth as a consequence of their nature. Nonetheless, the extent have to be an integer, so it’s conceivable to examine the gap between them, simply in case it had been sufficiently small to uniquely determine a single worth.

In abstract, it’s confirmed that each expressions are similar for all of the values that the node labels might have. Subsequently, the extent of a node could be inferred by both of the above formulae, the left one being the best, and so the one which shall be used within the evaluation.

As the extent of i coincides with the elementary operations of the whereas loop, the associated fee Tw(i) is outlined analogously to the node’s degree from which the trail to the highest should start.

Subsequent, with an expression for the price of every iteration of the for loop as a perform of the preliminary node, we will attempt to discover the sum of all the prices generated by the nodes of the tree. However, as there’s a flooring perform in every summand, we’ll first examine the influence of not making use of this perform on the final word sure, to be able to simplify the summation, in addition to the ensuing sure in case the ground turns into dispensable.

If we plot Tf(n) for an honest vary of n, a slight distinction is discernible between the perform with the ground of every summand eliminated and the unique one. Notably, the one which instantly sums the logarithm values with none further transformation seems to be a higher sure of the particular complexity, so if we proceed to unravel the sum the place every time period is instantly log_2(i), we will arrive at a sure that asymptotically could also be considerably bigger than the precise one, establishing itself because the higher one:

By expressing the sum in a closed kind, we might assume that the algorithm requires an execution time no better than the order O(log(n!)) with respect to the variety of nodes within the enter tree. Nonetheless, this sure could be additional simplified. As an example, if we contemplate that in every iteration of the for loop, which is executed as many occasions as n nodes, a piece is carried out proportional and never larger than the utmost degree of the tree, we might get an higher sure of order O(n log(n)). As a consequence, if we evaluate it with the earlier order O(log(n!)) by way of the restrict of its ratio when the enter tends to infinity, we conclude that each are equal, permitting the simplification of the higher sure of the algorithm’s runtime to O(n log(n)).

At this juncture, the higher sure ensures that the runtime overhead of the algorithm doesn’t exceed the order O(n log(n)) by way of progress with respect to the enter. Nonetheless, the curiosity of the evaluation resides in bounding the associated fee as a lot as doable, i.e., discovering the tight sure, not an higher one, which in some circumstances might differ considerably. For this, it’s essential to discover a closed kind for the sum Tf(n) above, particularly when the ground perform is utilized to the summands. Intuitively, the applying of the ground will cut back the worth of every time period to some extent, and the final word worth might range because of the dependence between the higher restrict and the dimensions of the tree.

Firstly, for log_2(i) to be an integer and to keep away from making use of the ground transformation, the node label have to be of the shape 2^l, the place l should essentially discuss with the extent at which it’s encountered.

Coincident with m(l), above it’s proven that every one nodes i=m(l) whose label is the minimal of their degree will end in log_2(i) being an integer, specifically l.

Subsequently, by feeding all labels between m(l) and M(l) as enter to the flooring(log_2(i)) perform, it ought to yield its degree, which has been discovered to coincide with that of the “consultant” m(l) node of that degree. Briefly, this permits to imagine that each node of a specific degree will incur in the identical value Tw(i), as the trail’s size from any considered one of them to the foundation is precisely equal to l.

Picture by writer.

Subsequently, the variety of nodes at every degree is deduced, which as one would possibly guess with out this step is 2^l, that’s, if at every degree the variety of nodes of the earlier one is doubled, for a sure degree this amount shall be given by the product of the branching issue by itself l occasions.

In conclusion, the runtime value of the algorithm in any respect nodes of the identical degree l is the product between the size of the trail to the foundation, coincident with the extent, and the variety of nodes in it. And, from this outcome a closed kind for Tf(n) depending on the depth d of the tree could be drawn:

By rewriting the sum as a perform of the degrees from 0 to the depth, we arrive on the above expression, which could be concretized by defining the connection between d and the full variety of nodes n:

Picture by writer.

Since n is the label of the final node, flooring(log_2(n)) ensures to return the worth of the final degree, which in flip coincides with the depth d. Thus, by the above formulation of the entire value Tf(n) we conclude with the next tight sure:

At this level, it’s price making an attempt to simplify it, in order that it’s featured by a less complicated expression. Because of this, we proceed to calculate its ratio with the earlier higher bounds, which can primarily present the distinction between each in case they’re asymptotically equal, or diverge within the reverse case (though it might additionally converge to 0).

However, the bounds of the ratio produce the identical outcome for each higher bounds, being asymptotically equal. And, as they lie on an actual interval, it may be inferred that the tight sure is equal to the higher one, not less than asymptotically, for the reason that ratio signifies a negligible distinction at infinity.

Lastly, the time complexity of the algorithm is decided by the highest order, which could be achieved in a number of methods as we’ll see under. Earlier than persevering with, although, it’s price noting the connection between the 2 expressions discovered for the tight sure. Whereas the latter relies upon instantly on the variety of nodes, the unique one could be shaped by rewriting the one proven above changing n by the variety of nodes on the final degree, which contributes to a greater understanding of the dependence between the runtime and the properties of the information construction concerned.

Strategy 2

One other technique to proceed with the evaluation is by defining every worth contained within the enter array:

Each is recognized by a concrete analysis P(i), from which the next constraint could be inferred:

By representing P(i) an ascent within the tree, any enter bounded by [0,n] that may be offered to the perform will all the time return a outcome current in the identical interval, which results in the formalization of the traversal carried out by the whereas loop and whereby we’ll obtain Tw(i):

Initially of the traversal, any node i is chosen, ascending to its mum or dad P(i), then to its ancestor P(P(i)) and so forth till reaching the foundation with label 1. Really, the loop stops when reaching the “node” L[0], nonetheless, right here it’s thought-about that it stops on the root, for the reason that distinction in value shall be fixed. So, above we formalize this course of by composing P(i) a variable variety of occasions, which as we all know coincides with the size of the trail to the foundation, could be set equal to the node’s degree l.

With this method, the associated fee Tw(i) is outlined as the extent of the enter node, which will also be acquired by discovering the integer that satisfies the higher equality.

At this level, when acquiring the integer l that causes the repeated composition to end in 1, we first apply the properties of the ground perform to explain the composition in a closed kind. Additionally, it’s demonstrable that the composition of the perform P ends in the above expression.

Thereafter, by definition of the ground perform, an inequality is established between the closed type of the composition and the result it ought to attain. That’s, the equality dictates that after l compositions precisely the worth of the foundation is reached, though, for the reason that argument of the ground could also be better than 1, we proceed from the inferred inequality. Lastly, we conclude with an expression for the extent of a sure node i, which we’ll use to search out Tf(n), and therefore the complexity.

When changing Tw(i) by the extent of node i, the summation produced is equal to the one solved within the earlier evaluation, so the ultimate expression can be equal.

In the end, the tight sure derived from this process is of order nlog(n), coinciding with the beforehand inferred one. In flip, it could even be rewritten as a perform of tree’s depth, which in sure conditions turns into useful.

Strategy 3

Lastly, we’ll discover an alternate technique to carry out this evaluation and purchase the prior asymptotic sure. On this case, we will begin from the label i of a mum or dad node saved within the array. This label at low degree is represented as a optimistic integer, particularly in base 2. Subsequently, its binary kind could be denoted as follows:

On one hand, it’s outlined because the sum of the merchandise of the bits by their worth within the corresponding base, which in a compact format is formalized as a gaggle of bits whose subscript denotes such worth.

Every of the bits is an integer 0 or 1, whose subscript belongs to the interval of the integers comprised between 0 and B(i)-1, the place B(i) is the perform that returns the size of the binary illustration of the integer i.

As for its formulation, it stays confirmed that the variety of bits wanted to explain an integer in base 2 is given by the above equality. A priori, the logarithmic time period is similar to the expression describing the extent at which node i is positioned, so we will start to elucidate the remainder of the process.

To calculate Tw(i), it’s essential to account for the impact of P(i) on the binary illustration of the node label. Merely put, the label ensuing from the repeated utility of P(i) have to be 1, or on this case for simplicity 0. Subsequently, by dividing the label by 2 and making use of the ground perform, it may be assured that in binary the equal of this perform is a shift operation to the correct. So, after B(i) shifts, the ensuing label shall be 0, concluding the trail of the whereas loop and incurring a price proportional to flooring(log_2(i))+1.

Likewise, when substituting B(i) within the sum of the general value, on this evaluation we find yourself with a further time period n, which, being smaller than the ultimate worth, is asymptotically negligible.

In conclusion, with this process the identical tight sure is deduced, protecting the runtime value of the algorithm categorized by the order nlog(n).

Time Measurements

Lastly, after the theoretical evaluation, experimental measurements shall be collected of the time it takes to complete the algorithm for inputs of various sizes, to be able to present how effectively or poorly the runtime progress matches the tight sure.

knowledge = Flatten[
ParallelTable[
Table[{n, 
AbsoluteTiming[TreeIteration[GenerateTree[n]]][[1]]}, {index, 0,
n}], {n, 1, 300}], 1];

To this finish, a number of Wolfram features are used within the measurement course of. Probably the most vital of those is AbsoluteTiming[], which information the time in seconds it took to run the algorithm with a tree consisting of n nodes. Right here, we don’t choose values of n which might be powers of two, we merely contemplate that the enter is a whole tree as a substitute of an ideal one to be able to observe how the execution time grows in relation to the variety of nodes. Then, measurements are taken for n from 1 to 300, performing n runs for every corresponding variety of nodes.

nonlinearModel = NonlinearModelFit[points, a*x*Log[x], {a}, x]
ListPlot[{points, Table[{x, nonlinearModel[x]}, {x, 0, 300}]}, 
PlotStyle -> {Pink, Inexperienced}, ImageSize -> Giant, 
AxesLabel -> {"n", "Time (s)"}]

Afterwards, a becoming mannequin of the shape c*nlog(n) is outlined by which c represents a continuing used as a parameter, adjusting its worth to the measurement dataset as dictated by the NonLinearModelFit[] perform.

As soon as the mannequin has been fitted, the highest final result is generated, the interpretation of which is considerably extra significant when plotted in opposition to the information factors:

As seen, the dataset reveals some variability as a consequence of sensible interferences within the measurement course of. Nonetheless, the expansion as n is elevated is clearly just like an order nlog(n), which can be exceptional as compared with the placement of the mannequin, being located in a zone considerably decrease than the typical between the 2 areas that visibly present a better density of measurements.

Lastly, with the earlier becoming outcomes and an adjusted R² of 0.934551, it may be concluded that the mannequin accurately captures the expansion pattern of the dataset. Although, its variability interprets right into a slight uncertainty within the worth of the c fixed.

The formal evaluation of the algorithm characterizes the asymptotic progress of its execution time by the order Θ(nlog(n)). Such sure has been calculated from three totally different approaches, though all of them are primarily based on the identical concept of figuring out the depth of every tree node. Within the first one, the extent was used because the depth measure, which is equal to the variety of occasions P(i) should compose with itself to succeed in the label of the foundation node, and, in flip, to the variety of bits wanted to characterize the label of the preliminary node i in binary.

Additionally, as a closing notice, it’s price mentioning that many of the Wolfram code concerned on this evaluation was generated by the GPT-4o mannequin from ChatGPT.