1. Use the trick of Tree SQRT-Decomposition

I didn't find any Codeforces post describing this method, so I'm here to share this method. Also, I hope people could help me learn more about tree sqrt-decomposition.

Main Idea

The main idea of Mo's algorithm is to decompose sequence with size $$$N$$$ into $$$O(K)$$$ contiguous blocks, and if we re-order our queries $$$[L_i,R_i]$$$ with pair $$$(\text{BlockID}(L_i), R_i)$$$ and maintain two pointer of $$$L$$$ and $$$R$$$, we could get a $$$O(Q \frac{N}{K} + KN)$$$ complexity to solve some sequence problem.

It is because that every queries with the same $$$\text{BlockID}(L_i)$$$ shares the cost of moving the pointer of $$$R$$$ from leftmost to rightmost, so the pointer of $$$R$$$ will move at most $$$O(NK)$$$ elements. Also, the pointer of $$$L$$$ won't move more than $$$O(\frac{N}{K})$$$ elements when changing from one query range to another query range.

So, if we have a way to preserve these two properties ( i.e. we could share the cost of moving a pointer $$$R$$$ when those queries are in the same block, and we could correctly decompose our tree into some blocks such that moving a pointer $$$L$$$ between nodes in a single blocks won't cost to large ), we could apply a mo's algorithm on trees.

Tree SQRT-Decomposition

Here comes a way to decompose tree into blocks such that the size of blocks will be in the range $$$[K, 3K]$$$, and also the distance between nodes inside a single block won't exceed $$$O(3K)$$$.

Analysis

So, why this algorithm work ( i.e. the size of blocks will be in the range $$$[K, 3K]$$$, and the distance between nodes inside a single block won't exceed $$$O(3K)$$$ )?

Obviously, we call vertices_without_group a block if and only if it's size is greater than $$$K$$$, so the size of a single block won't exceed $$$K$$$ and the size of the return value of DFS will be less than $$$K$$$. Also, when merging two sets with size both less than $$$K$$$, the resulting set will have a size less than $$$2K$$$. This proves the size of blocks constructed in DFS will be in the range $$$[K, 2K]$$$. However, the merging step in group_tree_vertices may make the size of the last block becomes greater than $$$2K$$$, but it won't make it over $$$3K$$$. Now, we proved that the size of every block will be in the range $$$[K, 3K]$$$.

Additionally, a single block is almost-connected ( i.e. if it is constructed when dfsing $$$u$$$, then after putting $$$u$$$ into that block, the block becomes a connected component ), so the distance between nodes inside a single block won't exceed its size. Thus, the distance won't exceed $$$O(3K)$$$.

Mo's Algorithm on Tree with Tree SQRT-Decomposition

So, how to apply mo's algorithm on trees to handle tree path problem?

We could find that if we maintain an edge set $$$T_u$$$ from root to $$$u$$$ and an edge set $$$T_v$$$ from root to $$$v$$$, then $$$T_u \operatorname{\triangle} T_v$$$ ( where $$$\operatorname{\triangle}$$$ means Symmetric difference ) will be the edge set between $$$u$$$ to $$$v$$$. Also, when we want to change our query from $$$(u, v)$$$ to $$$(p, q)$$$, we just need to transform $$$T_u$$$ to $$$T_u \operatorname{\triangle} ( T_u \operatorname{\triangle} T_p )$$$ and $$$T_v$$$ to $$$T_v \operatorname{\triangle} (T_v \operatorname{\triangle} T_q)$$$ ( because $$$(T_u \operatorname{\triangle} ( T_u \operatorname{\triangle} T_p )) \operatorname{\triangle} (T_v \operatorname{\triangle} (T_v \operatorname{\triangle} T_q)) = T_p \operatorname{\triangle} T_q $$$ ).

As implementing this algorithm, when we want to change our query from $$$(u,v)$$$ to $$$(p,q)$$$, we just need to traverse the path from $$$u$$$ to $$$p$$$ and $$$v$$$ to $$$q$$$, adding those edges into our set if they haven't been in our set, or remove them if they have been in our set.

We could sort our queries $$$(u_i, v_i)$$$ with $$$(\text{BlockID}(u_i), \text{dfn}(v_i))$$$ ( where $$$\text{dfn}(u)$$$ means the time we enter our tree in a dfs procedure ).

This preserves the properties described above because moving a pointer between a single block cost $$$O(K)$$$ and traverse the tree with $$$\text{dfn}$$$ equals to traversing the tree with dfs, and thus, it cost $$$O(N)$$$.

As a result of this algorithm, we could first sqrt-decompose our tree and sort our queries base on it, and apply a mo's algorithm on it.

However, the merging step in group_tree_vertices may make the size of the last block becomes greater than 2K, but it won't make it over 3K. Now, we proved that the size of every block will be in the range [K,3K].

Hmm. That's wierd. You stated that the result vector returned by DFS will not be more than K in size. And the merging step simply combines the nodes in the vector into one block. So how does the last block even become greater than K in size?