My question is, What is the point of the rows where ancestor_id=descendant_id? What useful information do they add? Makes no sense.

I will hazard a guess: Those elements may contain some information in addition to links to descendants. Therefore, when looking for information (keys), you must inspect the element itself in addition to its descendants.

Notice also that for all records of the Comments table, the pair (commentId,commentId) has been added to the CommentPaths table.
That makes it easy to build queries about a comment and all of its descendants.

The CommentPaths table is a terrible way to store relationships. While it might make it easy to determine the set of records that contain a record and all of its children, the table is a maintenance nightmare. Suppose you add a subtree to one of the child nodes? You have to find all of the parent nodes associated with that child node and add all of the subtree nodes to it. Then, if you want to move the subtree to another child node, you have to do the same thing for the new position plus remove the old relationships.

The CommentPaths table is a terrible way to store relationships. While it might make it easy to determine the set of records that contain a record and all of its children, the table is a maintenance nightmare. Suppose you add a subtree to one of the child nodes? You have to find all of the parent nodes associated with that child node and add all of the subtree nodes to it. Then, if you want to move the subtree to another child node, you have to do the same thing for the new position plus remove the old relationships.

If we want to do something like

Code (Text):

parents_id | child_id
================
NULL 1
1 2
2 4
2 5
1 3
3 6
6 7

then it's still a maintenance nightmare, isn't it? The advantage of the table in my original post (called a Closure Table) is that if you delete a comment it's easy to find all it's descendants to delete.

It's also easy to add a comment. Suppose you want to change the table to

Code (Text):

1
/ \
2 3
/ \ \
4 5 6
/ \
7 8

meaning add 8 as a child of 6. All you have to really do is enter as ancestors of 8 all ancestors of 6.

Why would you consider updating one record to be more of a maintenance headache than updating multiple records?
For example, lets say that you're the DBA for Ancestry.com. You just got a record that says that someone named John Doe is a relative of Thomas Jefferson. You have the entire set of relatives for John Doe all the way to the present day (presumably thousands of relatives). You now have to create new parent/child relationships for all of John Doe's relatives in order to relate them to Thomas Jefferson. Two days later, you find out that he should have been tied to someone named Tomas Jefferson. You now have to remove thousands of records tied to Thomas Jefferson and again create thousands for Tomas Jefferson. The table above only requires updating a single record.

Yes, but your example doesn't relate to my sistuation because the numbers are auto-incremented ints representing ids of comments. Those ids have no reason to ever change. In your example, what if you find out that Thomas Jefferson was not a real person and you've imposed a rule that your ancestory database only contains members whose entire lineage is known? You're going to have to write a Hell of a query to delete all of Thomas Jefferson's descendants from the tree.

....The advantage of the table in my original post (called a Closure Table) is that if you delete a comment it's easy to find all it's descendants to delete.

Is that table good for just explaining how the data are being stored in the B-tree ? But in implementation repeated rows as shown in that table are not required. A tree grows from the basic concept of a linked list with more pointers in within to point to the left and right of the tree itself.
So something like this I think is enough

Code (Text):

parents_id | child_id
================
NULL 1
1 2
2 4
2 5
1 3
3 6
6 7

I don't know how a comment 3 is false but its descendants 6 and 7 are true. So deleting 3, 6,7 all at once is logical, isn't it ?

I gave that as an example, not as a database rule.
It only requires modifying a single child record that ties John Doe to Thomas Jefferson. Once that record is changed, the decendants are no longer linked to Jefferson. How is that a "Hell of a query"?

It would be a Hell of a query if you need to delete all Jefferson's descendants. I guess that wouldn't make sense in the family tree example since you'd keep the descendants so that you can attach them to the correct ancestor later. In my example, where the tree represents comments in a thread, the descendant comments are junk data as soon as the ancestor comment is deleted, hence it's nice to have the tree be a structure in which it's easy to query descendants.

I guess that I wasn't following your example very well. I see that the table is named CommentPaths but there wasn't much to describe its purpose until now. If you want to delete descendant comments, there are other options for deleting all of the children automatically. Cascade delete is perfect for this situation.

I guess that I wasn't following your example very well. I see that the table is named CommentPaths but there wasn't much to describe its purpose until now. If you want to delete descendant comments, there are other options for deleting all of the children automatically. Cascade delete is perfect for this situation.

Can you show me how you would write a cascade delete that does all this?

There is only one entry for any response or comment. All Response entries have to be tied to a comment. If a comment is deleted, all of the responses tied to that comment will be deleted.
The Response Parent ID would signify its relationship to its parent response. If its parent is a comment, you could null it or have it reference itself to signify that your code should treat it differently. The two tables are similar enough that you could change a response into a comment with a few commands but order would be important to avoid cascading a delete. You could also probably put everything into a single table but that would take some additional effort.