-
-
Notifications
You must be signed in to change notification settings - Fork 46.9k
Improved del_node func #12011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improved del_node func #12011
Conversation
There was some logical error in implementation of delete node function. Also we don't need to find balance factor 2 times so made separate variable.
for more information, see https://pre-commit.ci
If you run
|
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
@99991 Please review my recent Pull Request |
The tree becomes unbalanced when inserting and deleting the same value multiple times. Here are a few more tests you can use for debugging. Note that the tests are very comprehensive and probably not suited for doctests because that would take too long. def dump_tree_inorder(node, data):
if node:
dump_tree_inorder(node.left, data)
data.append(node.data)
dump_tree_inorder(node.right, data)
def check_balance(node):
if node:
if node.left and node.right:
balance = node.left.height - node.right.height
assert abs(balance) <= 1, f"Height difference of more than 1 between left node {node.left.data} with height {node.left.height} and right node {node.right.data} with height {node.right.height} of parent node with data {node.data}"
left_height = check_balance(node.left)
right_height = check_balance(node.right)
height = max(left_height, right_height) + 1
assert height == node.height, f"Node {node.data} should have height {height} instead of height {node.height}"
return height
return 0
replay = """
insert:6
insert:1
insert:1
right rotation node: 1
left rotation node: 6
insert:8
insert:7
left rotation node: 8
right rotation node: 6
insert:9
right rotation node: 1
insert:9
right rotation node: 8
insert:6
insert:2
insert:2
left rotation node: 6
right rotation node: 1
insert:1
insert:0
left rotation node: 7
insert:5
insert:8
insert:8
right rotation node: 8
insert:4
left rotation node: 5
right rotation node: 2
delete:1
left rotation node: 1
left rotation node: 7
right rotation node: 2
"""
t = AVLtree()
# Parse printed output to reproduce error
for line in replay.strip().split("\n"):
op, value = line.split(":")
if op == "insert":
t.insert(value)
elif op == "delete":
t.del_node(value)
print(t)
check_balance(t.root)
import random
# Brutefoce test many insertion and deletion sequences
for length in range(1000):
for seed in range(1000):
print("seed:", seed, "length", length)
random.seed(seed)
t = AVLtree()
s = []
for _ in range(length):
if random.random() < 0.4 and s:
value = random.choice(s)
i = s.index(value)
s.pop(i)
t.del_node(value)
else:
value = random.randrange(20)
if value in s: continue
s.append(value)
t.insert(value)
check_balance(t.root)
tree_data = []
dump_tree_inorder(t.root, tree_data)
assert sorted(s) == tree_data, f"Expected {sorted(s)}, got {sorted(tree_data)}"
print() |
@99991 AVL must not contain duplicate nodes by definition(we can maintain count variable though) So what am I supposed is either I maintain a new variable to count and store count of duplicates(also change other functions) or discard duplicate node while inserting them(only changes in insert function) ? |
Both solutions are good (much better than silently corrupting the tree anyway). I think ignoring duplicate keys is easier since it only requires an additional if-branch in |
@99991 Kindly review my pull request (like 3rd time) |
The following insertion/deletion sequence will result in an unbalanced tree. The height of the left subtree of node 12 is 1, while the height of the right subtree is 3, which results in a larger-than-allowed height difference of 2.
You can use this code to check: def dump_tree_inorder(node, data):
if node:
dump_tree_inorder(node.left, data)
data.append(node.data)
dump_tree_inorder(node.right, data)
def check_balance(node):
if node:
if node.left and node.right:
balance = node.left.height - node.right.height
assert abs(balance) <= 1, f"Height difference of more than 1 between left node {node.left.data} with height {node.left.height} and right node {node.right.data} with height {node.right.height} of parent node with data {node.data}"
left_height = check_balance(node.left)
right_height = check_balance(node.right)
height = max(left_height, right_height) + 1
assert height == node.height, f"Node {node.data} should have height {height} instead of height {node.height}"
return height
return 0
replay = """
insert:1
insert:5
insert:6
insert:12
insert:3
insert:9
insert:19
insert:0
insert:16
insert:4
insert:10
delete:16
insert:14
insert:17
insert:15
insert:7
delete:0
"""
t = AVLtree()
# Parse printed output to reproduce error
for line in replay.strip().split("\n"):
op, value = line.split(":")
try:
value = int(value.strip())
except ValueError:
continue
if op == "insert":
t.insert(value)
elif op == "delete":
t.del_node(value)
print(t)
check_balance(t.root)
import random
# Brutefoce test many insertion and deletion sequences
for length in range(1000):
for seed in range(1000):
print("seed:", seed, "length", length)
random.seed(seed * 1000 + length)
t = AVLtree()
s = set()
for _ in range(length):
if random.random() < 0.4 and s:
value = random.choice(list(s))
s.remove(value)
t.del_node(value)
else:
value = random.randrange(20)
if value in s: continue
s.add(value)
t.insert(value)
check_balance(t.root)
tree_data = []
dump_tree_inorder(t.root, tree_data)
assert sorted(s) == tree_data, f"Expected {sorted(s)}, got {sorted(tree_data)}"
print() |
Output I am getting after running that replay thing
It is correct I think so Run this code below
|
LGTM 👍 (There might be a small issue when inserting the value |
The pull request has been reviewed. Could you please merge it? |
I can not merge this PR because I do not have write access to this repository. You can go through recently merged PRs to check who approved/merged them and ask if they can also merge yours: https://github.com/TheAlgorithms/Python/pulls?q=is%3Apr+is%3Amerged |
@cclauss |
@tianyizheng02 |
There was some logical error in implementation of delete node function.
Also we don't need to find balance factor 2 times so made separate variable.
Describe your change:
Checklist: