From a database i need to remove categorical columns with a certain percentage of missing character and at the same time I would like to remove the numeric columns with a percentage of stagnant values (not necessary 100%).

Yeah it all depends on how you set it up. The first check is finding the most common value in each column and determines if it is above or below the threshold percentage. Unfortunately, a missing value isn't a value, so it can't be the most common. That's why it doesn't work on the numeric columns. It does work on the Character columns b/c "" is both missing and an empty string value.

Hope that helps explain things.

If you want more granularity, just run multiple loops with different percentages using different strategies.

Hi msharp thanks for your reply. Let'm ask another question. The percent variable work for both on categorical and numerical columns? If yes can I have two different value for missing and stagnant cases?

Yeah it all depends on how you set it up. The first check is finding the most common value in each column and determines if it is above or below the threshold percentage. Unfortunately, a missing value isn't a value, so it can't be the most common. That's why it doesn't work on the numeric columns. It does work on the Character columns b/c "" is both missing and an empty string value.

Hope that helps explain things.

If you want more granularity, just run multiple loops with different percentages using different strategies.