Is it important to take care of balance of each variation in examples too? Tensorflow embeding uses also the count of each word for classification? So if you have of one variation more examples for an intent, does this have an impact? So with the example above (what I mean exactly): I might have intent examplkes of one type like how would I pin to powerpoint the taskbar more often

Add more examples to the intent with only 2 examples, it would be impossible for an NLU model to accurately predict the intent with such few examples.
You don’t have to exactly balance them, but there should be a reasonable amount of examples for each intent