The motivation behind this proposal is the non-deterministic nature of speech, i.e. given a sentence, there are many valid ways to articulate the words, many of which will carry different meanings. One way to address this is by sampling from speech synthesis models. By addressing the shortcomings of existing models and designing more accurate models of speech, we hope to improve the quality of samples from these generative models. This will be attempted by focussing on assumptions made.