This is an example of sound change, and there are a few hypotheses we can consider as to why this sound change came about.

The first hypothesis is that this is the result of sound change through misperception. No, particularly when said in isolation, and abruptly, ends in a glottal stop: You can feel your glottis (vocal chords) closing, if you pay attention. Glottal stop is easily confusable with the standard English sounds p, t and k. When you combine this with the rounding of the lips from the o sound, you get something more p-like versus k-like. (Note the position of the lips w/ a p; in contrast, many languages had glottal stop ~ k sound changes, e.g., Hawaiian).

This results in acoustic cues at the end of no that make it sound like maybe possibly there is a p at the end. Some people misperceive that as a full-on p then start purposefully adding the p and the rest is history(cal linguistics).

More generally, it's likely incorrect to presume that language has an explicit drive toward being easier. Rather, there are tangible forces at play that result in it generally being easier, but not always (e.g., nope).

Therefore, an obvious alternative to consider is sound change through misproduction. I think this is far less likely, but one way it could work is to begin with the idea that two syllables are a more natural state for a word to be in. Evidence for this come from babies, for example, who often start speaking using bisyllabic words, e.g., mama, dada, bahbah (for bottle) and so on; and from the fact that many languages organize their words into bisyllabic units as with English stress patterns (trochees, iambs). From there, it may be that people are misproducing No to fit this 2-syllable template and that all occurrences of Nope actually have a strongly released p, meaning the mouth pops open and some aspirated sound is made making it almost like Nopahhhh. Why the p instead of some other sounds? The answer would be similar to the above: The rounding of the o leads to a labial (lip) consonant.

Advertisement

A similar hypothesis would be something along the lines that human speech is ultimately based on the opening and closing motions of our jaw (see works of MacNeilage and Davis). No, when it occurs in isolation, is the end of a phrase; phrase endings may be generally associated with a mouth-closing motion, hence a p. These don't strike me as very satisfactory explanations: Why not b? Why not nopa? Where's the evidence for phrase-final effects of this sort? Regardless, misproductionshould be considered.

Another alternative is sound symbolism, which is what people are alluding to when they talk about the feeling of finality associated with the p. Consider the following English words: clasp, clap, slap, crop, blip, stop, stomp, lop. They all end in p and they all have a sort of abrupt finality associated with their meaning. A blip both begins and ends abruptly; or juxtapose clap with something like whoosh. It's not that p is explicitly associated with that meaning in English (e.g., cap), but rather it's suggestive. It's difficult to know what to do with these phonesthemes in terms of understanding human grammar, but it's pretty clear it affects people psychologically in terms of processing language.

In some cases, there is a definite sound-symbolic component of the sound-meaning pairing; for example, here, p is articulated abruptly, or consider slip, slither, slide and the acoustics of sl, which sounds sort of slithery. Other times, it can be a more arbitrary pairing, e.g., glitter, glimmer, glint, glow (light has nothing to do with the gl sound).
So, the final-p phonaestheme in English could have led to nope and yep sound-change to indicate finality. The challenging part here is that it's easy to identify phonaesthemes in a language by looking at a dictionary, but it's hard to pinpoint how they arise and how they impact sound change.

***

So, a few hypotheses here. How might we distinguish between these options?

Well, first I'd be remiss if I didn't reference the Language Log "Yep and nope" post, which notes that nope and yep pretty much always occur alone in a phrase. Well, nope. Or, Yep, we can! don't seem to occur very often. I believe this advocates in favor of an explanation that relates to the phonetics of word-final phrasing.

Another thing we can do is look at when nope and yup started appearing in the English language. If nope came far earlier, then it would suggest the acoustics of no is important. Recall that I suggested that yup came about as analogy with nope. There is no phonetic way to get p from yes. So, if they came about in tandem, it would suggest sound symbolism. If yup came first, it's back to the drawing board. What sayeth the OED? Nope first attested in 1888. Yup is 1906, a decent gap. But wait, what have we here? Yep, attested 1891. Attestation data that is so close is tough to interpret, particularly when it's colloquialisms, but interesting nevertheless.

Advertisement

We can look at the production question by looked at whether there is, indeed any sign of lip-closing for words at the end of a phrase. Seems unlikely to me...

Finally, we can look at the acoustics of no in isolation, particularly the existence of the glottal stop. Some object to the idea, though the comments on the language log post confirmed a glotalized p. Also, thefact that nope can only be used in isolation syntactically distinguishes it from no at the beginning of a sentence, which would not have a glottal stop. Ultimately, it's a question we can answer by measurement: look at as many utterances containing only no as possible and look at the acoustics and see if they suggest any sort of glottal stop, or better yet, try to get an image of the glottis in action.