The White House disputed that President Donald Trump told The Wall Street Journal in an interview Thursday that “I probably have a very good relationship with Kim Jong Un of North Korea,” saying that Mr. Trump had instead said “I’d probably have a very good relationship” with the North Korean leader.

The Journal stands by what it reported. The Journal and White House agreed before the interview that audiotape taken by White House officials and reporters would be used for transcription purposes only. After the White House challenged the Journal’s transcription and accuracy of the quote in a story, The Journal decided to release the relevant portion of the audio. The White House then released its audio version of the contested segment.

Here's the relevant audio:

Your browser does not support the audio element.

Zeroing in on the crucial region:

Your browser does not support the audio element.

And a spectrogram:

The "phoneme restoration effect" means that I (like many people) can hear this passage either way, if I start from one preconception or the other.

The main acoustic cues for the place of articulation of a stop consonant following a vowel are the formant transitions into the closure. In particular, at the end of a vowel, the classical analysis says that we should see F2 falling into the closure if the following consonant were the labial /p/. But in this case, we see it rising, perhaps toward the "locus" of about 1800 or 1900 Hz. that we classically expect for a following coronal /d/:

Lin's sketch of the F2 transition in Trump's "I ? probably" is in green on the right:

However, there's a problem here.

Trump's pronunciation of the pronoun "I" in general is a rising diphthong something like [ɐɪ], and thus will tend to have a rising F2 anyhow. And everything in this example goes by pretty fast. The whole of the [ɐɪ], from the release of the final /n/ in "and" to the following closure, whether of /d/ or of /p/, is just about 82 milliseconds long. This is not much time for opening transitions, a diphthongal vowel, and closing transitions all to play out, as we might see in a careful citation-form pronunciation of (say) "tie Paul" vs. "tied Paul".

Everything here is heavily co-articulated — much more so than in Lin's example of "I've" on the left, where the open portion of "I" is about 137 milliseconds.

And the recording SNR is low enough that it's hard to tell how long voicing continues into the 85 milliseconds or so of stop gap between the closure of "I" and the release of the /p/ in "probably".

So I don't think that the acoustic (or perceptual) analysis of this one passage is determinative.

We might be able to do better by looking at a collection of examples of Mr. Trump saying comparable things, and examining spectral variation at the ends of his first person singular pronouns, the durations of stop gaps in relevant instances of /dp/ vs. /p/, etc. His extensive archive of interviews and speeches will offer plenty of examples. But I don't have time for that right now — I'm in the middle of a journey from Bangalore to Philadelphia, temporarily stranded in London after my BLR to LHR flight was six hours late.

Update — a correspondent observes:

To my mind the most important piece of disambiguating evidence isn't phonetic, but what he says after the quoted segment:

And I (’d) probably have a very good relationship
with Kim Jong Un of North Korea.I would-I have relationships with people,I think you people are surprised.

Your browser does not support the audio element.

Indeed it's clear that he starts to repeat himself, but breaks off after clearly articulating "I would-". This certainly makes it seem more plausible that he intended a similar irrealis modal in the previous phrase.

Colin Zwanziger said,

tangent said,

On top of what your correspondent says about "I would", which is good evidence, consider the "probably". That's more consistent with irrealis.

If I'm talking about a present actual relationship, I'd probably just say it's good, period. My relationship is good, or it's not, unless I'm into self-doubt and hedging, which this speaker is not. For him to include "probably" suggests the relationship is not here to be evaluated.

FM said,

Mind you, he says 'you people are surprised' where one would clearly expect 'you people would be surprised' (by my relationships with people, if you knew them).
I don't have time to dig, but I remember seeing/hearing other instances of Trump not using irrealis mood where it was clearly meant. So whether or not a "d' is pronounced may not be so decisive…

Andreas Johansson said,

I agree with tangent: "I probably have a very good relationship" is an inherently unlikely thing to say, particularly for a man little given to self-doubt, so I feel pretty confident that he said (or at least meant to say) the version with 'd.

Saurs said,

Trump's idiosyncratic pseudo-humble "probably" (he is forever probably the greatest statesman his interlocutor has ever met, the least racist, the most fit, and so forth) is certainly IRR, but he has a notably shaky grasp on how Anglophones typically express themselves subjunctively. He frequently omits modals and auxiliaries and enjoys inhabiting a kind of beatific flowing present from which he may pluck ideas about the future and experiences from the past without the bother of a grammatical change to mark the departure of time and tense. It's normally as simple as a pause or stammer, usually rendered by the print press in an en dash or as an ellipsis (though he rarely trails off in the traditional sense, more's the pity).

Saurs said,

As an idea, of course, it's characteristic besides. He is always the best friend of whomever he last spoke with or about; cf the existence of Norway popping into his head, shortly after meeting Solberg, when he wants to slag off other countries that are not the US. He thinks alliances are forged over a single wedge of chocolate cake or the feeding of fish or a phone call that, unbeknownst to him, irritates the other person on the line and sends the aides of both sides scrambling. By definition, his is the better approach than his predecessors, so of course his friendship is superior, even in the here and now, for a person who knows no consequences because he's not mentally present enough to connect current conditions with prior behavior. Obama is bad, so things connected to him are bad. He, himself, is good, so all things that flow from him are good.

Bloix said,

First, I don’t hear anything remotely like a d. Second,Trump lies as easily as most people breath,so any claim that he makes is worthless. Third, Trump has a motive to lie about this, while the WSJ has none. Fourth, “I’d” makes no sense. I’d if what?

"He frequently omits modals and auxiliaries and enjoys inhabiting a kind of beatific flowing present from which he may pluck ideas about the future and experiences from the past without the bother of a grammatical change to mark the departure of time and tense."

"He frequently omits modals and auxiliaries and enjoys inhabiting a kind of beatific flowing present from which he may pluck ideas about the future and experiences from the past without the bother of a grammatical change to mark the departure of time and tense."

Sentence of the day!

stephen said,

Bloix said,

"WSJ reporters asked if that meant he talks directly to Kim, which would be a major departure from US policy over the past decade. Trump replied: “I don’t want to comment on it. I’m not saying I have or haven’t. I just don’t want to comment.”"

KB said,

I am reminded of one of Trump's victory speeches, in which he was reported as saying "I thought I lost". My reaction at the time was "Shouldn't that be 'had lost'?" Perhaps a barely-there "'d" can be called upon for this too?

Frank Southworth said,

Context! We need the preceding context to make sense of this bit of sound. As Bloix says: "I’d if what?". On the other hand, while I'm ideologically open to the belief that Trump lied in this case as in many others, I would vote for "I'd" on the basis of what I hear.

Pflaumbaum said,

John Swindle said,

He said something that we can interpret in a way that makes sense or a way that makes no sense. Which should we choose? He's not saying he has a good relationship with his North Korean counterpart. He's saying that under some unspecified circumstances he could have such a relationship. That's all. It's not a huge or even unlikely claim, and it may even display a glimmer of goodwill.

Feynmaniac said,

Frank Southworth,
"Context! We need the preceding context to make sense of this bit of sound. "

Indeed, look what came before,

"With that being said, President Xi has been extremely generous with what he's said, I like him a lot…
I have a great relationship with him, as you know I have a great relationship with Prime Minister Abe of Japan and [I/I'd] probably have a very good relationship with Kim Jong Un of North Korea…
I would- I have relationships with people."

Trump was talking in the present tense about his relationship with Abe and Xi . He said "I have" twice immediately before his statement about Kim Jong Un and once immediately after. Given this, the acoustic ambiguity and Trump's history of outrageous statements it's easy to see why someone would interpret it as "I" rather than "I'd".

Trump's (alleged) intended meaning is still ridiculous though.

Andrew Usher said,

I can only believe after listening (I of course have no access to the formant analysis or whatever) that he said "I'd", and I doubt you would hear his office protest so much about a minor transcriptional error if they were wrong.

What Trump meant, though, is another thing, and I can't really determine from this and probably couldn't even with full context, given Trump's style.

If I'd listened (which could as easily be 'I listened' with no important change of meaning) to the passage without this information, I don't think I would have noticed this possible ambiguity at all. Instead, I would have heard (as I did anyway) the strange articulation of the name 'Kim Jong Un' and the non-rhoticity of the following 'North' [noəθ].