In the word "Sydney" - the first Y is a vowel, and the second Y is a consonant.

No. The first Y is a vowel (it represents a vowel sound) and the second is part of the vowel sound represented by the letters EY. The final Y is not a consonant here because it is not pronounced as a consonant. Likewise with Kay, May and Kuykendahl.

It's probably easiest with examples. Y is clearly the vowel in the words like fly or cry or crypt. Y is the first consonant in words like you and yes. The other vowels don't aren't able to function like consonants like the y in yes.

That's not a vowel sound followed by a consonant sound, it's a diphthong -- the vowel "ah" followed by the vowel "ee". There's actually no such vowel as "long I", it's two different vowels blended together.

Those are similar, but different sounds. "Vowel" and "consonant" are properties of sounds, not letters. It just so happens that we write those sounds down with letters, and sometimes forget that there's a difference when they don't match up exactly.

"Yet" has three distinct sounds in it. A "y" sound, and "e" sound, and a "t" sound. The Y (and the T) are obviously consonants because it's impossible to pronounce them in isolation. If you try, you're saying something like "yuh" and "tuh" with a very quick neutral vowel at the end. If you try and sustain the sound of "YYYYYYYYYYYYY", notice that you're actually pronouncing it as a vowel then, which is not what occurs at the beginning of "yet".

"Fly" is a little trickier to analyze, but if you listen carefully there are four sounds. A "f" sound, an "l" sound, and then an "a" sound that smoothly transitions into an "e" sound. When two vowels occur in sequence as a single syllable like this, they blend together to form a new kind of vowel sound called a diphthong. It's not really two vowels smashed together, it's a single vowel where you move your tongue to a different position in the middle of voicing it. That sound is being written down here with the letter Y, but it's not a vowel sound followed by a consonant sound, it's a single sound that consists of two vowels that glide together.

If you're unconvinced, say "fly" aloud, sustain the sound at the end, and then say "yet". You'll notice, as you already have, that the two words appear to "fit together" perfectly, which I assume is why you think they're the same sound. "Flyyyyyet" sounds perfectly normal. But now do that same thing, except try to not say the "fl" part aloud. If you pay attention, you'll find yourself saying something that sounds like "eeeeeeeyet", which is definitely not the same as "yyyyet". It's not even possible to hold the sound at the beginning of "yet", since that's a consonant which requires a sharp, sudden exhale to say. The point of doing this is that the "e" sound that appeared out of nowhere (and is not part of the beginning of the sound of the word "yet") is actually the second half of the diphthong at the end of "fly". But the Y at the end of "fly" is both halves of that diphthong...

It might help to know that the words "vowel" and "consonant" actually have two meanings.

The first meaning of "vowel" is any sound made with the mouth open and unobstructed by the tongue, teeth, etc. By this definition, the sounds "ea", "aa" and "aw" sounds in "leap", "baa" and "law" are all vowels. All other sounds are consonants.

The second meaning of "vowel" is any letter that represents a vowel sound, and of "consonant" is any letters that represents a consonant sound. So when "y" represents the vowel sound in "bit", as it does in the first syllable of "rhythm", it is a vowel, and when it represents the sound at the start of "yes", it is a consonant. Hence "y" is sometimes a vowel and sometimes a consonant.

The letters A, E, I, O and U can never represent consonant sounds, so these are only ever vowels, and hence are considered the only vowels in the English alphabet.

Looking at this from a vocal production standpoint vowels do not stop the vibrations leaving the mouth. They also lack an articulation point making them more musical than percussive.
Consonants on they other hand shape the vibrations of the voice by stopping and starting the air, or in the case of fricative consonants (F, V, S) force the vibration through a very small space. With consonants there is always a point of articulation (the lips the teeth the tip of the tongue.
When pronouncing "yes, yet, and yell" The back of the tongue meets with he soft pallet and when it releases we get the "y" sound. That point of initiating articulation marks the act of a consonant.