9tabs

Evil Coding Incantations

Ever since I watched the revered Wat video by Gary Bernhardt, I’ve been fascinated with the strange behavior of certain programming languages. Some programming languages have more unexpected behaviors than others. Java, for example, has a whole book dedicated to its edge cases and peculiarities. For C++’s equivalent, you can refer to the C++ specification itself for just 200 USD. :expressionless:

What follows is a collection of my favorite surprising, humorous, and yet valid incantations. Generally speaking, taking advantage of these peculiar behaviors is considered evil since your code should be anything but surprising. Thankfully, there are many linters that are primed and ready to make fun of you if you try most of the following tomfoolery. All that being said, knowledge is power, so let’s begin.

Thankfully, this yields a SyntaxError in Python 3 since True, False, and None are now reserved words. This eccentricity is still far less evil than the C++ prank of sneaking #define true false into a standard header file of your coworker’s development machine.

The semantics of == for beginning Java programmers is often perplexing, but the operator’s inconsistency in even trivial scenarios serves to complicate the situation, even if the performance benefits are worth it.

The JVM will use the same reference for values in the range [-128, 127]. What’s even stranger is the python equivalent behavior.

>>>x=256>>>y=256>>>xisyTrue>>>x=257>>>y=257>>>xisyFalse

Nothing too surprising so far.

>>>x=-5>>>y=-5>>>xisyTrue>>>x=-6>>>y=-6>>>xisyFalse

It seems the lower limit for the python interpreter to use the same instance is… -5. Integers in the range of [-5, 256] get the same IDs. It somehow gets weirder still.

>>>x=-10>>>y=-10>>>xisyFalse>>>x,y=[-10,-10]>>>xisyTrue

It seems that using destructured assignment changes the rules here. I’m not sure why this is and I actually have a Stack Overflow question open to try to understand it. My guess is that repeating values in a list point to the same object to save memory.

The reversed subscript notation gives any developer an instant headache.

intx[1]={0xdeadbeef};printf("%x\n",0[x]);// prints deadbeef

The reason this works is that array[index] is really just syntactic sugar for *(array + index). Thanks to the commutative property of addition, we can swap the array and the index and get the same result.

The sizeof operator is a compile-time operator, which gives it interesting properties.

intx=0;sizeof(x+=1);if(x==0){printf("wtf?");// this will be printed
}

Since instances of the sizeof operator are evaluated at compile time, the expression (x += 1) never runs. Also interesting: studies revealed that printf("wtf?") is the most popular line of code that never gets pushed.

/r/programminghumor has been having fun with “indexingstartsat1” memes. Shockingly, there are plenty of programming languages that actually sport 1-indexed arrays. A more comprehensive list can be found here.

For historical reasons, there are alternatives to the non-alphanumeric symbols in C.

Trigraph

Symbol

Digraph

Symbol

Token

Symbol

??=

#

<:

]

%:%:

##

??/

\

:>

[

compl

~

??'

^

<%

{

not

!

??(

[

%>

}

bitand

&

??)

]

%:

#

bitor

|

??!

|

and

&&

??<

{

or

||

??>

}

xor

^

??-

~

and_eq

&=

or_eq

|=

xor_eq

^=

not_eq

!=

if(trueandtrue){// same as if (true && true)
printf("thanks, c");}

Some foreign equipment, such as the IBM 3270, did not supply some of the commonly used symbols in c/cpp, so digraphs, trigraphs, and tokens were supplied as to not discriminate against particular character sets.

I hope this article was interesting. You can follow the discussion on reddit here.