Launching the Wolfram Challenges Site

April 12, 2018

The more one does computational thinking, the better one gets at it. And today we’re launching the Wolfram Challenges site to give everyone a source of bite-sized computational thinking challenges based on the Wolfram Language. Use them to learn. Use them to stay sharp. Use them to prove how great you are.

The Challenges typically have the form: “Write a function to do X”. But because we’re using the Wolfram Language—with all its built-in computational intelligence—it’s easy to make the X be remarkably sophisticated.

The site has a range of levels of Challenges. Some are good for beginners, while others will require serious effort even for experienced programmers and computational thinkers. Typically each Challenge has at least some known solution that’s at most a few lines of Wolfram Language code. But what are those lines of code?

There may be many different approaches to a particular Challenge, leading to very different kinds of code. Sometimes the code will be smaller, sometimes it will run faster, and so on. And for each Challenge, the site maintains a leaderboard that shows who’s got the smallest, the fastest, etc. solution so far.

What does it take to be able to tackle Challenges on the site? If you’ve read my An Elementary Introduction to the Wolfram Language, for example, you should be well prepared—maybe with some additional help on occasion from the main Wolfram Language documentation. But even if you’re more of a beginner, you should still be able to do simpler Challenges, perhaps looking at parts of my book when you need to. (If you’re an experienced programmer, a good way to jump-start yourself is to look at the Fast Introduction for Programmers.)

How It Works

There are lots of different kinds of Challenges on the site. Each Challenge is tagged with topic areas. And on the front page there are a number of “tracks” that you can use as guides to sequences of related Challenges. Here are the current Challenges in the Real-World Data track:

Click one you want to try—and you’ll get a webpage that explains the Challenge:

Now you can choose either to download the Challenge notebook to the desktop, or just open it directly in your web browser in the Wolfram Cloud. (It’s free to use the Wolfram Cloud for this, though you’ll have to have a login—otherwise the system won’t be able to give you credit for the Challenges you’ve solved.)

Here’s the cloud version of this particular notebook:

You can build up your solution in the Scratch Area, and try it out there. Then when you’re ready, put your code where it says “Enter your code here”. Then press Submit.

What Submit does is to send your solution to the Wolfram Cloud—where it’ll be tested to see if it’s correct. If it’s not correct, you’ll get something like this:

But if it’s correct, you’ll get this, and you’ll be able to go to the leaderboard and see how your solution compared to other people’s. You can submit the same Challenge as many times as you want. (By the way, you can pick your name and icon for the leaderboard from the Profile tab.)

The Range of Challenges

The range of Challenges on the site is broad both in terms of difficulty level and topic. (And, by the way, we’re planning to progressively grow the site, not least through material from outside contributors.)

Here’s an example of a simple Challenge, that for example I can personally solve in a few seconds:

Here’s a significantly more complicated Challenge, that took me a solid 15 minutes to solve at all well:

Some of the Challenges are in a sense “pure algorithm challenges” that don’t depend on any outside data:

And some of the Challenges are “math-y”, and make use of the math capabilities of the Wolfram Language:

Pre-launch Experience

We’ve been planning to launch a site like Wolfram Challenges for years, but it’s only now, with the current state of the Wolfram Cloud, that we’ve been able to set it up as we have today—so that anyone can just open a web browser and start solving Challenges.

Still, we’ve had unannounced preliminary versions for about three years now—complete with a steadily growing number of Challenges. And in fact, a total of 270 people have discovered the preliminary version—and produced in all no less than 11,400 solutions. Some people have solved the same Challenge many times, coming up with progressively shorter or progressively faster solutions. Others have moved on to different Challenges.

It’s interesting to see how diverse the solutions to even a single Challenge can be. Here are word clouds of the functions used in solutions to three different Challenges:

And when it comes to lengths of solutions (here in characters of code), there can be quite a variation for a particular Challenge:

Here’s the distribution of solution lengths for all solutions submitted during the pre-launch period, for all Challenges:

It’s not clear what kind of distribution this is (though it seems close to lognormal). But what’s really nice is how concentrated it is on solutions that aren’t much more than a line long. (81% of them would even fit in a 280-character tweet!)

And in fact what we’re seeing can be viewed as a great tribute to the Wolfram Language. In any other programming language most Challenges—if one could do them at all—would take pages of code. But in the Wolfram Language even sophisticated Challenges can often be solved with just tweet-length amounts of code.

Why is this? Well, basically it’s because the Wolfram Language is a different kind of language: it’s a knowledge-based language where lots of knowledge about computation and other things is built right into the language (thanks to 30+ years of hard work on our part).

But then are the Challenges still “real”? Of course! It’s just that the Wolfram Language lets one operate at a higher level. One doesn’t have to worry about writing out the low-level mechanics of how even sophisticated operations get implemented—one can just concentrate on the pure high-level computational thinking of how to get the Challenge done.

Under the Hood

OK, so what have been some of the challenges in setting up the Wolfram Challenges site? Probably the most important is how to check whether a particular solution is correct. After all, we’re not just asking to compute some single result (say, 42) that we can readily compare with. We’re asking to create a function that can take a perhaps infinite set of possible arguments, and in each case give the correct result.

So how can we know if the function is correct? In some simple cases, we can actually see if the code of the function can be transformed in a meaning-preserving way into code that we already know is correct. But most of the time—like in most practical software quality assurance—the best thing to do is just to try test cases. Some will be deterministically chosen—say based on checking simple or corner cases. Others can be probabilistically generated.

But in the end, if we find that the function isn’t correct, we want to give the user a simple case that demonstrates this. Often in practice we may first see failure in some fairly complicated case—but then the system tries to simplify the failure as much as possible.

OK, so another issue is: how does one tell whether a particular value of a function is correct? If the value is just something like an integer (say, 343) or a string (say, “hi”), then it’s easy. But what if it’s an approximate number (say, 3.141592…)? Well, then we have to start worrying about numerical precision. And what if it’s a mathematical expression (say, 1 + 1/x)? What transformations should we allow on the expression?

There are many other cases too. If it’s a network, we’ll probably want to say it’s correct if it’s isomorphic to what we expect (i.e. the same up to relabeling nodes). If it’s a graphic, we’ll probably want to say it’s correct if it visually looks the same as we expected, or at least is close enough. And if we’re dealing with real-world data, then we have to make sure to recompute our expected result, to take account of data in our knowledgebase that’s changed because of changes out there in the real world.

Alright, so let’s say we’ve concluded that a particular function is correct. Well now, to fill in the leaderboard, we have to make some measurements on it. First, how long is the code?

We can just format the code in InputForm, then count characters. That gives us one measure. One can also apply ByteCount to just count bytes in the definition of the function. Or we can apply LeafCount, to count the number of leaves in the expression tree for the definition. The leaderboard separately tracks the values for all these measures of “code size”.

OK, so how about the speed of the code? Well, that’s a bit tricky. First because speed isn’t something abstract like “total number of operations on a Turing machine”—it’s actual speed running a computer. And so it has be normalized for the speed of the computer hardware. Then it has to somehow discard idiosyncrasies (say associated with caching) seen in particular test runs, as achieved by RepeatedTiming. Oh, and even more basically, it has to decide which instances of the function to test, and how to average them. (And it has to make sure that it won’t waste too much time chasing an incredibly slow solution.)

Well, to actually do all these things, one has to make a whole sequence of specific decisions. And in the end what we’ve done is to package everything up into a single “speed score” that we report in the leaderboard.

A final metric in the leaderboard is “memory efficiency”. Like “speed score”, this is derived in a somewhat complicated way from actual test runs of the function. But the point is that within narrow margins, the results should be repeatable between identical solutions. (And, yes, the speed and memory leaderboards might change when they’re run in a new version of the Wolfram Language, with different optimizations.)

Backstory

We first started testing what’s now the Wolfram Challenges site at the Wolfram Summer School in 2016—and it was rapidly clear that many people found the kinds of Challenges we’d developed quite engaging. At first we weren’t sure how long—and perhaps whimsical—to make the Challenges. We experimented with having whole “stories” in each Challenge (like some math competitions and things like Project Euler do). But pretty soon we decided to restrict Challenges to be fairly short to state—albeit sometimes giving them slightly whimsical names.

We tested our Challenges again at the 2017 Wolfram Summer School, as well as at the Wolfram High School Summer Camp—and we discovered that the Challenges were addictive enough that some people systematically went through trying to solve all of them.

We were initially not sure what forms of Challenges to allow. But after a while we made the choice to (at least initially) concentrate on “write a function to do X”, rather than, for example, just “compute X”. Our basic reason was that we wanted the solutions to the Challenges to be more open-ended.

If the challenge is “compute X”, then there’s typically just one final answer, and once you have it, you have it. But with “write a function to do X”, there’s always a different function to write—that might be faster, smaller, or just different. At a practical level, with “compute X” it’s easier to “spoil the fun” by having answers posted on the web. With “write a function”, yes, there could be one version of code for a function posted somewhere, but there’ll always be other versions to write—and if you always submit versions that have been seen before it’ll soon be pretty clear you have to have just copied them from somewhere.

As it turns out, we’ve actually had quite a bit of experience with the “compute X” format. Because in my book An Elementary Introduction to the Wolfram Language all 655 exercises are basically of the form “write code to compute X”. And in the online version of the book, all these exercises are automatically graded.

Now, if we were just doing “cheap” automatic grading, we’d simply look to see if the code produces the correct result when it runs. But that doesn’t actually check the code. After all, if the answer was supposed to be 42, someone could just give 42 (or maybe 41 + 1) as the “code”.

Our actual automatic grading system is much more sophisticated. It certainly looks at what comes out when the code runs (being careful not to blindly evaluate Quit in a piece of code—and taking account of things like random numbers or graphics or numerical precision). But the real meat of the system is the analysis of the code itself, and the things that happen when it runs.

Because the Wolfram Language is symbolic, “code” is the same kind of thing as “data”. And the automatic grading system makes extensive use of this—not least in applying sequences of symbolic code transformations to determine whether a particular piece of code that’s been entered is equivalent to one that’s known to represent an appropriate solution. (The system has ways to handle “completely novel” code structures too.)

Code equivalence is a difficult (in fact, in general, undecidable) problem. A slightly easier problem (though still in general undecidable) is equivalence of mathematical expressions. And a place where we’ve used this kind of equivalence extensively is in our Wolfram Problem Generator:

Of course, exactly what equivalence we want to allow may depend on the kind of problem we’re generating. Usually we’ll want 1 + x and x + 1 to be considered equivalent. But (1 + x)/x might or might not want to be considered equivalent to 1 + 1/x. It’s not easy to get these things right (and many online grading systems do horribly at it). But by using some of the sophisticated math and symbolic transformation capabilities available in the Wolfram Language, we’ve managed to make this work well in Wolfram Problem Generator.

Contribute New Challenges!

The Wolfram Challenges site as it exists today is only the beginning. We intend it to grow. And the best way for it to grow—like our long-running Wolfram Demonstrations Project—is for people to contribute great new Challenges for us to include.

At the bottom of the Wolfram Challenges home page you can download the Challenges Authoring Notebook:

Fill this out, press “Submit Challenge”—and off this will go to us for review.

Beyond Challenges

I’m not surprised that Wolfram Challenges seem to appeal to people who like solving math puzzles, crosswords, brain teasers, sudoku and the like. I’m also not surprised that they appeal to people who like gaming and coding competitions. But personally—for better or worse—I don’t happen to fit into any of these categories. And in fact when we were first considering creating Wolfram Challenges I said “yes, lots of people will like it, but I won’t be one of them”.

Well, I have to say I was wrong about myself. Because actually I really like doing these Challenges—and I’m finding I have to avoid getting started on them because I’ll just keep doing them (and, yes, I’m a finisher, so there’s a risk I could just keep going until I’ve done them all, which would be a very serious investment of time).

So what’s different about these Challenges? I think the answer for me is that they feel much more real. Yes, they’ve been made up to be Challenges. But the kind of thinking that’s needed to solve them is essentially just the same as the kind of thinking I end up doing all the time in “real settings”. So when I work on these Challenges, I don’t feel like I’m “just doing something recreational”; I feel like I’m honing my skills for real things.

Now I readily recognize that not everyone’s motivation structure is the same—and many people will like doing these Challenges as true recreations. But I think it’s great that Challenges can also help build real skills. And of course, if one sees that someone has done lots of these Challenges, it shows that they have some real skills. (And, yes, we’re starting to use Challenges as a way to assess applicants, say, for our summer programs.)

It’s worth saying there are some other nice “potentially recreational” uses of the Wolfram Language too.

One example is competitive livecoding. The Wolfram Language is basically unique in being a language in which interesting programs can be written fast enough that it’s fun to watch. Over the years, I’ve done large amounts of (non-competitive) livecoding—both in person and livestreamed. But in the past couple of years we’ve been developing the notion of competitive livecoding as a kind of new sport.

We’ve done some trial runs at our Wolfram Technology Conference—and we’re working towards having robust rules and procedures. In what we’ve done so far, the typical challenges have been of the “compute X” form—and people have taken between a few seconds and perhaps ten minutes to complete them. We’ve used what’s now our Wolfram Chat functionality to distribute Challenges and let contestants submit solutions. And we’ve used automated testing methods—together with human “refereeing”—to judge the competitions.

A different kind of recreational application of the Wolfram Language is our Tweet-a-Program service, released in 2014. The idea here is to write Wolfram Language programs that are short enough to fit in a tweet (and when we launched Tweet-a-Program that meant just 128 characters)—and to make them produce output that is as interesting as possible:

We’ve also had a live analog of this at our Wolfram Technology Conference for some time: our annual One-Liner Competition. And I have to say that even though I (presumably) know the Wolfram Language well, I’m always amazed at what people actually manage to do with just a single line of Wolfram Language code.

At our most recent Wolfram Technology Conference, in recognition of our advances in machine learning, we decided to also do a “Machine-Learning Art Competition”—to make the most interesting possible restyled “Wolfie”:

In the future, we’re planning to do machine learning challenges as part of Wolfram Challenges too. In fact, there are several categories of Challenges we expect to add. We’ve already got Challenges that make use of the Wolfram Knowledgebase, and the built-in data it contains. But we’re also planning to add Challenges that use external data from the Wolfram Data Repository. And we want to add Challenges that involve creating things like neural networks.

There’s a new issue that arises here—and that’s actually associated with a large category of possible Challenges. Because with most uses of things like neural networks, one no longer expects to produce a function that definitively “gets the right answer”. Instead, one just wants a function that does the best possible job on a particular task.

There are plenty of examples of Challenges one can imagine that involve finding “the lowest-cost solution”, or the “best fit”. And it’s a similar setup with typical machine learning tasks: find a function (say based on a neural network) that performs best on classifying a certain test set, etc.

And, yes, the basic structure of Wolfram Challenges is well set up to handle a situation like this. It’s just that instead of it definitively telling you that you’ve got a correct solution for a particular Challenge, it’ll just tell you how your solution ranks relative to others on the leaderboard.

The Challenges in the Wolfram Challenges site always have very well-defined end goals. But one of the great things about the Wolfram Language is how easy it is to use it to explore and create in an open-ended way. But as a kind of analog of Challenges one can always give seeds for this. One example is the Go Further sections of the Explorations in Wolfram Programming Lab. And other examples are the many kinds of project suggestions we make for things like our summer programs.

What is the right output for an open-ended exploration? I think a good answer in many cases is a computational essay, written in a Wolfram Notebook, and “telling a story” with a mixture of ordinary text and Wolfram Language code. Of course, unlike Challenges, where one’s doing something that’s intended to be checked and analyzed by machine, computational essays are fundamentally about communicating with humans—and don’t have right or wrong “answers”.

The Path Forward

One of my overarching goals in creating the Wolfram Language has been to bring computational knowledge and computational thinking to as many people as possible. And the launch of the Wolfram Challenges site is the latest step in the long journey of doing this.

It’s a great way to engage with programming and computational thinking. And it’s set up to always let you know how you’re getting on. Did you solve that Challenge? How did you do relative to other people who’ve also solved the Challenge?

I’m looking forward to seeing just how small and efficient people can make the solutions to these Challenges. (And, yes, large numbers of equivalent solutions provide great raw material for doing machine learning on program transformations and optimization.)

Who will be the leaders on the leaderboards of Wolfram Challenges? I think it’ll be a wide range of people—with different backgrounds and education. Some will be young; some will be old. Some will be from the most tech-rich parts of the world; some, I hope, will be from tech-poor areas. Some will already be energetic contributors to the Wolfram Language community; others, I hope, will come to the Wolfram Language through Challenges—and perhaps even be “discovered” as talented programmers and computational thinkers this way.

But most of all, I hope lots of people get lots of enjoyment and fulfillment out of Wolfram Challenges—and get a chance to experience that thrill that comes with figuring out a particularly clever and powerful solution that you can then see run on your computer.

7 comments. Show all »

It’s a wonderful idea. Have done some problems at Project Euler. Here one can learn more, people submit their code and one can learn how other people do it.

Having solved a couple of challenges I come to ToMorseCode and have a couple of suggestions. There should be a way to disagree with the tests. The verification should do “happy cases” if it supposed to be recreational coding. If you want to hire someone then input validation is a good thing. In this case the code is tested with the string “they’re” which is supposed to produce morse coding including the apostrophe. Having spaces around the text is not part of the “happy case” coding. One could argue that it should be part of the result or not, but having to trim or not trim the input isn’t part of the brain teasing.

Trying to submit the results from within Mathematica often results in a 401 http error code, even with the correct credentials.

The “evaluation box” does not allow me to copy the results, e.g. to do a diff.