If this is your first visit, be sure to
check out the FAQ by clicking the
link above. You may have to register
before you can post: click the register link above to proceed. To start viewing messages,
select the forum that you want to visit from the selection below.

General Programming Question - Self Enumerating Pangrams

I'm trying to create a self enumerating pangram, here is an example of one:

This first pangram has five a's, one b, one c, two d's, twenty-nine e's, six f's, four g's, eight h's, twelve i's, one j, one k, three l's, two m's, nineteen n's, twelve o's, two p's, one q, eight r's, twenty-six s's, twenty t's, three u's, five v's, nine w's, three x's, four y's, and one z.

I've tried two different approaches.

The first approach was to generate a sentence with random numbers. I would compare the randomly generated numbers to the actual number of occurrences of each letter, update the sentence and repeat. This method quickly finds potential pangrams with an error of between 6 - 8 numbers, but there doesn't seem to be much more progress after that.

The second approach was to create a "gene pool" of random potential pangrams. The fittest pangrams can reproduce (by simulating crossover), and there are random mutations, and new individuals are created and destroyed on a random basis. This method takes longer, but has produced potential pangrams with an error of between 1 - 4 numbers.

Are there any other approaches which I haven't thought of? Which of the above two approaches has the most potential of working? I'm nearly going insane trying to figure this out!

Re: General Programming Question - Self Enumerating Pangrams

Originally Posted by SamstaUK

Are there any other approaches which I haven't thought of?

One approach often used in these high-dimensional problems is Monte Carlo.

It's quite similar to the genetic approach. The pangram is slightly modified in each iteration and if the modification resulted in a better pangram it's accepted. But, and this is the defining feature of Monte Carlo, even if the modified pangram is worse than the existing it's acceped with a certain (small) probability. This "walking about in pangram space" continues until a correct panagram is found (or the computer is switched off )

There are a few open questions still to think about. How should the pangram be "modified"? And how does one know whether a pangram is "better or worse" than another pangram?

I cannot guarantee Monte Carlo will be better than what you already tried but it's a well established standard method.

Re: General Programming Question - Self Enumerating Pangrams

Thank you for the monte carlo suggestion, I'll have a read about it

I'm using a sort of hybrid technique, I'm getting *really* close now, only 1 number is off, but still no solution.

What I'm doing is choosing a random number, and setting every letter to that number. Then I update the pangram with the actual count. I keep updating the pangram for 5 seconds, if no progress is made, then it is stuck in a loop, so I mutate it 15000 to 1000000 times, updating the pangram between mutations. (I choose this figure before running the program). If no improvement is made from the mutations, I revert back to the previous best and start mutating again.

I'm trying to find the right balance between mutations and updates. If I update too often I get stuck in a loop, if I mutate too much I might miss a potential solution.

I've turned my spare room into an oven, running this on two poweredge servers :P

EDIT: Solution Found!

It turns out that the string you use to start your pangram determines whether it is solvable or not. I changed the base string, and within minutes had a solution

After much tribulation, Sam Kennedy made this pangram which contains: nine a's, two b's, four c's, four d's, twenty nine e's, nine f's, two g's, seven h's, fifteen i's, one j, two k's, two l's, five m's, twenty nine n's, sixteen o's, two p's, one q, ten r's, thirty one s's, twenty four t's, seven u's, four v's, ten w's, three x's, six y's, and one z.

This is a really interesting exercise if anyone is ever bored and wants something to do!

Re: General Programming Question - Self Enumerating Pangrams

Originally Posted by SamstaUK

This is a really interesting exercise if anyone is ever bored and wants something to do!

I'll have that in mind and then I'll try the Monte Carlo approach.

But there's another approach I'd be eager to investigate. It's when you use one pangram to generate the next, and the next and so on recursively. It would resemble a random generator for pangrams. With luck the process approaches a correct pangram and you know you're there when two consecutive pangrams are equal. It's because a correct pangram generates itself. But the process may also end up cycling, and it definately will if no correct pangram exists.

Regarding the tuning of your genetic algorithm. There will be the same situation in Monte Carlo where you accept a worse pangram with a certain fixed probability. If it's set too low the simulation may get stuck locally. If it's too high a correct pangram may be overlooked.

Finally the question of whether a correct pangram exists at all. I have a feeling it depends on the fixed part of the pangram. The variable part will have to adapt to that and that may not be possible. But even without a fixed part a pangram may not have a solution (it should be language dependent I guess because numbers have different letter representations) so the fixed part must be decisive for the question of existence.

Re: General Programming Question - Self Enumerating Pangrams

hmmm... I just tried solving this same problem, I first used monte carlo but that seemed too slow, I then tried a mixed method of monte carlo and then updating the numbers that the pangram has iteratively (as I think OP had in method 1). however I haven't been able to make much progress, I couldn't see your code could I? do let me know, I've spent all day on this and now I'll be really annoyed if I can't get a working solution. (Hope you're still on this forum/still have the code a year later, fingers crossed)

Re: General Programming Question - Self Enumerating Pangrams

Originally Posted by crashandburn4

I first used monte carlo but that seemed too slow,

Well, this looks like the archetypical needle-in-a-haystack problem. If it's NP-complete which I suspect then it's bound to be slow regardless of which solution method you apply. You're not even guaranteed to ever find an exact solution regardless of how long you try. And if you do it can be ascribed mostly to luck.

I've tried a Monte Carlo strategy. It finds a solution in both cases the OP tried, also the one he abandoned as lacking a solution. But we're talking many millions, almost billions, of iterations and that takes time. As always with Monte Carlo it's a delicate balance between sticking around in areas with high chance of a solution and at the same time avoid getting stuck. It's an art rather than a science really. Even changing seed in the random generator may be the difference between finding a solution or not. That's why it's hard to know whether an improved algorithm was an improvement or if it just was the random walk that took a different path and happened to strike gold this time.

I've introduced a data structure called an LFV, a letter frequency vector. It's simply an array with one entry for each letter, each entry holding the frequency of that letter (how often it occurs). In this case there are 26 entries, one for each English letter with frequencies in the 0 to 99 range.

Now a Pangram can be turned into an LFV, that's obvious, any text can. But in this case the opposite is true too. An LFV holding the variable part of a Pangram can be used to generate the whole Pangram text. Say the frequency for the letter 't' is 42, then that would correspond to the text "fourty-two t's" in the Pangram.

Say we have an LFV called the Variate and a function called the PangramFunction which takes an LFV and turns it into a Pangram and then returns an LFV of that Pangram. Then the problem of finding a Self-Enumerating Pangram can be expressed with this equality,

Variate == PangramFunction(Variate)

It's simply a question of finding the Variate that transforms into itself when passing the Pangram function. The Variate has a total of 26^100 combinations so finding the solution Variate is not exactly easy picking to say the least.

To determine how close a Variate is to a solution the squared distance between Variate and PangramFunction(Variate) is used. When this distance is 0 they're equal and that's the ultimate goal.

Some experimentation shows that a good strategy is to make very small frequency changes to a few letters of the Variate in each iteration. If this results in a lower squared distance the modified Variate is accepted. To prevent this process from getting stuck occasionally a Variate with higher squared distance is accepted. But that's standard procedure for Monte Carlo.

Re: General Programming Question - Self Enumerating Pangrams

This is the result of some "recreational programming" this weekend.

It's written using VS2012 Express for Windows Desktop. It's started like this,

SEP sep;
sep.simulation();

Unfortunately I missed there should be an "and" before the last letter in a Pangram text. I didn't want to modify the program at that point so the fix is instead to add an "and" to the constant part of the Pangram. Interestingly this has a huge impact on solution time. The "After much tribulation, Sam Kennedy made this pangram which contains: " Pangram took 62 million iterations without the "and " but 892 million with it. This really demonstrates the true probablistic nature of Monte Carlo simulations.

This program has found a solution in under 1 billion iterations to all Pangrams I've tested. But it's still only an approximate algorithm and sooner or later this promising track record will be broken.

The equality in my previous post #6 can be implemented using a logic network consisting of Boolean high-level elements such a ROM memories, adders and comparators which all in their turn can be implemented with simple NAND and NOR gates. This means the logic network constitutes a big Boolean expression. Presented with a Variate input (the 26 letter frequencies in binary form) this network will churn out a single Boolean output telling whether the equality is true or false.

Finding which inputs to a Boolean expression will yield true is a Satisfiability problem. It's known to be NP complete so the Self-Enumerating Pangram problem should be NP complete too. This means it's a hard problem but also that efficient solution strategies can be sought among general SAT-solvers. But that's a project for another weekend.

By the way if you're interested in the foundation of computing I can recommend The Golden Ticket by Lance Fortnow. It's a popular science book but it's written by an insider (and not some journalist who's trying to explain what he doesn't understand). It's by far the most accessible still deep book in this category I've ever seen.

Re: General Programming Question - Self Enumerating Pangrams

Originally Posted by nuzzle

It's known to be NP complete so the Self-Enumerating Pangram problem is NP complete too.

given an NP-complete problem A and a problem B, in order to prove that B is at least NP-complete you should solve A in terms of B; if I read your post correctly, you did the opposit, so you didn't prove that the pangram problem is NP-complete.

Re: General Programming Question - Self Enumerating Pangrams

Originally Posted by superbonzo

given an NP-complete problem A and a problem B, in order to prove that B is at least NP-complete you should solve A in terms of B; if I read your post correctly, you did the opposit, so you didn't prove that the pangram problem is NP-complete.

It's not a formal proof. I just presented the evidence that finally convinced me of NP completness. But a proof shouldn't be that hard. I promise to clearly indicate if I attempt one.

But more importantly, the fact that the Self-Enumerating Pangram problem can be easily reformulated as a SAT problem (specifically CIRCUIT-SAT) means it can be attacked with SAT-solvers.

And if you feel you could easily prove something in one way or another please feel free to do so. Staying Mr. Besserwisser can't be that satisfying.

* The Perfect Platform for Game Developers: Android
Developing rich, high performance Android games from the ground up is a daunting task. Intel has provided Android developers with a number of tools that can be leveraged by Android game developers.

* The Best Reasons to Target Windows 8
Learn some of the best reasons why you should seriously consider bringing your Android mobile development expertise to bear on the Windows 8 platform.