OK, now I have done the thing with output.txt:There are in fact 245,745 entries there, with a lot of repetitions. Thanks for pointing that out.The repetitions don't carry over to the returned array, and neither do the not-existing combinations.

If Not Store[n].HasKey(Keyword) That's where the filtering happens for non-existing combinations.Store[n, Keyword] := A_Index That's where the filtering happens for repetitions.

I might be wrong, though, but that's what I had in mind when writing it.Edit: strike out of obvous error, need closer inspection.

wolf_II wrote:OK, now I have done the thing with output.txt:There are in fact 245,745 entries there, with a lot of repetitions. Thanks for pointing that out.The repetitions don't carry over to the returned array, and neither do the not-existing combinations.

If Not Store[n].HasKey(Keyword) That's where the filtering happens for non-existing combinations.Store[n, Keyword] := A_Index That's where the filtering happens for repetitions.

I might be wrong, though, but that's what I had in mind when writing it.Edit: strike out of obvous error, need closer inspection.

I understand that, but you see now that there is a way to generate unique permutations and nothing else (without having to add another .HasKey check for duplicates themselves) My script doesn't need those extra checks because it can't make duplicates. There's your puzzle if you choose to tackle it I want to hear about it!

;-------------------------------------------------------------------------------ShowAnagrams:; live display all anagrams of String found in Dictionary;-------------------------------------------------------------------------------GuiControlGet, String WordLength :=StrLen(String)SB_SetText("Length: " WordLength,1)

;-------------------------------------------------------------------------------ShowAnagrams:; live display all anagrams of String found in Dictionary;-------------------------------------------------------------------------------GuiControlGet, String WordLength :=StrLen(String)SB_SetText("Length: " WordLength,1)

;-------------------------------------------------------------------------------ShowAnagrams:; live display all anagrams of String found in Dictionary;-------------------------------------------------------------------------------GuiControlGet, String WordLength :=StrLen(String)SB_SetText("Length: " WordLength,1)

Yes, the last entry of size 4 in output.txt is at line 238,711. Got it.I will come back to all the issues you mentioned, one by one, when I get the time.(sort alphabetically for each size separately, as a start)

In the meantime I am gonna build a script modifier that can check the time for every line to execute and dump it. Know of anything?

You mean like a tracer? I don't know of any, sorry. I use A_TickCount, or QPC() for higher precision. But they both are only useful for testing specific parts of code, as you know, I'm sure.

Edit: I just spotted a compliment in an earlier post of yours. "I gotta start trimming my code or really examine yours, I don't know why it's so great!" Thank you very much, but I can't take credit for that. It is all thanks to Helgef's teaching.

wolf_II wrote:Yes, the last entry of size 4 in output.txt is at line 238,711. Got it.I will come back to all the issues you mentioned, one by one, when I get the time.(sort alphabetically for each size separately, as a start)

In the meantime I am gonna build a script modifier that can check the time for every line to execute and dump it. Know of anything?

You mean like a tracer? I don't know of any, sorry. I use A_TickCount, or QPC() for higher precision. But they both are only useful for testing specific parts of code, as you know, I'm sure.

Edit: I just spotted a compliment in an earlier post of yours. "I gotta start trimming my code or really examine yours, I don't know why it's so great!" Thank you very much, but I can't take credit for that. It is all thanks to Helgef's teaching.

Helgef really set us up pretty nicely and I'm glad you were inspired to build this thing for yourself after coming to help me with my problems. I'm having tons of fun

oh and I finished the tracer and I'm sure there are tons of them out there but here's mine incase it saves you anytime

;-------------------------------------------------------------------------------; Anagrams.ahk; by wolf_II;-------------------------------------------------------------------------------; simple Anagram searcher; written with massive help by Helgef from AHK forum; https://autohotkey.com/boards/viewtopic.php?p=158284#p158284;-------------------------------------------------------------------------------

;-----------------------------------------------------------------------; Each word in Dictionary gets pushed into the respective DICT[L, K].; Push() function returns the index of the most recently pushed value.; We then use this index to correct the size downwards,; i.e. save space or adjust the capacity of a field in DICT[L, K].;-----------------------------------------------------------------------; E.g. let's say we pushed the 7th word to DICT[L, K] not knowing the 7.; Push() gives us back the index = 7, and we can adjust the 7th field; of DICT[L, K] to ByteSize = # characters (respecting ANSI/Unicode);-----------------------------------------------------------------------; We do this exercise on each word in Dictionary,; impact on performance is negligible, but:; impact on memory used is noticeable (129 MB vs. 96 MB) on 370 k words; impact on memory used is noticeable ( 40 MB vs. 30 MB) on 110 k words;----------------------------------------------------------------------- index := DICT[WordLength, Keyword].Push(A_LoopField) DICT[WordLength, Keyword].SetCapacity(index,(WordLength +1)* WA)}

;-------------------------------------------------------------------------------ShowAnagrams:; live display all anagrams of String found in Dictionary;-------------------------------------------------------------------------------GuiControlGet, String WordLength :=StrLen(String)SB_SetText("Length: " WordLength,1)

littlegandhi1199 wrote:Helgef really set us up pretty nicely and I'm glad you were inspired to build this thing for yourself after coming to help me with my problems. I'm having tons of fun

I couldn't agree more! I'm having fun too.

Things for me to do:1) check out you tracer - I'm looking forward to that. 2) check out your Anagram Solver, that thing is hard to read. I think I wait a bit, you said you will start "trimming that code".

littlegandhi1199 wrote:Helgef really set us up pretty nicely and I'm glad you were inspired to build this thing for yourself after coming to help me with my problems. I'm having tons of fun

I couldn't agree more! I'm having fun too.

Things for me to do:1) check out you tracer - I'm looking forward to that. 2) check out your Anagram Solver, that thing is hard to read. I think I wait a bit, you said you will start "trimming that code".

Definitely, I am trying to pull the slowest parts of my script out and until then if you want to delve into solving it yourself I started out by manually comparing 2 letter dumps to 3 letter dumps to 4 letter dumps to 5.1 indicates the respective sorted query was brand new and hadn't been generated of yet.The next part is showing the actual unsorted query and it all lines up so you can find the patterns that are there ;DThat's how I found themDIY

Hi littlegandhi1199,just a quick note, an update on "my progress".I want to demonstrate that I finally understand what you are talking about, or so I hope. I have added to my script a time measurement again. There was one back in v1.02, which I did not release.

;-------------------------------------------------------------------------------; Anagrams.ahk; by wolf_II;-------------------------------------------------------------------------------; simple Anagram searcher; written with massive help by Helgef from AHK forum; https://autohotkey.com/boards/viewtopic.php?p=158284#p158284;-------------------------------------------------------------------------------

;-----------------------------------------------------------------------; Each word in Dictionary gets pushed into the respective DICT[L, K].; Push() function returns the index of the most recently pushed value.; We then use this index to correct the size downwards,; i.e. save space or adjust the capacity of a field in DICT[L, K].;-----------------------------------------------------------------------; E.g. let's say we pushed the 7th word to DICT[L, K] not knowing the 7.; Push() gives us back the index = 7, and we can adjust the 7th field; of DICT[L, K] to ByteSize = # characters (respecting ANSI/Unicode);-----------------------------------------------------------------------; We do this exercise on each word in Dictionary,; impact on performance is negligible, but:; impact on memory used is noticeable (129 MB vs. 96 MB) on 370 k words; impact on memory used is noticeable ( 40 MB vs. 30 MB) on 110 k words;----------------------------------------------------------------------- index := DICT[WordLength, Keyword].Push(A_LoopField) DICT[WordLength, Keyword].SetCapacity(index,(WordLength +1)* WA)}

;-------------------------------------------------------------------------------ShowAnagrams:; live display all anagrams of String found in Dictionary;-------------------------------------------------------------------------------GuiControlGet, String WordLength :=StrLen(String)SB_SetText("Length: " WordLength,1)

/*--------- end of file --------------------------------------------------------

I will not update the fist post with this.

Using that script, I started putting in up to 18 letters, and I saw what's going on. Most of the time spent in extreme conditions is in fact inside the Combinations() function. Just like you keep telling me. There I do the 238k computations, where 32k should be enough. I finally got your point. Thanks for continuously rubbing it in.

littlegandhi1199 wrote:There's your puzzle if you choose to tackle it I want to hear about it!

Now I think the time has come to tackle the problem. I know you want to hear about my attempt, not just me rambling. But I admit, I was somewhat reluctant to even consider the possibility, since I would have been forever happy with my script's performance at up to 15-letters input, where my declared goal was 12 to begin with. I guess I was lazy, and thinking "why bother?". I see now, that I should bother indeed. There may be room to raise Max_Input_Length := 18 for good, not just for this exercise.

wolf_II wrote:Hi littlegandhi1199,just a quick note, an update on "my progress".I want to demonstrate that I finally understand what you are talking about, or so I hope. I have added to my script a time measurement again. There was one back in v1.02, which I did not release.

;-------------------------------------------------------------------------------; Anagrams.ahk; by wolf_II;-------------------------------------------------------------------------------; simple Anagram searcher; written with massive help by Helgef from AHK forum; https://autohotkey.com/boards/viewtopic.php?p=158284#p158284;-------------------------------------------------------------------------------

;-----------------------------------------------------------------------; Each word in Dictionary gets pushed into the respective DICT[L, K].; Push() function returns the index of the most recently pushed value.; We then use this index to correct the size downwards,; i.e. save space or adjust the capacity of a field in DICT[L, K].;-----------------------------------------------------------------------; E.g. let's say we pushed the 7th word to DICT[L, K] not knowing the 7.; Push() gives us back the index = 7, and we can adjust the 7th field; of DICT[L, K] to ByteSize = # characters (respecting ANSI/Unicode);-----------------------------------------------------------------------; We do this exercise on each word in Dictionary,; impact on performance is negligible, but:; impact on memory used is noticeable (129 MB vs. 96 MB) on 370 k words; impact on memory used is noticeable ( 40 MB vs. 30 MB) on 110 k words;----------------------------------------------------------------------- index := DICT[WordLength, Keyword].Push(A_LoopField) DICT[WordLength, Keyword].SetCapacity(index,(WordLength +1)* WA)}

;-------------------------------------------------------------------------------ShowAnagrams:; live display all anagrams of String found in Dictionary;-------------------------------------------------------------------------------GuiControlGet, String WordLength :=StrLen(String)SB_SetText("Length: " WordLength,1)

/*--------- end of file --------------------------------------------------------

I will not update the fist post with this.

Using that script, I started putting in up to 18 letters, and I saw what's going on. Most of the time spent in extreme conditions is in fact inside the Combinations() function. Just like you keep telling me. There I do the 238k computations, where 32k should be enough. I finally got your point. Thanks for continuously rubbing it in.

littlegandhi1199 wrote:There's your puzzle if you choose to tackle it I want to hear about it!

Now I think the time has come to tackle the problem. I know you want to hear about my attempt, not just me rambling. But I admit, I was somewhat reluctant to even consider the possibility, since I would have been forever happy with my script's performance at up to 15-letters input, where my declared goal was 12 to begin with. I guess I was lazy, and thinking "why bother?". I see now, that I should bother indeed. There may be room to raise Max_Input_Length := 18 for good, not just for this exercise.

Now where did I put my thinking hat?

Yes!! And I wanted you to do it because once yours is doing the essentials it's got to be blazing fast and your memory usage will be almost non-existent like mine too.I think I found my slowest part too. I load from the dictionary every time it searches which from what I'm seeing might just be all 876ms that's left for each solve. Here's hoping

Edit. Okay! Now it's down to 109ms BUT I STILL CAN DO BETTER! V1.03 which is standalone and no longer requires 2 scripts is updated on my post now.

@littlegandhi1199: I have tackled and I have failed. Well not so much as in did not find a way, but I made my script's performance much, much worse with everything I tried.

I admit defeat

The only "benefit" from my attempts came from the fact, while I have times displayed, I could test if and by how much my custom sort improves when compacted to a one-liner, similar to the examples in the docs. It appears to be about 20% faster. And that is in the most insignificant part of the code. And by the way, I found out that if I disable the input during longer calculations, it seems somewhat natural/obvious to me (or the user) to type slower. To limit the flashing, I only do that with input length > 12.

;-------------------------------------------------------------------------------; Anagrams.ahk; by wolf_II;-------------------------------------------------------------------------------; simple Anagram searcher; written with massive help by Helgef from AHK forum; https://autohotkey.com/boards/viewtopic.php?p=158284#p158284;-------------------------------------------------------------------------------

;-----------------------------------------------------------------------; Each word in Dictionary gets pushed into the respective DICT[L, K].; Push() function returns the index of the most recently pushed value.; We then use this index to correct the size downwards,; i.e. save space or adjust the capacity of a field in DICT[L, K].;-----------------------------------------------------------------------; E.g. let's say we pushed the 7th word to DICT[L, K] not knowing the 7.; Push() gives us back the index = 7, and we can adjust the 7th field; of DICT[L, K] to ByteSize = # characters (respecting ANSI/Unicode);-----------------------------------------------------------------------; We do this exercise on each word in Dictionary,; impact on performance is negligible, but:; impact on memory used is noticeable (129 MB vs. 96 MB) on 370 k words; impact on memory used is noticeable ( 40 MB vs. 30 MB) on 110 k words;----------------------------------------------------------------------- index := DICT[WordLength, Keyword].Push(A_LoopField) DICT[WordLength, Keyword].SetCapacity(index,(WordLength +1)* WA)}

;-------------------------------------------------------------------------------ShowAnagrams:; live display all anagrams of String found in Dictionary;-------------------------------------------------------------------------------GuiControlGet, String WordLength :=StrLen(String)SB_SetText("Length: " WordLength,1)

;-------------------------------------------------------------------------------LengthSort(a, b){; word length specific custom sort;-------------------------------------------------------------------------------; Compacting this function to a one-liner, similar to the examples; in the help file, runs about 20% faster than before.;---------------------------------------------------------------------------

Edit I am looking at v1.10 before any memory saving which slows down performance considering the perfect solves won't make 85 billion words to make your memory usage huge anyways to make my changes for a perfect solve on your script.

Edit2 So I'll look at that and tell you if I give up, but for now can you just explain what LengthSort(a, b) is doing for you?

Edit3 Okay so this would be the original loops and all their permutations for ABCDE at length output of 2. 0 indicates a alphabetical repeat and 1 something brand new.The non-alphabetical outputs are shown with the same spacing to compare it to the 1s and 0s. Then the list shows us only the unique (non-alphabetical) outputs.

breakdown of input ABCDE for 2 letter outputs!!aaabacadaex4 The first loop defines A as taken and moves into 2nd loop. 2nd loop sees it's taken for it's first iteration but succeeds on it's 2nd iteration to take B (since it's looping through the input index anyways) and finishes it's last 3 iterations with no problem either.

!ba!!bbbcbdbex3 Now back to master loop and it wouldn't grab A because again it's looping through the index so it takes B next. Now the next inner loop should take A because it only sees B as taken and become BA. But since A has already appeared with every other possible letter we don't want it to do that so.... we say that it needs to respect B but also not to take anything below B either IN THE INPUT INDEX.This is why the only thing that needs satisfied isn't != previous entry but now !<= previous entry1 rule)if (curINDEX <= takenINDEX) continue

!ca!cb!!cccdcex2In implementation this time it sees C as the starter letter. On the 2nd loop's first iteration it sees C >= A... second iteration it sees C >= B... and third C >= C. (this is all referring to the index entries of these letters) then it's last two iterations are able to take D and then E

!da!db!dc!!dddex1

!ea!eb!ec!ed!!eex0Now 1 other rule that I found is to save just times spent checking used indexes.As you can see the last master loop and all the things it produces are repeats so if we call the last master loop #12 rule)If (master_loop# < output_LENGTH-(output_LENGTH-1))So no matter the input length if the output length is 3 then the last two master loops are all 0s and if the output length is 15 (my current max) then the last 14 lines are all 0s

Two rules, the second isn't necessary to only make perfect permutations (without checking any previous permutations) but it does shave off more and more loops (not queries) the higher you go.

Last edited by littlegandhi1199 on 13 Jul 2017, 21:27, edited 10 times in total.

littlegandhi1199 wrote:can you just explain what LengthSort(a, b) is doing for you?

Sure, it's called by this line Sort, WordList, D| F LengthSort. It's a custom sort "instruction" which is explained better than I could in Sort, under the option F. It's function is to sort longer input higher in the sorted list than shorter input. When input a has the same length as input b, they will be sorted alphabetically. The final 0 is returned, when length(a)=length(b) and a=b, the trivial case, which does not happen, since we do not sort anything with repetitions. Returning nothing at all or the empty string is treated by the Sort command like returning 0

littlegandhi1199 wrote:can you just explain what LengthSort(a, b) is doing for you?

Sure, it's called by this line Sort, WordList, D| F LengthSort. It's a custom sort "instruction" which is explained better than I could in Sort, under the option F. It's function is to sort longer input higher in the sorted list than shorter input. When input a has the same length as input b, they will be sorted alphabetically. The final 0 is returned, when length(a)=length(b) and a=b, the trivial case, which does not happen, since we do not sort anything with repetitions. Returning nothing at all or the empty string is treated by the Sort command like returning 0

Option D| is defining the pipe as a delimiter.

I updated my post to show you how it works but still will try to understand your script now. Thank you for that custom sort explanation. I've read about doing that before but forgot.Let's see who will perfect your script first!

Edit But please tell me if you need more explanation before you start into it. I don't want to have to understand your script if you can implement the above 2 rules for me.

Thank you very much for your explanation of how this algorithm works. In my attempt to implement it I only had success half way. I can correctly produce 2**n -1 substrings for n-letters input (e.g. 15 subs for 4 letters). However I don't yet get how I should avoid repetitions. See the following implementation with two examples:

;-------------------------------------------------------------------------------Combinations(String){; return an array with all the combinations of String;-------------------------------------------------------------------------------If(Len :=StrLen(String))=1Return,[String]

The code can still be compacted a bit more, but I like to be able to correct it before that. I'm sorry to be a pain, I have looked at the diff from above again, and i just don't get it. You have had the patience of an angel with me so far, I hope you can help me to see what I can't see yet (I'm sure you have explained it before)The code above is definitely better than my current function (at least less convoluted and more readable) but those pesky repetitions I can not handle. If I go again the route of storing the strings as keywords with a dummy value, I still don't avoid computing the repetitions in the first place.I could just check if a candidate already exists and only include non-preexisting ones, but then I'm back to square one.

wolf_II wrote:Thank you very much for your explanation of how this algorithm works. In my attempt to implement it I only had success half way. I can correctly produce 2**n -1 substrings for n-letters input (e.g. 15 subs for 4 letters). However I don't yet get how I should avoid repetitions. See the following implementation with two examples:

;-------------------------------------------------------------------------------Combinations(String){; return an array with all the combinations of String;-------------------------------------------------------------------------------If(Len :=StrLen(String))=1Return,[String]

The code can still be compacted a bit more, but I like to be able to correct it before that. I'm sorry to be a pain, I have looked at the diff from above again, and i just don't get it. You have had the patience of an angel with me so far, I hope you can help me to see what I can't see yet (I'm sure you have explained it before)The code above is definitely better than my current function (at least less convoluted and more readable) but those pesky repetitions I can not handle. If I go again the route of storing the strings as keywords with a dummy value, I still don't avoid computing the repetitions in the first place.I could just check if a candidate already exists and only include non-preexisting ones, but then I'm back to square one.

It's okay, I'm still excited. And not to mention the fact that I don't even understand your script LET ALONE how to implement these custom rules into it.Ok, so now give me the same thing but without whichever rule you just squeezed in. (the first one is the only one that makes it perfect though, I hope you got that one)Exactly and those are the pitfalls of making changes incorrectly. You've repeated yourself but either the same speed or worse lol

Helgef wrote:I see little need to improve something that works more or less instantly, in a on-user-typing context.

I still think this is true, however, trying to improve serves other purposes, such as learning and just having fun So I will try to contribute with some ideas.

First some general comments. When we try to improve performance, we need to start where it matters. To know where that is, we have to benchmark, or measure what we can, and see where we actually spend our time. For example, I found that for input string"abcdefghijklmno" the line

So combinations() is the culprit, no surprise really. The sort we needn't think about.The problem with combinations() as you noted, is the repetions, there are minor issues with the implementation, but really the problem is the algorthim. About the implementation, one thing that stands out is that you do

combine(items){; Combines all entries in the items array, without repetion. (Each entry is considered unique); Output is on the from all := [l1,...,ln], where lk is an; array of all k-length combinations, ∀k ∈ [1,n], where n is the number of items of the input.; Eg,; items:=["a","b","c"] -> l1:=[ ["a"],["b"],["c"] ], l2:=[ [["a"],["b"]], [["a"],["c"]], [["b"],["c"]] ], l3:= [ ["a","b","c"] ]; Total number of combinations are 2**n-1 (excluding the choise where nothing is choosen), each level, lk, in the out put hold nChoosek(n,k) combinations.

/* About the algorithm:

We build the combinations from level 1 and upwards to level n, that is

In code we store an index along side each combination, denoted nI, of where to begin the choises on the next level. Choises are only made from the items array, the input. eg items[nI]. Each combination is allowed to choose from all items[k] ∀ k >= nI up to n (number of items)

when we build the next level, that is l2, which is all combinations of length 2 we look at level 1, l1, and look at the first combination, which is comb1 := {1:a, nextInd:2} here we see that this combination starts its choises on index 2, (nI:=comb1.nextInd = 2), which is items[2] -> b

n:=items.length(); Number of items all:=[]; Result storage, output. all.setcapacity(n); level:=[]; Level k, holds all combinations of lenght k. (i.e., the combining of k items)for k, item in items level.push({1:item,nextInd:k+1}); k=1 is first, lk:=[item1,...,itemn], an index is stored along side each item, for tracking purposes, it is removed when not needed.loop % n-1{; there are n levels all.push(level); Store level k, k=1 is compeleted on loop entry, next level is completed at bottom of loop nextLevel:=[]; Create next level array nextLevel.SetCapacity(nchoosek(n,A_Index+1)); And set capacity, it holds nchoosek(n,k) items.for k, comb in level {; go through all combinations in level k. nI:=comb.nextInd ; Get the combinations next index. comb.delete("nextInd"); Delete it from the comb-arrayif(nI>n); if next index is greater than the number items, n, no choise is available, continue to next combination.continueloop % n-nI+1{; Combination is allowed to make choises newComb:=comb.clone(); clone combination newComb.nextInd:=nI+A_Index; Increment nextInd for the new combination newComb.push(items[nI+(A_Index-1)]); Add next item according to index. (adding one item from items per new combination) nextLevel.push(newComb); Store new combination}} level:=nextLevel ; nextLevel is done, set level to nextLevel, repeat...} all.push([items]); The last level is trivial, it is the input.return all ; Done.}; Help functionnchoosek(n,k){; n!/k!(n-k)!, n>=k>0 m:=n-k,p:=1loop, % m p*=(n-(A_Index-1))/A_Indexreturnround(p)}; For visualisation, only suitable for string itemscombinationToString(combArr,delItem:="",delLevel:="`n"){for k, level in combArr{for l, items in level {for m, item in items str.=item . delItem str:=rtrim(str,delItem). delLevel }}returntrim(str,delLevel)}

I try to explain in comments. It is general because it doesn't concatenate strings, it takes any collection of items and do all combinations, without repeats and organise the result in arrays. So this is not directly usable for your case, but could easily be adepted, I know because I have done it .Now, reading littlegandhi1199 s explanation I think this is the same idea. However, @littlegandhi1199, regarding repetions, there are no repetions for unique items, but if items are not unique you get repetions. So you will have to sort that out anyways, as wolf_II have noticed here. I think your implemention is correct there wolf_II, in my implemention, I worked with strings, not arrays, then I do Sort, result, U to remove duplicates.

@Helgef: I was too tired to realize what I did and didn't do. When I woke up, somehow I knew exactly why I get repetitions with input=ABBA. Cause the repeated letters of the input. D'oh. Those can't possibly be avoided, They MUST be filtered. Then I checked for responses in this thread, and there it was: "if items are not unique you get repetitions. So you will have to sort that out anyway".

I need some time to read your general solution, I will implement the hiding of the Listbox when updating it, I will check out sort, result, U and and and ... Thank you.

I write some more in this thread, but I want to write code first.

Last edited by wolf_II on 14 Jul 2017, 08:45, edited 3 times in total.