I'm programming with C on Linux to get all possible combinations of particular sequence related to biology. But I don't know how I implement it because it seems to be very complicated to make rule. I'm going to use those combination to make different sequences to test.

What I'd like to know is how to make all possible combinations with those sequence to make it different sequence from original one. I think there are lots of patterns to extract letters from different positions. By the way, the length of sequence is up to 100 approximately.

For example, I need to choose one letter to replace it.

W
S
N
Y
N
W
S

I need to choose two letters.

WS WN WY WN WY WN WN WW WS: start with W at first position
SN SY SN SY SN SN SW SS: start with S at second position
NY NN NY NN NN NW NS: start with N at third position
....
....
WS: start with W at second position from last

Next, I need to choose three letters. It's lost of possible to make combinations

So you have values you want to compute all the different possible combinations of, of strings of length 100 chars. Have you decided on a first or base string? Is there more than one possible base string?

Right now I'm thinking a recursive call to a choosing function might be a good way to do this (I know I did this a bit with trees before), but the N (four possible options) might be tricky... I'd also recommend the singleton design pattern for your result set.

So you have values you want to compute all the different possible combinations of, of strings of length 100 chars. Have you decided on a first or base string? Is there more than one possible base string?

I'm sorry I don't understand exactly because I'm not good at English.
But I think I need to explain the procedure of running of this program.
If this does not meet your question, please rephrase your question.

Step1: User input sequence of nucleotides.
For examples: WSNYTNTGGYTNCCNWSNGRGCNA.... (size is up to 200 or 300)
This sequence will be different.

Step2: My program will extract ambiguous letters from input.
Ambiguous letters are all letters of input other than A,C,G and T
so info_table[0] will contain WSNYNYNNWS... (size is up to 100)

Step3: Create all the different possible combinations from info_table[0]
For examples:
ACACACAAAC
T----------
-G--------
....

I'm sorry I don't understand exactly because I'm not good at English.
But I think I need to explain the procedure of running of this program.
If this does not meet your question, please rephrase your question.

Step1: User input sequence of nucleotides.
For examples: WSNYTNTGGYTNCCNWSNGRGCNA.... (size is up to 200 or 300)
This sequence will be different.

Step2: My program will extract ambiguous letters from input.
Ambiguous letters are all letters of input other than A,C,G and T
so info_table[0] will contain WSNYNYNNWS... (size is up to 100)

Step3: Create all the different possible combinations from info_table[0]
For examples:
ACACACAAAC
T----------
-G--------
....

I need algorithms to make all different possible combination with replaceable letters like above,

Step4: Modify original sequence with result from Step3 - I think I can do this
For examples:
ACACTATGGCTACCAACAGAGCAA....: with only A,C,G and T
TCACTATGGCTACCAACAGAGCAA....
....

Thank you.

Listen, I want to try my best not to do your work for you (ethically, not to mention the forum guidelines), but I'll try and present you with the logic that works and try and help lead you to the correct solution to the problem.

Ok, so you even said what you actually have to do: store/print (why do you want to print 16,000+ combinations?) all combinations of a sequence.

Here's what you know:
-The sequence has characters in it that can be replaced (These can be represented either by WNYS... or basically all characters not equal to A,C,G, or T)
-These replaceable characters have specific characters that can replace them:
"W: A,T
S: C,G
N: A,C,G,T
Y: C,T" (Question: Why are you repeating this over and over? why not write it only once for W,N,Y,S ?)

-To store/create EVERY combination (you said this up top with the case example) you would have to go through every replaceable character, so like, change the first one, same as the rest, next, next, then change the first one and the second, the same for the rest, change the second again, etc. (you already said that, it's exactly what you need to do)

-So things you'll probably need to keep in mind if you did this by hand (I always ask that, what would I need to do this by hand, after all, the computer needs to know the same things you would):
-The length of the inputed string (it could be 200 or 300 or whatever, but you need to know that number)
-The number of replaceable characters - sift through the string and count every one (use a for loop, go through every single character one at a time if it's one of the replaceable characters simply count++ or whatever)
-The number of possible combinations (you may not need this, but for the way I'm imagining it, you would)- you calculated it up there, you know your statistics ;)

I hope that helps. Use that and perhaps get started, come up with something, and come back for more help.

-To store/create EVERY combination (you said this up top with the case example) you would have to go through every replaceable character, so like, change the first one, same as the rest, next, next, then change the first one and the second, the same for the rest, change the second again, etc. (you already said that, it's exactly what you need to do)

I'm afraid there may need more complicate consideration in there.
If I need to change 3 or more letters in given sequence, there are lots of possibilities to make different values from original one.

For examples:
Supposing there are only 10 characters in the sequence.
And if I want to change different values with five replaceable letters.

-To store/create EVERY combination (you said this up top with the case example) you would have to go through every replaceable character, so like, change the first one, same as the rest, next, next, then change the first one and the second, the same for the rest, change the second again, etc. (you already said that, it's exactly what you need to do)

I'm afraid there may need more complicate consideration in there.
If I need to change 3 or more letters in given sequence, there are lots of possibilities to make different values from original one.

For examples:
Supposing there are only 10 characters in the sequence.
And if I want to change different values with five replaceable letters.

Mark 'v' means that marked position can be replaced with replaceable letters.
Those patterns can shift to right.
I don't know how to find out those all patterns.

I'd like to explain what I want clearly, but my English doesn't help me to do that.

Thank you.

First of all, may I ask what language you do speak fluently? Just curious.

Alright, well let me ask you two yes/no questions, maybe they will help me understand better:

Are you saying that the program asks how many replaceable characters you want changed?

For example, could this be a computer output:

Computer Output: What sequence of DNA am I spitting out combinations for?

User Input: WNYSACGTWNYS

Computer Output: Good, now how many characters can I replace?

User Input: 2

Computer Output: The combinations have been stored, they are:

AAYSACGTWNYS
ACYSACGTWNYS
AGYSACGTWNYS
ATYSACGTWNYS

ANCSACGTWNYS
ANTSACGTWNYS

ANYCACGTWNYS
ANYGACGTWNYS

ANYSACGTANYS
ANYSACGTTNYS

ANYSACGTWAYS
ANYSACGTWCYS
ANYSACGTWGYS
ANYSACGTWTYS

ANYSACGTWNCS
ANYSACGTWNTS

ANYSACGTWNYC
ANYSACGTWNYG

TAYSACGTWNYS
TCYSACGTWNYS
TGYSACGTWNYS
TTYSACGTWNYS

.
.
.

OR

Are you saying that the possible combinations can include characters that stay the same, another words are not replaced?

Now, let me just say no matter what, this can be done, just approach it calmly. After all, if you had just paper and as much time as you needed, you could do them all by hand right? That means you can tell a computer how to do it.

First of all, may I ask what language you do speak fluently? Just curious.

Alright, well let me ask you two yes/no questions, maybe they will help me understand better:

Are you saying that the program asks how many replaceable characters you want changed?

For example, could this be a computer output:

Computer Output: What sequence of DNA am I spitting out combinations for?

User Input: WNYSACGTWNYS

Computer Output: Good, now how many characters can I replace?

User Input: 2

Computer Output: The combinations have been stored, they are:

AAYSACGTWNYS
ACYSACGTWNYS
AGYSACGTWNYS
ATYSACGTWNYS

ANCSACGTWNYS
ANTSACGTWNYS

ANYCACGTWNYS
ANYGACGTWNYS

ANYSACGTANYS
ANYSACGTTNYS

ANYSACGTWAYS
ANYSACGTWCYS
ANYSACGTWGYS
ANYSACGTWTYS

ANYSACGTWNCS
ANYSACGTWNTS

ANYSACGTWNYC
ANYSACGTWNYG

TAYSACGTWNYS
TCYSACGTWNYS
TGYSACGTWNYS
TTYSACGTWNYS

.
.
.

OR

Are you saying that the possible combinations can include characters that stay the same, another words are not replaced?

Now, let me just say no matter what, this can be done, just approach it calmly. After all, if you had just paper and as much time as you needed, you could do them all by hand right? That means you can tell a computer how to do it.

-blazed

First of all, may I ask what language you do speak fluently? Just curious.
=>My mother language is Korean.

Are you saying that the program asks how many replaceable characters you want changed?
=>No (At present), But I guess I need it later.

Are you saying that the possible combinations can include characters that stay the same, another words are not replaced?
=>Actually output values must consist of A,C,G and T not ambiguous letters such as W,N,S and etc. That's why I replaced ambiguous letters with A,C,G or T first and then I planed to replace them.
For examples:
before:WSNYNYNNWS,
after: ACACACAAAC

you could do them all by hand right?
=>No, I could not do that by hand because the number of possibilities is up to 16.000 when size of input is only 10.

Computer Output: Good, now how many characters can I replace?
User Input: 2
=>How about 5 or 6? I think I can make some logic for 1 or 2, but I don't think I'm able to make some logic for more than 3.

First of all, may I ask what language you do speak fluently? Just curious.
=>My mother language is Korean.

Are you saying that the program asks how many replaceable characters you want changed?
=>No (At present), But I guess I need it later.

Are you saying that the possible combinations can include characters that stay the same, another words are not replaced?
=>Actually output values must consist of A,C,G and T not ambiguous letters such as W,N,S and etc. That's why I replaced ambiguous letters with A,C,G or T first and then I planed to replace them.
For examples:
before:WSNYNYNNWS,
after: ACACACAAAC

you could do them all by hand right?
=>No, I could not do that by hand because the number of possibilities is up to 16.000 when size of input is only 10.

Computer Output: Good, now how many characters can I replace?
User Input: 2
=>How about 5 or 6? I think I can make some logic for 1 or 2, but I don't think I'm able to make some logic for more than 3.

I will use all output values as input in other program.

I hope this can help you understand more.

Thank you for your asking.
Alex.

Well then, if I'm understanding correctly, the input is only of characters W, N, Y, or S and CANNOT be A, C, G, or T

Another example, you tell me how close to true this is (I'll use very small numbers)

Computer Output: What sequence of DNA am I spitting out combinations for?

User Input: WNYS

Computer Output: The combinations have been stored (32 Possible combinations), they are:

AACC
AACG

AATC
AATG

ACCC
ACCG
ACTC
ACTG

AGCC
AGCG
AGTC
AGTG

ATCC
ATCG
ATTC
ATTG

TACC
TACG

TATC
TATG

TCCC
TCCG
TCTC
TCTG

TGCC
TGCG
TGTC
TGTG

TTCC
TTCG
TTTC
TTTG

Is this the correct way of "doing it by hand" exactly with that input, would this be the output?

Well then, if I'm understanding correctly, the input is only of characters W, N, Y, or S and CANNOT be A, C, G, or T

Another example, you tell me how close to true this is (I'll use very small numbers)

Computer Output: What sequence of DNA am I spitting out combinations for?

User Input: WNYS

Computer Output: The combinations have been stored (32 Possible combinations), they are:

AACC
AACG

AATC
AATG

ACCC
ACCG
ACTC
ACTG

AGCC
AGCG
AGTC
AGTG

ATCC
ATCG
ATTC
ATTG

TACC
TACG

TATC
TATG

TCCC
TCCG
TCTC
TCTG

TGCC
TGCG
TGTC
TGTG

TTCC
TTCG
TTTC
TTTG

Is this the correct way of "doing it by hand" exactly with that input, would this be the output?

-blazed

Is this the correct way of "doing it by hand" exactly with that input, would this be the output?
=>Yes, you're right.
So can you make logic for that in case size of input data is up to 100?

if I'm understanding correctly, the input is only of characters W, N, Y, or S and CANNOT be A, C, G, or T
=>Actually, Input data consists of all letters but you can assume like that. Because I'll extract only letters from input data except A,C,G and T, then I'll make new values by combining with original input data.

Even though I've been spending two days on this problem I still don't know how I approach it.

Is this the correct way of "doing it by hand" exactly with that input, would this be the output?
=>Yes, you're right.
So can you make logic for that in case size of input data is up to 100?

if I'm understanding correctly, the input is only of characters W, N, Y, or S and CANNOT be A, C, G, or T
=>Actually, Input data consists of all letters but you can assume like that. Because I'll extract only letters from input data except A,C,G and T, then I'll make new values by combining with original input data.

Even though I've been spending two days on this problem I still don't know how I approach it.

Thank you.
Alex

Well, now that I know what you're doing, yes, I've noticed a pattern. Here, let me try and show you:

While I was typing out the pattern of those letter output combinations, did you notice I had spaces between certain places? Here, take a look:

Expand|Select|Wrap|Line Numbers

AACC

AACG

<--One Here

AATC

AATG

<--One Here

ACCC

ACCG

ACTC

ACTG

<--One Here

AGCC

AGCG

AGTC

AGTG

<--One Here

ATCC

ATCG

ATTC

ATTG

<--Two Here

<--Two Here

TACC

TACG

<--One Here

TATC

TATG

<--One Here

TCCC

TCCG

TCTC

TCTG

<--One Here

TGCC

TGCG

TGTC

TGTG

<--One Here

TTCC

TTCG

TTTC

TTTG

I did that naturally, without noticing it... till I took a good look. Why you might ask? Well, I didn't want to type out EVERY SINGLE LETTER. I'm lazy, so I copied and pasted parts, but to keep track of course I put spaces between parts I copied. Try and guess what parts I copied first (it'll help you to understand the logic).

Now, do you start to see a small pattern like I do? I didn't just randomly change letters one at a time till I luckily got 32 combinations, there was a pattern, yes?

Let's take another look, this time I'll point out what kind of pattern I can see.

Here, check it out:

Expand|Select|Wrap|Line Numbers

AACC

AACG <-- Same thing as last one, just change the 4th letter

AATC <

AATG <These two are the same as above, but only 3rd letter has changed

ACCC <

ACCG <

ACTC <

ACTG <These 4 are the same as above 4, but 2nd letter has changed

AGCC <

AGCG <

AGTC <

AGTG <Again, for another possible combo of 2nd letter

ATCC <

ATCG <

ATTC <

ATTG <Again, for another possible combo of 2nd letter

NOW, everything is the same as above except first letter has changed

TACC

TACG

TATC

TATG

TCCC

TCCG

TCTC

TCTG

TGCC

TGCG

TGTC

TGTG

TTCC

TTCG

TTTC

TTTG

Tell me if you see the same pattern. If so, if this is the way I did it (by copying and pasting parts, and changing others) do you see a possible solution forming?

If so, please try and put together some logic or pseudocode. I hope we're getting somewhere...