Hello everyone, i have just started to study Q-learning and see the possibilities of using Q-learning to solve my problem.

Problem: I am supposed to detect a certain combination of data, i have four matrices that acts as an input to my system, i have already categorised the inputs ( each input can either be Low (L) , or High (H) ). I need to detect certain types of input for example LLLH, LLHH, HHHH etc

NOTE:
1)LLLH means the first input in L, second input is L, third input is L and the fourth input is H!
2)I have labelled each type of input type as state, for example LLLL is state 1, LLLH is state 2, so on.

What i have studied in Q-learning is that most of the time you have one goal (only one state as a goal) which makes it easier for the agent to learn and create the Q-matrix from the R-matrix . Now in my problem i have many goal ( many states act as goal and need to be detected). I dont know how to design the states, how to create the Reward-matrix by having many goals and how the agent will learn. Can you please help me how can i use Q-learning in this kind of situation. Taking into account i have like 16 goals in 20+ states!