This is a trickier problem than it at first appears, and indeed there are several pitfalls which prevent us from using more “standard” techniques to arrive at a solution.

Perhaps the two main (hidden) obstacles, which were not immediately obvious from the examples I gave, are, firstly, the fact that we are prevented from using a construction involving a SEARCH-approach (e.g. by locating occurrences of each substring of the four types *????*, †????*, *????† and †????†, as John Jairo V attempted), since this of course presumes that there is only one occurrence of each of those substring types within our string, a presumption which cannot be made.

Secondly, we have to be careful with attempted solutions which involve tests for numericalness of a string which do not test each of the individual characters within that string.

Although it is true that, as I said, there will only ever be one occurrence of a string of the type XabcdY, where a, b, c and d are numbers, this should not lead us to the false conclusion that testing just the first of these characters as to whether it is a number will be sufficient to conclude that all of b, c and d are also numbers.

As I pointed out in the article here, this approach is risky, and not guaranteed to work, one example string which will cause this approach to fail (at least with a regional language setting of English) being “3Jan”.

As such, we need to be a little more subtle in our approach, and some of the techniques we will use are not at all obvious.

And so we have created an array of integers from 1 up to a value 5 less than the length of the string in A2, i.e. 45, to pass to MID as our start_num parameter. This of course makes perfect sense, since we are here interested in substrings consisting of six characters only (the last such string will occupy positions 45 to 50 inclusive).

Since we are going to be checking for strings beginning and ending in either an asterisk or an obelisk, it seems logical to make our task simpler by first substituting all cases of one for the other, thus reducing the complexity of the resulting checks we have to make.

Note that the array of six elements which we passed as MID’s start_num parameter here was necessarily of a displacement orthogonal to the large array being processed.

Since that array is a single-column array consisting of 45 elements, we thus need to ensure that our array of start_num parameters is a single-row array. As such, the matrix resulting from this operation will be a 45-row-by-6-column array, the 270 entries in which corresponding to applying MID with a start_num parameter of, in turn, each of the values from 1 to 6 on each of the 45 entries in that large array.

And so we have created an array consisting of all six characters for each of the 45 substrings of length six from our string in A2. What we now need to do is, for each of these 45 groups, to test each of the six characters within them according to the criteria we have laid out.

And those criteria are that the first and last characters be an asterisk (recall that we first substituted out the obelisks) and that the 2nd, 3rd, 4th and 5th characters be numeric.

Now, we could of course apply six separate tests to that effect to each of the groups within our large array. However, such an approach would make any solution extremely cumbersome, not to mention lengthy.

What we really want to be able to do is to somehow perform each of these six test simultaneously over all 45 groups in our array.

Things are perhaps made a tad more amenable to such an approach by virtue of the fact that we have, in total, just two different tests to perform, i.e. one for an asterisk (to 2 of the characters) and one for numericalness (to the remaining – middle – 4 characters).

Now, in order to perform such simultaneous testing, we need to resort to some logical acrobatics. Although such an approach may appear a little convoluted, it is justified (in my opinion) by virtue of the fact that we are able to construct a formula set-up in perhaps a fifth of the characters that we would have were we to instead employ multiple, individual tests and then somehow collate those results.

Allow me to demonstrate how we can achieve such simultaneous criteria testing here.

The reason for the choice of 42 is nothing to do with it purportedly being the answer to the ultimate question of life, the universe and everything, but rather due to the fact that 42 is the ASCII code for the asterisk (*).

Hence, by subtracting 42 from our array of values, we know that any zeroes in there must relate to asterisks within our substrings. And the reason we wish to associate zeroes with those characters is that we can then employ an abridged IFERROR technique in our solution.

the point being that any zeroes from the original array result in #DIV/0! whilst non-zero values remain unchanged. Readers may see here for more on this technique if they wish.

Now, since we are interested in identifying not only asterisks in our substring but also numerics, we can use the fact that the ASCII codes for the digits 0-9 range from 48 to 57 inclusive.

The usual way in which we would test an array of values as to whether their ASCII code falls somewhere in this range would be to subtract the ASCII codes from 52.5, take the absolute of the resulting values and test these as to whether they are less than 5 (52.5-48=4.5 and 52.5-57=-4.5).

(Readers who have not seen this method of employing a single test using the absolute values rather than two separate tests – one involving less than and one involving greater than – should take note of the abbreviation it offers us.)

Here, however, we have already subtracted 42 from our array of ASCII codes and so, instead of the above-mentioned 52.5 we will need to use 10.5, which means that:

We will shortly be testing each of the six elements in each of the 45 groups in this large array as to whether they meet our criteria. Firstly, though, we need to resolve our #DIV/0!s. And we need to resolve them to a value such that, when we perform our test on them, we ensure that only those values satisfy whatever criterion we apply.

Hence my choice of an arbitrarily large negative value of -1000 for the IFERROR. Since, if we now resolve this IFERROR, we see that:

And if we now perform a comparison, for each of these 45 groups, as to whether each of the six elements within that group is less than the corresponding element in the following array:

{-999,5,5,5,5,-999}

we know that only an asterisk in the original string will satisfy the criterion of having a value less than -999 in the array we have created. We also know, from the discussion earlier, that only a numeric will have a value less than 5.

All in all, then, if any group of six elements satisfies all of the above criteria, we can conclude with certainty that it must begin and end with an asterisk and also have for its middle four characters some number from 0-9.

We then subtract 1 from this value, since we are now going to apply LEFT to our original string using this value as a parameter and clearly we do not wish to have a result which ends in an asterisk or obelisk.

and it is now a relatively straightforward task to extract our desired string using a standard technique. I will not give a detailed explanation of the remainder of the resolution of this formula, though readers may see here for a breakdown of how this technique with TRIM, SUBSTITUTE and REPT works if they wish.