And the merge should have the rows of Table1 and the rows of Table2 where the "prefix" of Table2.Column2 is Table1.Column2. Take into account that I do not know the number of chars of the prefix, I mean, Table1.Column2 can have any lenght not greater than 7

Re: Complex PowerQuery Merge query using substrings

A different way to perform a lookup is to use "Table.SelectRows", although this might be slower - so it is essential that you use the Table.Buffer and for very large table try to partition also. But the basic principle can look as follows:

Re: Complex PowerQuery Merge query using substrings

well there needs to be a rule about the length of the merged characters. It is not important how long the second table are, but it is important how long the characters in the first table are. Your example suggests that there can be 3 or 4 character lenght in the first table. Assuming that is a typo, I would do this.

Load table 2

duplicate column 2

split column 2 at 4 characters

delete the remainder column

join table 1 on table 2 with the new column.

If your 3 character 443 in table is not a typo, you are in a world of trouble - unless you can identify a rule on how to split the column in table 2.

* Matt is a Microsoft MVP (Power BI) and author of the Power BI Book Supercharge Power BI.

Re: Complex PowerQuery Merge query using substrings

A different way to perform a lookup is to use "Table.SelectRows", although this might be slower - so it is essential that you use the Table.Buffer and for very large table try to partition also. But the basic principle can look as follows:

A way you could possibly do this would be to duplicate your query 6 times, split off a different number of characters in each query, and merge only matching rows to the original table. Then at the end you'd need to append all the tables.

Re: Complex PowerQuery Merge query using substrings

This is awesome solution.I've spend few days trying different ways of achieving same result, but this is by far the best - fastest and cleanest.

I wonder if this could be made into custom function?

Also, is there a solution to speed up subsequent actions after this join? To be honest, performance of PQ after this kind of join falls of the cliff and even simple filter takes ages (and this is for a dimension kind of table, with just about 1000 rows)

Re: Complex PowerQuery Merge query using substrings

The solution below I applied similarly in an Excel / Power Query application I recently developed ( @ImkeF knows: the account matching application), which is used in an interactive way, so it had to be fast.

You can add key columns to Table 2, each with the first n (1-7) characters of the value in Column2, acting as key columns. I did so in below query Table2With7Keys.

Next, you can add a column to Table1 with the length of the value in Column2, and group on that length, so you have partitioned tables for each key length.

Now you can merge those nested tables with Table2, using the key column with the corresponding key length.