Andrzej Kozlowski wrote:
>
> This is very efficient indeed and beutifully illustrates the usefullnes of
> the amazing Sequence function. The only slight weakness that I can see is
> when dealing with very long lists with a small number of distinct elements.
> Since we know that the final output must have the same number of elements as
> given by the (very fast) built-in Union function, for such lists one may do
> better by a code which stops checking elements of li once it has constructed
> a list of length Length[Union[li]]. Here is a modified code that does this:
>
> In[2]:=
> OrderedUnion1[li_]:=Block[{i,counter=0,l0={},m=Length[Union[li]]},
> i[n_]:=(i[n]=Sequence[];counter=counter+1;n);
> Scan[If[counter<m,l0={l0,i[#]},Return[Flatten[l0]]]&,li];Flatten[l0]]
>
> On very long lists with a small number of distinct elements this is somewhat
> faster than OrderedUnion:
>
> In[3]:=
> list1=Table[Random[Integer,{1,9}],{10000}];
>
> In[4]:=
> OrderedUnion[list1];//Timing
> Out[4]=
> {0.166667 Second, Null}
>
> In[5]:=
> OrderedUnion1[list1];//Timing
> Out[5]=
> {0.0833333 Second, Null}
>
> (Mac PowerBook G3 233 Mhz).
>
> However, in the case of lists with very few or no repetitions the balance is
> naturally reversed:
>
> In[6]:=
> list2=Range[1000];
>
> In[7]:=
> OrderedUnion[list2];//Timing
> Out[7]=
> {0.15 Second, Null}
>
> In[8]:=
> OrderedUnion1[list2];//Timing
> Out[8]=
> {0.25 Second, Null}
Use of Union or Sort will make the asymptotic complexity worse
regardless of size of result, but the crossover may be so large when
dealing with lists of machine integers or machine reals that it hardly
matters, at least in version 4. I'd not be surprised if it is on the
order of a few million.
> What's worse, on my Mac if I try to apply OrderedUnion1 to
> list2=Range[10000] I get a crash, caused presumably by the Kernel running
> out of memory while building up a huge nested list.
This problem was brought to my attention by Colin Rose. I need to look
into it.
> For those with a taste for weird Mathematica programs here is an unusual
> (and unfortunately inefficient) example:
>
> In[10]:=
> OrderedUnion2[l_]:=Module[{a},Reverse[GroebnerBasis[Map[a,l]]]/.a[k_]->k]
>
> e.g.
>
> In[11]:=
> OrderedUnion2[{3,2,4,5,a,a,3,6,5,2,8}]
> Out[11]=
> {3, 2, 4, 5, a, 6, 8}
This relies on the fact that GroebnerBasis orders its variables as they
appear in the input. Not really safe to rely on this as it is not
documented behavior. Specifically, default ordering is unspecified.
> --
> Andrzej Kozlowski
> Toyama International University
> JAPAN
> http://sigma.tuins.ac.jp
> http://eri2.tuins.ac.jp
>
Daniel Lichtblau
Wolfram Research
PS I agree Carl Woll's method is really nice. I played around with the
problem before I saw his, also had an O(n+m) method where n = size in
and m = size out. His method gave timings consistently around 2/3 of
mine on large problems.