Seth Chandler wrote:
>
> Suppose one has a list of ordered pairs oplist such as
> {{1,5},{1,6},{2,3},{2,3} ... }. All the values in the list are positive
> integers. The highest value in oplist is known to be some value z.
>
> Now one wishes to form a matrix such that the value at element i,j is equal
> to
> Count[oplist,{i,j}]. One terribly slow way to do this is
>
> Table[Count[oplist,{i,j}],{i,1,z},{j,1,z}]
>
> A faster way to do this is as follows:
> Fold[ReplacePart[#,Extract[#,#2]+1,{#2}]&,Table[0,{z},{z}],oplist]
>
> Does anyone have a considerably faster way? A speed up of 100% or more would
> be very helpful.
>
> For what it's worth Count[oplist,{something_,_}] is the same for all values
> of something.
>
> P.S. The problem arises in converting a representation of a directed graph
> into a Markov transition matrix.
>
> Thanks,
>
> Seth J. Chandler
> Associate Professor of Law
> University of Houston Law Center
A straightforward binning procedure will be about as fast as you can
get. Since everything is simple machine integer arithmetic, I use
Compile and that appears to make things 5-10 times faster than
otherwise.
n = 100;
edges = Table[{Random[Integer,{1,n}],Random[Integer,{1,n}]}, {n^3}];
convertToMarkovMatrix = Compile[{{n,_Integer}, {edges,_Integer,2}},
Module[{mat = Table[0,{n},{n}]},
Map[(mat[[#[[1]],#[[2]]]] += 1)&, edges];
mat
]];
In[39]:= Timing[mat1 = convertToMarkovMatrix[n,edges];]
Out[39]= {3.77 Second, Null}
To see how it is without Compile, try:
Timing[mat2 = Module[{mat = Table[0,{n},{n}]},
Map[(mat[[#[[1]],#[[2]]]] += 1)&, edges];
mat
];]
I am unable to compare this to your Fold[...] code because that gives
error messages such as
ReplacePart::psl:
Position specification {{6, 58}} in
ReplacePart[{{0, 0, 0, <<95>>, 0, 0}, <<99>>}, <<2>>]
is not an integer or a list of integers.
Extract::partw:
Part 56 of ReplacePart[{{0, 0, 0, <<95>>, 0, 0}, <<99>>}, <<2>>]
does not exist.
For this example,
Timing[mat3 = Table[Count[edges,{i,j}],{i,1,n},{j,1,n}]]
will take longer than I am willing to wait. So I'll use only n^2 edges
(1% of them).
In[42]:= e2 = Take[edges,n^2];
In[43]:= Timing[mat3 = Table[Count[e2,{i,j}],{i,1,n},{j,1,n}];]
Out[43]= {199.84 Second, Null}
We'll compare it to the bin count method to see just what improvement
factor we get in this case.
In[44]:= Timing[mat4 = convertToMarkovMatrix[n,e2];]
Out[44]= {0.04 Second, Null}
In[45]:= mat4===mat3
Out[45]= True
The bin count method is O(max(n^2,#edges)) for speed, which is what we
want for this problem. Though if the edge set is small one might do
better with a sparse representation of the matrix, as you could then get
rid of the n^2 term.
Daniel Lichtblau
Wolfram Research