{-# LANGUAGE BangPatterns #-}
import qualified Data.Map as M
import Debug.Trace
{-
I'm trying to run a HAppS web site with a large amount of data: stress
testing happstutorial.com.
Well, 20 million records doesn't sound that large by today's
standards, but anyway that's my goal for now.
I have a standard Data.Map.Map as the base structure for one of my
macid data tables (jobs), but I noticed something
that is probably causing problems for me.
Even a simple 20 million record with int/int key values causes an out
of memory error for me in ghci,
on a computer with 256M of ram.
I'm wondering if there is a data structure that might be more suitable
for large recordsets.
Or do you just have to use a database, or some sort of file-based
serialization, once your records
are in the millions?
Or is this some weird subtlety of lazy evalution, or some other haskell gotcha?
-}
size = 2 * 10^7
-- out of memory error
t = (M.! size) . myFromList . map (\i->(i,i)) $ [1..size]
-- Lists are no problem
{-
*Main> :! time ghc -e tL testMap.hs
(20000000,20000000)
3.38user 0.09system 0:03.53elapsed 98%CPU (0avgtext+0avgdata 0maxresident)k
-}
tL = (!! (size-1)) . map (\i->(i,i)) $ [1..size]
t2 = (M.fromList . map (\i->(i,i)) $ [1..10] )
M.\\ (M.fromList . map (\i->(i,i)) $ [6..15])
-- does this evaluate all of list l, or just whnf?
myFromList (!l) = M.fromList l