From: ice
Date: September 28 2013 12:49pm
Subject: Re: Tablespaces large enough, but still get table full error
List-Archive: http://lists.mysql.com/cluster/8594
Message-Id:
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
Giovane gmail.com> writes:
>
> Hi David,
>
> Thanks for your answers.
>
> Just for curiosity, may I ask the size of your largest table in the
> cluster, and how much you reserved for DataMemory and IndexMemory?
>
> And yeah, I've tried with partitioning and works fine. But you still
> keep with a single machine, that is, mem and cpu bottleneck.
>
> But the thing is that I'm interested on an more scientific
> measurement, so I'm comparing two scenarios: to store the data
> centralized (with partitioning and also with partitioning) and
> distributed (with 2, 3, 4, ... data nodes).
>
> But I still cannot make the cluster to work, that is, to import the 4GB data.
> And I read all the documentation, created the tablespaces, set to
> store on disk, ran it many times, and stills not working :(
>
> To be honest, I do not know what else I could do to make it work.
>
> Thanks and regards,
>
> Giovane
>
> On Thu, Jun 25, 2009 at 11:10 AM, David Ennis techwest.ie> wrote:
> > Hi Giovane.
> >
> > What's described makes sense to me - and looks similar to simple trials I've
> > done before with millions of lines inserted and various fields in disk-based
> > storage - all starting from the simple tutorials.
> > However, a few (non-cluster) items I thought I'd share.
> > Sorry.. They don't answer your questions below, but for what you describe,
> > if you aren't looking for fault tolerance, perhaps a non-NDB solution will
> > work..
> >
> > Regarding cluster speed(someone please correct me here, if needed)
> > - Disk based storage in cluster is not particularly fast
> > - To use disk-spaced storage, those fields cannot be part of any index - so
> > if those fields are an integral part of any query, no indexes would itself
> > cause poor results on queries
> > - NDB is not very good at all with any queries with joins
> >
> > If you are purely looking for performance increase on reads only, depending
> > on how you use the data, I'd suggest using partitioning/maybe
> > subpartitioning your table.
> > Just use your existing table create statement and add the partition specific
> > additions to it and then select into the (new)partitioned table - I believe
> > this would be faster than altering the existing table and adding partions
> > under it.
> >
> > - Perhaps a range partition on your 'end_time' field or something
> > - Partitioning works with NDB, MyISAM and InnoDB, but is still fastest on
> > MyISAM
> >
> > -David
> >
> >
> >
> >
> > Giovane wrote:
> >>
> >> Hi Andrew and Augusto,
> >>
> >> Thanks for your replies.
> >>
> >> Firstly I've included a primary key: PRIMARY KEY (`start_time`,`flowid`).
> >>
> >> Then, I've played a bit with DataMemory and IndexMemory and tried to
> >> insert the 4GB data.
> >> My input file has 60M lines, and what I go for each case was:
> >>
> >> Case 1:
> >> DataMemory: 400M
> >> IndexMemory: 1200M
> >> # of lines inserted before full table error: 12M
> >>
> >> Case 2:
> >> DataMemory: 800M
> >> IndexMemory: 800M
> >> # of lines inserted before full table error: 25M
> >>
> >> Case 3:
> >> DataMemory: 1400M
> >> IndexMemory: 300M
> >> # of lines inserted before full table error:46M
> >>
> >> The conclusions from that is that DataMemory parameter matters more
> >> than IndexMemory while inserting data.
> >>
> >> I'll explain what I 'm trying to do and I would like to hear from you
> >> if you think it is feasible (checking mysql doc and it says that is,
> >> but so far I've been only failing).
> >>
> >> I installed mysql server on a single machine, not cluster. Than I've
> >> created a table using myisam engine, and finally I imported the 4GB
> >> input file. No problem at at (and the machine has 3GB RAM).
> >>
> >> The I thought: how would mysqlcluster improve my response time? Can I
> >> store the 4GB files on 2 machines (the same model as the one used on
> >> the "centralized" case) and user the cluster to parallelize a single
> >> query, improving performance? Again, the documentations says yes.
> >>
> >> So I tried to do all tablespace stuff to have this working (because
> >> mydata > RAM), but its not working.
> >>
> >> Does anyone know if there is something wrong with this approach?
> >>
> >> Thanks and regards,
> >>
> >> Giovane
> >>
> >>
> >>
> >
> >
> > --
> > MySQL Cluster Mailing List
> > For list archives: http://lists.mysql.com/cluster
> > To unsubscribe: http://lists.mysql.com/cluster?unsub=gufrgs
gmail.com
> >
> >
>
Hi,
I have the same issue here.
After some of my tests, I saw that, a file imported on a
mysql cluster disk storage need about 8-9 times more space
on the disk file.
The same table on InnoDB have this size:
12K test_table.frm
489M test_table.ibd
After dumping of table the dump file have this size:
171M test_table.sql
After importing on table to cluster on a data file of 3G I have this:
+-------------------+
| FILE_NAME | Total Extent Size | Total Free In Bytes | % Free Space
| EXTRA |
+-------------------+
| ./data/data_1.dat | 3221225472 | 1871708160 | 58.1055
| CLUSTER_NODE=3 |
| ./data/data_1.dat | 3221225472 | 1871708160 | 58.1055
| CLUSTER_NODE=2 |
+-------------------+
This means is using about 1300 MB:
3 times more space than InnoDB and about 8 times bigger than dump file size.
I found this to be a very stupid thing and I didn't found anything
relatedon internet about this.
Why a table need such lot of space for a disk storage?
@Giovane if you need to import a 4GB file then you need a data file large
than 32GB to don't get a Full Table error.
Of course, if somebody know if there is a fix for this or how we
can reduce the space used on data files, please let us to know.
I want to import tables bigger than 100GB on mysql cluster and
I don't have the disk space for them.
Regards,
ice