Tuesday, April 28, 2009

The CCK ‘display fields’ settings are very useful for theming nodes by content type, but I often find myself having to get further into customizing node output field by field.

Once I decide there’s no way getting around taking finer control of my node’s fields, I create a node-{type}.tpl.php by copying my existing node.tpl.php and open up Devel’s ‘Node Render’ tab. Then I delete the $content variable from the template and start adding things like

<?phpprint $node->content['field_image']['#value'];?>

and

<?phpprint $node->content['body']['#value'];?>

The $content variable is nice for amateurs, but we need the unrendered $node->content to get good control of a node.

While this gives me complete control over the output of the node’s fields, the drawback is that now if I add a new field to the node, or if I add a module that adds something to $node->content, I have to go back to my template and add in this new element. Because I often do my theming and development in parallel, this can be rather annoying, and there is also a danger that I could overlook doing it.

Therefore I think it may be more practical to use code like this in the node template:

Instead of printing each field that is wanted in the template, I can instead unset the fields that are not wanted and then print all the fields. This is faster to write and it lets any new fields show up. After unsetting unwanted fields it looks like this:

Now I know that these template files are not supposed to get all mucked up with php, so this method is really just a hack and won’t be appreciated if you are working with a designer who dislikes php. But if you, like me, were finding yourself dissecting $node->content into crazy pieces with tons of php in your node templates already, perhaps this method could be a slight improvement.Commentssirkitree (not verified) on February 12, 2008 - 3:40am

Yes it is better to get this done in template.php and then have your .tpl file clean for a designer, but this is a key step that should be learned on the way there. After that, if you find your template.php file getting overcrowded, you can split out your $vars into node-[content-type].vars.php.

* reply

jody on February 12, 2008 - 10:09am

I wonder if we could combine these two methods. In _phptemplate_variables maybe we could assign all the fields we want as $vars and also assign a variable to represent all the fields we neither assigned nor unset, like a catch-all.

* reply

adrinux (not verified) on February 12, 2008 - 6:25am

Fantastic write up! I think I’ll be using this.

* reply

moshe weitzman (not verified) on February 12, 2008 - 8:14am

very nice writeup. we want to improve this experience for themers in D7. they will have to use a few simple php functions but the temptation to override $content will be much smaller. we will introduce render() and hide() so you can put pieces of $content where you want without wrecking future enhancements. your writeup is current state of the art though - thanks for sharing.

* reply

jody on February 12, 2008 - 10:00am

Eaton’s goals in that thread seem to be exactly right - I’m really glad to see people are working hard on this issue.

* reply

Alan Burke (not verified) on February 12, 2008 - 8:45am

There’s a better waySee the theme folder of the CCK moduleThere’s a readme txt file which explains how to have individual template files for each CCK field.

RegardsAlan

I Thought I had added it to Drupal.org…but I can’t find it now

* reply

jody on February 12, 2008 - 9:36am

Hi Alan,

I checked out that CCK theme README and yes it does explain how to make template files for individual fields. I think that most of the time that won’t be helpful to me. A good example of what I’m most often trying to do with my fields is to just stick a few of them in the same div in the node template. I think a better example of what I’m doing may be:

I have a quick question. How would you do this for CCK fields that are inside a fieldgroup?

Thanks again :)

* reply

jody on February 13, 2008 - 8:44pm

That’s a good question- using fieldgroups makes this even more complex. Here’s an idea showing some fields from a fieldgroup being explicitly displayed, and then anything else in the fieldgroup gets printed last.

Thank you Jody! I was struggling with the fieldsets for quite a while. This is perfect :)

* reply

John (not verified) on March 3, 2008 - 12:45am

Thank you for this info. I have used it in a site and it has been helpful. I ran into an issue that unfortunately limits its usefulness however, in that the node content bit that I have pulled into a block is not rendering PHP. I have PHP enabled as the input format on both the node and the block. The PHP [a call to base_path()] shows in the source but is not processed as PHP. Has anyone experienced this and found a solution?

* reply

John (not verified) on March 12, 2008 - 1:38pm

I think i was trying to do too much with this. While i never did really find an answer to my issue, i did resolve the situation by using “CCK Blocks” to get my additional fields into blocks that would evaluate PHP.

So your technique was helpful in that i could add fields to a node but keep some of them from displaying, for example in a typical content-in-center-column layout. Then through CCK Blocks i could use those fields elsewhere on the page (left column in my case).

Thanks again for taking the time to post this helpful technique.

* reply

jody on March 12, 2008 - 1:46pm

I would be concerned if PHP was not evaluating in my blocks. I hope that you did the obvious and enclosed it in php tags. I have never heard of this issue.

As regards simply not showing certain of your cck fields, you can take care of this by visiting the ‘display settings’ tab of your content type’s administration. The technique posted here is intended for more advanced theming needs.

* reply

Martijn (not verified) on March 19, 2008 - 8:48am

Hi Jody,

Excellent writeup. This helps a lot with Drupal theming!You yourself have a great theme also!

greetings,Martijn

* reply

Chacha (not verified) on March 25, 2008 - 4:39pm

Hey!! So good to find your website. This write up is great. I am now going to try unset on my dumb annoying fields right now!

* reply

Marcus (not verified) on May 30, 2008 - 6:57pm

Jody,

Thanks for posting these node theming tips. The example code you set forth has been immensely helpful.

How can i alter my timeout settingHey there,i have a rather long script that makes several connections to a MySQL database. Sometimes , if there is a problem with the script, or a bug in my code, or something else unexpected, the connections dont get closed. Is there a way that i can set MySQL to percieve when a thread has been idle for like 5 minutes and kill it for me? Like a timeout function.i have googled and searched around here, but i am having trouble finding out just where to do this.i have MySQL-Adminstrator and phpmyadmin that i can use if its any easier.thanks

It does. The wait_timeout setting is set to 28,800 seconds and MySQL will kill inactive threads after this. You can set it while MySQL is running with "SET wait_timeout = $value;", or set it directly in your my.ini file.

the problem was in the my.ini file (for windows only) that was to beedited and not the config.ini. in this file we have to set the timeoutby setting innodb_lock_wait_timeout.

======The InnoDB error codes page lists the following information:

1205 (ER_LOCK_WAIT_TIMEOUT)

Lock wait timeout expired. Transaction was rolled back.

So how can we prevent that from happening?

This happens because the value of innodb_lock_wait_timeout was not enough for InnoDB to acquire the lock. On my servers it was set to default: 50.

The manual has following information:

The timeout in seconds an InnoDB transaction may wait for a lock before being rolled back. InnoDB automatically detects transaction deadlocks in its own lock table and rolls back the transaction. Beginning with MySQL 4.0.20 and 4.1.2, InnoDB notices locks set using the LOCK TABLES statement. Before that, if you use the LOCK TABLES statement, or other transaction-safe storage engines than InnoDB in the same transaction, a deadlock may arise that InnoDB cannot notice. In cases like this, the timeout is useful to resolve the situation. The default is 50 seconds.

Quick and Dirty MySQL BackupUntil recently, the MySQL databases I work with contain data that can be retrieved from other sources. Most of the data is either batched in from flat files or another database. It would be inconvenient to reload a couple months worth of data, but since these databases are not mission critical, the business could operate without them for a couple days. Lately, we've been implementing some semi-critical systems that rely on a somewhat expedient recovery.

The requirements for the project were that the database must remain up during the backup and losing a day's worth of data was acceptable. All of my regular Oracle readers are cringing at the moment, but hey, that was the rules I was working with.

My first thought was to use mysqlhotcopy because it backed up the actual physical files. However, mysqlhotcopy only allows you to backup MyISAM tables and we extensively use InnoDB.

My next choice was mysqldump. mysqldump basically takes the entire database and dumps a text file containing DDL and DML that will re-create your database. Coming from an Oracle background, I knew there were shortcomings to dumping the entire database, but hopefully I could mitigate them.

The first hurdle was security. I specifically turn off unauthenticated root access on my databases, but I needed to be able to read all the tables to do a backup. I don't want to hard-code my root password or any password in a script as I don't have suicidal tendencies (diagnosed, anyway). So I created a user called backup that could only login from the server machine, but could login unauthenticated.

The next thing I had to figure out was how to get a consistent view of the data. I knew that my developers preferred InnoDB for it's Referential Integrity features and getting inconsistent data would be disasterous. Fortunately, one of the flags to mysql_dump is the --single-transaction which essentially takes a snapshot in time.

So I wrote a script around mysql_dump and --single-transaction and dumped my entire database to disk. Every now and again, I encountered an "Error 2013: Lost connection to MySQL server during query when dumping table `XYZ` at row: 12345". The row number changed each time, so I figured it had something to do with either activity in the database or memory. I could rerun the command and it usually finished the second or third time.

After the third day straight of my backup failing, I decided to research it a little more. mysql_dump has a flag called --quick which bypasses the cache and writes directly to disk. I put this flag in my backup script and the script started finishing more consistently.

The last hurdle was having enough space on disk to store my backups. Since the backup file is really a text file, I decided to pipe the output through gzip to reduce it's size.

Currently, my quick and dirty backup script is a wrapper around the following command:

We're adopting MySQL at a blistering pace, so I'm sure I'll need to make changes in the future. For right now, though, it gets the job done.Posted by Jeff Hunter at 8:43 PM10 comments:

Anonymous said...

Hi,

Can you post SQL string you used to create backup user please.

Roland 11/07/2006 06:56:00 AM gandolf989 said...

Under what situations is mySQL better than Oracle? 11/07/2006 10:10:00 AM Anonymous said...

While your Database is still relatively small, you might want to look into replication, and backing up the replicated machine.

The Replicated machine will contain an exact snapshot of the live machine in case it goes down, and the backups of the replicated machine are there in case someone accidently deletes or over writes data.

/bryan 11/07/2006 10:43:00 AM Anonymous said...

Not great I'm afraid. This will cause MyISAM tables to lock for the duration of the dump and will take ages to do innodb tables. You sohuld really consider moving everything to InnoDB and buying InnoDB hot Backup. 11/07/2006 12:20:00 PM Jeff Hunter said...

This will cause MyISAM tables to lock for the duration of the dump and will take ages to do innodb tables. You sohuld really consider moving everything to InnoDB and buying InnoDB hot Backup. We don't use MyISAM and I don't think 15 minutes is ages. I've investigated InnoDB backup and when the time is right, we'll probably get it. But right now, the quick and dirty method works well with my business rules. 11/07/2006 06:08:00 PM Anonymous said...

As Roland said above, what was the command to create the backup user? 12/25/2006 02:44:00 AM Anonymous said...

What if you want to restore only one database? That would be a pain using --all-databases and gzipping everything.

I wrote a small PHP script to handle all my backup needs, I also feel that it's enough with a nightly backup and it's ok to lose one day of databases if we have a crash (at least so far), so I made this script: http://iblins01.quidnovo.se/~sebjma/backup.phps

The script is called via crontab during low traffic times (usually at night) and will produce gzipped files for each database on the system. Backups older than a predefined timelimit is deleted from the filesystem.

Saturday, April 25, 2009

You can store each InnoDB table and its indexes in its own file. This feature is called “multiple tablespaces” because in effect each table has its own tablespace.

Using multiple tablespaces can be beneficial to users who want to move specific tables to separate physical disks or who wish to restore backups of single tables quickly without interrupting the use of other InnoDB tables.

To enable multiple tablespaces, start the server with the --innodb_file_per_table option. For example, add a line to the [mysqld] section of my.cnf:

[mysqld]innodb_file_per_table

With multiple tablespaces enabled, InnoDB stores each newly created table into its own tbl_name.ibd file in the database directory where the table belongs. This is similar to what the MyISAM storage engine does, but MyISAM divides the table into a tbl_name.MYD data file and an tbl_name.MYI index file. For InnoDB, the data and the indexes are stored together in the .ibd file. The tbl_name.frm file is still created as usual.

You cannot freely move .ibd files between database directories as you can with MyISAM table files. This is because the table definition that is stored in the InnoDB shared tablespace includes the database name, and because InnoDB must preserve the consistency of transaction IDs and log sequence numbers.

If you remove the innodb_file_per_table line from my.cnf and restart the server, InnoDB creates tables inside the shared tablespace files again.

The --innodb_file_per_table option affects only table creation, not access to existing tables. If you start the server with this option, new tables are created using .ibd files, but you can still access tables that exist in the shared tablespace. If you start the server without this option, new tables are created in the shared tablespace, but you can still access any tables that were created using multiple tablespaces.Note

InnoDB always needs the shared tablespace because it puts its internal data dictionary and undo logs there. The .ibd files are not sufficient for InnoDB to operate.

To move an .ibd file and the associated table from one database to another, use a RENAME TABLE statement:

RENAME TABLE db1.tbl_name TO db2.tbl_name;

If you have a “clean” backup of an .ibd file, you can restore it to the MySQL installation from which it originated as follows:

1.

Issue this ALTER TABLE statement to delete the current .ibd file:

ALTER TABLE tbl_name DISCARD TABLESPACE;

2.

Copy the backup .ibd file to the proper database directory. 3.

Issue this ALTER TABLE statement to tell InnoDB to use the new .ibd file for the table:

ALTER TABLE tbl_name IMPORT TABLESPACE;

In this context, a “clean” .ibd file backup is one for which the following requirements are satisfied:

*

There are no uncommitted modifications by transactions in the .ibd file. *

There are no unmerged insert buffer entries in the .ibd file. *

Purge has removed all delete-marked index records from the .ibd file. *

mysqld has flushed all modified pages of the .ibd file from the buffer pool to the file.

You can make a clean backup .ibd file using the following method:

1.

Stop all activity from the mysqld server and commit all transactions. 2.

Wait until SHOW ENGINE INNODB STATUS shows that there are no active transactions in the database, and the main thread status of InnoDB is Waiting for server activity. Then you can make a copy of the .ibd file.

Another method for making a clean copy of an .ibd file is to use the commercial InnoDB Hot Backup tool:

1.

Use InnoDB Hot Backup to back up the InnoDB installation. 2.

Start a second mysqld server on the backup and let it clean up the .ibd files in the backup.

Note that the advice above regarding the DISCARD TABLESPACE and IMPORT TABLESPACE only works when you restore the .ibd file to the "installation from which it originated." If you try to copy ibd files into a new database instance you will get an error like this:

InnoDB: Error: tablespace id in file './test/<ibd_file_name.ibd>' is 38, but in the InnoDB data dictionary it is 401.

When you run:

ALTER TABLE <table_name> IMPORT TABLESPACE

It is possible to copy InnoDB tables between two databases using mysqldump and mysqlimport.Posted by Ricardo Marques on May 24 2006 5:48pm [Delete] [Edit]

Note that .ibd files are always single auto-extending files.

So it's NEITHER possible to set an initial size for .ibd files NOR to set a maximum file size for them.

This information was posted by Heikki Tuuri (creator of the InnoDB transactional storage engine) in the InnoDB forum of the MySQL web site, in January 2006, at:

I am actually using this feature to allow me to maintain disk usage. Here is details on what I am trying to achieve. Hopefully it will help somebody. http://crazytoon.com/2007/04/03/mysql-ibdata-files-do-not-shrink-on-database-deletion-innodb/Posted by Dario Souza on September 19 2007 4:00am [Delete] [Edit]

If you need to move some or all of your tables to separate .ibd files, you may try the following:

1) add "innodb_file_per_table" option to my.cnf2) restart mysqld3) do an ALTER TABLE <table> ENGINE=InnoDB; on each table you want to move apart. This will recreate the table in its separate .ibd file.

The reverse also applies. Coalescing and/or moving apart tables only requires you to restart mysqld with or without the innodb_file_per_table option flag.Posted by murtuja khokhar on March 28 2008 8:52am [Delete] [Edit]

Thanks Dario Souza,

This query will generate ALTER TABLE ENGINE=InnoDB; type statements for all tables.

Recently, a customer wondered if they should start using the innodb_file_per_table option, or if they should continue to use the large InnoDB tablespace files created by the innodb_data_file_path option in the my.cnf option file.

Many people still use the older innodb_data_file_path option because it is the default for MySQL server. So, what are the benefits of using innodb_file_per_table instead?

The innodb_file_per_table makes for easier-to-manage files. With this option each InnoDB table has its own data and index file under the database directory. As an example, if you had table foo located in database xyz the InnoDB data file for table foo would be /var/lib/mysql/data/xyz/foo.idb. Each table would have its own idb table in the appropriate database directory. This is in contrast to using the innodb_data_file_path option with (typically) one large file in the root of your data directory. For example, it might be /var/lib/mysql/data/ibdata1.idb. All table data and indexes would be stored in this one file, and it can be very largeÂ and unwieldy. I don’t recall the largest ibdata file I have seen, but what do you do if you have a 100 gig InnoDB tablespace file? It can, and does, happen. The file contains what amounts to all the data of all your databases on the server.

Also, with the innodb_file_per_table option, you know what table is in each file.Â You know that table foo is in the foo.idb file.Â No other table’s data or index information is going to be in there.Â There is still a shared tablespace but it is going to be much smaller and more manageable.Â It contains the shared meta-data of all the databases, not individual table data.Â With the shared tablespace option, all data is typically going to be in one file, and you can’t differentiate between databases or tables.

Are there any advantages to using innodb_data_file_path? Glad you asked! There is one.Â You can specify multiple file paths using multiple innodb_file_per_table options in my.cnf, and “spread the load” over multiple partitions or hard drives.Â That is the theory.Â In practice, it is very rarely worth it.

Another question that often comes up is, if we are already using the single tablespace format can we convert to the innodb_file_per_table option? You can convert from one to the other but it will require that you stop writes to the server, perform a complete logical backup (i.e. mysqldump) of the server, shut down the server, change the my.cnf configuration to file per table, and then restart the server.Â You will then need to drop all the current database and then reimport all your data.Â Fun!

One last benefit of using innodb_file_per_table is that defragmenting the table (using OPTIMIZE TABLE) will actually free up space in the data file, whereas right now, if you defragment an InnoDB table, the ibdata file will not shrink.

There are more disadvantages of using file per table - you end up consuming significantly more file descriptors (which is a problem particularly for MySQL on Windows). The main InnoDB thread that wakes up every few seconds will also have to flush a lot more files when it checkpoints.

You don’t need to mysqldump the data to convert from the shared global tablespace to the file per table model (but you don’t reclaim space unless you do). A CREATE TABLE new_table LIKE old_table; INSERT INTO new_table SELECT * FROM old_table; DROP TABLE old_table; RENAME TABLE new_table to old_table; is the alternative. 2. Gavin Towey says: June 20, 2008 at 4:14 pm

You don’t even have to do that much to convert tables from the single tablespace to the individual files. After enabling “innodb_file_per_table” and restarting mysql just do “ALTER TABLE foo ENGINE=InnoDB” and it will rebuild the table and pull it out of the shared tablespace.

However if you ever want to reclaim all that shared tablespace then you’ll have to dump the db, shutdown mysql & delete the tablespace / log & .frm files then edit the my.cnf to re-adjust the shared tablespace file sizes and then restart and let mysql build empty files, then import your data back in.

It’s also been shown that the shared tablespace has a bit faster read/write performance. 3. Xaprb says: June 22, 2008 at 7:34 am

With file-per-table, the shared tablespace is still configurable as usual. It just doesn’t have as much data in it, so you can configure it to be smaller. It does have more than meta-data, though. For example: undo logs, insert buffer, doublewrite buffer, etc. Lots of “real” data — the kind of stuff that will cause lost data if you delete it. 4. phasnox says: July 2, 2008 at 12:15 pm

Hi, great article.. you said “You can specify multiple file paths using multiple innodb_file_per_table options in my.cnf, and ?spread the load? over multiple partitions or hard drives. ”

With MyISAM one could easily copy a database directory from one server to another and there it could be used immidialety.

Will this work with a InnoDB database that has been splitted to files too? 7. ICD says: August 6, 2008 at 3:27 pm

ON a version 5.0.36sp1-enterprise-gpl MySQL Enterprise Server (GPL)

I have the innodb_file_per_table set on without the innodb_data_file_path set and the following is true when I check the global variables: have_innodb has a value of YES innodb_data_file_path has a value of ibdata1:10M:autoextend innodb_file_per_table has a value of ON

this to me seems to be a bug. If there will be individual file ibd files why would I have an ibdata1 created? Secondly, anytime I try setting the innodb_data_file_path it turns off the innodb engine and have_innodb winds up having a value of NO. The error when I do set the innodb_data_file_path is 080805 7:16:57 [ERROR] InnoDB: syntax error in innodb_data_file_path

I was getting the error “ERROR 1286 (42000): Unknown table engine ‘InnoDB’” when trying to create/alter tables with the InnoDB engine. You can check for this warning by issuing the create or alter statements and then runningshow warnings;in the same mysql client session.

My problem is that this works on some clients, but not others. It just flashes a command prompt screen indicating it is connecting, but then the screen closes and disappears.

Does anyone have any suggestions? I am at a loss.

Thanks,Lisa

IP: LoggedTrammelMember posted 05-23-2006 05:46 PM Click Here to See the Profile for Trammel Edit/Delete Message Reply w/Quote I have had this experience with some psexec trying to execute it on remote machines. I have found that by using psexec.exe to execute a batch file is more reliable. Not only that, you can stick a pause in the batch file. This will stop the cmd window from dissapearing.

Obviously in the batch file is where you put your wuauclt /detectnow command

IP: Logged1isaannunregistered posted 05-25-2006 06:05 AM Edit/Delete Message Reply w/Quote Thanks, Trammel. I created a batch file located at C:\Program Files\Hyena\Scripts\wuaucltdetect with the following commands:

@echo onwuauclt /detectnow &pause@echo off

I'm new at batch files, so I was wondering if this is all I needed. Also, when I set the tool up in Hyena, should it read like this:

I wasn't sure if I had to list the entire path to the batch file. Is the syntax correct because it is doing the same thing.

Thanks for your help.Lisa

IP: Logged1isaannunregistered posted 05-25-2006 10:44 AM Edit/Delete Message Reply w/Quote I have just tried running psexec \\computername cmd.exe from the RUN command and the same thing happens. The window closes as fast as it opens. I guess this means that it isn't my wuauclt command or Hyena. Any other thoughts as to why this is happening on most of my clients?

I use a sharepoint for all of my tools instead of my local machine. You should be able to figure out the path difference and get yours to work. Make the batch file and then you can use a command window, paste or type in the command. This is how you can ensure the command is going to work. Once you have it working, then you can remove the pause

Hope this helps.

[This message has been edited by Trammel (edited 05-25-2006).]

IP: Logged1isaannunregistered posted 05-26-2006 05:04 AM Edit/Delete Message Reply w/Quote Trammel, you rock! Thanks for your help. The problem is that I was not logged on as an administrator and not adding the admin and password to the command line. Once I did that, everything worked great.

IP: LoggedrtmcmullenJunior Member posted 09-19-2006 12:18 PM Click Here to See the Profile for rtmcmullen Edit/Delete Message Reply w/Quote Is there anyway to have this dump into a log file for a status?

IP: LoggedTrammelMember posted 09-19-2006 06:00 PM Click Here to See the Profile for Trammel Edit/Delete Message Reply w/Quote You could add a command in the batch file to append a text file. Add this line to your batch file and point it to a network share instead of C drive.

Thursday, April 23, 2009

Sometimes it is necessary to keep a variable with in a certain range. ex. 0 - 255, 0 - 127, etc. I will show you an easy way to do modulo base 2 numbers with bitwise masking.

First off, a little background on what modulus does. Modulus finds the remainder after division of two numbers. So the result of a % b will always be in between 0 and one less than b.

Secondly, you need to know how the bitwise AND, &, works. It will compare the corresponding bits in each term, and only set the bit in the result if both bits in each term are set.CODE

0101AND 0011 = 0001

Now that we understand modulus and binary AND lets combine the two concepts.

If the result of the modulus will always be atleast one less than the value you are dividing by then this leaves us to be able to using a bitwise mask to perform the same operation on a base 2 number. First off a bitwise mask is just a row of consecutive bits used to mask certain bits in a byte. (00001111) The masks can be located anywhere inside the byte(s) and be of any length, but for the purposes of modulo they will always start at bit 0.

Now if we where to do a modulo of 8, the result would be in the range of 0-7, 0000 - 0111 in binary. That looks like a mask right? It is and this is how we will perform the modulo. We will set up a mask of the number - 1.CODE

Don't like having 'Shortcut to...' appended to every Shortcut? You're not alone. With this tip you can stop Windows for adding this text when creating links. This tweak can be easily applied using WinGuides Tweak Manager.Download a free trial now!

Open your registry and find the key below.

Create a new BINARY value name 'link', of modify the existing value, to equal '00 00 00 00' to have the Shortcut text disabled.

Additionally, if you happen to have a dollar-sign in your password (or username, or DSN) -- you must avoid using double quotes. This is a normal PHP gotcha, but worth mentioning.Won't work: $con = mssql_connect ("MyDSN", "MyUserID", "MyPa$$W0rd");

DBConvert Product Line is a constantly growing range of efficient tools strongly focused on data migration between different Database Management Platforms.It includes a number of powerful converters and synchronizers that perform accurate and convenient way of data transfer from one server to another.DBConvert tools handle the process of transformation and replication data between the most of the leading database engines MS Access, MS SQL, MS FoxPro, MySQL, PostgreSQL and others.

/** * Check if a number is a odd number. 單數 * @param * @return help text for the path */function is_odd($var){ return($var & 1);}

/** * Check if a number is a even number. 複數 * @param * @return help text for the path */function is_even($var){ return(!($var & 1));}

/** * Generate a random alphanumeric password. */function random_password($length = 10) { // This variable contains the list of allowable characters for the // password. Note that the number 0 and the letter 'O' have been // removed to avoid confusion between the two. The same is true // of 'I', 1, and 'l'. $allowable_characters = 'abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ23456789';

Copy and past the result. However, there is a reason that I recommendedrebuilding from scripts. ALTER TABLE only takes one ALTER COLUMN, soif there are several varchar columns in the same table, SQL Serverhave to rebuild the tables several times. Since nvarchar takes up twicethe space of varchar, this is no simple metadata change. For largetables, this could take a long time. Building a new database may befaster route.

But maybe some of the columns are indexed? In such case, you need todrop these indexes first.

change database columns from varchar to nvarchar if not already nvarchar

Hello, I am in a situation where I must update an existing database structure from varchar to nvarchar using a script. Since this script is run everytime a configuration application is run, I would rather determine if a column has already been changed to nvarchar and not perform an alter on the table. The databases which I must support are SQL Server 2000, 2005 and 2008.

SQLyog Sets New Standards for Data Synchronization Speed. peter_laursen June 21st 2008 Posted to MySQL, SQLyog In this BLOG post I will concentrate on the achievements with data synchronization with SQLyog 7.0 (beta 1 released just before the weekend).

SQLyog 7 provides 2-8 times speed improvement (depending on data) with a ‘mixed-type’ of syc-job (involving both INSERTS, UPDATES and DELETES) as compared to SQLyog 6. A few comparison examples (SQLyog 6, SQLyog 7 and Maatkit) with 4 testcases:

(all those examples use a ‘developer machine’ configuration as defined by the configuration wizard bundled with the windows installer for the MySQL server 5.0.51b. Sync is between two databases on the same server. Client and server running on the same machine - a 3Ghz Intel Pentium4 with 1G RAM Running Windows XP SP2 - and with no other significant load at the time of sync. Maatkit version is 1877. Perl environment for running Maatkit was created with ActivePerl for Windows).

We would like to credit Baron ‘Xaprb’ Schwartz (author of Maatkit) for the Maatkit algorithms though (this post in particular was a challenge for us) from which we learned a lot. For the most typical PK-setup (a single column integer PK) the somewhat better performance of SQLyog 7 as compared to Maatkit is probably only due to the fact that SQLyog/SJA is a multithreaded compiled (C++) binary and not a server-side (Perl) script. However with other (more unusual) PK setups the difference is bigger to the advantage of SQLyog.

When syncing to an empty table we use a special high-speed codebrach (a rather simple copy, actually). Other sync tools (also Maatkit) waste lot of time looping while testing things that really need not be tested in this situation! SQLyog will be even faster the larger the max_allowed_packet setting in server configuration as this setting will be detected and BULK INSERTS as large as possible will be generated (optionally). With the above test SQLyog was ~50 times faster than Maatkit when syncing to an empty target (but with the 2 largest cases we interrupted Maatkit before it had finished, though!)

You can download the four testcases if you want to verify/test on your environment. And if you think it is not fair to compare with Maatkit on WIndows, you can do the same on Linux with the SJA (SQLyog Job Agent) for Linux as well.

Download links (zip archives with SQL dumps for target and source for each case):case1 (~20 MB): http:webyog.com/downloads/blogresources/datasynctest/datasync_testcase_1.zipcase2 (~3 MB): http:webyog.com/downloads/blogresources/datasynctest/datasync_testcase_2.zipcase3 (~2 MB): http:webyog.com/downloads/blogresources/datasynctest/datasync_testcase_3.zipcase4 (~350 KB): http:webyog.com/downloads/blogresources/datasynctest/datasync_testcase_4.zip(one detail to observe is that those dumps do not contain a USE statement - be careful about where you import them!)

Download SQLyog and SJA from: http://webyog.com/en/downloads.php(note that data synchronization is included in SQLyog Enterprise and Enterprise Trial versions only - and in SJA for Linux)

ABOUT SIEGESiege is an http regression testing and benchmarking utility. It was designed to let web developers measure the performance of their code under duress, to see how it will stand up to load on the internet. Siege supports basic authentication, cookies, HTTP and HTTPS protocols. It allows the user hit a web server with a configurable number of concurrent simulated users. Those users place the webserver "under siege."

PLATFORM SUPPORTSiege was written on GNU/Linux and has been successfully ported to AIX, BSD, HP-UX and Solaris. It should compile on most System V UNIX variants and on most newer BSD systems. Because Siege relies on POSIX.1b features not supported by Microsoft, it will not run on Windows. Of course you can use Siege to test a Windows HTTP server.

NEWS & RELEASE INFORMATIONSIEGE-2.67 April 15, 2008 07:15 EDT This is the current stable release. It implements HTTP/1.1 cache validation. Since it's primary job is to simulate load, cache validation is off by default. You may activate it in your new .siegerc file. Developers, beta testers and adventurous souls, click HERE for the latest BETA version. It is basically a code cleanup. This is the final beta release. Look for version 2.66 soon.

"This is an excellent application that does something expensive suites don't. ie. It does what the users expects it to do, and does it simply."-- Sgt. Scrub

Saturday, April 18, 2009

Speed up your web pages with YSlowYSlow analyzes web pages and tells you why they're slow based on the rules for high performance web sites. YSlow is a Firefox add-on integrated with the popular Firebug web development tool. YSlow gives you:

An algorithm to find and resolve data differences between MySQL tableswith 10 commentsI’ve been designing an algorithm to resolve data differences between MySQL tables, specifically so I can ‘patch’ a replication slave that has gotten slightly out of sync without completely re-initializing it. I intend to create a tool that can identify which rows are different and bring them into sync. I would like your thoughts on this.Background and requirementsI see this as the next step in my recent series of posts on MySQL tools and techniques to keep replication running reliably and smoothly. Sometimes slaves “drift” a little bit, even when there don’t seem to be any issues with replication (this is one reason I submitted a bug report to add checksums on binlog events). Once a table differs on the slave, it gets more and more different from the master, possibly causing other tables to differ too.I need a tool that, given a table known to differ on master and slave(s), will efficiently compare the tables and resolve the differences. Finding tables that differ is easy with MySQL Table Checksum, but I am not sure the best way to find which rows differ.Here are my requirements. The algorithm needs to be:Designed for statement-based replication, which means no temp tables, no expensive queries that will propagate to the slave, and so forth. Efficient in terms of network load and server load, both when finding and when resolving differences. No huge tables or un-indexed data, no high-impact INSERT.. SELECT locking, etc. Efficient on the client-side where the tool is executed. Must work well on “very large” tables. Some things I assume:Tables must have primary keys. Without primary keys, it’s hard or a waste of time at best, and a disaster at worst. It is not a good idea to do this unless the fraction of rows that differ is very small. If much of the table is different, then mysqldump is a better idea. Other tools I’ve foundI’ve found a number of tools that are either not complete or don’t quite address the need, but reading the source code has been very productive. There’s Giuseppe Maxia’s work in remote MySQL table comparison. I based the MySQL Table Checksum tool on some of this work. Read the comments on that link, and you’ll see some follow-up from Fabien Coelho, who wrote pg_comparator. The documentation for this tool is an excellent read, as it goes into great detail on the algorithm used.There are also a few projects that don’t do what I’m looking for. datadiff does a two-way in-server comparison of two tables with OUTER JOIN, a fine technique but inherently limited to two tables on one server, and not practical for extremely large tables. coldiff is a more specialized variant of that tool. mysqldiff diffs the structure of two tables, which I mention for completeness though it is not the problem I’m trying to solve.The Maxia/Coelho bottom-up algorithmWithout restating everything these smart people have written, here’s a high-level overview of the algorithm as presented by Maxia and Coelho:Compute a “folding factor” based on the number of rows in the table and/or user parameters. Build successive levels of checksum tables bottom-up, starting at a row-level granularity and decreasing granularity by the “folding factor” with each level, until the final table has a single row. Each row in the first table contains key column(s), a checksum of the key column(s), and a checksum of the whole row. Each row in an intermediate-level summary table contains checksums for a group of rows in the next more granular level of summary data. Groups are defined by taking checksums from the previous level modulo the folding factor. Beginning at the most aggregated level, walk the “tree” looking for the differences, honing in eventually to the offending rows. The “folding factor” is really a “branching factor” for the tree of summary tables. If the factor is 128, each level in an intermediate summary table will contain the groupwise checksum of about 128 rows in the next most granular level summary table.This algorithm has many strengths. For example, it uses a logarithmic search to find rows that differ. It makes no assumptions about key distributions; the modulo operation on the checksum should randomize the distribution of which rows need to be fixed. It’s also very generic, which means it works pretty much the same on all tables. There’s no need to think about the “best way to do it” on a given table.I am concerned about a few things, though. There’s a lot of data in all these summary tables. The first summary table contains as many rows as the table to analyze. If I were to calculate and store these rows for a table with lots of relatively narrow rows, I might be better off just copying the whole table from one server to the other. Also, creating these tables is not replication-friendly; the queries that run on the master will run on the slave too. This might not be a problem for everyone, but it would not be acceptable for my purposes.The second part of the algorithm, walking the “tree” of summary tables to find rows that differ, doesn’t use any indexes in the implementations I’ve seen. Suppose I have a table with 128 million rows I want to analyze on two servers, using a branching factor of 128 (the default). The first checksum table has 128 million rows; the second has 1 million, and so on. Repeated scans on these tables will be inefficient, and given the randomization caused by the summary checksums, will cause lots of random I/O. Indexes could be added on the checksum modulo branching factor, but that’s another column, plus an index — this makes the table even bigger.The checksum/modulo approach has another weakness. It defeats any optimizations I might be able to make based on knowledge of where in the table the rows differ. If the differences are grouped at the end of the table, for example in an append-only table that just missed a few inserts on the slave, the algorithm will distribute the “pointers” to these corrupt rows randomly through the summary tables, even though the rows really live near each other. Likewise, if my table contains client data and only one client is bad, the same situation will happen. This is a major issue, especially in some large tables I work with where we do things a client or account at a time. These and other spatial and temporal locality scenarios are realistic, because lots of real data is unevenly distributed. The checksum/modulo approach isn’t optimal for this.Finally, the bottom-up approach doesn’t allow for early optimization or working in-memory. It builds the entire tree, then does the search. There’s no chance to “prune” the tree or try to keep a small working set. The flip side of this is actually a strength: assuming that the whole tree needs to be built, bottom-up is optimal. But most of my data isn’t like that. If much of the table is corrupt, I’m going to do a mysqldump instead, so I want to optimize for cases where I’ll be able to prune the tree.One solution: a top-down approachGiven that I won’t even be looking at a table unless the global checksum has already found it differs, I am considering the following top-down approach, or some variation thereof:Generate groupwise checksums for the whole table in a top-level grouping (more on that later). If more than a certain fraction of the groups differ, quit. Too much of the table is different. Otherwise descend depth-first into each group that has differences. I think this algorithm, with some tuning, will address most of my concerns above. In particular, it will allow a smart DBA to specify how the grouping and recursion should happen. The choice of grouping is actually the most complicated part.I’d do this client-side, not server-side. I’d generate the checksums server-side, but then fetch them back to the client code and keep them in memory. Given a good grouping, this shouldn’t require much network traffic or memory client-side, and will avoid locks, eliminate scratch tables, and keep the queries from replicating.In the best case, all other things being equal, it will require the server to read about as many rows as the bottom-up approach, but it will exploit locality — a client at a time, a day at a time, and so on. This is a huge help, in my opinion; reducing random I/O is a high priority for me.Given all this, I think top-down is better if there are not many changes to resolve, or if they’re grouped tightly together.Some of the weaknesses I see are complexity, a proliferation of recursion and grouping strategies, perhaps more network traffic, and susceptibility to edge cases. Whereas the bottom-up approach has identical best and worst cases for different distributions of corrupt rows (assuming the number of corrupt rows is constant), the top-down approach suffers if there’s no locality to exploit. I’m a bit worried about edge cases causing this to happen more than I think it ought to.Finally, and this could be either a strength or weakness, this approach lets every level of the recursion have a different branching factor, which might be appropriate or not — the DBA needs to decide.Smart grouping and recursionI think the hardest part is choosing appropriate ways to group and “drill down” into the table. Here are some possible strategies:Date groupings. We have a lot of data in InnoDB tables with day-first or week-first primary keys, which as you know creates a day-first or week-first clustered index. The first checksum I’d run on these tables would be grouped by day. Numeric groupings. Tables whose primary key is an auto-incremented number would probably be best grouped by division, for example, floor(id/5000) to group about 5000 neighboring rows together at a time. Character groupings. If the primary key is a character string, I might group on the first few letters of the string. Drill-down. Take for example one of our tables that is primary-keyed on IDs, which are auto-incremented numbers, and client account numbers. The best way to do the table I’m thinking of is by account number, then numerically within that on ID. For the day-first table, I’d group by day, then account number, and then by ID. Exploit append-only tables. If a table is append-only, then corruption is likely in the most recent data, and I might try to examine only that part of the table. If there are updates and deletes to existing rows, this approach might not work. Use defaults if the DBA doesn’t specify anything. If there’s a multi-column primary key, recurse one column at a time. If a single-column key, look for another key whose cardinality is less, and recurse from that to the primary key instead. I think the DBA will have to choose the best strategy on a table-by-table basis, because I can’t think of a good automatic way to do it. Even analyzing the index structures on the table, and then trying to decide which are good choices, is too risky to do automatically. For example, SHOW INDEX will show estimated index cardinalities, but they’re based on random dives into the index tree and can be off by an order of magnitude or more.How to resolve the differencesAgain assuming that this reconciliation is taking place between a master and slave server, it’s important to fix the rows without causing more trouble while the fixing happens. For example, I don’t want to do something that’ll propagate to another slave that’s okay, and thereby mess it up, too.Fixing the rows on the master, and letting the fixes propagate to the slave via the normal means, might actually be a good idea. If a row doesn’t exist or is different on the slave, REPLACE or INSERT .. ON DUPLICATE KEY UPDATE should fix the row on the slave without altering it on the master. If the row exists on the slave but not the master, DELETE on the master should delete it on the slave.Peripheral benefits of this approach are that I don’t need to set up an account with write privileges on the slave. Also, if more than one slave has troubles with the same rows, this should fix them all at the same time.Issues I need to research are whether the different number of rows affected on the slave will cause trouble, and if this can be solved with a temporary slave-skip-errors setting. The manual may document this, but I can’t find it.Next stepsI’m looking forward to your feedback, and then I plan to build a tool that’ll implement whatever algorithm emerges from that discussion. At this point, assuming the above algorithm is as good as we can come up with together, I’m planning to actually implement both top-down and bottom-up approaches in the tool, so the DBA can decide what to use. The tool will, like the rest of the scripts in the MySQL Toolkit, be command-line friendly (there are lots of proprietary “visual tools” to compare and sync tables, but they don’t interest me — plus, why would I ever trust customer data to something I can’t see source code for?). I also understand that not everyone has the same narrowly-defined use case of re-syncing a slave, so of course I’ll make the tool more generic.For my own use, ideally I’ll be making sure the tool is rock-solid, then defining rules for tables that frequently drift, and running a cron job to automatically find which tables are different and fix them. If the MySQL Table Checksum tool finds a table is out of sync and I don’t have a rule for it, it’ll just notify me and not try to fix it.SummaryIn this article I proposed some ideas for a top-down, in-client, replication-centric way to compare a table known to differ on a master and slave, find the rows that differ, and resolve them. I’m thinking about building a tool to implement this algorithm, and would like your feedback on efficient ways to do this.Written by Xaprb March 5th, 2007 at 11:46 pmPosted in SQL« MySQL Table Checksum bug fixWhat to do when MySQL says skip-innodb is defined »10 Responses to 'An algorithm to find and resolve data differences between MySQL tables'Subscribe to comments with RSS or TrackBack to 'An algorithm to find and resolve data differences between MySQL tables'. Nice article!We have a similar tool called SQLyog Job Agent which incorporates most of what you have discussed in this article. Unfortunately, it is not open-source.We are always trying to improve the algorithm and look forward to more articles on this topic!

Rohit6 Mar 07 at 2:32 amVery interesting. I’ve already developed a simple tool to do just that, albeit based on a simple row-by-row comparison, with the correcting actions being inserts or updates directly on the slave.It can recurse all the databases and tables to repair an entire database, or just operate on a single table.I’d not considered doing the corrective action on the master, but it’s an interesting idea.You’re absolutely correct that tables that lack a primary key are barely worth attempting, and so far my script ignores them.One thing that I’ve found in practice, is that you must perform the synchronisation while replication is actually running. If you don’t, you will inevitably end up replicating ahead of the normal replication process and breaking it.I’ve found that this tool is useful for bringing up a new replica where it’s impossible for the master to be affected in any ways, such as through table locking.If you do develop a working implementation of your own, do let me know!James

James Holden6 Mar 07 at 7:48 amRohit, if I read your comment right, you’re subtly saying I’m going a good direction, which is encouraging :-)James, you’ve reinforced my belief that lots of people need a tool that doesn’t disrupt replication.

Xaprb6 Mar 07 at 8:57 amI finally read the stuff (this article and the source code). The various discussions are very interesting.I’m not sure that I understand fully the “index” issue with the bottom-up approach. Basically each summary table is build (once) and then it is scanned just once, so having an index built on some attribute would not be amortized. The only exception may be for bulk deletes or inserts, but that should not happen.On the “exploit append-only tables” idea, the bottom-up approach can have a “where” clause on the initial table so that the comparison is only performed on part of the data. Moreover, if the candidate tuples are somehow an identifiable fraction of the table, it might be simpler to justdownload them directly for comparison, that would be a third algorithm:-)Do you have performance figures with your tool in different settings?

Fabien Coelho15 May 07 at 5:06 amHello Fabien. The issue with the indexing is not scans, but lookups from a child table to its parent tables, including the group-by queries. These happen potentially many times. I could benchmark with and without indexes fairly easily and see for sure, but after writing all the queries I’m satisfied the index is important.The WHERE clause has proven to be very important, as you guessed.I did some testing with real data, and the results are here: Comparison of Table Sync Algorithms.

Xaprb15 May 07 at 7:57 amI have some problem with query sql.I want you correct for me.this is my query :$queryselect=”select Distinct student.id_student,first_name,last_name,promotion,classfrom student,applicationrecieve where student.id_student applicationrecieve.id_student”;the query that I write to you, I want to select data that it don’t have in the table applicationrecieve from table student.thanks,regards,chhivhorng

chhivhorng27 Jun 07 at 5:57 amNice article. I’ve studied the algorithm before, but wasn’t so clearly.Actually I met he same need for keeping my databases synchronized, and for some days I’m trying to build a suitable algorithm for that. The Top-Down algorithm is good, but, as you mentioned, is too hard to find a good grouping. And how what is to be done if there is only one indexed column as Primary Key? So I find the Bottom-Up more suitable for that purpose, but with some differences, I’m going to build a B*-Tree based on row checksums for each table, keep it saved locally (I think a XML structure is a good way) and, for saving time and traffic, do the comparison locally too. I’m not sure it’s the best way for that, but i want to try.Best regards.

Negruzzi Cristian9 Oct 07 at 8:38 amPlease take a look at MySQL Table Sync in the MySQL Toolkit (http:mysqltoolkit.sourceforge.net). It may save you a lot of work. I’ve implemented both algorithms there.

Xaprb9 Oct 07 at 8:50 amBut by all means, explore your algorithm too! I don’t mean to say you shouldn’t. It may be a much better way.

Xaprb9 Oct 07 at 8:52 ami want the algorithm to find the table of any no.