Description

I've compared ZendAMF to AMFPHP, by returning a dataset with 5000+ rows from a table, using the exact same code (both in PHP, and AS3). ZendAMF averages a 16 second return time, whereas AMFPHP returns in 3 seconds. I'm guessing the added time is coming from the process of serializing data to send back to Flash.

Comments

Posted by Matthew Grippo (mgrippo) on 2009-10-27T18:42:09.000+0000

I'd like to confirm this issue. Spent the day trying to chase down why the performance difference was so drastic.
With 20000 rows/3 columns:
AMFPHP - approx 5 seconds
ZendAMF - approx 70 seconds

I was able to narrow it down to the Serializer (serialization of objects) class (AMF3 object / array writing). probably related to the Stream class.
Was also able to confirm this was not a "output" issue at the apache/php level to the browser.

Posted by Andreas Adam (acadam71) on 2009-12-23T03:34:29.000+0000

I'd like to confirm this issue too: we don't send thousands of rows but we create complex objects with 10 or more arrays, e.g.:
class person
var vehicles: Array;
var groups: Array;
var a1: Array;
var a2: Array;
...
In each array there are also some complex objects, but not many.
It takes more than 3 seconds to get an object person from PHP to Flex.

Posted by Mark Reidenbach (mreiden) on 2009-12-30T08:34:25.000+0000

I just got finished working on a project which showed a drastic performance hit when switching from AMFphp to Zend_Amf. Our serialization performance exploded from 2.3 seconds to 25.5 seconds.

I'm going to attach two patches to this issue which brought the serialization time down to roughly the same speed as AMFphp.

Any comments on whether these patches help others would be appreciated.

Posted by Mark Reidenbach (mreiden) on 2009-12-30T08:47:57.000+0000

These two patches were tested on php 5.3 and are against Zend AMF 1.9.6. After applying both the serialization times for our project containing many multi-dimensional arrays decreased from 25.5 seconds to 2.3 seconds which matches the speed of using the older AMFphp serialization. Comments on whether these patches help others would be appreciated.

Amf.noref-writeString.diff:

This patch changes Zend_Amf to always write strings without doing an array_search to see if a string has already been written. This is how AMFphp handled strings and did result in the download size increasing from around 179kb to 287kb for our project, but that's acceptable to us for the increased serialization speed. Perhaps a better approach would be to use some sort of hash for string lookups to decrease the number of array elements array_search must search through.

Amf.data-as-refs.diff:

This patch passes data by reference rather than by value in the serialization routines. This prevents making copies of large data while doing serialization.

Posted by Wade Arnold (wadearnold) on 2010-01-04T11:38:21.000+0000

I will run this throught the unit tests tonight. If all passes I will submit a patch for the next mini release. Mark thanks for your patch!!

Posted by Mark Reidenbach (mreiden) on 2010-01-04T22:57:29.000+0000

After running through the AllTests.php unit tests from http://framework.zend.com/svn/framework/…, the attached patch provides the same results as testing against Zend_Amf 1.9.6 with the data-as-refs patch applied.

The noref-writeString patch causes 3 errors, but I believe this is due to not using references to some strings (larger payload, much reduced response time).

Posted by Wade Arnold (wadearnold) on 2010-01-06T12:34:00.000+0000

Zend Framework requires that we php strict mode standards. If we apply this patch I get all kinds of "Only variables should be passed by reference" so I guess we need to jump into this and figure out were we don't need to explicitly pass by reference because php is already passing it that way.

Posted by Mark Reidenbach (mreiden) on 2010-01-06T15:00:22.000+0000

Wade, did you apply patch #3 [Amf.Response_Body_By_Value.diff] when you ran your tests? I'm not seeing any of the "Only variables should be passed by reference" errors when I run the tests with all three patches applied.

Without patch #3 I get many of the passed by reference errors and a summary of "Tests: 160, Assertions: 289, Failures: 2, Errors: 72."

With all three patches I get a summary of "OK (160 tests, 416 assertions)" which matches what I get when run against 1.9.6.

Posted by Mark Reidenbach (mreiden) on 2010-01-06T17:00:49.000+0000

Wade, if you're still getting the "Only variables should be passed by reference" errors with the previous 3 patches, here's an initial try at passing non-objects by reference as the first parameter and php objects (objects, dates, xml, etc) as a new third parameter of the writeTypeMarker serialization methods.

I haven't gotten to do much testing with this latest patch, but it does pass the unit tests and initial testing of our project.

Posted by Wade Arnold (wadearnold) on 2010-01-17T12:11:43.000+0000

So your suggesting apply both Amf.Response_Body_By_Value.diff and Amf.Combined-NoObjectsByRef.diff patches?

Posted by Wade Arnold (wadearnold) on 2010-01-17T12:14:43.000+0000

Nope just Amf.Combined-NoObjectsByRef.diff it looks like

Posted by Mark Reidenbach (mreiden) on 2010-01-18T08:50:17.000+0000

Wade, I had some time to do more testing on this over the weekend.

I created a unit test for serializing a large array and after using this test, I believe the only patch that is needed is the one to not reference strings.

My previous patch [Amf.noref-writeString.diff] causes some test failures due to not referencing any strings which some of the tests expect, so here's a new patch based on a recommendation that in_array can be very slow as more elements are added to an array while checking if an array key exists is roughly constant.

Hopefully this much simpler patch will fix this performance issue and it passes the unit tests.

Patches:
Amf.perform.ref-writeString.diff : Use the string as the array key and store the reference number as the value for much quicker lookup performance.

Amf.ResponseTest.php.diff : Unit test to make sure the large array serialization time hasn't ballooned by a factor of 10. (Is there a better way of testing the speed other than comparing against a "high enough" number that works on today's hardware?)

largeArrayData.bin : This is simply my test dataset compressed with gzcompress that consists of several large arrays containing almost . It is 624kB in size though, so maybe this isn't acceptable to include for unit testing.

The only change needed to greatly increase performance is to use an associative array and array_key_exists instead of array_search when writing referenced strings in the AMF3 serializer.

The strict type checking used in array_search is unnecessary since only string data is ever passed to the writeString method and checking if an array key exists is much faster than searching an array as the array becomes increasingly large.

Matthew,
I just downloaded 1.10.4 and it seems that the patch from Mark Reidenbach hasn't been applied.
I am looking at Serializer.php line 231, function writeString.
The string is not stored as a key but as a value.
Can you please check it out?
Thanks

Posted by James Spurin (funkjames) on 2010-05-23T08:28:19.000+0000

I also downloaded 1.10.4 today.

As Philippe mentioned, the patch from Mark Reidenbach mentioned at 04/Mar/10 12:09 PM does not seem to have been applied. I manually made the changes to 1.10.4 using the details provided by Mark and noticed substantial performance improvements.

On a 70,000 row table the query took 13 seconds as opposed to 60+ seconds prior to the patch being applied.

Posted by Otto (wedge) on 2010-06-03T01:10:10.000+0000

I also tried Mark's patch from 04/Mar/10. It's only slightly faster here though. Getting 400 objects I go from 2300ms -> 2000ms.
Is the count part of his patch really necessary? I've modified it, and it appears to work and also appears to be slightly faster.