[ https://issues.apache.org/jira/browse/AVRO-557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12874777#action_12874777
]
Kevin Oliver commented on AVRO-557:
-----------------------------------
We do a decent amount of 1 time usage of BinaryDecoders and GenericDatumReaders. When we upgraded
to Avro 1.3 we saw significant regression in performance on decoding. A profiler showed the
issue pretty quickly.
Basically, it boiled down to 2 issue:
1) Having GenericDatumReaders always create the ResolvingDecoder is too expensive for one
time usage.
2) BinaryDecoders now created a bunch of arrays and got more complicated, again significantly
slowing down one time usage.
I'm attaching a patch that has a somewhat hacky workaround. I've resurrected the BinaryDecoder
code from v1.2 (more or less). I've also created a GenericDatumReaderWithOptionalResolver
class that basically forks GenericDatumReader to allow for reading directly from the supplied
decoder.
Running the newly added 'Perf -GoneTimeUse' you can see the stark difference:
GenericReaderOneTimeUsage12Test: 2175 ms, 1.9147720770945649 million entries/sec. 0.008961491780473783
million bytes/sec
GenericReaderOneTimeUsage13Test: 13152 ms, 0.3167766318368232 million entries/sec. 0.0014825739399539307
million bytes/sec
I don't believe we should commit the patch as is. But I'd like some feedback on how to go
from here to get this performance back.
> Speed up one-time data decoding
> -------------------------------
>
> Key: AVRO-557
> URL: https://issues.apache.org/jira/browse/AVRO-557
> Project: Avro
> Issue Type: Improvement
> Components: java
> Affects Versions: 1.3.2
> Reporter: Kevin Oliver
> Assignee: Kevin Oliver
> Fix For: 1.4.0
>
>
> There are big gains to be had in performance when using a BinaryDecoder and a GenericDatumReader
just one time. This is due to the relatively expensive parsing and initialization that came
with 1.3. Patch with example code and a Perf harness to follow.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.