In the past when I have burned a CD with EAC (with read and write offset correction values consistent with Andre Wiethoff's reference), I have been able to rip files from it identical to the source files.

Based on what I could gather from this forum, I used to be of the conviction that with a drive capable of overreading and overwriting, the exact same files that were burned to a CD can be extracted from it again, and that the only disadvantage of using a drive not capable of overreading was that a few samples at the beginning or end of the first or last audio file would be replaced with silence, regardless of what the samples originally contained.

Recently, however, it has come to my understanding that it is very hard (or even impossible) to extract the original files that were burned to it from a mass-produced CD.

My question is simple: How is it that I can rip files identical to the source files from a CD I have burned myself, but not from one that is factory-pressed? And, perhaps more importantly, how is it that I managed to achieve this using 'incorrect' offset values?

I apologise in advance for any difficulty I might have understanding any mathematical arguments you put forth. I fear I am slightly dyscalculic.

(I have decided, henceforth, to rip using the 'ideal scenario' reference which is 30 samples before that established by Mr Wiethoff, but I am not sure if it is relevant.)

Do the offsets have anything to so with anything except timing? The audio data will be identical when extracted correctly, regardless of whether or not you pay any attention to the offsets. If the extracted tracks are not exact duplicates, bit for bit, of the source from which the CD was created, it is because something on the CD can't be read correctly, and this will differ due to offsets.

I think the majority of "mass produced CDs are still pressed, not written with laser on CD-R media. The read error rates are generally significantly higher on pressed CDs than on good computer written CD-Rs, but generally well within the range where perfect, bit for bit playback happens. I've never heard about any difficult with perfect extraction either. However, without the original source file to compare to your extraction from the CD, you have no way to know if your extraction is indeed identical.

I suppose these on-line databases that you can use to verify your extraction are based on other people's extractions. The assumption is that if everyone's work shows the same thing result, it is indeed correct. I would say that is a good assumption and worrying about it beyond that is not really different than worrying about stepping on sidewalk cracks.

There's subcode, HTOA, offsets, pregaps etc which you'll have varying degrees of success reading, depending on your drive and method - but apart from these, you can get the exact audio data back, and you are getting the exact audio data that was pressed.

Some audiophools don't believe this is the case, but some mastering engineers check exactly that: that the audio data from the pressed CD matches what they sent off. Sometimes it doesn't, but that's because another stage of unwanted signal manipulation was added to the chain by someone down the line - and that's exactly why they check it.

There's a study somewhere investigating the supposedly different "sound" of CDs glass mastered + pressed at different factories. They looked at the issue of the CDs (and master tape) being bit-identical very carefully.

If your drive has positive offset correction value and quantity of "zero" samples in end of CD is less than offset value, then you need overread in lead-out to exactly rip CD and overwrite in lead-out to exactly write it to CD-R.If your drive has negative offset correction value and quantity of "zero" samples in beginning of CD is less than offset value, than you need overread in lead-in to exactly rip CD and overwrite in lead-in to exactly write it to CD-R.If quantity of "zero" samples in beginning/end of CD is more than drive offset value, then audio data will not be affected.

If your drive has positive offset correction value and quantity of "zero" samples in end of CD is less than offset value, then you need overread in lead-out to exactly rip CD and overwrite in lead-out to exactly write it to CD-R.If your drive has negative offset correction value and quantity of "zero" samples in beginning of CD is less than offset value, than you need overread in lead-in to exactly rip CD and overwrite in lead-in to exactly write it to CD-R.If quantity of "zero" samples in beginning/end of CD is more than drive offset value, then audio data will not be affected.

If your drive has positive offset correction value and quantity of "zero" samples in end of CD is less than offset value, then you need overread in lead-out to exactly rip CD and overwrite in lead-out to exactly write it to CD-R.If quantity of "zero" samples in beginning/end of CD is more than drive offset value, then audio data will not be affected.

I am not sure if I understood this correctly.

I use one of the old famous Plextor PX-716A. It's one of these units which has an offset of +30 (and therefore a offset of 0 according to the theory referred by the OP).

Till now I have thought that despite of having overread in & out I would lose a few samples if I used a +30 offset. So I had to chose between losing a few samples (that would be silence most of the time) and being abble to use AR or rip with a offset of 0, maintaining all "theoric samples", having the supposed absolute offset but not being able to use AR.

According to Rollin, there wouldn't be that difference in the number of kept samples due to the overreading capability. Is that assumption right?

The greatest number of samples that can be replicated to a cd-r/rw using that drive will be done when the read offset correction and write offset are both set to zero. This has everything to do with the limitations of the drive and nothing to do with the measured reference.

OT:I have that drive. It is pretty good but although it was actually made by Plextor with a Sanyo chipset, it is not one of the "famous" ones. Regardless, the hype over old Plextor drives is just that. Ironically, the PX-230 appears to have a better track record of delivering error-free rips according to statistics from the AccurateRip database. It was not made by Plextor.

This post has been edited by greynol: Sep 6 2012, 19:09

--------------------

Breath is found in waveform and spectral plots;DR figures too, of course.

There are a few links relevant to the topic. I am not sure if any of them explicitly states that the differences between these two references (which I will readily admit I do not understand in the first place) prevents extraction of the same files that were burned to the CD, but that was how I interpreted it.

The bottom line, as far as I can tell, seems to be that it is hard to tell where the actual data on the CD starts. (I have previously shared with greynol that this strikes me as unintuitive, as I feel that the CD drive should have no problem recognising where the little pits on the surface of the disc start and where they end - but that is an aside.)

Again, my problem is that I have a hard time understanding how this is consistent with the fact that my own experiments showed me that I could indeed get the same files back from a CD that I burned to it.

Another, albeit related, thing that confuses me, is how I were able to accomplish this despite my drive consistently reading 30 samples prior to where it 'should' be reading. I suspect the answer lies in basic math, but nonetheless, I would be glad if someone could clarify.

There are a few links relevant to the topic. I am not sure if any of them explicitly states that the differences between these two references (which I will readily admit I do not understand in the first place) […]

IpseDixit’s later calculated reference is verifiable as being exactly correct, meaning that a drive calibrated to it will not exhibit any offset between the times of the data that are requested and the data that are actually returned. Andre’s is a bit off (30 samples, to be exact), but most applications supporting offsets and verification of tracks according to them (read: about three) were firmly entrenched in using it by the time IpseDixit presented his findings, so they tend to offer the new offset as an optional extra if at all.

QUOTE

[…] prevents extraction of the same files that were burned to the CD, but that was how I interpreted it. […] Again, my problem is that I have a hard time understanding how this is consistent with the fact that my own experiments showed me that I could indeed get the same files back from a CD that I burned to it.

No one has said anything about not being able to extract the same audio, but you must account for offsets to ensure that you don’t miss a bit at the start or end, by accounting for the offsets of both the present reading drive and the and antecedent writing drive.

QUOTE

The bottom line, as far as I can tell, seems to be that it is hard to tell where the actual data on the CD starts. (I have previously shared with greynol that this strikes me as unintuitive, as I feel that the CD drive should have no problem recognising where the little pits on the surface of the disc start and where they end - but that is an aside.)

It’s quite simple. The audio is written as a continuous stream in the main data, whereas the information on which parts of it correspond to which tracks is written elsewhere. Your drive can get information on what to read, either from the table of contents or from your instruction, but it’s not necessarily true that it can then seek to exactly the right place in the nondescript stream of audio.

QUOTE

Again, my problem is that I have a hard time understanding how this is consistent with the fact that my own experiments showed me that I could indeed get the same files back from a CD that I burned to it.

Again, no one has said that this is impossible, so please do not portray anyone as having done so. It’s perfectly possible, just maybe with a little preparation.

QUOTE

Another, albeit related, thing that confuses me, is how I were able to accomplish this despite my drive consistently reading 30 samples prior to where it 'should' be reading. I suspect the answer lies in basic math, but nonetheless, I would be glad if someone could clarify.

Did you apply offset correction during reading and writing, or if not does your drive have inverse read (e.g. +30) and write (e.g. -30) offsets? Again, this isn’t very complex.

Didn't we just have another discussion saying fairly clearly that pits and lands do not directly translate to bits in the audio stream and that the data for each frame of audio is distributed over an area so errors can be corrected perfectly from minor damage?

It shouldn't be too much of a stretch to accept that such a scheme would result in different hardware resolving the data to any specific address differently; especially when the specification does not require address resolution to an exact sample.

Combine this with pressings that differ by thousands, if not tens of thousands of samples (or more!) on a track by track basis and you quickly realize that this reference business is much ado about nothing.

This post has been edited by greynol: Sep 7 2012, 06:07

--------------------

Breath is found in waveform and spectral plots;DR figures too, of course.

Didn't we just have another discussion saying fairly clearly that pits and lands do not directly translate to bits in the audio stream and that the data for each frame of audio is distributed over an area so errors can be corrected perfectly from minor damage?

It shouldn't be too much of a stretch to accept that such a scheme would result in different hardware resolving the data to any specific address differently; especially when the specification does not require address resolution to an exact sample.

Combine this with pressings that differ by thousands, if not tens of thousands of samples (or more!) on a track by track basis and you quickly realize that this reference business is much ado about nothing.

That is not quite correct. The spreading is part of the encoder called CIRC scheme (consisting of delays and interleaves stages) and it is fixed, i.e. same spread amount is applied to all samples/sectors. The problem arises because the specification:1. is to have the subchannel which contains the addressing (location) data to be treated as a separate stream from the audio samples data (main channel) (as mentioned by db1989)2. does not describe that subchannel should be perfectly aligned from subchannel address 00:00.00 to first audio sample, so hardware designers found it easier to simply leave the skew (offset) created by the buffering used for the "spreading" of bytes (CIRC).

From my understanding it appears you are implying a randomness? Not quite, the offsets were a result of the buffering skew and are fixed values for each drive, in contrast the CIRC "spread" is the same for all drives. The randomness is when observing among different drives (ones using different buffering skew).

Pressed CD have recorded offsets because this applies to pressing plant recorders (LBR) as well, simply because they are glorified (made of higher quality) CD writers, many of which the chipset is from ordinary consumer manufacturers (e.g. TEAC, Sony) so these LBRs will naturally have a write offset, and further complications arise because if every plant use a different LBR they will be recording with a different write offset to the CD glass master.

Not at all, no. While I wasn't specific about the mechanism (which was already given in the cdfreaks discussion that the OP read), I did say the spec does not impose the requirement that there be sample-accurate indexing. Rather, I was simply trying to point out that the pits and lands are not laid out in the same sequential way that analog audio is laid out on tape or vinyl.

This post has been edited by greynol: Sep 7 2012, 12:35

--------------------

Breath is found in waveform and spectral plots;DR figures too, of course.

So if I understand this correctly, any uncertainty in the situation stems from the fact that in order to extract the unaltered files that were written to it from any given CD, one would need to know the write offset of the drive that wrote it (or, in the case of commercial CDs, the write offset of the drive that wrote the master CD)? I can certainly understand how that is not a feasible scenario!

My drive has an AccurateRip-configured read offset correction of +6. Does that mean it should be changed to -24, should I wish to follow the newer reference?

in order to extract the unaltered files that were written to it from any given CD, one would need to know the write offset of the drive that wrote it (or, in the case of commercial CDs, the write offset of the drive that wrote the master CD)?

Yes. So the +30 or -30 won't matter much, and you can rather stick to the EAC suggestion.

Unless you know the write offset used by the master plant / writer, - your files will be offset by an unknown number- if you take care of the offset when burning, you will still get the same audio as on the pressed CD- ... which is also offset by an unknown number.

Your files and the CD will differ by 30, and two wrongs don't make a right, but I wouldn't (and I don't) worry.