I should begin by saying I really want to support the concept of open access. I read lots of papers that are open access, in particular in PLOS, BMC journals etc..I want to do everything I can to get people to be more open and make research accessible, I even encourage scientists to put their data in such journals on Figshare, slideshare etc… However I am currently conflicted and this comes from several recent experiences over the past year. I should also say that in the interests of full disclosure I am currently on the editorial board of Pharmaceutical Research (Springer), Drug Discovery Today (Elsevier), Mutation Research Reviews (Elsevier) and Chem-Bio Informatics (CBI-Society).

I have one more example, this did not take as long but for me it documents how a good paper can be held back for no aparent reason. On October 26th 2012 we submitted a paper ‘Enhancing Hit Identification in Mycobacterium tuberculosis Drug Discovery Using Validated Dual-Event Bayesian Models’ to PLOS Computational Biology, which seemed appropriate considering we used Bayesian Machine learning models to predict prospectively potential compounds with activity and then tested hundreds of molecules and found >20% hit rate. we felt the choice of journal was appropriate given the computational nature and that such work on this scale had not previously been published. We had a companion paper “Dual-event machine learning models to accelerate drug discovery” (which would eventually be published as “Bayesian models leveraging bioactivity and cytotoxicity information for drug discovery”) using additional TB Bayesian models which combined retrospective and prospective evaluation as well as in vivo work, which we sent to Chemistry & Biology. This journal is a part of the Cell Press family of journals owned by Elsevier, so a closed journal compared to PLOS which is an open access publisher. What follows is our experience with PLOS.

We did not have to wait long for a response:

Dear Dr. Ekins,

Thank you very much for submitting your manuscript ‘Enhancing Hit Identification in Mycobacterium tuberculosis Drug Discovery Using Validated Dual-Event Bayesian Models’ for review by PLOS Computational Biology. I apologize for the delay in handling your manuscript. We have held a consultation among the editors. Unfortunately the general opinion was that while the problem is important and the results are useful, we do not feel that the manuscript is suitable for PLOS Computational Biology.

The main strength of the manuscript is in the validated database that it provides. The method (software) is not provided as an open source as we require, and it is also unclear to what extent it is novel rather than incremental to what was published earlier. As such we believe that it would fit better into a drug, or medicinal chemistry journal.

We are sorry that we cannot be more positive on this occasion, but we hope that you appreciate the reasons for this decision and that you will consider PLOS Computational Biology for other submissions in the future.

While we cannot consider your manuscript for publication in PLOS Computational Biology, we very much appreciate your wish to present your work in an Open Access publication. We therefore want to alert you to an alternative that you may find attractive. Our sister journal PLOS ONE (www.plosone.org) is a swift, high-volume, efficient and economical system for the publication of peer-reviewed research in all areas of science and medicine; a unique publishing forum that exploits the full potential of the web to make the most of every piece of research.

If you would like to submit your work to PLOS ONE we can transfer your files directly into PLOS ONE’s manuscript handling system; please contact the PLOS ONE publication staff (transfers@editorialoffice.co.uk) now, citing your manuscript tracking number. If you would like more information about submitting to PLOS ONE please either visit its website or email plosone@plos.org.

Thank you again for your support of PLOS Computational Biology and open-access publishing.

Sincerely,

Ruth NussinovEditor-in-ChiefPLOS Computational Biology

So PLOS Computational Biology prevents people from publishing using commercial software to do drug discovery, how lame is that. What next banning Microsoft Word? We even provided the models created to anyone that wants them and have already shared them with several pharmas. We took their advice and by Nov 6th the manuscript was in PLOS ONE for review. We did not expect the following on Dec 14th

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we have decided that your manuscript does not meet our criteria for publication and must therefore be rejected.

Two independent reviewer reports were fully in contrast (see below). Therefore, I asked to a third reviewer, who has found major limitations and problems with the study in the current status. Specifically, this reviewer cannot help but notice a somewhat disturbing lack of completeness in this report. The authors constantly refer to a submitted but not published (as far as the reviewer could tell) manuscript (ref. 30, Ekins S, Reynolds R, Kim H, Koo M-S, Ekonomidis M, et al. (2012) Dual-event machine learning models to accelerate drug discovery. Submitted.), in which the Bayesian procedure applied here should be reported in full details together with a retrospective validation and a preliminary prospective application. Without ref. 30 published – or, at the very least, confidentially available to reviewers – a fair assessment of the scientific quality of the results presented here, is impossible. And, also in this case, it would remain to be established how big and general an improvement the results presented here would make with respect to those presented in ref. 30.

I am sorry that we cannot be more positive on this occasion, but hope that you appreciate the reasons for this decision.

Yours sincerely,

Andrea Cavalli, PhDAcademic EditorPLOS ONE

Reviewers’ comments:

Reviewer #1: The manuscript entitled “Enhancing Hit Identification in Mycobacterium tuberculosis Drug Discovery Using Validated Dual-Event Bayesian Models” by Ekins et al. describes an series of three prospective validation experiments using commercially available molecules from Asinex, Maybridge and Selleck Chemicals kinase libraries. In the process of this evaluation, the authors have identified valuable starting points for the discovery of novel antitubercular agents: 124 actives against Mycobacterium tuberculosis, including two families built around drug-like heterocyclic cores and several FDA-approved human kinase-targeting drugs.

1. The research article describes a method that provides hits more efficiently, thus decreasing the time and cost involved in their evolution to leads and eventually a clinical candidate. The methodological approach well sounds.2. The results reported in the manuscript have not been published elsewhere. 3. The in vitro antitubercular experiments of the selected molecules have been conducted rigorously with appropriate controls, although without replication. The cross-validation of the best models reveals a good predictive capability, substantiating their robustness. Methods and reagents have been described in sufficient detail.4. The data presented in the manuscript support the conclusions drawn and the results appear exhaustively discussed.5. The article is presented in an intelligible fashion and is written in standard English. 6. The research meets all applicable standards for the ethics of experimentation and research integrity. 7. The article adheres to appropriate reporting guidelines and community standards for data availability.

For these reasons, I recommend the publication of the manuscript on PlosOne without alteration.

Reviewer #2: The authors state that they have generated a “Bayesian model with enhanced predictive capability” but at the moment the cited paper (30) is only submitted. In this paper, they are validating again a model not published yet and unavailable for the readers/reviewers. In my opinion the acceptance of this paper has to follow the publication of paper cited in 30.In addition, while I have no reason to question the results described in this manuscript, the discussion has to be written in a more intelligible way. On the other hand, I am not convinced that the paper has significant novelty to justify publication in PONE. I recommend publication in a more specialized journal. Moreover, authors have to check more carefully the references before submission, i.e. in references 9, 18, 29, 32, 33 some details are missing.

I had to request the following review because it was omitted from the email

Reviewer#3The manuscript submitted by Ekins and colleagues describes the application of a Bayesian classifier aimed at selecting molecules displaying inhibitory activity toward Mtb. The screening procedure carried out on three libraries of commercially available compounds returned an average hit rate of 22.5%.

Despite a certain tendency toward overselling a little bit the described results, this hit rate would be quite interesting. Yet, I cannot help but notice a somewhat disturbing lack of completeness in this report. The authors constantly refer to a submitted but not published (as far as I could tell) manuscript (ref. 30, Ekins S, Reynolds R, Kim H, Koo M-S, Ekonomidis M, et al. (2012) Dual-event machine learning models to accelerate drug discovery. Submitted.) in which the Bayesian procedure applied here should be reported in full details together with a retrospective validation and a preliminary prospective application. Without ref.30 published – or, at the very least, confidentially available to this reviewer – a fair assessment of the scientific quality of the results presented here, is impossible.

And, also in this case, it would remain to be established how big and general an improvement the results presented here would make with respect to those presented in ref. 30.It is my opinion that, in the present form, relying too heavily on unpublished data, the manuscript should be rejected for publication on Plos ONE but maybe it could be addressed to a more specific and technical forum shifting the focus more on the hits rather thatn on the procedure that generated them.

This opinion is solely based on my scientific expertise and it is not biased by any competing interest.

—

So our paper was rejected because we referenced our other paper which was submitted at roughly the same time and did not provide this at the same time even though the content of both manuscripts is different. Dec 15th Chemistry & Biology sent us a manuscript asking for major revisions to this paper..Which we did and it was accepted Jan 3rd. In the meantime…

Dear Dr Cavalli, I am happy to send ref 30, which deals with a totally seperate test set. I had no idea that was required along with submission of this paper. Our previous papers on single event Bayesian models using retrospective and prospective validation have all been published.

I thought PLOS was meant to focus on the science …novelty was not an issue?

SincerelySean Ekins

On the 18th Dec 2012 I recieved the following:

Dear Dr. Ekins,

Thank you for contacting PLOS ONE.

I am writing to inquire whether you would be interested in formally appealing the original decision rendered through PLOS ONE regarding the manuscript PONE-D-12-34738. While I cannot guarantee that your appeal will be approved by our in house editors, they will consider appeals via the formal appeals process when you submit a detailed rebuttal letter.

Appeal requests should be made in writing, not by telephone, and should be addressed to plosone@plos.org with the word “appeal” in the subject line. Authors should provide detailed reasons for the appeal and point-by-point ‘responses to the reviewers’ and/or Academic Editor’s comments. Decisions on appeals are final without exception.

If you have any further questions or concerns, please do not hesitate to contact us.

Kind regards,

Dianne CartwrightStaff EOPLOS ONE

Case Number: 01546111 ref:_00DU0Ifis._500U069JTc:ref

So Dec 20th we submitted both manuscripts to PLOS ONE with our appeal listed below.

Dear PLOS ONE,

We are writing to formally appeal the original decision for PONE-D-12-34738. After reading the reviews sent to us, it became clear that the basis of rejection was questionable and did not hinge on failures in any of the seven criteria listed as publication criteria (http://www.plosone.org/static/publication).

It should be noted that our manuscript (30) is currently under revision at Chemistry and Biology and is enclosed with this appeal. We now include a rebuttal of all 3 reviews now in our possession.

Rebuttal

Letter from Editor

Thank you for submitting your manuscript to PLOS ONE. After careful consideration, we have decided that your manuscript does not meet our criteria for publication and must therefore be rejected.

Two independent reviewer reports were fully in contrast (see below). Therefore, I asked to a third reviewer, who has found major limitations and problems with the study in the current status. Specifically, this reviewer cannot help but notice a somewhat disturbing lack of completeness in this report. The authors constantly refer to a submitted but not published (as far as the reviewer could tell) manuscript (ref. 30, Ekins S, Reynolds R, Kim H, Koo M-S, Ekonomidis M, et al. (2012) Dual-event machine learning models to accelerate drug discovery. Submitted.), in which the Bayesian procedure applied here should be reported in full details together with a retrospective validation and a preliminary prospective application. Without ref. 30 published – or, at the very least, confidentially available to reviewers – a fair assessment of the scientific quality of the results presented here, is impossible. And, also in this case, it would remain to be established how big and general an improvement the results presented here would make with respect to those presented in ref. 30.

I am sorry that we cannot be more positive on this occasion, but hope that you appreciate the reasons for this decision.

Yours sincerely,

Andrea Cavalli, PhDAcademic EditorPLOS ONE

Response : The editor suggests reviewer #3 as the reason for their decision as follows “Specifically, this reviewer cannot help but notice a somewhat disturbing lack of completeness in this report.” After reading the comments from reviewer #3 below it would appear that this would refer to the manuscript referenced as (30) being submitted elsewhere. In the manuscript, we clearly stated reference 30 was submitted. At any time during the editing or review process we would have been happy to supply reference 30 and it is enclosed with this appeal. We have previously published papers outlining the validation of single-event Bayesian models, which are more appropriate for comparison to the dual-event Bayesian models described in this current work. We contend that the current manuscript stands alone and, therefore, a fair assessment of the scientific quality of the results is possible and does not require reference 30. For example, references 20, 21, and 22 could be used for comparison. However, as stated previously, had we been notified at any point during the review process prior to the decision being rendered, we would have supplied the manuscript under review.

Reviewers’ comments:

Reviewer #1: The manuscript entitled “Enhancing Hit Identification in Mycobacterium tuberculosis Drug Discovery Using Validated Dual-Event Bayesian Models” by Ekins et al. describes an series of three prospective validation experiments using commercially available molecules from Asinex, Maybridge and Selleck Chemicals kinase libraries. In the process of this evaluation, the authors have identified valuable starting points for the discovery of novel antitubercular agents: 124 actives against Mycobacterium tuberculosis, including two families built around drug-like heterocyclic cores and several FDA-approved human kinase-targeting drugs.

1. The research article describes a method that provides hits more efficiently, thus decreasing the time and cost involved in their evolution to leads and eventually a clinical candidate. The methodological approach well sounds.2. The results reported in the manuscript have not been published elsewhere.3. The in vitro antitubercular experiments of the selected molecules have been conducted rigorously with appropriate controls, although without replication. The cross-validation of the best models reveals a good predictive capability, substantiating their robustness. Methods and reagents have been described in sufficient detail.4. The data presented in the manuscript support the conclusions drawn and the results appear exhaustively discussed.5. The article is presented in an intelligible fashion and is written in standard English.6. The research meets all applicable standards for the ethics of experimentation and research integrity. 7. The article adheres to appropriate reporting guidelines and community standards for data availability.

For these reasons, I recommend the publication of the manuscript on PlosOne without alteration.

RESPONSE: We thank the reviewer for their comments which supports publication based on the 7 criteria for publication listed above.

Reviewer #2:

Comment #1: The authors state that they have generated a “Bayesian model with enhanced predictive capability” but at the moment the cited paper (30) is only submitted. In this paper, they are validating again a model not published yet and unavailable for the readers/reviewers. In my opinion the acceptance of this paper has to follow the publication of paper cited in 30.

RESPONSE: We thank the reviewer for their comments. To clarify, the models used in this study are fully described in this manuscript and supporting information. These models are available upon request as described in the supporting information. However, it should be clear we are not validating this work against reference 30, they are distinct manuscripts that are independent. This current manuscript can stand alone. We have demonstrated in this manuscript that we can expand the concept of dual-event Bayesian models across many datasets, using the models to select compounds without human intervention. In this study, we have tested 550 molecules and identified 124 actives. Paper 30 is currently with the editors of Chemistry and Biology after we addressed changes they requested. We have provided the manuscript with this rebuttal. In it we provide one small example of the use of a dual-event Bayesian model to test 7 compounds and identify 5 hits. So the manuscripts are clearly distinct in scope with this manuscript probing the antitubercular efficacy of nearly 80X more molecules.

Comment #2: In addition, while I have no reason to question the results described in this manuscript, the discussion has to be written in a more intelligible way. On the other hand, I am not convinced that the paper has significant novelty to justify publication in PONE. I recommend publication in a more specialized journal.

RESPONSE: We thank the reviewer for their comments. Based on the PLOSONE 7 criteria for publication, novelty is not listed as one of them. We would disagree that our manuscript is not novel when, in fact, we have clearly performed the largest prospective testing of any computational models for TB drug discovery. Our manuscript details testing of 550 molecules and identification of 124 actives. We have written this manuscript to appeal to a broad readership and have attempted to make it generally accessible.

Comment #3: Moreover, authors have to check more carefully the references before submission, i.e. in references 9, 18, 29, 32, 33 some details are missing.

RESPONSE: At the time of writing full reference information for references 9, 18, 29, 33 were unavailable as these were in press – ref 18 is still in press. The full reference for reference 32 is correct as provided.

Comment #1: The manuscript submitted by Ekins and colleagues describes the application of a Bayesian classifier aimed at selecting molecules displaying inhibitory activity toward Mtb. The screening procedure carried out on three libraries of commercially available compounds returned an average hit rate of 22.5%. Despite a certain tendency toward overselling a little bit the described results, this hit rate would be quite interesting. Yet, I cannot help but notice a somewhat disturbing lack of completeness in this report. The authors constantly refer to a submitted but not published (as far as I could tell) manuscript (ref. 30, Ekins S, Reynolds R, Kim H, Koo M-S, Ekonomidis M, et al. (2012) Dual-event machine learning models to accelerate drug discovery. Submitted.) in which the Bayesian procedure applied here should be reported in full details together with a retrospective validation and a preliminary prospective application. Without ref.30 published – or, at the very least, confidentially available to this reviewer – a fair assessment of the scientific quality of the results presented here, is impossible.

RESPONSE: We thank the reviewer for their comments. We do not believe we have made any efforts to oversell the research and would appreciate more specific examples so we can address them. Paper 30 is currently with the editors of Chemistry and Biology after we addressed changes they requested. We have provided the manuscript with this rebuttal and the full methods for Bayesian model development are included in both manuscripts (as well as those previously published by us on single-event models -references 20, 21, and 22). Please note that the current manuscript also has unique models for a dataset resulting from screening Mtb against a human kinase compound library not described in reference 30. In reference 30, we provide one small example of the use of a dual-event Bayesian model to test 7 compounds and identify 5 hits. The remainder of reference 30 focuses on prospective evaluation of single-event Bayesian models. We have demonstrated in the current manuscript that we can expand the concept of dual-event Bayesian models across many datasets, using the models to select compounds without human intervention. In this study, we have tested 550 molecules and identified 124 actives.

We would also assert that the scientific quality of the results in the current manuscript are independent of reference 30.

Comment #2: And, also in this case, it would remain to be established how big and general an improvement the results presented here would make with respect to those presented in ref. 30.It is my opinion that, in the present form, relying too heavily on unpublished data, the manuscript should be rejected for publication on Plos ONE but maybe it could be addressed to a more specific and technical forum shifting the focus more on the hits rather thatn on the procedure that generated them.

RESPONSE: As we reference on page 10, “Previous work highlighted a 14% hit rate when applying a single-event Bayesian model and a dual-event model afforded 5/7 hits with an MIC ≤ 2 mg/ml [1]. “ In contrast, in the current study we assayed a total of 550 molecules in vitro (compared with the 7 in reference 30), and identified 124 actives (compared with 5 in reference 30). The current study is therefore more useful as an example to compare to traditional HTS as well as previous single-event models that were validated retrospectively (references 20, 21, and 22). The current study greatly expands on the efforts in reference 30. We are not aware of any other published TB computational studies that have made such extensive prospective predictions using machine learning models followed up by in vitro screening for validation.

We are not relying on unpublished data in paper 30; the data for each paper are distinct and independent. This current manuscript can stand alone. We have demonstrated in this manuscript that we can expand the concept of dual-activity Bayesian models across many datasets, using the models to select compounds without human intervention.

We recognize that other scientists (readers of PLOSONE) who would otherwise perform HTS should understand that prior HTS data can be readily used with commercially available and increasingly open software (Gupta et al., Drug Metab Dispos. 2010 Nov;38(11):2083-90) to build models that can be used to dramatically narrow the number of compounds that need to be tested to find active compounds. By publishing in a more technical journal, we would miss the very scientists we are trying to reach as an audience in PLOSONE. We have gone into detail on some of the hits in this manuscript and are in the process of following them up for further optimization.

Comment #3: This opinion is solely based on my scientific expertise and it is not biased by any competing interest.

RESPONSE: We thank the reviewer for that statement.

We conclude that the manuscript was not rejected based on the 7 publication criteria. We asserted in the manuscript as submitted (and described above) how this study differed from that disclosed in reference 30 which was under review elsewhere at the same time and that the current study is independent of it. We were not requested to submit the manuscript 30 to PLOSONE at any time during the editorial or review process. If omission of reference 30 was so critical, surely reviewer # 1 would have raised the issue too. Instead, they noted that we complied with the seven criteria for publication. It appears there were no other comments of note from reviewers #2 and #3 that require addressing. We have dealt with the minor updates in references that were in press (or their information was not available in PubMed) during submission of the manuscript.

We hope that this manuscript is now acceptable and we trust that PLOSONE will overturn the initial decision which we feel was made hastily without any recourse for us to provide additional information.

Yours sincerely,

…….

After not getting responses to several follow up emails ( there are no phone numbers that get you through to any editors at PLOS) Jan 24th 2013 we received the following.

Dear Dr. Ekins,

Please accept our apologies for not replying sooner. Unfortunately we had a technical issue with our inbox, we do apologise for this technical error. I can confirm that all five of your emails have now been reinstated and we are dealing with this as a matter of urgency.

Your request for appeal has been forwarded to our internal editors who will determine whether or not PLOS ONE will consider your appeal. We may also need to consult additional editorial board members.

We will notify you via email once a decision has been made on whether we will consider your appeal. If your request for an appeal is granted, your manuscript will go out for review a second time and may incur additional review time. Please be aware that appeals typically take longer to review than new submissions due to the complexity of their history. Decisions on appeals are final without exception. Details about appeals can be found under “Editorial and Peer-Review Process” at: http://www.plosone.org/static/information.action.

We will be in touch again soon with more information, but in the meantime, please do not hesitate to contact us if you have any further questions or concerns.

Thank you for your email regarding the above manuscript, I apologise for not replying sooner.

I understand from your last email that the submission to Chemistry & Biology is now in press. In the light of this, I agree that it is not necessary to include a full description of the Bayesian models in the current manuscript.

I would be grateful if you could please revise the manuscript to update the citations to the publication in Chemistry & Biology and provide a brief description of how the current study relates to the work in that publication and resubmit your manuscript files online. While I appreciate that you note that two papers are different in scope, I believe it will still be relevant for the further evaluation of the work -and eventually for readers if the manuscript is accepted – to have a some details regarding the relationship between the two studies outlined in the current manuscript.

Once we receive the resubmitted files, the manuscript will be re-evaluated as per the appeal process, we will aim to reach a decision without further external review, but please note that the re-evaluation will be handled by Academic Editors and thus the steps involved will depend on their consideration.

I hope this is of help but please do let me know if you have any further queries.

I am writing to update you on the status of your appeal submission to the PLOS ONE journal.

Currently, your manuscript is being discussed by two editorial board members. We have been in communication with these individuals, and we are doing all we can to ensure that your paper will receive a timely and proper decision.

Many thanks for your patience, and as always, please feel free to email us any further questions or concerns that may arise.

Regards,

Kate MassinghamPLOS ONE

Then Feb 27th the following

Dear Dr. Ekins,

I have been contacted by the Academic Editor handling your recent submission PONE-D-12-34738 requesting a copy of your article currently in press at Chemistry & Biology, Bayesian Models Leveraging Bioactivity and Cytotoxicity Information for Drug Discovery. Would you please send me a copy of this article to forward to the Academic Editor to facilitate his review?

Please let me know if you have any questions regarding this request.

Kind regards, Tara

PLOS I OPEN FOR DISCOVERY Tara Garnett I Publications Assistant

So after Joel resend the files while I was out of the country I responded on the 6th March.

Dear Tara,

I am following up on your last email as Joel was able to send the paper requested last week (which we had already submitted on 2 seperate occasions). Has any decision been made yet please.

Also please note this case has involved emails to Dianne Cartwright and this appeal was submitted on the 20th of Dec 2012, it is now March 6. I think if we do not have a response in the next week we will take this elsewhere.

Thank you for your email which has been forwarded to me. Currently, your manuscript is being discussed by two editorial board members. We are in constant communication with these individuals, and we are doing all we can to ensure that your paper will receive a timely and proper decision.

Many thanks for your patience, and as always, please feel free to email us any further questions or concerns that may arise.

Best regards,Kate MassinghamPLOS ONE

Finally March 14th we get a decision…but wait it went out for review all over again!!!

CC: daniel.sem@cuw.eduPONE-D-12-34738R1 Enhancing Hit Identification in Mycobacterium tuberculosis Drug Discovery Using Validated Dual-Event Bayesian Models PLOS ONEDear Dr. Ekins,We have carefully reviewed your manuscript, and also sent it out for external review. We note that the previous concern regarding reference #30 has now been addressed, since you have provided access to that manuscript. Furthermore, the reviewer was in favor of publishing with modification, and we are in agreement.Please address the reviewer concerns, and also provide an update on the citation for the manuscript cited as reference 30, as appropriate.We encourage you to submit your revision within forty-five days of the date of this decision. When your files are ready, please submit your revision by logging on to http://pone.edmgr.com/ and following the Submissions Needing Revision link. Do not submit a revised manuscript as a new submission. Before uploading, you should proofread your manuscript very closely for mistakes and grammatical errors. Should your manuscript be accepted for publication, you may not have another chance to make corrections as we do not offer pre-publication proofs.If you would like to make changes to your financial disclosure, please include your updated statement in your cover letter. Please also include a rebuttal letter that responds to each point brought up by the academic editor and reviewer(s). This letter should be uploaded as a Response to Reviewers file.In addition, please provide a marked-up copy of the changes made from the previous article file as a Manuscript with Tracked Changes file. This can be done using ‘track changes’ in programs such as MS Word and/or highlighting any changes in the new document. If you choose not to submit a revision, please notify us.Yours sincerely,Daniel Sem, PhD Associate Professor of Pharmaceutical Sciences Concordia University Wisconsin Academic Editor PLOS ONE Yung-fu Chang, DVM, Ph.D Professor of Molecular Microbiology, College of Veterinary Medicine, Cornell University Academic Editor PLOS ONE

Reviewers’ comments:

Reviewer’s Responses to QuestionsComments to the Author
1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass this form and submit your “Accept” recommendation.Reviewer #1: (No Response)

Please explain (optional).Reviewer #1: This is an interesting article by Ekins et al. showing how they can used a dual-event Bayesian model to identify Mtb actives in commercially available compound libraries.Overall, their results are interesting and the TB community will benefit from sharing this information. In addition, there will be widespread interest to apply such Bayesian methodologies to drug discovery for other pathogens. Issues to address:1) This article erroneously compares hit rates using Bayesian methodology and single point HTS. As they state, single point HTS is often performed at ~10 and 25 uM. However, this article defines a “hit” as one that shows a MIC of 90% growth inhbition at 100 uM using re-ordered compounds (page 9, 4 lines from bottom, “where a hit demonstrated >90% inhibition at 100 ug/mL”). The mistake in logic is that hit rates do not increase in a linear manner as one changes the cutoff threshold, as the authors imply, but instead they can be closer to exponential. This is also observed when one decreases the % inhibition cut-off from, let’s say, 90% to 20%. The 90% cutoff is standard, but also arbitrary for large screens and it is very common for both academic and big pharma groups to analyze data well below 90% inhibition. Thus, if someone using standard HTS has a 1% hit rate at 10 uM, they may have a 25% hit rate if they screen at 100 uM (it could also be lower — a lot depends on the library). The reason most academic and industrial labs screen at 10 to 25 uM is that they usually aim for a hit rate of 0.5-1%, which limits the number of compounds to re-confirm in cherry picking assays. There is a big difference between designing an HTS and cut-off rates to obtain a 1% hit rate, and claiming that an HTS hit rate of 1% (as a fixed value).The authors should compare apples to apples. I’d recommend that they describe their hit rate using an MIC of 10 or 25 uM as a cutoff in a manner similar to a normal single-point HTS. Since their data was determined in MIC format, they have the data (the MIC must have been tested at 100, 50, 25, 12.5, 6 ? uM). What is their hit rate at 10 uM? At 25 uM? Most of the MIC’s for compounds shown in this article fall in the 50-100 uM category and most likely their hit rate will be very low if using 10 uM as an MIC cutoff. This could be summarized in a table in which the authors could present hit rates for 100 uM and also 10 uM; this will allow readers to draw their own conclusions.Taking account such higher hit rates of “regular” HTS may impact other figures in the paper, for example supplementary Figure S9.2) Even if the author’s hit rate decreases significantly after performing the analysis suggested in point 1, obviously their Bayesian predictive methodology still has immense value in TB drug discovery. However, the authors would be better served re-writing this article to tone down the overselling of their results, tone down their negative perception of standard screening methods (which, for all their listed negatives, still work), and consider presenting their work in a different context – Bayesian virtual screening of libraries is a complementary technique to standard methods. Bayesian virtual screening may pick up weak actives missed by standard HTS, and pick up interesting pharmacophores (often weak hits) that could be developed to generate improved activity with medicinal chemistry. Ironically, they also have to take into account that their Bayesian learning method uses data generated by standard HTS.Finally, although their method is obviously very rapid, they still have to reorder compounds and test them in standard assays. Many pharmaceutical companies have technologies to screen upwards of 100K compounds in whole cells screens per day – including Mtb (with a 1-2 day wait for reading plates).3) The LORA assay is well accepted and appreciated in the TB field. However, the Bayesian models were generated using publically available data for Mtb grown at 20% O2, not data from a low-oxygen screen. Perhaps the authors may want to clarify in the text why they chose to test compounds by LORA as well – or at least point out that their model should not be relevant to the LORA data (my understanding is that it should not be predictive of LORA activity since the model was based on replicating activity at 20% O2).4) Was there a quality control for the identity and purity of each chemical re-ordered? I did not see that in the methods. Were there solubility issues observed at the higher concentrations that gave MIC values?5) Were MICs/LORA assays performed once? In triplicates? How did control compounds perform in the assays? Please comment and/or update Methods.6) Please included calculated selectivity index (SI) for each compound in tables; one SI for MABA, one SI for LORA – this will make data much easier to go through (this is common for this type of data).7) A minor point – the technical jargon used is highly specific for Bayesian analysis and will be lost on many readers. I found some information in the Methods. The article may be easier to understand, and thus be more accessible to readers from other fields, if the authors spell out terms and give brief explanations of their meaning/significance. For example, “LOO ROC”, pg7, etc.

2. Is the manuscript technically sound, and do the data support the conclusions?
The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.Reviewer #1: Partly

Please explain (optional).Reviewer #1: This article erroneously compares hit rates using Bayesian methodology and single point HTS. As they state, single point HTS is often performed at ~10 and 25 uM. However, this article defines a “hit” as one that shows a MIC of 90% growth inhbition at 100 uM using re-ordered compounds (page 9, 4 lines from bottom, “where a hit demonstrated >90% inhibition at 100 ug/mL”). The mistake in logic is that hit rates do not increase in a linear manner as one changes the cutoff threshold, as the authors imply, but instead they can be closer to exponential. This is also observed when one decreases the % inhibition cut-off from, let’s say, 90% to 20%. The 90% cutoff is standard, but also arbitrary for large screens and it is very common for both academic and big pharma groups to analyze data well below 90% inhibition. Thus, if someone using standard HTS has a 1% hit rate at 10 uM, they may have a 25% hit rate if they screen at 100 uM (it could also be lower — a lot depends on the library). The reason most academic and industrial labs screen at 10 to 25 uM is that they usually aim for a hit rate of 0.5-1%, which limits the number of compounds to re-confirm in cherry picking assays. There is a big difference between designing an HTS and cut-off rates to obtain a 1% hit rate, and claiming that an HTS hit rate of 1% (as a fixed value).The authors should compare apples to apples. I’d recommend that they describe their hit rate using an MIC of 10 or 25 uM as a cutoff in a manner similar to a normal single-point HTS. Since their data was determined in MIC format, they have the data (the MIC must have been tested at 100, 50, 25, 12.5, 6 ? uM). What is their hit rate at 10 uM? At 25 uM? Most of the MIC’s for compounds shown in this article fall in the 50-100 uM category and most likely their hit rate will be very low if using 10 uM as an MIC cutoff. This could be summarized in a table in which the authors could present hit rates for 100 uM and also 10 uM; this will allow readers to draw their own conclusions.Taking account such higher hit rates of “regular” HTS may impact other figures in the paper, for example supplementary Figure S9.

3. Has the statistical analysis been performed appropriately and rigorously?Reviewer #1: No

4. Does the manuscript adhere to standards in this field for data availability?
Authors must follow field-specific standards for data deposition in publicly available resources and should include accession numbers in the manuscript when relevant. The manuscript should explain what steps have been taken to make data available, particularly in cases where the data cannot be publicly deposited.Reviewer #1: Yes

Please explain (optional).Reviewer #1: (No Response)

5. Is the manuscript presented in an intelligible fashion and written in standard English?
PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors below.Reviewer #1: Yes

6. Additional Comments to the Author (optional)
Please offer any additional comments here, including concerns about dual publication or research or publication ethics.Reviewer #1: There is concern regarding a second article (reference 30) that is in publication elsewhere, it appears that it may be very similar to the current publication.

7. If you would like your identity to be revealed to the authors, please include your name here (optional).
Your name and review will not be published with the manuscript.Reviewer #1: (No Response)

So we resubmitted AGAIN!! March 2oth

Reviewer’s Responses to Questions

Comments to the Author

1. If the authors have adequately addressed your comments raised in a previous round of review and you feel that this manuscript is now acceptable for publication, you may indicate that here to bypass this form and submit your “Accept” recommendation.

Reviewer #1: (No Response)Please explain (optional).

Reviewer #1: This is an interesting article by Ekins et al. showing how they can used a dual-event Bayesian model to identify Mtb actives in commercially available compound libraries.

RESPONSE: Thank you.

Overall, their results are interesting and the TB community will benefit from sharing this information. In addition, there will be widespread interest to apply such Bayesian methodologies to drug discovery for other pathogens.

RESPONSE: Thank you.

Issues to address:

1) This article erroneously compares hit rates using Bayesian methodology and single point HTS. As they state, single point HTS is often performed at ~10 and 25 uM. However, this article defines a “hit” as one that shows a MIC of 90% growth inhbition at 100 uM using re-ordered compounds (page 9, 4 lines from bottom, “where a hit demonstrated >90% inhibition at 100 ug/mL”). The mistake in logic is that hit rates do not increase in a linear manner as one changes the cutoff threshold, as the authors imply, but instead they can be closer to exponential. This is also observed when one decreases the % inhibition cut-off from, let’s say, 90% to 20%. The 90% cutoff is standard, but also arbitrary for large screens and it is very common for both academic and big pharma groups to analyze data well below 90% inhibition. Thus, if someone using standard HTS has a 1% hit rate at 10 uM, they may have a 25% hit rate if they screen at 100 uM (it could also be lower — a lot depends on the library). The reason most academic and industrial labs screen at 10 to 25 uM is that they usually aim for a hit rate of 0.5-1%, which limits the number of compounds to re-confirm in cherry picking assays. There is a big difference between designing an HTS and cut-off rates to obtain a 1% hit rate, and claiming that an HTS hit rate of 1% (as a fixed value).

The authors should compare apples to apples. I’d recommend that they describe their hit rate using an MIC of 10 or 25 uM as a cutoff in a manner similar to a normal single-point HTS. Since their data was determined in MIC format, they have the data (the MIC must have been tested at 100, 50, 25, 12.5, 6 ? uM). What is their hit rate at 10 uM? At 25 uM? Most of the MIC’s for compounds shown in this article fall in the 50-100 uM category and most likely their hit rate will be very low if using 10 uM as an MIC cutoff. This could be summarized in a table in which the authors could present hit rates for 100 uM and also 10 uM; this will allow readers to draw their own conclusions.

Taking account such higher hit rates of “regular” HTS may impact other figures in the paper, for example supplementary Figure S9.

RESPONSE: Thank you for this suggestion. To be clear we do not perform a direct comparison between single point and MIC hit rates, the reviewer would appear to be conflating text from different parts of the manuscript. We do not state or imply anywhere in the manuscript that hit rates increase in a linear manner as one changes the cutoff threshold. As described in the paper molecules were tested from the Asinex Library, and the Maybridge Library at a single concentration of 100 mg/mL. The kinase library was screened at 50 mg/mL. In all three cases we looked at compounds that resulted in greater than or equal to 90% inhibition of Mtb activity. We only have MIC data for MABA and LORA for the selected hits so the proposed suggested approach of using MIC of 10 or 25 uM for all compounds does not appear relevant or possible based on the data we have produced for this paper. In fact, the process of single point screening followed by MIC testing of hits is routine. In addition we show in Table S5 for comparison hit rates at the Institute for Tuberculosis Research for libraries of 1000 to 100,000 molecules performed at 10-50uM for which there appears to be no relationship in hit rate and concentration. It is likely that this hit rate will be influenced by diversity of the libraries as well rather than any concentration used for screening. We are also no aware of others looking at hit rates at different concentrations.

2) Even if the author’s hit rate decreases significantly after performing the analysis suggested in point 1, obviously their Bayesian predictive methodology still has immense value in TB drug discovery. However, the authors would be better served re-writing this article to tone down the overselling of their results, tone down their negative perception of standard screening methods (which, for all their listed negatives, still work), and consider presenting their work in a different context – Bayesian virtual screening of libraries is a complementary technique to standard methods. Bayesian virtual screening may pick up weak actives missed by standard HTS, and pick up interesting pharmacophores (often weak hits) that could be developed to generate improved activity with medicinal chemistry. Ironically, they also have to take into account that their Bayesian learning method uses data generated by standard HTS.

Finally, although their method is obviously very rapid, they still have to reorder compounds and test them in standard assays. Many pharmaceutical companies have technologies to screen upwards of 100K compounds in whole cells screens per day – including Mtb (with a 1-2 day wait for reading plates).

RESPONSE: We agree that Bayesian methods will be important for tuberculosis research as we have demonstrated in this manuscript and previous papers since 2010. However this manuscript is novel because it presents a truly prospective and large scale analysis of predictions unlike any of our previous studies or those from other groups. Contrary to the reviewers suggestion the hit rate calculated is in line with previous calculations generated across different libraries in our hands. We do not believe that we have ‘oversold’ the results, we are presenting the data in a transparent manner across this paper and ref 30. We are quite open about some Bayesian models not performing well in some cases as described in the results and in the discussion e.g. see bottom of page 14. We are also continuing the development of the models we started with in 2010. So this study represents the evolution of these in terms of refining the data including cytotoxicity. The reality is that no other groups are doing prospective computational prediction and testing for Mtb activity. Also while the technique is complimentary to HTS we are saying that there is enough HTS screening data now for TB that we can start with Bayesian models and pick through vendor libraries and compounds that have received little attention. Instead of screening every compound we could feasibly pick less than 1% and test it. It should also be clear that the data we are using is coming from HTS screens of the various collaborators, so this opinion expressed in the manuscript is coming from screeners and modelers. Neither of the previous two reviewers had any issues with the manner in which the data was presented or how the manuscript written.

Our approach is indeed rapid and reordering, supply and testing of compounds can all be achieved in 1-2 weeks. Finally, as further evidence of the value of these models, we suggest screening 100k compounds is unnecessary and have provided the models in the manuscript to two pharmaceutical companies so they can cherry pick thousands of compounds from upwards of 500K-1M compound plus libraries.

3) The LORA assay is well accepted and appreciated in the TB field. However, the Bayesian models were generated using publically available data for Mtb grown at 20% O2, not data from a low-oxygen screen. Perhaps the authors may want to clarify in the text why they chose to test compounds by LORA as well – or at least point out that their model should not be relevant to the LORA data (my understanding is that it should not be predictive of LORA activity since the model was based on replicating activity at 20% O2).

RESPONSE: LORA and MABA MIC data was only generated on hit compounds, we were not predicting LORA activity. The LORA assay was used to see if any of the hits from single point screening were also active in this assay at low oxygen concentrations.

4) Was there a quality control for the identity and purity of each chemical re-ordered? I did not see that in the methods. Were there solubility issues observed at the higher concentrations that gave MIC values?

RESPONSE: These compounds were used as is which is frequently the case in such studies. It would be prohibitively expensive to QC such a high number of compounds and the samples are supplied within claimed QC ranges by the commercial supplier. Furthermore, several of the libraries were supplied in very small amounts that would make it difficult to run a thorough QC. We have published numerous papers in reputable journals based on initial HTS screening data of commercial samples without additional verification of the supplied sample and quality. It is routine to run and report these types of studies on small library samples and publish the data based on the claimed QC by the company. Any follow up work would typically require purchasing or preparing enough material to confirm sample structure, quality, and activity in the screens. A statement has been added to the paper regarding the fact that compounds were used as supplied from the commercial company. No overt solubility issues were identified or the samples would have been eliminated from the screens or a lower dose used.

5) Were MICs/LORA assays performed once? In triplicates? How did control compounds perform in the assays? Please comment and/or update Methods.

RESPONSE: LORA MIC data, was only run once with 8 concentrations and is in line with standard protocols used by this laboratory for many other libraries and studies.

6) Please included calculated selectivity index (SI) for each compound in tables; one SI for MABA, one SI for LORA – this will make data much easier to go through (this is common for this type of data).

RESPONSE: Thank you for pointing this out. We have now added the SI and additional text to tables S2-S4..

7) A minor point – the technical jargon used is highly specific for Bayesian analysis and will be lost on many readers. I found some information in the Methods. The article may be easier to understand, and thus be more accessible to readers from other fields, if the authors spell out terms and give brief explanations of their meaning/significance. For example, “LOO ROC”, pg7, etc.

RESPONSE Thank you for this suggestion. Leave-one-out cross-validation receiver operator curve (LOO ROC) has been added in results section, it was also defined in the methods section previously. It appears that this is the only definition that needed amending.2. Is the manuscript technically sound, and do the data support the conclusions?

The manuscript must describe a technically sound piece of scientific research with data that supports the conclusions. Experiments must have been conducted rigorously, with appropriate controls, replication, and sample sizes. The conclusions must be drawn appropriately based on the data presented.

Reviewer #1: Partly

Please explain (optional).

Reviewer #1: This article erroneously compares hit rates using Bayesian methodology and single point HTS. As they state, single point HTS is often performed at ~10 and 25 uM. However, this article defines a “hit” as one that shows a MIC of 90% growth inhbition at 100 uM using re-ordered compounds (page 9, 4 lines from bottom, “where a hit demonstrated >90% inhibition at 100 ug/mL”). The mistake in logic is that hit rates do not increase in a linear manner as one changes the cutoff threshold, as the authors imply, but instead they can be closer to exponential. This is also observed when one decreases the % inhibition cut-off from, let’s say, 90% to 20%. The 90% cutoff is standard, but also arbitrary for large screens and it is very common for both academic and big pharma groups to analyze data well below 90% inhibition. Thus, if someone using standard HTS has a 1% hit rate at 10 uM, they may have a 25% hit rate if they screen at 100 uM (it could also be lower — a lot depends on the library). The reason most academic and industrial labs screen at 10 to 25 uM is that they usually aim for a hit rate of 0.5-1%, which limits the number of compounds to re-confirm in cherry picking assays. There is a big difference between designing an HTS and cut-off rates to obtain a 1% hit rate, and claiming that an HTS hit rate of 1% (as a fixed value).

The authors should compare apples to apples. I’d recommend that they describe their hit rate using an MIC of 10 or 25 uM as a cutoff in a manner similar to a normal single-point HTS. Since their data was determined in MIC format, they have the data (the MIC must have been tested at 100, 50, 25, 12.5, 6 ? uM). What is their hit rate at 10 uM? At 25 uM? Most of the MIC’s for compounds shown in this article fall in the 50-100 uM category and most likely their hit rate will be very low if using 10 uM as an MIC cutoff. This could be summarized in a table in which the authors could present hit rates for 100 uM and also 10 uM; this will allow readers to draw their own conclusions.

Taking account such higher hit rates of “regular” HTS may impact other figures in the paper, for example supplementary Figure S9.

RESPONSE: As described previously above in 1-

Thank you for this suggestion. To be clear we do not perform a direct comparison between single point and MIC hit rates, the reviewer would appear to be conflating text from different parts of the manuscript. We do not state or imply anywhere in the manuscript that hit rates increase in a linear manner as one changes the cutoff threshold. As described in the paper molecules were tested from the Asinex Library, and the Maybridge Library at a single concentration of 100 mg/mL. The kinase library was screened at 50 mg/mL. In all three cases we looked at compounds that resulted in greater than or equal to 90% inhibition of Mtb activity. We only have MIC data for MABA and LORA for the selected hits so the proposed suggested approach of using MIC of 10 or 25 uM for all compounds does not appear relevant or possible based on the data we have produced for this paper. In fact, the process of single point screening followed by MIC testing of hits is routine. In addition we show in Table S5 for comparison hit rates at the Institute for Tuberculosis Research for libraries of 1000 to 100,000 molecules performed at 10-50uM for which there appears to be no relationship in hit rate and concentration. It is likely that this hit rate will be influenced by diversity of the libraries as well rather than any concentration used for screening. We are also no aware of others looking at hit rates at different concentrations.

3. Has the statistical analysis been performed appropriately and rigorously?

Thank you for this suggestion. To be clear we do not perform a direct comparison between single point and MIC hit rates, the reviewer would appear to be conflating text from different parts of the manuscript. We do not state or imply anywhere in the manuscript that hit rates increase in a linear manner as one changes the cutoff threshold. As described in the paper molecules were tested from the Asinex Library, and the Maybridge Library at a single concentration of 100 mg/mL. The kinase library was screened at 50 mg/mL. In all three cases we looked at compounds that resulted in greater than or equal to 90% inhibition of Mtb activity. We only have MIC data for MABA and LORA for the selected hits so the proposed suggested approach of using MIC of 10 or 25 uM for all compounds does not appear relevant or possible based on the data we have produced for this paper. In fact, the process of single point screening followed by MIC testing of hits is routine. In addition we show in Table S5 for comparison hit rates at the Institute for Tuberculosis Research for libraries of 1000 to 100,000 molecules performed at 10-50uM for which there appears to be no relationship in hit rate and concentration. It is likely that this hit rate will be influenced by diversity of the libraries as well rather than any concentration used for screening. We are also no aware of others looking at hit rates at different concentrations.4. Does the manuscript adhere to standards in this field for data availability?

Authors must follow field-specific standards for data deposition in publicly available resources and should include accession numbers in the manuscript when relevant. The manuscript should explain what steps have been taken to make data available, particularly in cases where the data cannot be publicly deposited.

Reviewer #1: YesPlease explain (optional).

Reviewer #1: (No Response)5. Is the manuscript presented in an intelligible fashion and written in standard English?

PLOS ONE does not copyedit accepted manuscripts, so the language in submitted articles must be clear, correct, and unambiguous. Any typographical or grammatical errors should be corrected at revision, so please note any specific errors below.

Reviewer #1: There is concern regarding a second article (reference 30) that is in publication elsewhere, it appears that it may be very similar to the current publication.

RESPONSE: This paper is different to ref 30 (which is due to be published in the next week) and has been provided to the editors of PLOSONE previously– this is the whole reason for this re-review. We have previously extensively documented how the two are unique.7. If you would like your identity to be revealed to the authors, please include your name here (optional).

Your name and review will not be published with the manuscript.

Reviewer #1: (No Response)

RESPONSE: Thank you for reviewing it.

But they had made it more difficult in the interim by changing editorial Manager..March 21

Dear Sean,

Thank you for contacting PLOS ONE.

We apologize for the difficulty you experienced while attempting to resubmit your manuscript. We have recently updated the Editorial Manager system to require co-author zip codes. This means that authors of revisions that previously included sufficient co-author details in their original submission will be presented with an error message when they try to submit their manuscript. Please go back to the details page and fill in the zip code for each co-author before you attempt to submit your manuscript again.

We appreciate your submission to PLOS ONE and if we may be of further assistance, please do not hesitate to get in touch.

Kind Regards,

Jackie

Jackie SurpliceEO StaffPLOS ONE

Case Number: 01878752 ref:_00DU0Ifis._500U07B37m:ref

After several more emails on formating and the addition of our other paper now published..April 3rd

I am pleased to inform you that your manuscript has been deemed suitable for publication in PLOS ONE.

Your manuscript will now be passed on to our Production staff, who will check your files for correct formatting and completeness. After this review, they may return your manuscript to you so that you can make necessary alterations and upload a final version.

Before uploading, you should check the PDF of your manuscript very closely. THERE IS NO AUTHOR PROOFING. You should therefore consider the corrected files you upload now as equivalent to a production proof. The text you supply at this point will be faithfully represented in your published manuscript exactly as you supply it. This is your last opportunity to correct any errors that are present in your manuscript files.

In addition, now that your manuscript has been accepted, please log into EM at http://www.editorialmanager.com/pone and update your profile. Click on the “Update My Information” link at the top of the page. Please update your user information to ensure an efficient production and billing process.

If you or your institution will be preparing press materials for this manuscript, you must inform our press team in advance. Please contact them at ONEpress@plos.org.

Please contact us at plosone@plos.org if you have any questions, concerns, or problems, and thank you for submitting your work to our journal.

With kind regards,Daniel S. SemAcademic EditorPLOS ONE

I will spare you all the emails back and forth on production around heading levels, figure resolutions etc..But needless to say we probably dealt with most people at PLOS. One point of contact would have been ideal. Dianne Cartwright is acknowledged for giving us the opportunity to appeal and Daniel Sem for the addition to his editing workload. But honestly based on our responses it would have been faster to submit the paper elsewhere and get it published. The paper came out May 7th. The Chemistry & Biology paper came out March 21.