Figures

Abstract

Therapeutic monoclonal antibodies have the potential to work as biological therapeutics. OKT3, Herceptin, Keytruda and others have positively impacted healthcare. Antibodies evolved naturally to provide high specificity and high affinity once mature. These characteristics can make them useful as therapeutics. However, we may be missing characteristics that are not obvious. We present a means of measuring antibodies in an unbiased manner that may highlight therapeutic activity. We propose using a microarray of random peptides to assess antibody properties. We tested twenty-four different commercial antibodies to gain some perspective about how much information can be derived from binding antibodies to random peptide libraries. Some monoclonals preferred to bind shorter peptides, some longer, some preferred motifs closer to the C-term, some nearer the N-term. We tested some antibodies with clinical activity but whose function was blinded to us at the time. We were provided with twenty-one different monoclonal antibodies, thirteen mouse and eight human IgM. These antibodies produced a variety of binding patterns on the random peptide arrays. When unblinded, the antibodies with polyspecific binding were the ones with the greatest therapeutic activity. The protein target to these therapeutic monoclonals is still unknown but using common sequence motifs from the peptides we predicted several human and mouse proteins. The same five highest proteins appeared in both mouse and human lists.

Data Availability: Data are available from the Gene Expression Omnibus; accession GPL14921 (platform). Data are provided at the Gene Expression Omnibus at NCBI. Platform: GPL17490, GPL17679 and Series GSE49217, GSE50044 and GSE50045.

Funding: Ameneh Zare was supported by a Du Pre Grant from MS International Federation (MSIF). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interest exist.

Introduction

Monoclonal antibodies can bind a variety of targets: lipids, LPS, sugar moieties, phosphorylated or myristoylated residues, conformational or multimeric targets. Therapeutic antibodies often possess characteristics that promote some desired activity in vivo. Measurements of affinity can be done using ELISA, SPR or other biochemical assays. However, the human body is a complex environment of proteins, buffers, pH, temperatures, and competitors. Specificity is a measure of off-target binding under very controlled conditions. In the human body, if specificity is high, the antibody might not bind its target in vivo due to competition or unsuitable presentation of the target. If too low it might riddle an otherwise effective therapeutic antibody with side effects.

Peptide microarrays have long been used to analyze antibodies against linear epitopes. Tiling epitope arrays can demonstrate the specificity of both polyclonal and monoclonal antibodies [1]. Phage display uses even larger libraries of peptides providing more epitopes to pan, but in both cases the results generally answer the same question: which linear peptide sequences bind to a given monoclonal? A few limitations should be mentioned: antibodies to non-linear epitopes may bind peptides, but they would likely be a mimotope, a sequence unrelated to the sequence of the antigen. Additionally, the peptide arrays likely contain only those peptides needed to cover a single proteome or even just the protein(s) under investigation.

We think it is possible to use a peptide microarray of random sequences to characterize antibodies to both linear and non-linear epitopes. Mimotopes can bind as strong or stronger to an antibody than its original antigen and may exist in a random peptide array of only a few hundred thousand sequences [2]. We have demonstrated that antibodies to linear sequences can find motifs that match their antigen.

When peptides are arrayed on a solid surface, such as a microarray, peptide-antibody interactions can be measured by detecting bound antibody after stringently washing the array. Typically, an antibody is bound strongly when 4–5 residues make a perfect match, generally ~50kCal/mol. Fewer than that and antibodies are washed off. This is how most tiling arrays work; it is undesirable to retain antibodies from imperfect matches. We found that peptides spaced <1nm apart on a solid surface could create a dense forest that enables weakly captured antibodies to be trapped, re-binding to the peptides creating a high local avidity that antibodies with only 2–3 residues need to be a perfect match, being retained even following a stringent wash [3]. This allowed us to see thousands of binding events on arrays with only 125,000 to 330,000 peptides [4].

Materials and methods

Training on commercial antibodies

In order to gain understanding of how different antibodies behave on random-peptide arrays, we selected 4 different peptide microarrays that utilize random-sequence peptides, but use different lengths and numbers of peptides (see Table 1). We then purchased 24 different commercial monoclonal antibodies to test on these four peptide microarrays (see Table 2). The epitopes for these antibodies varied substantially, by design. One target is against a hapten, three targets are against proteins in the phosphorylated and dephosphorylated form, eleven are against linear peptides, four of which are <13 residues, and seven are against putative but unmapped regions of proteins. We prioritized our analysis of linear epitopes against the eleven monoclonals to linear peptides.

Four libraries were constructed using mask-based lithography synthesis. Each library is synthesized on silicon wafers coated with silicon oxide. Every library is made in a 0.49cm2 area, 24 separate assays are arrays on a standard 25mm x 75mm slide in an 8x3 design. Every array was assayed under the same conditions.

Ab Name is the title of the antibody as sold by the manufacturer. Protein target is the gene name of the protein that served as the antigen. Ab Clone is the nomenclature the manufacturer uses to identify the hybridoma cell lineage. Epitope is the target of the antibody, generally the position in the Protein Target (when known) that contains the exact epitope. Generally, a short Epitope implies a linear peptide was used as the immunogen. aa is the number of amino acids included in the linear immunogen. Source is the manufacturer, along with the catalog number. Host is the mammalian animal host. Last, isotype is the class of antibody produced.

Testing blinded therapeutic antibodies

Our collaborators at the Mayo Clinic in Rochester, MN supplied us with thirteen different therapeutic antibodies (Table 3). The characteristics of these antibodies were blinded to us other than to state that they were of mouse or human origin and that they were IgM. These antibodies had previously been tested for their ability to remyelinate the central nervous system (CNS) as an approach to remediate symptoms from Multiple Sclerosis[5]. These monoclonals were sourced from Waldenstrom’s myeloma cells, and are IgM rather than IgG[6]. They were selected due to their unusual properties that causes remyelinating activity in vivo [7, 8]. Mayo’s experiments revealed that the most efficacious of these antibodies did not halt demyelination, rather it initiated remyelination of neurons during periods of remission. This effect could potentially be leveraged to restore function to humans recovering from MS. Five mouse and eight human antibodies were selected, deidentified, and sent to Arizona State University for processing on the 330,000 peptide immunosignature array.

These antibodies were shipped from Mayo (Rochester, MI) to ASU (Phoenix, AZ) blinded, labeled only by the source (human or mouse) and the number (Antibody ID). The Code, Specificity in CNS and the Function were known only to Mayo prior to unblinding. Human IgM-6 (Code 22) is in human trials for remyelination. Mouse IgM-5 (Code 94.03) was shown to promote remyelination, and was identified as a natural autoantibody.

We followed the same procedure used for the commercial monoclonals: exploratory data analysis of the data distributions, general patterns and commonality among the antibodies.

Peptide synthesis and array printing

Four different random peptide microarray libraries were used (Table 1), two with ~330,000 peptides and two with ~125,000 peptides. Some arrays had peptides shorter or longer length; some libraries had fixed length peptides; some had peptides of variable length. Each library was treated the same relative to sample processing and analysis. Each array is repeated 24 times on one standard slide with a gasket separating the assays. Synthesis of the peptides on the silicon wafers was performed as described [9], using shadow-mask lithography and BOC peptide synthesis. The assay is performed as follows: first, arrays are incubated in the presence of sample buffer (SB = 1x PBS pH 7.3 + 0.05% Tween20 (Sigma-Aldrich, St. Louis. MO) for one hour at 25°C with gentle agitation. Antibodies were added by multichannel pipette to the arrays to a final concentration of 4nM in 150ml of sample buffer. The primary incubation is done at 37°C in a rotating hybridization oven (Agilent, Santa Clara, CA) for 1hr. The gasket is removed, the slide washed 3x in SB for 5 minutes with agitation, then 3x5 minutes each in deionized 50MW water with agitation. The slides are placed in a 5ml tray without the gasket, where 2ml of SB + 5mg/ml casein (Fisher Scientific, Fair Hills, WI) at pH 7.3 is added and fluorescent anti-mouse (Jackson Lab AlexaFluor555 goat anti-mouse IgM Fc) or anti-human (Life Technologies AlexaFluor555 mouse anti-human Fc) secondary antibody is added to a final concentration of 4nM. The secondary binds to the primary antibodies for 1hr at 25°C with gentle agitation. Slides are washed as above, dried by centrifugation at 1500g for 10 minutes, then scanned at 1um resolution in an Innopsys Innoscan 910 two-channel scanner at high laser power, 20% PMT. A 16-bit TIFF image is stored for each array (24 images per slide), aligned using GenePix Pro 6.0, data analyzed using GeneSpring 7.3.1 (Agilent, Santa Clara, CA) or R (CRAN Repository).

Analysis methods

We first asked whether there were any generalizable measures of binding that could differentiate the 24 antibodies. We first analyzed the data distributions for trends or patterns using EDA (exploratory data analysis) across all four different peptide libraries. We then asked whether there were differences in these trends on libraries with shorter or longer peptides (330K long and 330K short libraries), or on arrays with fixed length (HT124K) vs. variable length (CIM125K). We compared Shannon’s entropy, data distribution, binding promiscuity, and dynamic range. Finally, we asked whether we could find evidence of the eliciting linear epitope for those monoclonals raised to them in the short, random peptides. Once we competed these preliminary experiments, we tested a collection of antibodies blinded to us provided by a collaborator at Mayo, Rochester.

To capture sequence information from the peptides per monoclonal, we analyzed the top 200 peptides that uniquely bound to each monoclonal. These peptides and their binding intensities are shown in the heatmaps in Fig 4. The test to identify peptides specific to each monoclonal used a correlation score that compared a vector that simulates a pattern representing the highest signal for that monoclonal but the lowest possible signal for every other monoclonal. Any peptide that matched this hypothetical pattern produced a high correlation and appeared at the top of the list. This list produced peptides that bind strongly to just one monoclonal. We used CLUSTALW (GNU General Public License, v2) to group the peptide sequences into clusters, and asked GLAM2 [10] to align the peptides from CLUSTALW using gaps (if necessary) to find the most conserved positions in a motif. The resulting motifs were searched for the peptide when known. This was done for 10 of the monoclonals (bold/underlined in Table 2).

Results

Immunosignature training on 24 monoclonals

The peptide arrays produce binding data between antibody and peptides. This pool of data is called an immunosignature. An immunosignature is compilation of the steady-state binding affinities between the library peptides and the antibody (or collection of antibodies, as found in serum). Immunosignatures are generally log10 normal [3, 11]. However, a single antibody can have a broad range of binding characteristics and the binding data may deviate from log10 normality. This measure of the data distribution can be considered the first of many general observations. As in epitope mapping experiments, the highest-binding peptide sequences for antibodies raised against linear peptides can be measured, clustered, and examined for motifs that should correspond to the linear epitope for that antibody.

Fig 1 shows images from the four peptide libraries that were used: the 330K (short) library consisted of peptides of varying lengths, aveage length was 12.2 residues. The 330K (long) library consisted of peptides of length 17 residues. The HT 124K is a commercial peptide array made by HealthTell (San Ramon, CA) with a mean length of 9 residues. The CIM 125K is similar to the HealthTell arrays, with a mean length of 12aa. Previous reports indicate that even short motifs found in an immunosignature peptide can be statistically relevant to the linear epitope of an antibody [12–16] but it was unknown how the size of the library impacted the ability to predict epitopes.

Fig 1. Raw images of a small portion of the upper-left portion of four different peptide microarrays showing four different monoclonal antibodies.

The X axis is four different peptide libraries; CIM330K (short) is a library of 330,000 random-sequence peptides of length 12.2 residues; CIM 330K (long) is a library of 330,000 random-sequence peptides of length 17 residues. HT124K is a library of 124,000 random-sequence peptides of length 9 residues; CIM125K is a library of 125,000 peptides of mean length 12 residues. The Y axis is four commercially-sourced monoclonal antibodies. Row 1: anti-human TP-53 (Ab1) has low binding to most peptides but very high binding to a small subset of peptides, especially to those containing the sequence RHSVV. Row 2: anti-human hnRNP monoclonal has an intermediate binding prevalence, with approximately 15% of the total peptides binding at >2SD above background. Row 3: anti-human p38 monoclonal has low binding pattern but at least 15% of all peptides bind at least 2SD above background with a few high binders. Row 4: anti-AKT1(7) monoclonal has more promiscuous binding with >40% of all peptides binding >2SD above background for HT124K and CIM125K. This visual display is intended to demonstrate qualitatively how diverse the binding patterns are.

Table 4 and Fig 2 illustrate a method that uses Information Theory to determine whether there is a difference in the diversity or randomness found in the binding pattern. Table 4 lists several descriptors of data distributions (mean, stdev, upper 95th percentile, skewness, kurtosis, dynamic range). We settled upon Shannon’s Entropy as it provided a wide range of values that should indicate diversity of information content. We applied Shannon’s Entropy [17] to the patterns of data for each monoclonal for each random peptide library. Entropy is highly dependent on the composition of the library, thus there are larger differences in the score between libraries than between antibodies using the same library. Entropy scores are constrained by each library. Therefore, the rank order of scores for antibodies is the best way to compare each antibody across the different peptide libraries. Entropy was found to be more sensitive to subtle changes in the relationship between the peptide binding pattern and biological associations. For example, the entropy score for an immunosignature increases (more randomness) as additional monoclonals were added to the solution that bound to the microarray. Thus, for a single monoclonal, an entropy score would be high if that monoclonal were more promiscuous, but also a strong binder [17]. The antibody p53 Ab8 had the highest relative entropy score across all libraries and has shown strong and very specific binding to very few peptides, generally when at least 2–3 amino acid residues of its cognate epitope are present and only when they contain the tryptophan (SDLWKY). By comparison, anti-BrdU antibody was raised against a small hapten and binds many peptides strongly.

Fig 2. Entropy measures of each of the 24 different monoclonals tested.

Shannon’s Entropy was calculated for each of the monoclonals and each of the 3 different peptide libraries. Since each peptide library is different, entropy calculations will differ as well, however a general trend shows that p53Ab8 has generally high measured entropy and anti-BrdU the lowest.

Table 4. List of selected characteristics of the data distributions from each of the peptide array formats.

The table below lists seven different numerical descriptors of the full dataset from each peptide microarray library (330K short peptide, 330K long peptide, 125K short peptide, 124K short peptide), and for each antibody. In order, the descriptors are: Shannon’s Entropy [17], 95th percentile coefficient of variance (calculated from the standard deviation / mean of the 95th upper percentile of all observed fluorescence intensities for each antibody per peptide library), the mean of all intensities, the stdev of all intensities, the measured kurtosis of all raw intensities, the skewness of the raw intensity data distribution, and the dynamic range of the raw intensities.

Fig 3 shows the distributions plotted as density maps. It is easier to view these densities as general patterns rather than try to examine the finer details per distribution. As seen, there is substantial variation in the shape of the curves, and the breadth. The wider the distribution plot, the higher the dynamic range of the peptide binding scores. The plots also give an indication of normality, with some plots like hnRNP A1, p21, and BAX showing the most deviation from log10 normality on all libraries tested. Further, the CIM125K library and the HT124K library tended to produce distributions closer to log10 normality, but also provided less dynamic range.

In the lower half of Fig 3, the data distributions are plotted together, which highlights how few antibodies deviate from the general plot shape. For the libraries with variable peptide length, the distribution of lengths is plotted as a histogram within the density plots. The CIM330K short library has a wider distribution of lengths than the HT124K library. This effect is seen in the total number of peptides for a given length; the 330K library has a higher percentage of lengths at different discrete values relative to the mean length than the HT124K library, meaning that there is a wider range of peptide lengths.

Fig 4 is a heatmap showing the relationship of the binding intensity per peptide per monoclonal relative to those peptides binding to the other 23. This serves to highlight the specificity of the assay–for each monoclonal there are 200 peptides that bind only to that antibody and do not bind any other antibody.

Fig 4. Hierarchical clustering of the top 50 peptides for each of the 24 monocloanls tested.

The top 200 peptides for each monoclonal were selected by filtering via pattern-matching to a perfectly discriminatory pattern (i.e. high for each monoclonal, low for the other 23 monoclonals). This filter produced peptides that are unique to each monoclonal, if possible. The values for these 50*24 = 1200 peptides is shown for the three microarray libraries. The peptides were clustered using Pearson’s correlation coefficient to group peptides on the Y axis, the X axis lists each monoclonal, and was ordered manually Common reactivity is seen as colored bars off the diagonal axis.

To understand whether this method can provide information about unknown antibodies, we asked a collaborator for a collection of clinical monoclonals that were tested for their ability to remyelinate the central nervous system (CNS) in patients who were in remission from multiple sclerosis. These monoclonals were sourced from human Waldenstrom’s myeloma cells. Notably they are IgM rather than IgG. These were selected due to the properties of remyelinating activity in vivo. Their effects in laboratory mice infected with Theiler’s virus, a picornavirus that persists in the CNS and causes demyelination, demonstrated that no antibody stopped demyelination but they initiated remyelination of damaged neurons. This is the clinical effect that was being sought. If the immunosignature data from the commercial monoclonals is relevant, it should work for ANY monoclonal, not just well-characterized IgG molecules. We assayed the five human and five mouse antibodies, but were left blinded to their clinical data.

One of the antibodies (HIgM22) completed phase I clinical trials in multiple sclerosis patients without any serious complications. Whether this antibody that promotes remyelination in animals also promotes remyelination in multiple sclerosis patients is unknown. The dataset was analyzed blinded and reported to the collaborator who then interpreted the IMS data relative to each antibodies efficacy to promote CNS remyelination.

The concept that antibodies can promote remyelination comes from previous experiments in the Mayo/Rodriguez lab where adoptive transfer of antisera raised against purified mouse spinal cord homogenate was able to induce remyelination in animals with CNS demyelination induced by Theiler’s virus. As a result, spleens from those animals that produced this remyelinating antisera were fused to produce mouse monoclonal antibodies. These monoclonals were then screened for their ability to bind to CNS by immunofluorescence. Those that bound to myelin were then injected into mice infected with Theiler’s virus to determine which antibodies promote remyelination. It was shown that those antibodies that promote repair were polyreactive and had similar DNA sequences to germline making them natural antibodies. Once this was known, then human patients with monoclonal gammopathies were screened for their ability to bind the CNS myelin by immunofluorescence and then to promote remyelination in the Theiler’s virus model of demyelination. One of these antibodies that promoted consistent remyelination (rHIgM22) was sequenced and cloned to obtain a recombinant protein which is now being used in the multiple sclerosis clinical trial.

Informatic analysis of 24 monoclonals

We first wished to test the general characteristics of the 24 monoclonals. Antibodies were assayed according to Materials and Methods. An image of 3 sample antibodies was taken from each of the three libraries. The 9 images are shown in Fig 1. Note that some antibodies have very restricted binding patterns, as exemplified by JNK2, where few peptides are bound by the monoclonal. This is the opposite of p21, where many peptides are bound by the monoclonal. Fig 2 shows a bar-chart of the 24 antibodies’ entropy calculation. Values range from 5 to 7 for all three libraries, but the range of values differs by the peptide library. The entropy scores should be compared non-parametrically using rank rather than absolute scores. The composition of the peptide library has a large effect on entropy, more than the differences across antibodies. Therefore each library should be analyzed relative to the antibodies tested, rather than a direct comparison of entropy across libraries. The highest entropy for all libraries was p21. The lowest for the two 125K libraries was actin B and phospho-p21. The lowest for the 330K library was anti-human p53 Ab8.

Fig 3 shows the density plots for each of the monoclonals. Density distributions reflect the binding variance and deviation from log-normal. As before, the CIM330K library shows some difference in the antibody profiles, suggesting that the older array may have properties unique to that library and synthesis method. The two newest array platforms show similar profiles.

Fig 4 is a display of the peptide intensities for 50 of the most unique peptides for each antibody. 1200 peptides are shown in total. As seen, there is little overlap in the peptides that each antibody bound well.

Fig 5 is an analysis of the Shannon’s entropy score for the Mayo monoclonal antibodies. In this figure the highest entropy scores resulted from Human 6 and Mouse 5.

Experimental IgM monoclonals used for therapeutic remyelination in human and mouse. IgM6 and IgM1 and IgM5 were shown by clinical trial significant efficacy in remyelinating human and mouse neurons, respectively. No other monoclonal showed efficacy.

Data for all 125,000 peptides from CIM125K are shown as a density plot, either one by one (top plots) or side-by-side (bottom plot). For the distributions shown along the bottom, blue color indicates low intensity binding while yellow and red indicate higher binding at least above the median signal for that array. Antibodies are shown in the same order as Fig 4. Wide/broad distributions match promiscuous binding of specific antibodies, narrow distributions suggest more specific binding. No other serological test performed on any of these antibodies led investigators to predictions of efficacy, but therapeutic efficacy of these IgM antibodies correlated perfectly with the broad distributions and relatively high entropy scores.

In Fig 7 we show the results from the same method used previously to obtain peptides that bind uniquely to each monoclonal. The heatmap on the left shows that only 1/3 of the peptides selected to be uniquely bound by each monoclonal were in fact unique–the rest were generally common to all of the tested antibodies. This was most apparent in the mouse monoclonals (right side of heatmap). The heatmap on the far right shows the peptides that were selected for the human and the mouse antibodies. The common high binders were different for mouse and for human, suggesting that there were different epitopes being bound by the antibodies. These 200 peptides for mouse and separately 200 peptides for human were BLASTed against their respective proteomes using BLASTP with a low stringency E<0.01 cutoff. Each of the 200 peptides was used to obtain a list of matching proteins. For the human antibodies, there were 118 different protein targets found, for mouse there were 172 different proteins found. Table 5 lists the top four proteins that were found more often than any other protein. These four common proteins which were found both in mouse and human were identified by 400 different peptides. We do not know the identity of the actual biological targets for these antibodies but the common targets for both mouse and human appear to be cytoskeletal in nature. It is possible the size of these proteins plays a role in finding them using this probabilistic approach, but there are other large proteins in the human and mouse proteomes and both found similar proteins at approximately the same rate.

Each of the clinical monoclonals was tested exactly like the 24 commercial antibodies, to find 50 peptides that were unique for each antibody (see Fig 4). Left: For each of the antibodies, some unique peptides were identified but for the human antibodies, many peptides overlapped suggesting a common target. The mouse antibodies had less overlap with either the human or other mouse antibodies. Right: We applied a general filter for high binding peptides. Here there are 200 peptides identified for the human antibodies (left) and 200 for the mouse antibodies (right). These high-binding peptides overlap with each other, but not between mouse and human reinforcing the possibility that these two sets of antibodies are against different protein targets. These 200 peptides were used to BLAST all human and all mouse proteins, respectively (see Table 4).

Table 5. List of most common hits from mouse and human peptides (from Fig 7).

The Mayo monoclonals from Table 3 were tested on the 330K immunosignature array. 200 peptides that were common for the human and 200 peptides common for the mouse antibodies were used to compare the GeneBank human (hs) or mouse (mm) protein database using BLASTP and a cutoff of 0.01. The protein hits for each peptide were compiled and sorted. The table below contains proteins from both mouse and human that were hit at least 2-fold more often than the next most common protein. The first column lists the protein common name, the second column lists the number of times the 200 peptides aligned with each protein for human (column 2) and mouse (column 3). The next highest number of hits for human was 61 and for mouse was 47.

Fig 8 is an example of how a library of random-sequence peptides can identify a linear sequence of protein that defines the eliciting epitope, similar to the way standard epitope mapping experiments work. This relates to the previous analysis of the Mayo monoclonals suggesting that there may be some capacity of the random peptides to find actual epitopes from antibodies–we previously explored this capacity [12, 14, 16] with similar results.

JNK2 and DM1A were processed on 3 microarray platforms. 125K, 124K and 330K array data were used to find epitopes using CLUSTALW and GLAM2. The large text figures represent the GLAM2 output. These motifs are similar to the actual linear epitope shown underlined in the protein sequence. Right: Guitope [13] was used to identify a region of either JNK2 (top three graphs) or tubulin (bottom three graphs). The red lines indicate the noise threshold, generated by testing all random peptides from each of the peptide libraries. The green line is the signal from the 200 selected peptides unique to that antibody. The black vertical line indicates the position within the protein where the epitope is likely to reside. For both proteins, the Guitope analysis predicted the exact location of the start of the epitope sequence. 72 residues from the C-terminus of JNK2 and 23 residues from the C-terminus of tubulin.

Discussion

We introduced a method for investigating binding properties of monoclonal antibodies. We examined the binding of 24 different commercially sourced monoclonal IgG antibodies to four different random-peptide immunosignature microarrays. Some monoclonals had published and defined targets, most only reported the protein as the immunogen. DM1A is a human anti-Tubulin IgG that was created by immunizing with the peptide AALEKDY while others like Akt1 antibody were raised to a segment of protein representing amino acids 345–480 in human Akt1. Some targets were not detailed by the manufacturer, a common practice for many commercial antibodies. We analyzed all but could not confirm whether our information about the target is correct. Some targets such as p-p21 were phosphorylated, and the non-phosphorylated pair, i.e. p-21, was also tested.

We first used a general exploratory analysis. We asked whether information theory could provide some insight into the behavior of the monoclonals. Here we used Shannon’s entropy [17] to explore how many independent binding events could be examined together to form a picture of antibody behavior. When examined along with the histograms of the data, a picture emerges of how a given antibody responds to random sequences. Some antibodies prefer to bind to many different peptides strongly, some to many peptides weakly, but most bound to several hundred to several thousand peptides strongly, regardless of the peptide library that was used. The 330K long array highlighted that some antibodies prefer to bind to longed peptides as evidenced by the increased number of high-binding events on the 330K long vs. the HT 124K library. A sequence analysis of the 200 highest binding peptides per antibody revealed that some monoclonals bound to similar motifs but those motifs could appear closer to the C-terminus (p53 Ab8) or to the N-terminus (p53 Ab1). These sorts of observations are difficult to obtain using classic antibody binding measurements like ELISA. Phage display can provide information like this, but unlike phage display, the immunosignatures can display non-binding information. P53Ab8 was raised to SDLWKLL and binds to peptides that differ substantially from this sequence as long as there is a tryptophan near the middle of the peptide. Almost no peptide with no tryptophan bound to p53 Ab8.

We then examined 13 IgM antibodies from Mayo. The molecular target for the remyelination antibodies is unknown. Neither the human nor mouse antibody panel has yielded a confirmed in vivo target. In Hecker et al. [18] a tiled peptide microarray made by JPT Peptides (Berlin, Germany) detected several candidate proteins with high antibody reactivity in relapsing remitting Multiple Sclerosis (RRMS); ACTB, ACTG (human actin B and actin gamma) were identified by several high-binding peptides in a majority of MS case samples. S100A1 and CRYAB are a heat shock protein and a calcium binding protein, respectively and were also identified by commonly binding peptides present in these proteins. Given the wide range of possible targets for these therapeutic remyelination antibodies, we followed an unbiased search using peptides that bound the monoclonals at the highest intensity.

Mayo provided the antibodies blinded. Analysis showed that IgMhu6 and IgMMm5 had the highest entropy values and the broadest and most non-normal density distributions. The patterns in Fig 7 indicate that there were few peptides that were completely unique to each antibody. The test to pick unique peptides is quite stringent as seen for the commercial antibodies in Fig 4. The human IgM antibodies showed a great deal of overlap, even as the selection process actively discouraged any overlap. This may suggest a common target. IgMhu6 showed little overlap with other antibodies. This may be due to many peptides binding simultaneously, decreasing specificity for a given set of peptides. The same pattern appears in mmIgM6. It is worth noting that the mouse antibodies had more specificity across the different clones than the humans. There was little commonality between the mouse antibodies in general vs. the human antibodies, as shown in Fig 8. Mayo demonstrated that Human 6 and Mouse 5 were most efficacious in promoting remyelination in animal models of demyelination, including both the Theiler’s virus model and direct lysolecithin injection into the cord. Mayo tested lipid panels, pull-downs, western blots, and other discovery methods, but the target remained elusive. Without a candidate protein, it is difficult to align motifs to obtain a confident target but using an ab initio approach we simply BLASTed the peptides against the human and mouse proteome, respectively. We identified Kelch-like (123 times in human and 126 times in mouse), dynein heavy chain (535hu and 768mm), myosin family protein (350hu and 322mm), and titin (202hu and 88mm). It may be that these overlapping proteins contain the target sequence from another protein, or the repeating units enhanced off-target alignments, but equally likely the Mayo antibodies are actually binding or stabilizing certain cytoskeletal components allowing remyelination. Dynein had hits along the length of the protein, but dynein is a large protein (nearly 5000 amino acids); peptides are likely to match it by random chance. However, a western blot study of these proteins might prove informative.

Conclusions

The data provided here can be applied to any antibody. Epitope binning is a first and important characterization of therapeutic antibodies, but it may be that immunosignature analysis might provide insights not available with standard techniques. For example, information about which peptide sequences bind can reveal motifs like the actual epitope. However, information about which peptides ablate binding can be as important. Single residue changes that reverse a strong binder to a weak binder reveals a great deal about the paratope and the epitope determinant. Information about promiscuity or polyreactivity can be obtained in a single experiment on an immunosignature microarray. These facets of antibody character, as demonstrated here, could profoundly affect clinical efficacy such as promoting remyelination. A single value, entropy, could be used as a proxy for polyreactivity.

There are several immediate practical benefits that arise from this study. First, epitope binning is time consuming. It may be that a rapid screen with a random peptide microarray can narrow thousands of candidate monoclonals to a few that can be investigated more thoroughly. Polyreactivity was a strong indicator of clinical efficacy in this case, but there can be many different outputs. The breadth of data provided by immunosignatures lends itself to machine learning. By training an algorithm on data from successful or clinically useful monoclonals, that pattern, no matter how complex or convoluted, can be captured by sophisticated machine learning analyses in a high-throughput manner. This increase in speed is the key to increasing our ability to screen thousands or millions of antibodies, one of which could be the next major blockbuster.