Correlating technical replicates #18

yaaminiv · 2017-10-11T17:43:02Z

As @emmats suggested, I regressed my technical replicates against each other for each transition to see if some transitions were messier than others. You can see my work in my lab notebook entry.

There are definitely some transitions with lower adjusted R squared values than others. My first instinct is to establish some sort of R-squared cutoff, remove transitions lower than this cutoff, and then remake my NMDS plot. While I'm going through each transition, I can also see if there are certain outliers or leverage points that could be influencing the R-squared values (for those close to the cutoff).

Any suggestions for what that cutoff should be?

emmats · 2017-10-11T17:47:32Z

What is the range? I would go pretty high with the cut-off. Your replicates should be right on top of each other. Maybe have @laurahspencer run the same script and figure out what her range of R2 values are? Off the cuff, I would say cut-off should be at least 0.85.

yaaminiv · 2017-10-11T17:49:42Z

The range is .2 to .9 (there are examples of each in my notebook), with the majority being above 0.6.

yaaminiv · 2017-10-11T17:51:37Z

Some examples for context. Peak area from the first batch of technical replicates on the x-axis, peak area from the second batch of technical replicates on the y-axis. Points are labelled with the oyster sample ID.

sr320 · 2017-10-11T20:52:33Z

To me it should be some defined range around a line that is has slope of 1.
Thus based on replicates and not proteins

laurahspencer · 2017-10-12T03:53:31Z

I did a quick work-up using Yaamini's script. Summary data for R^2:

Mean: 0.8636
Min^: 0.6507
Max: 0.9679
Median: 0.9016
^One peptide from Superoxide Dismutase had an awful R^2 for 2 transitions (<0.1), which were outliers.

NOTE: This wasn't using the full data set. I have 17 samples with 3 reps, and 3 samples with 4 reps; only the first 2 reps run are represented here, which likely skews things a bit (didn't want to dig too deep into modifying the code).

yaaminiv · 2017-10-12T04:03:02Z

@emmats Maybe I can start with a 0.65 R-squared cutoff. If that doesn't improve anything, work up to a 0.85 cutoff?

@sr320 can you elaborate on your suggestion? From what I understand, I would plot x = y line in addition to a linear regression, and then consolidate the two somehow?

sr320 · 2017-10-12T14:31:45Z

Lets discuss in class

…

On Wed, Oct 11, 2017 at 9:03 PM Yaamini Venkataraman < ***@***.***> wrote: Maybe I can start with a 0.65 R-squared cutoff. If that doesn't improve anything, work up to a 0.85 cutoff? @sr320 <https://github.com/sr320> can you elaborate on your suggestion? From what I understand, I would plot x = y line in addition to a linear regression, and then consolidate the two somehow? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEPHt2qWGyLfhZMOFYkI-InQ8CH_UE2tks5srY93gaJpZM4P13Q-> .

emmats · 2017-10-12T16:22:49Z

I think the 0.6 cut-off sounds safe. But, of course, @sr320 makes the final call.

This is pretty informative for me. I've never done this before.

yaaminiv · 2017-10-12T21:42:46Z

@emmats Here's my plan:

Normalize all my values by TIC to reduce any external variation (the plots I've made so far are not normalized...what are your thoughts on this step?)
Use a 0.6 cutoff and discard any transitions with an adjusted R-squared value below this. Remake an NMDS plot and examine clustering

AT THE SAME TIME...

Plot an x = y line on each plot, as well as a 95% confidence interval. @sr320 and I discussed the value of this during class. Since our technical replicates should have the same protein abundances, we expect the best fit model to be a 1:1 ratio.
Discard transitions with less than 95% of the points (43 points) within the condense interval. Remake an NMDS plot and examine clustering

Thoughts?

emmats · 2017-10-12T21:44:20Z

I think that sounds good. I don't think you need to normalize by TICs. If your TICs vary widely between technical replicates, then you have other problems.

yaaminiv · 2017-10-15T00:21:04Z

@emmats @sr320
Notebook

I went through the first part of my plan and used R-squared cutoffs to eliminate transitions and remake NMDS plots. I used a combination of three cutoffs (0.6, 0.7 and 0.8) and normalized/nonnormalized data. I found normalizing made my plots look a little better. Overall this helped a bit, but the technical replication still doesn't look fantastic.

0.6, normalized:

0.7, normalized:

0.8, normalized:

I'll try the second part soon, but it may take me a bit longer since making a confidence interval around a line in a for loop is a bit more tedious. Any thoughts about these results?

sr320 · 2017-10-15T08:38:32Z

I would not necessarily expect r2 threshold to improve reps- eg you good have a r2 of 1 and slope could be 0.

…

On Sun, Oct 15, 2017 at 2:21 AM Yaamini Venkataraman < ***@***.***> wrote: @emmats <https://github.com/emmats> @sr320 <https://github.com/sr320> Notebook I went through the first part of my plan and used R-squared cutoffs to eliminate transitions and remake NMDS plots. I used a combination of three cutoffs (0.6, 0.7 and 0.8) and normalized/nonnormalized data. I found normalizing made my plots look a little better. Overall this helped a bit, but the technical replication still doesn't look fantastic. 0.6, normalized: [image: 0.6-normalized-NMDS] <https://raw.githubusercontent.com/RobertsLab/project-oyster-oa/master/analyses/DNR_SRM_20170902/2017-10-10-Troubleshooting/2017-10-10-Transition-Replicate-Correlations/2017-10-13-NMDS-TechnicalReplication-Normalized-Cutoff1.jpeg> [image: 0.6-normalized-distances] <https://raw.githubusercontent.com/RobertsLab/project-oyster-oa/master/analyses/DNR_SRM_20170902/2017-10-10-Troubleshooting/2017-10-10-Transition-Replicate-Correlations/2017-10-13-NMDS-TechnicalReplication-Ordination-Distances-Normalized-Cutoff1.jpeg> 0.7, normalized: [image: 0.7-normalized-NMDS] <https://raw.githubusercontent.com/RobertsLab/project-oyster-oa/master/analyses/DNR_SRM_20170902/2017-10-10-Troubleshooting/2017-10-10-Transition-Replicate-Correlations/2017-10-13-NMDS-TechnicalReplication-Normalized-Cutoff2.jpeg> [image: 0.7-normalized-distances] <https://raw.githubusercontent.com/RobertsLab/project-oyster-oa/master/analyses/DNR_SRM_20170902/2017-10-10-Troubleshooting/2017-10-10-Transition-Replicate-Correlations/2017-10-13-NMDS-TechnicalReplication-Ordination-Distances-Normalized-Cutoff2.jpeg> 0.8, normalized: [image: 0.8-normalized-NMDS] <https://raw.githubusercontent.com/RobertsLab/project-oyster-oa/master/analyses/DNR_SRM_20170902/2017-10-10-Troubleshooting/2017-10-10-Transition-Replicate-Correlations/2017-10-13-NMDS-TechnicalReplication-Normalized-Cutoff3.jpeg> [image: 0.8-normalized-distances] <https://raw.githubusercontent.com/RobertsLab/project-oyster-oa/master/analyses/DNR_SRM_20170902/2017-10-10-Troubleshooting/2017-10-10-Transition-Replicate-Correlations/2017-10-13-NMDS-TechnicalReplication-Ordination-Distances-Normalized-Cutoff3.jpeg> I'll try the second part soon, but it may take me a bit longer since making a confidence interval around a line in a for loop is a bit more tedious. Any thoughts about these results? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEPHt7rlAzK1JJhltI3Q2hAgXVT61D1dks5ssU_wgaJpZM4P13Q-> .

emmats · 2017-10-16T15:54:25Z

I'm still pretty suspicious of these data. It just doesn't make sense that the technical replicates don't look the same.

sr320 · 2017-10-24T17:29:00Z

@yaaminiv can you please provide a csv with respective technical in adjacent columns?

yaaminiv · 2017-10-24T17:33:26Z

@sr320 csv

I just tried playing with slopes and confidence intervals. I'm going to try one more thing on that front and then write it up in a lab nb post/possibly post a new issue

yaaminiv · 2017-10-24T18:19:58Z

Notebook

I was following @sr320 suggestion to look at slopes and plot a 95% confidence interval around an x = y line. Ran into some issues doing that (more details in my nb), so I can only really plot an x = y line and a prediction line (same intercept as regression, but a slope of 1) along with my data.

Any suggestions for how to move forward? A few of my issues are that there are large intercepts for the regression, so an x = y line is far removed and creating a confidence interval around an x = y line/prediction line is essentially impossible with my skill set because neither or those have any error (so plotting a CI would just lead to an upper and lower bound falling directly on top of the original line). I could look at the slope of the original regression and if it falls within some cutoff (1 ± some undetermined error value), I remove the transition and remake an NMDS?

Thoughts? (esp from @emmats since you think this data is suspicious?) I'm stumped, and the only thing I think may work now might be rerunning samples (but I don't know how possible that is)...

yaaminiv · 2017-10-24T18:34:27Z

There are also transitions that have poor R squared values but slopes close to 1. What should I do about those?

sr320 · 2017-10-24T18:40:19Z

Provide the new data sheet I mentioned above and I can provide feedback

…

On Tue, Oct 24, 2017 at 11:34 AM Yaamini Venkataraman < ***@***.***> wrote: There are also transitions that have poor R squared values but slopes close to 1. What should I do about those? [image: choyp_psa 1 1 m 27259 yfqiayplpk y4 confint] <https://user-images.githubusercontent.com/22335838/31961321-3d222c32-b8af-11e7-9735-19268b06278b.jpeg> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#18 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AEPHt0JfoI5q6ARUS_PR984X8O63jlfKks5svi2zgaJpZM4P13Q-> .

yaaminiv · 2017-10-24T18:44:28Z

@sr320 The one with tech reps in adjacent columns? I linked you to that!

sr320 · 2017-10-24T18:48:50Z

Sorry - I need it in just two columns with the sample IDs in a column...

yaaminiv · 2017-10-24T19:05:18Z

@sr320 I think I'm confused...so sample IDs in one column, transitions in another column?

sr320 · 2017-10-24T19:06:50Z

Col1-transition | Col2-sampleID | Col3-rep1 | Col4-rep2

sr320 · 2017-10-24T19:07:35Z

Column1 and Column2 could be switched....

yaaminiv · 2017-10-24T19:08:14Z

normalized or not normalized?

sr320 · 2017-10-24T19:10:27Z

How about both...

yaaminiv · 2017-10-24T21:23:42Z

Normalized
Not normalized

sr320 · 2017-10-24T21:47:39Z

Use this data to start making graphs - simply average reps.

http://d.pr/f/OtdTD

This is just the normalized data with coefficient of variance less than 20.

yaaminiv · 2017-10-26T16:57:48Z

Notebook

Used CV filtering to redo NMDS/ANOSIM analyses. Slight improvement in technical replication, ANOSIM/NMDS indicates no significant clustering pattern.

Going to filter data with CV ≤ 10 and repeat. Will also look at expression of individual proteins making boxplots, etc. Interested in your thoughts @emmats.

yaaminiv · 2017-10-27T01:43:28Z

Seeing how we've answered my original question, I'm going to continue the current conversation in #35.

yaaminiv added help wanted question labels Oct 11, 2017

laurahspencer mentioned this issue Oct 12, 2017

Regress technical replicates for each transition, find R2 range RobertsLab/Paper-DNR-Geoduck-Proteomics#10

Closed

yaaminiv closed this as completed Oct 24, 2017

yaaminiv reopened this Oct 24, 2017

yaaminiv mentioned this issue Oct 27, 2017

Coefficient of variation filtering for transitions #35

Closed

yaaminiv closed this as completed Oct 27, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correlating technical replicates #18

Correlating technical replicates #18

yaaminiv commented Oct 11, 2017

emmats commented Oct 11, 2017

yaaminiv commented Oct 11, 2017

yaaminiv commented Oct 11, 2017

sr320 commented Oct 11, 2017 •

edited

laurahspencer commented Oct 12, 2017

yaaminiv commented Oct 12, 2017 •

edited

sr320 commented Oct 12, 2017 via email

emmats commented Oct 12, 2017

yaaminiv commented Oct 12, 2017

emmats commented Oct 12, 2017

yaaminiv commented Oct 15, 2017

sr320 commented Oct 15, 2017 via email

emmats commented Oct 16, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017 via email

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 26, 2017 •

edited

yaaminiv commented Oct 27, 2017

Correlating technical replicates #18

Correlating technical replicates #18

Comments

yaaminiv commented Oct 11, 2017

emmats commented Oct 11, 2017

yaaminiv commented Oct 11, 2017

yaaminiv commented Oct 11, 2017

sr320 commented Oct 11, 2017 • edited

laurahspencer commented Oct 12, 2017

yaaminiv commented Oct 12, 2017 • edited

sr320 commented Oct 12, 2017 via email

emmats commented Oct 12, 2017

yaaminiv commented Oct 12, 2017

emmats commented Oct 12, 2017

yaaminiv commented Oct 15, 2017

sr320 commented Oct 15, 2017 via email

emmats commented Oct 16, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017 via email

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 24, 2017

sr320 commented Oct 24, 2017

yaaminiv commented Oct 26, 2017 • edited

yaaminiv commented Oct 27, 2017

sr320 commented Oct 11, 2017 •

edited

yaaminiv commented Oct 12, 2017 •

edited

yaaminiv commented Oct 26, 2017 •

edited