You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently in V0.11.0 we use a scaling procedure to reduce any given metric into a range between 0 and 1 that can then be weighted and combined with other scaled and weighted metrics to give a final score for the assembly. An issue with this approach is that often in assembly projects we sweep large ranges of k values and get at least one useless assembly. Excluding this assembly could change the overall ranking of the 1st and 2nd assembly (depending on the weightings scores of other metrics). Scaling by the range means that the weight of the metric could be decided entirely by two outliers.
An alternative approach taken by Abbas et al, 2014 (http://link.springer.com/article/10.1186/1471-2164-15-S9-S10/fulltext.html) is to rank the assemblies based on a single metric before weighting and combining results. A problem with this method is that it can give to much credence to a metric where all assemblies contain similar values for that metric.
We are therefore looking for a better means of scaling assembly metrics. Potentially methods such as the inner quartile range or standard deviation of the metric may work better. This thread is for discussing this topic.
The text was updated successfully, but these errors were encountered:
In addition to ranking assemblies by each of these metrics and then calculating an average rank (Additional file 2: Figures S9–S11), we also calculated z-scores for each key metric and summed these. This has the benefit of rewarding/pen- alizing those assemblies with exceptionally high/low scores in any one metric. One way of addressing the reliability and contribution of each of these key metrics is to remove each metric in turn and recalculate the z-score. This can be used to produce error bars for the final z-score, showing the minimum and maximum z-score that might have occurred if we had used any combination of nine (bird and snake) or six (fish) metrics.
Currently in V0.11.0 we use a scaling procedure to reduce any given metric into a range between 0 and 1 that can then be weighted and combined with other scaled and weighted metrics to give a final score for the assembly. An issue with this approach is that often in assembly projects we sweep large ranges of k values and get at least one useless assembly. Excluding this assembly could change the overall ranking of the 1st and 2nd assembly (depending on the weightings scores of other metrics). Scaling by the range means that the weight of the metric could be decided entirely by two outliers.
An alternative approach taken by Abbas et al, 2014 (http://link.springer.com/article/10.1186/1471-2164-15-S9-S10/fulltext.html) is to rank the assemblies based on a single metric before weighting and combining results. A problem with this method is that it can give to much credence to a metric where all assemblies contain similar values for that metric.
We are therefore looking for a better means of scaling assembly metrics. Potentially methods such as the inner quartile range or standard deviation of the metric may work better. This thread is for discussing this topic.
The text was updated successfully, but these errors were encountered: