First pillar of Dubito: intervals
In Dubito, we treat data as intervals as opposed to points ([3,4] versus 3.5, for example). An interval is defined as a range of possible values, as opposed to a single value. For example, if we measure someone's height with a ruler that can only measure to the nearest inch, we might report that they are between 5'5'' and 5'6''. In traditional analysis, this person's height would be recorded as some absolute number (5'5.5'', for example), however this would be incorrect, since the ruler simply cannot measure with that sort of accuracy. The interval takes the form of a lower bound and an upper bound - the true value is somewhere in between.
Second pillar of Dubito: robust Bayes
Robust Bayes is a category of methods Dubito uses to perform statistical analysis on interval data. The output of these techniques is generally another interval. For example, the average of an interval dataset is itself an interval, its lower bound being the average of the lower bounds of the data, and its upper bound being the average of hte upper bounds of the data. The following documents introduce how robust Bayes and interval statistics works.
Third pillar of Dubito: automatic checking
On top of the built-in interval analysis, Dubito implements automatic and invisible cross-validation and ancillary sensitivity analyses. In the sensitivity analyses, calculations are repeated with different assumptions to assess how robust the results are. Cross validation deals with the error of the predictive model as opposed to the error of the data. Even if the data were perfectly precise, there would still be uncertainty in the prediction since all models are only estimates of the actual processes that are generating the data. Dubito applies this same concept (used by all statisticians) to the interval results generated by the predictive algorithms.
Natural language processing