Page 53 - Flytxt
P. 53

statistics of the data must be dealt with
          on an ongoing basis for infinite data sets.                Infinite data sets require data
          Extra-sensory unrecorded events like                        scientists to advance models

          political upheaval and natural disaster,                      and test them at a speed
          and changes in macro-economic, cultural                       comparable to the arrival
          and other trends are inevitable however;                     of new data. Also, multiple
          and these lead to non-homogeneity and                      models may work on multiple

          non-stationarity of the distribution.                       data representations; hence
          Furthermore, data set representations                          the ability to have many

          will change over time for infinite data                      models being evaluated in
          sets, including global notions of outliers                       parallel is necessary.
          and patterns, and local notions of

          improved or degraded ability of sensors
          and variability in availability of streams.          complete and the model is retrained.
          Data scientists today build their models             Thus models needing frequent manual
          assuming homogeneity, stationarity and               training don’t suit the cadence and
          regularity of representation in the data             assurance of decisions required for

          set. Then they retrain the models when               infinite data sets, and data science must
          their (perhaps manual) observation                   evolve to address this problem.
          of the target variables belies these                 Methods to deal with

          assumptions.                                         infinite data
          Target values are used for decisions that            Infinite data sets require data scientists

          affect personal and professional lives.              to advance models and test them at a
          Human training of models, however,                   speed comparable to the arrival of new
          implies that there is an allowance for               data; and to balance this speed with

          inefficient decisions until the training is          the accuracy of the decision the models







































          INSIGHTZ - VOLUME 03, 2018                                                                         53
   48   49   50   51   52   53   54   55   56   57   58