gpdFit               package:fExtremes               R Documentation

_G_e_n_e_r_a_l_i_z_e_d _P_a_r_e_t_o _D_i_s_t_r_i_b_u_t_i_o_n _M_o_d_e_l_l_i_n_g

_D_e_s_c_r_i_p_t_i_o_n:

     This is a collection of functions to model the Generalized Pareto 
     Distribution, GPD, based on R's EVIS package. Two approaches for
     parameter estimation are proved: Maximum likelihood estimation and
     the probability weighted moment method. 

     The functions are:

        1  'gpdSim'           generates data from the GPD,
        2  'gpdFit'           fits empirical or simulated data to the distribution,
        3  'print'            print method for a fitted GPD object of class ...,
        4  'plot'             plot method for a fitted GPD object,
        5  'summary'          summary method for a fitted GPD object,
        6  'gpdqPlot'         estimation of high quantiles,
        7  'gpdquantPlot'     variation of high quantiles with threshold,
        8  'gpdriskmeasures'  prescribed quantiles and expected shortfalls,
        9  'gpdsfallPlot'     expected shortfall with confidence intervals,
       10  'gpdshapePlot'     variation of shape with threshold,
       11  'gpdtailPlot'      plot of the tail.

_U_s_a_g_e:

     gpdSim(model = list(shape = 0.25, location = 0, scale = 1), n = 1000)
     gpdFit(x, threshold = NA, nextremes = NA, type = c("mle", "pwm"),
         information = c("observed", "expected"), ...)

     ## S3 method for class 'gpdFit':
     print(x, ...)
     ## S3 method for class 'gpdFit':
     plot(x, which = "all", ...)
     ## S3 method for class 'gpdFit':
     summary(object, doplot = TRUE, which = "all", ...)

     gpdqPlot(x, pp, ci.type = c("likelihood", "wald"), ci.p = 0.95, 
             like.num = 50)
     gpdquantPlot(data, p = 0.99, models = 30, start = 15, end = 500, 
             reverse = TRUE, ci = 0.95, autoscale = TRUE, labels = TRUE, ...)
     gpdriskmeasures(x, plevels)
     gpdsfallPlot(x, pp, ci.p = 0.95, like.num = 50)
     gpdshapePlot(data, models = 30, start = 15, end = 500, reverse = TRUE,
         ci = 0.95, autoscale = TRUE, labels = TRUE, ...) 
     gpdtailPlot(fit, optlog = NA, extend = 1.5, labels = TRUE, ...)

_A_r_g_u_m_e_n_t_s:

autoscale: whether or not plot should be automatically scaled; if not,
          xlim and ylim graphical parameters may be entered. 

      ci: the probability for asymptotic confidence band; for no 
          confidence band set to zero. 

    ci.p: the probability for confidence interval (must be less  than
          0.999). 

 ci.type: the method for calculating a confidence interval: 
          '"likelihood"' or '"wald"'. 

    data: a numeric vector of data. 

  doplot: a logical. Should the results be plotted? 

  extend: optional argument for plots 1 and 2 expressing how far x-axis
           should extend as a multiple of the largest data value. This 
          argument must take values greater than 1 and is useful for
          showing estimated quantiles beyond data. 

     fit: [print][plot][summary] -  print method, a fitted object of
          class '"gpd"'. 

information: whether standard errors should be calculated with
          '"observed"' or '"expected"' information. This only applies
          to the maximum likelihood method; for the
          probability-weighted moments method '"expected"' information
          is used if possible. 

  labels: optional argument for plots 1 and 2  specifying whether or
          not  axes should be labelled. 

like.num: the number of times to evaluate profile likelihood. 

   model: [gpdsim] -  a list with components 'shape', 'location' and 
          'scale' giving the parameters of the GPD distribution. By
          default the shape parameter has the value 0.25, the location
          is zero and the scale is one.

  models: the number of consecutive gpd models to be fitted. 

       n: [gpdsim] - lnumber of generated data points, an integer
          value. 

nextremes: [gpdFit] -  the number of upper extremes to be used (either
          this or  'threshold' must be given but not both). 

  object: [summary] - a fitted object of class '"gpdFit"'. 

  optlog: optional argument for plots 1 and 2 giving a particular
          choice  of logarithmic axes: '"x"' x-axis only; '"y"' y-axis 
          only; '"xy"' both axes; '""' neither axis. 

plevels, p, pp: a vector of probability levels, the desired probability
          for the  quantile estimate (e.g. 0.99 for the 99th
          percentile). 

 reverse: should plot be by increasing threshold ('TRUE') or number  of
          extremes ('FALSE'). 

start, end: the lowest and maximum number of exceedances to be
          considered. 

threshold: a threshold value (either this or 'nextremes' must be given 
          but not both). 

    type: a character string selecting the desired estimation mehtod,
          either '"mle"' for the maximum likelihood mehtod or '"pwm"'
          for  the probability weighted moment method. By default, the
          first will  be selected. Note, the function 'gpd' uses
          '"ml"'.     

   which: if 'which' is set to '"ask"' the function will  interactively
          ask which plot should be displayed. By default this value is
          set to 'FALSE' and then those plots will be displayed for
          which the elements in the logical vector 'which' ar set to
          'TRUE'; by default all four elements are set to '"all"'. 

       x: [gpdFit] - the data vector. Note, there are two different
          names for the first argument 'x' and 'data' depending  which
          function name is used, either 'gpdFit' or the  EVIS synonyme
          'gpd'. 
           [print][plot] - a fitted object of class '"gpdFit"'. 

     ...: control parameters and plot parameters optionally passed to
          the  optimization and/or plot function. Parameters for the
          optimization function are passed to components of the
          'control' argument of 'optim'.   

_D_e_t_a_i_l_s:

     *Simulation:* 

      'gpdSim' simulates data from a Generalized Pareto  distribution. 

     *Parameter Estimation:* 

      'gpdFit' fits the model parameters either by the probability 
     weighted moment method or the maxim log likelihood method.  The
     function returns an object of class '"gpd"'  representing the fit
     of a generalized Pareto model to excesses over  a high threshold.
     The fitting functions use the probability weighted  moment method,
     if method 'method="pwm"' was selected, and the  the general
     purpose optimization function 'optim' when the  maximum likelihood
     estimation, 'method="mle"' or 'method="ml"'  is chosen. 

     *Methods:* 

      'print.gpd', 'plot.gpd' and 'summary.gpd' are print,  plot, and
     summary methods for a fitted object of class 'gpdFit'.  The plot
     method provides four different plots for assessing fitted  GPD
     model.  

     *gpd* Functions:* 

      'gpdqPlot' calculates quantile estimates and confidence intervals
      for high quantiles above the threshold in a GPD analysis, and
     adds a  graphical representation to an existing plot. The GPD
     approximation in  the tail is used to estimate quantile. The
     '"wald"' method uses  the observed Fisher information matrix to
     calculate confidence interval.  The '"likelihood"' method
     reparametrizes the likelihood in terms  of the unknown quantile
     and uses profile likelihood arguments to  construct a confidence
     interval.  

     'gpdquantPlot' creates a plot showing how the estimate of a  high
     quantile in the tail of a dataset based on the GPD approximation 
     varies with threshold or number of extremes. For every model 
     'gpdFit' is called. Evaluation may be slow. Confidence intervals 
     by the Wald method may be fastest. 

     'gpdriskmeasures' makes a rapid calculation of point estimates  of
     prescribed quantiles and expected shortfalls using the output of
     the function 'gpdFit'. This function simply calculates point
     estimates  and (at present) makes no attempt to calculate
     confidence intervals for  the risk measures. If confidence levels
     are required use 'gpdqPlot'  and 'gpdsfallPlot' which interact
     with graphs of the tail of a loss distribution and are much
     slower.   

     'gpdsfallPlot' calculates expected shortfall estimates, in other
     words tail conditional expectation and confidence intervals for
     high   quantiles above the threshold in a GPD analysis. A
     graphical  representation to an existing plot is added. Expected
     shortfall is  the expected size of the loss, given that a
     particular quantile of the  loss distribution is exceeded. The GPD
     approximation in the tail is used  to estimate expected shortfall.
     The likelihood is reparametrised  in  terms of the unknown
     expected shortfall and profile likelihood arguments  are used to
     construct a confidence interval.  

     'gpdshapePlot' creates a plot showing how the estimate of shape 
     varies with threshold or number of extremes. For every model 
     'gpdFit' is called. Evaluation may be slow.   

     'gpdtailPlot' produces a plot of the tail of the underlying 
     distribution of the data.

_V_a_l_u_e:

     'gpdSim' 
      returns a vector of datapoints from the simulated  series.

     'gpdFit'  
      returns an object of class '"gpd"' describing the  fit including
     parameter estimates and standard errors. 

     'gpdquantPlot' 
      returns invisible a table of results.

     'gpdshapePlot' 
      returns invisible a table of results.

     'gpdtailPlot' 
      returns invisible a list object containing  details of the plot
     is returned invisibly. This object should be  used as the first
     argument of 'gpdqPlot' or 'gpdsfallPlot'  to add quantile
     estimates or expected shortfall estimates to the  plot.

_A_u_t_h_o_r(_s):

     This function is based on Alec Stephenson's R-package 'evir' 
     ported from the 'EVIS' library, _Extreme Values in S_, written by
     Alexander McNeil. The 'fExtremes' port and the  change and
     addition of some functions were done by Diethelm Wuertz.

_R_e_f_e_r_e_n_c_e_s:

     Hosking J.R.M., Wallis J.R., (1987); _Parameter and quantile
     estimation for the generalized Pareto distribution_,   
     Technometrics 29, 339-349.

_S_e_e _A_l_s_o:

     'gevFit'.

_E_x_a_m_p_l_e_s:

     ## Load Data:
        data(danish)
        data(bmw)
        
     ## Simulate GPD Data:
        xmpExtremes("\nStart: Simulate a GPD Distributed Sample > ")
        x = gpdSim(model = list(shape = 0.25, location = 0, scale = 1), n = 1000)
        # Fit GPD Data by Probability Weighted Moments:
        fit = gpdFit(x, nextremes = length(x), type = "pwm") 
        print(fit)
        # Summarize Results:
        par(mfcol = c(4, 2), cex = 0.5)
        summary(fit)  
        
     ## Fit GPD Data by Max Log Likelihood Method:
        xmpExtremes("\nNext: Estimate Parameters > ")
        fit = gpdFit(x, nextremes = length(x), type = "mle") 
        print(fit)
        # Summarize Results:
        summary(fit) 

     ## Fit GPD to excess losses over 10 for the 
        xmpExtremes("\nNext: Fit to Excess Losses > ")
        # Danish Fire insurance data:
        fit = gpdFit(danish, 10, type = "mle") 
        print(fit)
        par(mfrow = c(2, 2), cex = 0.7)
        summary(fit)  
        
     ## Make Example Function:
        xmpExtremes("\nNext: Write Example Function > ")
        examples = function(n) {
          par(mfrow = c(2, 2), cex = 0.7)
        
     ## gpdqPlot - 
        # Estimate 99.5th percentile:
        # BMW Stock Data: 
          par(mfrow = c(2, 2), cex = 0.7)
          plot(-bmw, ylim = c(0, 0.25), type = "h", main = "BMW Stock Data")
          fit = gpdFit(-bmw, nextremes = floor(length(bmw)/10), type = "mle")
          tail = gpdtailPlot(fit)
          gpdqPlot(tail, 0.995)
          title(main = "BMW: 99.5th Percentile")
      
     ## gpdqPlot - 
        # Estimates 0f the 99.5th percentile 
        # Danish Fire Data:
          fit = gpdFit(danish, threshold = 10, type = "mle")
          tail = gpdtailPlot(fit)
          gpdqPlot(tail, 0.995)
          title(main = "Danish Data: 99.5th Percentile") }

     ## gpdquantPlot - 
        # Estimates of the 99.9th percentile:
        # BMW Stock Data
          plot(-bmw, ylim = c(0, 0.25), type = "h", main = "BMW Stock Data")
          gpdquantPlot(-bmw, p = 0.999)
          title(sub = "BMW: GPD High Quantile", cex.sub = 0.75)

     ## gpdquantPlot - 
        # Estimates of the 99.9th percentile:
        # Danish Fire Data:
          plot(danish, type = "h", main = "Danish Fire")
          gpdquantPlot(danish, p=0.999)
          title(sub = "Danish Fire: GPD High Quantile", cex.sub = 0.75) 
        

