Auxiliary Functions for Phenological Data Analysis: Package 'pheno', Exams of German Philology

The r package 'pheno', which provides auxiliary functions for time series analyses of phenological data. The package includes functions for estimating connected sets in a matrix, calculating day length, finding the number of days between two dates, converting julian dates to string or integer dates, testing for leap years, converting numeric matrices to data frames, and finding the maximal connected set in a matrix. These functions are particularly useful in phenological analysis, where experimental designs are often unbalanced due to missing data.

Typology: Exams

Pre 2010

Uploaded on 08/30/2009

koofers-user-v6c
koofers-user-v6c 🇺🇸

9 documents

1 / 22

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Package ‘pheno’
April 19, 2009
Title Auxiliary functions for phenological data analysis
Version 1.4
Date 16.01.2008
Author Joerg Schaber
Description Provides some easy-to-use functions for time series analyses of (plant-) phenological data
sets. These functions mainly deal with the estimation of combined phenological time series and
are usually wrappers for functions that are already implemented in other R packages adapted to
the special structure of phenological data and the needs of phenologists. Some date conversion
functions to handle Julian dates are also provided.
Maintainer Joerg Schaber <[email protected]>
Depends R (>= 2.1), nlme, SparseM, quantreg
License GPL (>= 2)
Repository CRAN
Date/Publication 2009-01-18 12:09:05
Rtopics documented:
connectedSets........................................ 2
date2jul1 .......................................... 3
date2jul2 .......................................... 4
daylength .......................................... 4
daysbetween......................................... 5
DWD ............................................ 6
jul2date1 .......................................... 6
jul2date2 .......................................... 7
leapyear........................................... 8
matrix2raw ......................................... 8
maxConnectedSet...................................... 9
maxdaylength........................................ 11
1
pf3
pf4
pf5
pf8
pf9
pfa
pfd
pfe
pff
pf12
pf13
pf14
pf15
pf16

Partial preview of the text

Download Auxiliary Functions for Phenological Data Analysis: Package 'pheno' and more Exams German Philology in PDF only on Docsity!

Package ‘pheno’

April 19, 2009

Title Auxiliary functions for phenological data analysis

Version 1.

Date 16.01.

Author Joerg Schaber

Description Provides some easy-to-use functions for time series analyses of (plant-) phenological data sets. These functions mainly deal with the estimation of combined phenological time series and are usually wrappers for functions that are already implemented in other R packages adapted to the special structure of phenological data and the needs of phenologists. Some date conversion functions to handle Julian dates are also provided.

Maintainer Joerg Schaber

Depends R (>= 2.1), nlme, SparseM, quantreg

License GPL (>= 2)

Repository CRAN

Date/Publication 2009-01-18 12:09:

R topics documented:

connectedSets........................................ 2 date2jul1.......................................... 3 date2jul2.......................................... 4 daylength.......................................... 4 daysbetween......................................... 5 DWD............................................ 6 jul2date1.......................................... 6 jul2date2.......................................... 7 leapyear........................................... 8 matrix2raw......................................... 8 maxConnectedSet...................................... 9 maxdaylength........................................ 11

2 connectedSets

pheno.ddm......................................... 12 pheno.flm.fit......................................... 13 pheno.lad.fit......................................... 14 pheno.mlm.fit........................................ 16 raw2matrix......................................... 17 Searle............................................ 18 seqMK............................................ 18 Simple............................................ 19 tau.............................................. 19

Index 21

connectedSets Connected sets in a matrix

Description

Finds connected data sets, i.e. connected rows and columns of a numeric matrix M.

Usage

connectedSets(M)

Arguments

M Numeric matrix with missing values assumed to be NA or 0.

Details

In a two-way classification of linear models sometimes independent sets of normal equations are obtained due to missing data in the experiments design, i.e. the complete design matrix is not of full rank and thus no solution can be found. However, solutions of the independent sets of normal equations can still exist. This phenomenon is called ’connectedness’ of the data. Especially in phenological analysis experimental designs are almost always unbalanced because of missing data. Thus, when combined time series are to be estimated, it is worth checking for and finding connected data sets for which combined time series can then be estimated. Example (also see example data(Simple) and example in ’maxConnectedSets’): In the following matrix dots represent missing values, X represent observations and the lines join the connected sets: : X___X.. : | : X___X.. : :.. X___X

Thus, in this matrix observations in rows 1 and 2 or colums 1 and 2 form one connected set. Like- wise row 3 (or columns 3 and 4) form also one connected set.

4 daylength

date2jul2 Converts a date (day,month,year) to Julian date

Description

Converts an integer date (day,month,year) into a Julian day of year (DOY). If y is missing, 2000 is assumed.

Usage

date2jul2(d,m,y)

Arguments

d Day of month, numeric coecerd into an integer. m Month of year, numeric coerced into an integer. y Year, numeric coerced into an integer, default 2000.

Value

doy Day of year as integer. year Year as integer.

Author(s)

Joerg Schaber

Examples

date2jul2(31,5,1970)

daylength Daylength at julian day i on latitude l

Description

Calculates daylength [h] and declination angle delta [radians] on day i [julian day of year] for latitude l [degrees].

Usage

daylength(i,l)

daysbetween 5

Arguments

i Integer as julian day of year (1-365) l Float as latitude [degress]

Value

dl daylength [h] delta declination angle [degrees]

Author(s)

Joerg Schaber

Examples

daylength(as.integer(120),63)

daysbetween Number of days between two dates

Description

Number of days between date1 and date2.

Usage

daysbetween(d1,d2)

Arguments

d1 Date as a character string ’DD.MM.YYYY’. d2 Date as s character string ’DD.MM.YYYY’.

Value

ndays Number of days between d1 and d2.

Author(s)

Joerg Schaber

Examples

daysbetween('31.05.1970','10.03.2004')

jul2date2 7

Author(s)

Joerg Schaber

Examples

jul2date1(151,1970)

jul2date2 Converts Julian date to integers day,month,year

Description

Converts Julian day of year (DOY) into an integer date (day,month,year). If y is missing a non-leap year is assumed.

Usage

jul2date2(d,y)

Arguments

d DOY, numeric coerced into an integer. y Year, numeric coerced into an integer, default 2000.

Value

day Day of month as integer. month Month of year as integer. year Year as integer.

Author(s)

Joerg Schaber

Examples

jul2date2(151,1970)

8 matrix2raw

leapyear Boolean test for leap year

Description

Tests whether a given year is a leap year or not.

Usage

leapyear(y)

Arguments

y Year, numeric coerced into integer.

Value

TRUE leap year FALSE non leap year

Author(s)

Joerg Schaber

Examples

leapyear(2000) leapyear(2004)

matrix2raw Converts numeric matrix to data frame

Description

Converts a numeric matrix M into a dataframe D with three columns (x, factor 1, factor 2) where rows of M are ranks of factor 1 levels and columns of M are ranks of factor 2 levels, missing values are assumed to be 0 or NA. The resulting dataframe D has no missing values.

Usage

matrix2raw(M,l1,l2)

10 maxConnectedSet

Details

In a two-way classification of linear models sometimes independent sets of normal equations are obtained due to missing data in the experiments design, i.e. the complete design matrix is not of full rank and thus no solution can be found. However, solutions of the independent sets of normal equations can still exist. This phenomenon is called ’connectedness’ of the data. Especially in phenological analysis experimental designs are almost always unbalanced because of missing data. Thus, when combined time series are to be estimated, it is worth checking for and finding connected data sets for which combined time series can then be estimated. This can also be interpreted in the way that a prerequisite to obtain a combined time series is to have overlapping time series. Example (also see example data(Searle) from Searle (1997), page 324 and example in ’connectedSets’): In the following matrix dots represent missing values, X represent observations and the lines join the connected sets: : X___...X... : | :.. X.!..X : | | :. X..!X___X! : | | | :. X..!X___X! : | | :.... X..! : | | :.. X.!..___X : | :... X___X...

Thus, in this matrix observations of rows 1, 5 and 7 or colums 1, 4 and 5 form one connected set. Likewise observations of rows 2 and 6 (or columns 3 and 8) and rows 3 and 4 (or columns 2, 6 and

  1. form also connected sets, respectively.

Value

ms maximal connected set as matrix or data frame, corresponding to the input. maxl Number of observations in the maximal connected data set. nsets Number of connected data sets. lsets Vector with number of observations in each connected data sets, i.e. lsets[i] is the number of observations in connected data set i.

Author(s)

Joerg Schaber

References

Searle (1997) ’Linear Models’. Wiley. page 318.

maxdaylength 11

See Also

connectedSets

Examples

data(Searle) maxConnectedSet(Searle)

maxdaylength Maximal day length on latitude l

Description

Calculates maximal daylength maxdl [h] at a certain latitude l [degrees].

Usage

maxdaylength(l)

Arguments

l Latitude in degrees.

Value

maxdl Maximal daylength [h] at a certain latitude l [degrees]

Author(s)

Joerg Schaber

Examples

maxdaylength(60)

pheno.flm.fit 13

Examples

data(DWD) ddm1 <- pheno.ddm(DWD) attach(DWD) y <- factor(DWD[[2]]) s <- factor(DWD[[3]]) ddm2 <- as.matrix.csr(model.matrix(~ y + s -1, contrasts=list(s=("contr.sum")))) identical(ddm1$ddm,ddm2)

pheno.flm.fit Fits a two-way linear fixed model

Description

Fits a two-way linear fixed model. The model assumes the first factor f1 the second factor f2 to be fixed. Errors are assumed to be i.i.d. No general mean and sum of f2 is constrained to be zero.

Usage

pheno.flm.fit(D)

Arguments

D Data frame with three columns (x, f1, f2) or a matrix where rows are ranks of factor f1 levels and columns are ranks of factor f2 levels and missing values are assumed to be NA or 0.

Details

This function is basically a wrapper for the lm() function, adapted for the estimation of combined phenological time series. In phenological application, x should be the julian day of observation of a certain phase, factor f1 should be the observation year and factor f2 should be a station-id.

Value

f1 Estimated fixed effects f1, in phenology this is precisely the combined time series. f1.se f1 estimated standard error. f1.lev Levels of f1. Should be the same order as f1. f2 Estimated fixed effects f2, in phenology these are the station effects. f2.se f2 estimated standard error. f2.lev Levels of f2. Should be the same order as f2. resid Residuals lclf1 Lower 95 percent confidence limit of factor f1. uclf1 Upper 95 percent confidence limit of factor f1.

14 pheno.lad.fit

lclf2 Lower 95 percent confidence limit of factor f2.

uclf2 Upper 95 percent confidence limit of factor f2.

fit The fitted lm model object.

Author(s)

Joerg Schaber

References

Searle (1997) ’Linear Models’. Wiley. Schaber J, Badeck F-W (2002) ’Evaluation of methods for the combination of phenological time series and outlier detection’. Tree Physiology 22:973-

See Also

lm

Examples

data(DWD) R <- pheno.flm.fit(DWD) # parameter estimation

pheno.lad.fit Fits a robust two-way linear model

Description

Fits a robust two-way linear model. The model assumes both factors (f1 and f2) to be fixed. Errors are assumed to be i.i.d. No general mean and sum of f2 is constrained to be zero.

Usage

pheno.lad.fit(D)

Arguments

D Data frame with three columns (x, f1, f2) or a matrix where rows are ranks of factor f1 levels and columns are ranks of factor f2 levels and missing values are assumed to be NA or 0.

16 pheno.mlm.fit

pheno.mlm.fit Fits a two-way linear mixed model

Description

Fits a two-way linear mixed model. The model assumes the first factor f1 to be fixed and the second factor f2 to be random. Errors are assumed to be i.i.d. No general mean and sum of f2 is constrained to be zero.

Usage

pheno.mlm.fit(D)

Arguments

D Data frame with three columns (x, f1, f2) or a matrix where rows are ranks of factor f1 levels and columns are ranks of factor f2 levels and missing values are set to 0.

Details

This function is basically a wrapper for the lme() function of the nlme package, adapted for the estimation of combined phenological time series. Estimation method: restricted maximum likelihood (REML) In phenological application, x should be the julian day of observation of a certain phase, factor f1 should be the observation year and factor f2 should be a station-id.

Value

fixed Estimated fixed effects, in phenology this is precisely the combined time series. fixed.lev Levels of fixed effects. Should be the same order as fixed effects. random Estimated random effects, in phenology these are the station effects. random.lev Levels of random effects. Should be the same order as random effects. SEf1 Standard error group f1, i.e. square root of variance component fixed effect. SEf2 Standard error group f2, i.e. square root of variance component random effect. lclf Lower 95 percent confidence limit of fixed effects. uclf Upper 95 percent confidence limit of fixed effects. fit The fitted lme model object.

Author(s)

Joerg Schaber

References

Searle (1997) ’Linear Models’. Wiley. Schaber J, Badeck F-W (2002) ’Evaluation of methods for the combination of phenological time series and outlier detection’. Tree Physiology 22:973-

raw2matrix 17

See Also

lme

Examples

data(DWD) R <- pheno.mlm.fit(DWD) # pa plot(levels(factor(DWD[[2]])),R$fixed,type="l") # plot combined time series tr <- lm(R$fixed~rank(levels(factor(DWD[[2]])))) # trend estimation summary(tr)$coef[2] summary(tr)$coef[4]

raw2matrix Converts a numeric data frame to matrix

Description

Converts a numeric data frame D with three columns (x, factor 1, factor 2) to a numeric matrix M where rows are ranks of levels of factor 1 and columns are ranks of levels of factor 2, missing values are set to NA.

Usage

raw2matrix(D)

Arguments

D Data frame with three columns (x, factor 1, factor 2)

Value

M Numeric matrix where rows are ranks of levels of factor 1 and columns are ranks of levels of factor 2, missing values are set to NA.

Author(s)

Joerg Schaber

Examples

data(DWD) raw2matrix(DWD)

Simple 19

Value

prog Progressive row of Kendall’s normalized tau’s retr Retrograde row of Kendall’s normalized tau’s tp Boolean vector indicating at what indices of the original timeseries the prog and retr cross, i.e. TRUE at potential trend turning points.

Author(s)

Joerg Schaber

References

Kendall M, Gibbons JD (1990) ’Rank correlation methods’. Arnold. Sneyers R (1990) ’On sta- tistical analysis of series of observations. Technical Note No 143. Geneva. Switzerland. World Meteorological Society. Schaber J (2003) ’Phenology in German in the 20th Century: Meth- ods, analyses and models. Ph.D. Thesis. University of Potsdam. Germany. http://pub.ub. uni-potsdam.de/2002meta/0022/door.htm

Simple Simple example of a two-way classification table

Description

Simple example of a two-way classification table where missing data creates two distinct connected sets.

Usage

data(Simple)

Format

R source file

tau Kendall’s normalized tau

Description

Kendall’s normalized tau for time series x

Usage

tau(x)

20 tau

Arguments

x Numeric vector x.

Details

Implicitly assumes a equidistant time series x.

Value

t Kendall’s normalized tau.

Author(s)

Joerg Schaber

References

Kendall M, Gibbons JD (1990) ’Rank correlation methods’. Arnold. Sneyers R (1990) ’On sta- tistical analysis of series of observations. Technical Note No 143. Geneva. Switzerland. World Meteorological Society. Schaber J (2003) ’Phenology in German in the 20th Century: Meth- ods, analyses and models. Ph.D. Thesis. University of Potsdam. Germany. http://pub.ub. uni-potsdam.de/2002meta/0022/door.htm