














Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
The r package 'pheno', which provides auxiliary functions for time series analyses of phenological data. The package includes functions for estimating connected sets in a matrix, calculating day length, finding the number of days between two dates, converting julian dates to string or integer dates, testing for leap years, converting numeric matrices to data frames, and finding the maximal connected set in a matrix. These functions are particularly useful in phenological analysis, where experimental designs are often unbalanced due to missing data.
Typology: Exams
1 / 22
This page cannot be seen from the preview
Don't miss anything!















Title Auxiliary functions for phenological data analysis
Version 1.
Date 16.01.
Author Joerg Schaber
Description Provides some easy-to-use functions for time series analyses of (plant-) phenological data sets. These functions mainly deal with the estimation of combined phenological time series and are usually wrappers for functions that are already implemented in other R packages adapted to the special structure of phenological data and the needs of phenologists. Some date conversion functions to handle Julian dates are also provided.
Maintainer Joerg Schaber
Depends R (>= 2.1), nlme, SparseM, quantreg
License GPL (>= 2)
Repository CRAN
Date/Publication 2009-01-18 12:09:
connectedSets........................................ 2 date2jul1.......................................... 3 date2jul2.......................................... 4 daylength.......................................... 4 daysbetween......................................... 5 DWD............................................ 6 jul2date1.......................................... 6 jul2date2.......................................... 7 leapyear........................................... 8 matrix2raw......................................... 8 maxConnectedSet...................................... 9 maxdaylength........................................ 11
2 connectedSets
pheno.ddm......................................... 12 pheno.flm.fit......................................... 13 pheno.lad.fit......................................... 14 pheno.mlm.fit........................................ 16 raw2matrix......................................... 17 Searle............................................ 18 seqMK............................................ 18 Simple............................................ 19 tau.............................................. 19
Index 21
connectedSets Connected sets in a matrix
Description
Finds connected data sets, i.e. connected rows and columns of a numeric matrix M.
Usage
connectedSets(M)
Arguments
M Numeric matrix with missing values assumed to be NA or 0.
Details
In a two-way classification of linear models sometimes independent sets of normal equations are obtained due to missing data in the experiments design, i.e. the complete design matrix is not of full rank and thus no solution can be found. However, solutions of the independent sets of normal equations can still exist. This phenomenon is called ’connectedness’ of the data. Especially in phenological analysis experimental designs are almost always unbalanced because of missing data. Thus, when combined time series are to be estimated, it is worth checking for and finding connected data sets for which combined time series can then be estimated. Example (also see example data(Simple) and example in ’maxConnectedSets’): In the following matrix dots represent missing values, X represent observations and the lines join the connected sets: : X___X.. : | : X___X.. : :.. X___X
Thus, in this matrix observations in rows 1 and 2 or colums 1 and 2 form one connected set. Like- wise row 3 (or columns 3 and 4) form also one connected set.
4 daylength
date2jul2 Converts a date (day,month,year) to Julian date
Description
Converts an integer date (day,month,year) into a Julian day of year (DOY). If y is missing, 2000 is assumed.
Usage
date2jul2(d,m,y)
Arguments
d Day of month, numeric coecerd into an integer. m Month of year, numeric coerced into an integer. y Year, numeric coerced into an integer, default 2000.
Value
doy Day of year as integer. year Year as integer.
Author(s)
Joerg Schaber
Examples
date2jul2(31,5,1970)
daylength Daylength at julian day i on latitude l
Description
Calculates daylength [h] and declination angle delta [radians] on day i [julian day of year] for latitude l [degrees].
Usage
daylength(i,l)
daysbetween 5
Arguments
i Integer as julian day of year (1-365) l Float as latitude [degress]
Value
dl daylength [h] delta declination angle [degrees]
Author(s)
Joerg Schaber
Examples
daylength(as.integer(120),63)
daysbetween Number of days between two dates
Description
Number of days between date1 and date2.
Usage
daysbetween(d1,d2)
Arguments
d1 Date as a character string ’DD.MM.YYYY’. d2 Date as s character string ’DD.MM.YYYY’.
Value
ndays Number of days between d1 and d2.
Author(s)
Joerg Schaber
Examples
daysbetween('31.05.1970','10.03.2004')
jul2date2 7
Author(s)
Joerg Schaber
Examples
jul2date1(151,1970)
jul2date2 Converts Julian date to integers day,month,year
Description
Converts Julian day of year (DOY) into an integer date (day,month,year). If y is missing a non-leap year is assumed.
Usage
jul2date2(d,y)
Arguments
d DOY, numeric coerced into an integer. y Year, numeric coerced into an integer, default 2000.
Value
day Day of month as integer. month Month of year as integer. year Year as integer.
Author(s)
Joerg Schaber
Examples
jul2date2(151,1970)
8 matrix2raw
leapyear Boolean test for leap year
Description
Tests whether a given year is a leap year or not.
Usage
leapyear(y)
Arguments
y Year, numeric coerced into integer.
Value
TRUE leap year FALSE non leap year
Author(s)
Joerg Schaber
Examples
leapyear(2000) leapyear(2004)
matrix2raw Converts numeric matrix to data frame
Description
Converts a numeric matrix M into a dataframe D with three columns (x, factor 1, factor 2) where rows of M are ranks of factor 1 levels and columns of M are ranks of factor 2 levels, missing values are assumed to be 0 or NA. The resulting dataframe D has no missing values.
Usage
matrix2raw(M,l1,l2)
10 maxConnectedSet
Details
In a two-way classification of linear models sometimes independent sets of normal equations are obtained due to missing data in the experiments design, i.e. the complete design matrix is not of full rank and thus no solution can be found. However, solutions of the independent sets of normal equations can still exist. This phenomenon is called ’connectedness’ of the data. Especially in phenological analysis experimental designs are almost always unbalanced because of missing data. Thus, when combined time series are to be estimated, it is worth checking for and finding connected data sets for which combined time series can then be estimated. This can also be interpreted in the way that a prerequisite to obtain a combined time series is to have overlapping time series. Example (also see example data(Searle) from Searle (1997), page 324 and example in ’connectedSets’): In the following matrix dots represent missing values, X represent observations and the lines join the connected sets: : X___...X... : | :.. X.!..X : | | :. X..!X___X! : | | | :. X..!X___X! : | | :.... X..! : | | :.. X.!..___X : | :... X___X...
Thus, in this matrix observations of rows 1, 5 and 7 or colums 1, 4 and 5 form one connected set. Likewise observations of rows 2 and 6 (or columns 3 and 8) and rows 3 and 4 (or columns 2, 6 and
Value
ms maximal connected set as matrix or data frame, corresponding to the input. maxl Number of observations in the maximal connected data set. nsets Number of connected data sets. lsets Vector with number of observations in each connected data sets, i.e. lsets[i] is the number of observations in connected data set i.
Author(s)
Joerg Schaber
References
Searle (1997) ’Linear Models’. Wiley. page 318.
maxdaylength 11
See Also
connectedSets
Examples
data(Searle) maxConnectedSet(Searle)
maxdaylength Maximal day length on latitude l
Description
Calculates maximal daylength maxdl [h] at a certain latitude l [degrees].
Usage
maxdaylength(l)
Arguments
l Latitude in degrees.
Value
maxdl Maximal daylength [h] at a certain latitude l [degrees]
Author(s)
Joerg Schaber
Examples
maxdaylength(60)
pheno.flm.fit 13
Examples
data(DWD) ddm1 <- pheno.ddm(DWD) attach(DWD) y <- factor(DWD[[2]]) s <- factor(DWD[[3]]) ddm2 <- as.matrix.csr(model.matrix(~ y + s -1, contrasts=list(s=("contr.sum")))) identical(ddm1$ddm,ddm2)
pheno.flm.fit Fits a two-way linear fixed model
Description
Fits a two-way linear fixed model. The model assumes the first factor f1 the second factor f2 to be fixed. Errors are assumed to be i.i.d. No general mean and sum of f2 is constrained to be zero.
Usage
pheno.flm.fit(D)
Arguments
D Data frame with three columns (x, f1, f2) or a matrix where rows are ranks of factor f1 levels and columns are ranks of factor f2 levels and missing values are assumed to be NA or 0.
Details
This function is basically a wrapper for the lm() function, adapted for the estimation of combined phenological time series. In phenological application, x should be the julian day of observation of a certain phase, factor f1 should be the observation year and factor f2 should be a station-id.
Value
f1 Estimated fixed effects f1, in phenology this is precisely the combined time series. f1.se f1 estimated standard error. f1.lev Levels of f1. Should be the same order as f1. f2 Estimated fixed effects f2, in phenology these are the station effects. f2.se f2 estimated standard error. f2.lev Levels of f2. Should be the same order as f2. resid Residuals lclf1 Lower 95 percent confidence limit of factor f1. uclf1 Upper 95 percent confidence limit of factor f1.
14 pheno.lad.fit
lclf2 Lower 95 percent confidence limit of factor f2.
uclf2 Upper 95 percent confidence limit of factor f2.
fit The fitted lm model object.
Author(s)
Joerg Schaber
References
Searle (1997) ’Linear Models’. Wiley. Schaber J, Badeck F-W (2002) ’Evaluation of methods for the combination of phenological time series and outlier detection’. Tree Physiology 22:973-
See Also
lm
Examples
data(DWD) R <- pheno.flm.fit(DWD) # parameter estimation
pheno.lad.fit Fits a robust two-way linear model
Description
Fits a robust two-way linear model. The model assumes both factors (f1 and f2) to be fixed. Errors are assumed to be i.i.d. No general mean and sum of f2 is constrained to be zero.
Usage
pheno.lad.fit(D)
Arguments
D Data frame with three columns (x, f1, f2) or a matrix where rows are ranks of factor f1 levels and columns are ranks of factor f2 levels and missing values are assumed to be NA or 0.
16 pheno.mlm.fit
pheno.mlm.fit Fits a two-way linear mixed model
Description
Fits a two-way linear mixed model. The model assumes the first factor f1 to be fixed and the second factor f2 to be random. Errors are assumed to be i.i.d. No general mean and sum of f2 is constrained to be zero.
Usage
pheno.mlm.fit(D)
Arguments
D Data frame with three columns (x, f1, f2) or a matrix where rows are ranks of factor f1 levels and columns are ranks of factor f2 levels and missing values are set to 0.
Details
This function is basically a wrapper for the lme() function of the nlme package, adapted for the estimation of combined phenological time series. Estimation method: restricted maximum likelihood (REML) In phenological application, x should be the julian day of observation of a certain phase, factor f1 should be the observation year and factor f2 should be a station-id.
Value
fixed Estimated fixed effects, in phenology this is precisely the combined time series. fixed.lev Levels of fixed effects. Should be the same order as fixed effects. random Estimated random effects, in phenology these are the station effects. random.lev Levels of random effects. Should be the same order as random effects. SEf1 Standard error group f1, i.e. square root of variance component fixed effect. SEf2 Standard error group f2, i.e. square root of variance component random effect. lclf Lower 95 percent confidence limit of fixed effects. uclf Upper 95 percent confidence limit of fixed effects. fit The fitted lme model object.
Author(s)
Joerg Schaber
References
Searle (1997) ’Linear Models’. Wiley. Schaber J, Badeck F-W (2002) ’Evaluation of methods for the combination of phenological time series and outlier detection’. Tree Physiology 22:973-
raw2matrix 17
See Also
lme
Examples
data(DWD) R <- pheno.mlm.fit(DWD) # pa plot(levels(factor(DWD[[2]])),R$fixed,type="l") # plot combined time series tr <- lm(R$fixed~rank(levels(factor(DWD[[2]])))) # trend estimation summary(tr)$coef[2] summary(tr)$coef[4]
raw2matrix Converts a numeric data frame to matrix
Description
Converts a numeric data frame D with three columns (x, factor 1, factor 2) to a numeric matrix M where rows are ranks of levels of factor 1 and columns are ranks of levels of factor 2, missing values are set to NA.
Usage
raw2matrix(D)
Arguments
D Data frame with three columns (x, factor 1, factor 2)
Value
M Numeric matrix where rows are ranks of levels of factor 1 and columns are ranks of levels of factor 2, missing values are set to NA.
Author(s)
Joerg Schaber
Examples
data(DWD) raw2matrix(DWD)
Simple 19
Value
prog Progressive row of Kendall’s normalized tau’s retr Retrograde row of Kendall’s normalized tau’s tp Boolean vector indicating at what indices of the original timeseries the prog and retr cross, i.e. TRUE at potential trend turning points.
Author(s)
Joerg Schaber
References
Kendall M, Gibbons JD (1990) ’Rank correlation methods’. Arnold. Sneyers R (1990) ’On sta- tistical analysis of series of observations. Technical Note No 143. Geneva. Switzerland. World Meteorological Society. Schaber J (2003) ’Phenology in German in the 20th Century: Meth- ods, analyses and models. Ph.D. Thesis. University of Potsdam. Germany. http://pub.ub. uni-potsdam.de/2002meta/0022/door.htm
Simple Simple example of a two-way classification table
Description
Simple example of a two-way classification table where missing data creates two distinct connected sets.
Usage
data(Simple)
Format
R source file
tau Kendall’s normalized tau
Description
Kendall’s normalized tau for time series x
Usage
tau(x)
20 tau
Arguments
x Numeric vector x.
Details
Implicitly assumes a equidistant time series x.
Value
t Kendall’s normalized tau.
Author(s)
Joerg Schaber
References
Kendall M, Gibbons JD (1990) ’Rank correlation methods’. Arnold. Sneyers R (1990) ’On sta- tistical analysis of series of observations. Technical Note No 143. Geneva. Switzerland. World Meteorological Society. Schaber J (2003) ’Phenology in German in the 20th Century: Meth- ods, analyses and models. Ph.D. Thesis. University of Potsdam. Germany. http://pub.ub. uni-potsdam.de/2002meta/0022/door.htm