R:pems.utils Generic Functions

Karl Ropkins

2024-12-23

Background

The R package pems.utils uses two main data types: pems.elements, data-series with assigned units, and pems, data sets of simultaneously logged pems.elements.

This document provides an overview to their generic handling in R.

For a quick and more general introduction to pems.utils, see [>pems.utils introduction] or to return to the [>website index].

If you have any suggestions how to make either pems.utils or this document better or you have any problems using either, please let me know [>email me].

Unless you have setup R to automatically load pems.utils, you will need to load it e.g. using library(pems.utils) to use the functions described here.

Generic functions are a special category of R functions that identify the object type being supplied and, where available, automatically apply object-specific methods to handle these potentially very different objects appropriately.

This means that many functions that R users are already familiar with can be used with specialist object classes like pems and pems.elements.

For those already familiar with R object classes, when used with generic functions, pems behave in a similar fashion to other (row and column) data set objects like data.frames and data.tables, and pems.elements behave in a similar fashion to other vector objects like numerics, characters and factors.

See Chambers (2008)1 for further discussion of generic functions in R.

print is perhaps the most heavily used generic, and useful as an example of how they work.

When an object is evaluated and not caught, it is printed. This means when, for example, you enter pems.1, the name of the example pems data set provided with pems.utils, you are actually seeing the output of print(pems.1):

pems.1 
## pems (1000x25)
##               time.stamp  local.time  conc.co  conc.co2  conc.hc  conc.nox
##        [Y-M-D H:M:S GMT]         [s]   [vol%]    [vol%]  [ppmC6]     [ppm]
##   1  2005-09-08 11:46:07           0        0         0        0    20.447
##   2  2005-09-08 11:46:08           1        0         0        0    21.973
##   3  2005-09-08 11:46:09           2        0         0        0    20.752
##   4  2005-09-08 11:46:10           3        0         0        0    22.583
##   5  2005-09-08 11:46:11           4        0         0        0    20.142
##   6  2005-09-08 11:46:12           5        0         0        0    20.142
##  ... not showing: 994 rows; 19 cols (elements) 
##  ... other cols: afr; exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
##       amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
##       revolution[rpm]; option.1[V]; option2[V]; option.3[V];
##       latitude[d.degLat]; longitude[d.degLon]; altitude[m];
##       gps.velocity[km/h]; satellite; n.s; w.e

Here, print identifies pems.1 as a pems and passes this to a hidden function called print.pems that actually handles how the object is shown to the user.

While this might seem a little involved, it is all handled silently by R and means that the user gets object-specific responses without having to check the object class and then running a dedicated print.class function themselves.

It also allows package developers to more completely align the use of their and other packages.

pems.utils Generic Functions

as.data.frame converts a supplied pems or pems.element into a data.frame, for example:

data.frame.1 <- as.data.frame(pems.1)

As part of this process as.data.frame discards pems.utils information like units that are not tracked by data.frames.

While this might seem a disadvantage it does mean that functions that do not like such extra object structure have the option to convert it to a data.frame and work with that. The linear model function lm, for example, uses as.data.frame to allow it to work with many other classes of objects.

as.pems converts data set classes like data.frames into pems, for example:

pems.2 <- as.pems(data.frame.1)

as.pems.element works similarly to convert vector classes to pems.elements.

Both as.pems and as.pems.element will accepted extra arguments to associate pems.utils specific structure like units. So, the as… functions can be used in combination to work with functions that do not ‘like’ pems.utils classes, for example:

#get units
units.1 <- units(pems.1)                
#make pems into data.frame
data.frame.1 <- as.data.frame(pems.1)      
#do whatever you cannot do with a pems with your data.frame...
#then if this makes a new data.frame you want to keep
#(update units if any changed or extra columns added)
#re-build the pems
pems.2 <- as.pems(data.frame.1, units=units.1)

dim reports the dimensions, rows then columns, of a supplied pems, for example:

dim(pems.1)
## [1] 1000   25

Related functions ncol and nrow report the numbers of columns and rows in a pems, respectively.

Similar to other vector classes, dim(pems.element), ncol(pems.element) and nrow(pems.element) all return NULL.

head shows the top rows of a supplied pems or the start of a supplied pems.element. The extra argument n changes the number of pems rows or pems.element values shown, for example:

head(pems.1)
## pems (6x25)
##               time.stamp  local.time  conc.co  conc.co2  conc.hc  conc.nox
##        [Y-M-D H:M:S GMT]         [s]   [vol%]    [vol%]  [ppmC6]     [ppm]
##   1  2005-09-08 11:46:07           0        0         0        0    20.447
##   2  2005-09-08 11:46:08           1        0         0        0    21.973
##   3  2005-09-08 11:46:09           2        0         0        0    20.752
##   4  2005-09-08 11:46:10           3        0         0        0    22.583
##   5  2005-09-08 11:46:11           4        0         0        0    20.142
##   6  2005-09-08 11:46:12           5        0         0        0    20.142
##  ... not showing: 19 cols (elements) 
##  ... other cols: afr; exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
##       amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
##       revolution[rpm]; option.1[V]; option2[V]; option.3[V];
##       latitude[d.degLat]; longitude[d.degLon]; altitude[m];
##       gps.velocity[km/h]; satellite; n.s; w.e
head(pems.1, n=3)
## pems (3x25)
##               time.stamp  local.time  conc.co  conc.co2  conc.hc  conc.nox
##        [Y-M-D H:M:S GMT]         [s]   [vol%]    [vol%]  [ppmC6]     [ppm]
##   1  2005-09-08 11:46:07           0        0         0        0    20.447
##   2  2005-09-08 11:46:08           1        0         0        0    21.973
##   3  2005-09-08 11:46:09           2        0         0        0    20.752
##  ... not showing: 19 cols (elements) 
##  ... other cols: afr; exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
##       amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
##       revolution[rpm]; option.1[V]; option2[V]; option.3[V];
##       latitude[d.degLat]; longitude[d.degLon]; altitude[m];
##       gps.velocity[km/h]; satellite; n.s; w.e

Related function tail work similarly but shows the end of a supplied pems or pems.element.

na.omit removes any rows of a supplied pems that contain NAs, for example:

a <- pems.1[1:4, 1:3]; a[2,2] <- NA; a
## pems (4x3)
##               time.stamp  local.time  conc.co
##        [Y-M-D H:M:S GMT]         [s]   [vol%]
##   1  2005-09-08 11:46:07           0        0
##   2  2005-09-08 11:46:08          NA        0
##   3  2005-09-08 11:46:09           2        0
##   4  2005-09-08 11:46:10           3        0
na.omit(a)
## pems (3x3)
##               time.stamp  local.time  conc.co
##        [Y-M-D H:M:S GMT]         [s]   [vol%]
##   1  2005-09-08 11:46:07           0        0
##   3  2005-09-08 11:46:09           2        0
##   4  2005-09-08 11:46:10           3        0

Functions that cannot handle datasets that contain missing values may use na.omit to ‘clean’ datasets.

names reports the names of pems.elements in a supplied pems, for example:

names(pems.1)
##  [1] "time.stamp"    "local.time"    "conc.co"       "conc.co2"     
##  [5] "conc.hc"       "conc.nox"      "afr"           "exh.flow.rate"
##  [9] "exh.temp"      "exh.press"     "amb.temp"      "amb.press"    
## [13] "amb.humidity"  "velocity"      "revolution"    "option.1"     
## [17] "option2"       "option.3"      "latitude"      "longitude"    
## [21] "altitude"      "gps.velocity"  "satellite"     "n.s"          
## [25] "w.e"

It can also be used to change the names of the pems.elements in a supplied pems, for example:

names(pems.1)[3:6] <- c("CO", "CO2", "HC", "NOX")
names(pems.1)
##  [1] "time.stamp"    "local.time"    "CO"            "CO2"          
##  [5] "HC"            "NOX"           "afr"           "exh.flow.rate"
##  [9] "exh.temp"      "exh.press"     "amb.temp"      "amb.press"    
## [13] "amb.humidity"  "velocity"      "revolution"    "option.1"     
## [17] "option2"       "option.3"      "latitude"      "longitude"    
## [21] "altitude"      "gps.velocity"  "satellite"     "n.s"          
## [25] "w.e"

Similar with other vector classes, names(pems.element) returns NULL.

plot generates a scatterplot matrix for a supplied pems. By default, the matrix is limited to three pems.elements, but plotted cases can be changed with the extra arguments n (number of plot cases), id (names of columns to use) and ignore (names of columns to ignore). It also accepts common plot arguments (col, pch, cex, etc) to modify the appearance of the plot, for example:

plot(pems.1)
plot(pems.1, id=c("velocity", "revolution", "exh.flow.rate"), 
     col="red", pch=20, cex=0.3)

Note: this output is selected to be consistent with plot(data.frame) handling.

plot generates a conventional plot of a supplied pems.element. It accepts common plot arguments (col, pch, cex, etc) and automatically adds units (if set) to axes labels, for example:

plot(pems.1$velocity)
plot(pems.1$local.time, pems.1$velocity, type="l", col="red")

Although the generic plot function is enabled for both pems and pems.elements, the package was written to support other plotting options in R.

See also [>pems.utils plots] or R help documentation (?pems.units) for more about plotting pems and pems.elements.

print (described in above introductory example) generates a print report of a supplied pems or pems.element. Both are foreshorten to produce a screen-friendly summary report using a similar strategies to tibbles. [>for more about tibbles]

Additional arguments can also be used to modify outputs, for example:

print(pems.1)
## pems (1000x25)
##               time.stamp  local.time      CO     CO2       HC     NOX     afr
##        [Y-M-D H:M:S GMT]         [s]  [vol%]  [vol%]  [ppmC6]   [ppm]        
##   1  2005-09-08 11:46:07           0       0       0        0  20.447  199.85
##   2  2005-09-08 11:46:08           1       0       0        0  21.973  199.89
##   3  2005-09-08 11:46:09           2       0       0        0  20.752  199.91
##   4  2005-09-08 11:46:10           3       0       0        0  22.583  199.88
##   5  2005-09-08 11:46:11           4       0       0        0  20.142  199.88
##   6  2005-09-08 11:46:12           5       0       0        0  20.142  199.85
##  ... not showing: 994 rows; 18 cols (elements) 
##  ... other cols: exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
##       amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
##       revolution[rpm]; option.1[V]; option2[V]; option.3[V];
##       latitude[d.degLat]; longitude[d.degLon]; altitude[m];
##       gps.velocity[km/h]; satellite; n.s; w.e
print(pems.1, rows=7, cols=4)
## pems (1000x25)
##               time.stamp  local.time      CO     CO2
##        [Y-M-D H:M:S GMT]         [s]  [vol%]  [vol%]
##   1  2005-09-08 11:46:07           0       0       0
##   2  2005-09-08 11:46:08           1       0       0
##   3  2005-09-08 11:46:09           2       0       0
##   4  2005-09-08 11:46:10           3       0       0
##   5  2005-09-08 11:46:11           4       0       0
##   6  2005-09-08 11:46:12           5       0       0
##   7  2005-09-08 11:46:13           6       0       0
##  ... not showing: 993 rows; 21 cols (elements) 
##  ... other cols: HC[ppmC6]; NOX[ppm]; afr; exh.flow.rate[L/min];
##       exh.temp[degC]; exh.press[kPa]; amb.temp[degC]; amb.press[kPa];
##       amb.humidity[%]; velocity[km/h]; revolution[rpm]; option.1[V];
##       option2[V]; option.3[V]; latitude[d.degLat]; longitude[d.degLon];
##       altitude[m]; gps.velocity[km/h]; satellite; n.s; w.e

subset extracts a subset from a supplied pems selected using a sub-sampling (or filtering) argument supplied in the same call, for example:

subset(pems.1, local.time < 4) 
## pems (4x25)
##               time.stamp  local.time      CO     CO2       HC     NOX     afr
##        [Y-M-D H:M:S GMT]         [s]  [vol%]  [vol%]  [ppmC6]   [ppm]        
##   1  2005-09-08 11:46:07           0       0       0        0  20.447  199.85
##   2  2005-09-08 11:46:08           1       0       0        0  21.973  199.89
##   3  2005-09-08 11:46:09           2       0       0        0  20.752  199.91
##   4  2005-09-08 11:46:10           3       0       0        0  22.583  199.88
##  ... not showing: 18 cols (elements) 
##  ... other cols: exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
##       amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
##       revolution[rpm]; option.1[V]; option2[V]; option.3[V];
##       latitude[d.degLat]; longitude[d.degLon]; altitude[m];
##       gps.velocity[km/h]; satellite; n.s; w.e
subset(pems.1, CO==max(CO, na.rm=TRUE))
## pems (1x25)
##                time.stamp  local.time      CO     CO2       HC     NOX     afr
##         [Y-M-D H:M:S GMT]         [s]  [vol%]  [vol%]  [ppmC6]   [ppm]        
##   39  2005-09-08 11:46:45          38  2.9745  11.727   415.19  138.55  13.867
##  ... not showing: 18 cols (elements) 
##  ... other cols: exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
##       amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
##       revolution[rpm]; option.1[V]; option2[V]; option.3[V];
##       latitude[d.degLat]; longitude[d.degLon]; altitude[m];
##       gps.velocity[km/h]; satellite; n.s; w.e

As with other vectors, subset is not intended for use with pems.elements.

summary generates a conventional R data summary for a supplied pems or pems.element, for example:

summary(pems.1$velocity)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
##    0.00    0.20   15.55   22.27   46.23   69.70

units gets or sets the units of a supplied pems or pems.element, for example:

units(pems.1)
##        time.stamp local.time   CO  CO2    HC NOX afr exh.flow.rate exh.temp
## 1 Y-M-D H:M:S GMT          s vol% vol% ppmC6 ppm             L/min     degC
##   exh.press amb.temp amb.press amb.humidity velocity revolution option.1
## 1       kPa     degC       kPa            %     km/h        rpm        V
##   option2 option.3 latitude longitude altitude gps.velocity satellite n.s w.e
## 1       V        V d.degLat  d.degLon        m         km/h
units(pems.1)[1:2]
##        time.stamp local.time
## 1 Y-M-D H:M:S GMT          s
units(pems.1)["afr"]
##   afr
## 1
units(pems.1)["afr"] <- "ratio"
units(pems.1)
##        time.stamp local.time   CO  CO2    HC NOX   afr exh.flow.rate exh.temp
## 1 Y-M-D H:M:S GMT          s vol% vol% ppmC6 ppm ratio         L/min     degC
##   exh.press amb.temp amb.press amb.humidity velocity revolution option.1
## 1       kPa     degC       kPa            %     km/h        rpm        V
##   option2 option.3 latitude longitude altitude gps.velocity satellite n.s w.e
## 1       V        V d.degLat  d.degLon        m         km/h

See also [>pems.utils units] or R help documentation (?pems.units) for more on pems units handling.

with associates a supplied pems, so the user can work with contained pems.elements directly, for example:

#rather than
#diff(pems.1$velocity)/diff(pems.1$local.time)
with(pems.1, diff(velocity)/diff(local.time))
## pems.element [n=999]
##   [1]   0.0   0.2   0.0  -0.1   0.2  -0.1   0.4  -0.6   0.1   0.0   0.0  -0.1
##  [13]   0.1  -0.1   0.2  -0.1   0.1  -0.1   0.2  -0.3   0.1   0.0  -0.1   0.0
##  [25]   0.0   0.2  -0.1  -0.1   0.0   0.1   0.1   0.1  -0.3   0.2   0.1  -0.3
##   ... not showing: 81 rows
##   ... <numeric>

As with other vectors, with is not intended for use with pems.elements.

Other Generic Functions

There are numerous generic functions in R, and pems and pems.element versions are only written for those where a need was identified.

If you think any other generic functions would be useful, please let me know. [>email me]

Similarly, if you have any suggestions how to make either pems.utils or this document better or you have any problems using either, please let me know. [>email me].

Return to the [>website index] or [>introduction].


  1. Chambers, J., 2008. Software for data analysis: programming with R. Springer Science & Business Media.↩︎