The R package pems.utils uses two main data types: pems.elements, data-series with assigned units, and pems, data sets of simultaneously logged pems.elements.
This document provides an overview to their generic handling in R.
For a quick and more general introduction to pems.utils, see [>pems.utils introduction] or to return to the [>website index].
If you have any suggestions how to make either pems.utils or this document better or you have any problems using either, please let me know [>email me].
Unless you have setup R to automatically load pems.utils, you will need to load it e.g. using
library(pems.utils)
to use the functions described here.
Generic functions are a special category of R functions that identify the object type being supplied and, where available, automatically apply object-specific methods to handle these potentially very different objects appropriately.
This means that many functions that R users are already familiar with can be used with specialist object classes like pems and pems.elements.
For those already familiar with R object classes, when used with generic functions, pems behave in a similar fashion to other (row and column) data set objects like data.frames and data.tables, and pems.elements behave in a similar fashion to other vector objects like numerics, characters and factors.
See Chambers (2008)1 for further discussion of generic functions in R.
print is perhaps the most heavily used generic, and useful as an example of how they work.
When an object is evaluated and not caught, it is printed. This means
when, for example, you enter pems.1
, the name of the
example pems data set provided with
pems.utils, you are actually seeing the output of
print(pems.1)
:
## pems (1000x25)
## time.stamp local.time conc.co conc.co2 conc.hc conc.nox
## [Y-M-D H:M:S GMT] [s] [vol%] [vol%] [ppmC6] [ppm]
## 1 2005-09-08 11:46:07 0 0 0 0 20.447
## 2 2005-09-08 11:46:08 1 0 0 0 21.973
## 3 2005-09-08 11:46:09 2 0 0 0 20.752
## 4 2005-09-08 11:46:10 3 0 0 0 22.583
## 5 2005-09-08 11:46:11 4 0 0 0 20.142
## 6 2005-09-08 11:46:12 5 0 0 0 20.142
## ... not showing: 994 rows; 19 cols (elements)
## ... other cols: afr; exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
## amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
## revolution[rpm]; option.1[V]; option2[V]; option.3[V];
## latitude[d.degLat]; longitude[d.degLon]; altitude[m];
## gps.velocity[km/h]; satellite; n.s; w.e
Here,
pems.1
as a pems and passes this to a hidden function calledprint.pems
that actually handles how the object is shown to the user.
While this might seem a little involved, it is all handled silently by R and means that the user gets object-specific responses without having to check the object class and then running a dedicated
print.class
function themselves.
It also allows package developers to more completely align the use of their and other packages.
as.data.frame converts a supplied pems or pems.element into a data.frame, for example:
As part of this process as.data.frame discards pems.utils information like units that are not tracked by data.frames.
While this might seem a disadvantage it does mean that functions that do not like such extra object structure have the option to convert it to a data.frame and work with that. The linear model function lm, for example, uses as.data.frame to allow it to work with many other classes of objects.
as.pems converts data set classes like data.frames into pems, for example:
as.pems.element works similarly to convert vector classes to pems.elements.
Both as.pems and as.pems.element will accepted extra arguments to associate pems.utils specific structure like units. So, the as… functions can be used in combination to work with functions that do not ‘like’ pems.utils classes, for example:
#get units
units.1 <- units(pems.1)
#make pems into data.frame
data.frame.1 <- as.data.frame(pems.1)
#do whatever you cannot do with a pems with your data.frame...
#then if this makes a new data.frame you want to keep
#(update units if any changed or extra columns added)
#re-build the pems
pems.2 <- as.pems(data.frame.1, units=units.1)
dim reports the dimensions, rows then columns, of a supplied pems, for example:
## [1] 1000 25
Related functions ncol and nrow report the numbers of columns and rows in a pems, respectively.
Similar to other vector classes,
dim(pems.element)
,ncol(pems.element)
andnrow(pems.element)
all returnNULL
.
head shows the top rows of a supplied pems or the start of a supplied pems.element. The extra argument n changes the number of pems rows or pems.element values shown, for example:
## pems (6x25)
## time.stamp local.time conc.co conc.co2 conc.hc conc.nox
## [Y-M-D H:M:S GMT] [s] [vol%] [vol%] [ppmC6] [ppm]
## 1 2005-09-08 11:46:07 0 0 0 0 20.447
## 2 2005-09-08 11:46:08 1 0 0 0 21.973
## 3 2005-09-08 11:46:09 2 0 0 0 20.752
## 4 2005-09-08 11:46:10 3 0 0 0 22.583
## 5 2005-09-08 11:46:11 4 0 0 0 20.142
## 6 2005-09-08 11:46:12 5 0 0 0 20.142
## ... not showing: 19 cols (elements)
## ... other cols: afr; exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
## amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
## revolution[rpm]; option.1[V]; option2[V]; option.3[V];
## latitude[d.degLat]; longitude[d.degLon]; altitude[m];
## gps.velocity[km/h]; satellite; n.s; w.e
## pems (3x25)
## time.stamp local.time conc.co conc.co2 conc.hc conc.nox
## [Y-M-D H:M:S GMT] [s] [vol%] [vol%] [ppmC6] [ppm]
## 1 2005-09-08 11:46:07 0 0 0 0 20.447
## 2 2005-09-08 11:46:08 1 0 0 0 21.973
## 3 2005-09-08 11:46:09 2 0 0 0 20.752
## ... not showing: 19 cols (elements)
## ... other cols: afr; exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
## amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
## revolution[rpm]; option.1[V]; option2[V]; option.3[V];
## latitude[d.degLat]; longitude[d.degLon]; altitude[m];
## gps.velocity[km/h]; satellite; n.s; w.e
Related function tail work similarly but shows the end of a supplied pems or pems.element.
na.omit removes any rows of a supplied
pems that contain NA
s, for example:
## pems (4x3)
## time.stamp local.time conc.co
## [Y-M-D H:M:S GMT] [s] [vol%]
## 1 2005-09-08 11:46:07 0 0
## 2 2005-09-08 11:46:08 NA 0
## 3 2005-09-08 11:46:09 2 0
## 4 2005-09-08 11:46:10 3 0
## pems (3x3)
## time.stamp local.time conc.co
## [Y-M-D H:M:S GMT] [s] [vol%]
## 1 2005-09-08 11:46:07 0 0
## 3 2005-09-08 11:46:09 2 0
## 4 2005-09-08 11:46:10 3 0
Functions that cannot handle datasets that contain missing values may use na.omit to ‘clean’ datasets.
names reports the names of pems.elements in a supplied pems, for example:
## [1] "time.stamp" "local.time" "conc.co" "conc.co2"
## [5] "conc.hc" "conc.nox" "afr" "exh.flow.rate"
## [9] "exh.temp" "exh.press" "amb.temp" "amb.press"
## [13] "amb.humidity" "velocity" "revolution" "option.1"
## [17] "option2" "option.3" "latitude" "longitude"
## [21] "altitude" "gps.velocity" "satellite" "n.s"
## [25] "w.e"
It can also be used to change the names of the pems.elements in a supplied pems, for example:
## [1] "time.stamp" "local.time" "CO" "CO2"
## [5] "HC" "NOX" "afr" "exh.flow.rate"
## [9] "exh.temp" "exh.press" "amb.temp" "amb.press"
## [13] "amb.humidity" "velocity" "revolution" "option.1"
## [17] "option2" "option.3" "latitude" "longitude"
## [21] "altitude" "gps.velocity" "satellite" "n.s"
## [25] "w.e"
Similar with other vector classes,
names(pems.element)
returnsNULL
.
plot generates a scatterplot matrix
for
a supplied pems. By default, the matrix is limited to
three pems.elements, but plotted cases can be changed
with the extra arguments n (number of plot cases), id (names of columns
to use) and ignore (names of columns to ignore). It also accepts common
plot
arguments (col, pch, cex, etc) to modify the
appearance of the plot, for example:
plot(pems.1)
plot(pems.1, id=c("velocity", "revolution", "exh.flow.rate"),
col="red", pch=20, cex=0.3)
Note: this output is selected to be consistent with
plot(data.frame)
handling.
plot generates a conventional plot
of a
supplied pems.element. It accepts common
plot
arguments (col, pch, cex, etc) and automatically adds
units (if set) to axes labels, for example:
Although the generic
plot
function is enabled for both pems and pems.elements, the package was written to support other plotting options in R.
See also [>pems.utils plots] or R help documentation (?pems.units) for more about plotting pems and pems.elements.
print (described in above introductory example) generates a print report of a supplied pems or pems.element. Both are foreshorten to produce a screen-friendly summary report using a similar strategies to tibbles. [>for more about tibbles]
Additional arguments can also be used to modify outputs, for example:
## pems (1000x25)
## time.stamp local.time CO CO2 HC NOX afr
## [Y-M-D H:M:S GMT] [s] [vol%] [vol%] [ppmC6] [ppm]
## 1 2005-09-08 11:46:07 0 0 0 0 20.447 199.85
## 2 2005-09-08 11:46:08 1 0 0 0 21.973 199.89
## 3 2005-09-08 11:46:09 2 0 0 0 20.752 199.91
## 4 2005-09-08 11:46:10 3 0 0 0 22.583 199.88
## 5 2005-09-08 11:46:11 4 0 0 0 20.142 199.88
## 6 2005-09-08 11:46:12 5 0 0 0 20.142 199.85
## ... not showing: 994 rows; 18 cols (elements)
## ... other cols: exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
## amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
## revolution[rpm]; option.1[V]; option2[V]; option.3[V];
## latitude[d.degLat]; longitude[d.degLon]; altitude[m];
## gps.velocity[km/h]; satellite; n.s; w.e
## pems (1000x25)
## time.stamp local.time CO CO2
## [Y-M-D H:M:S GMT] [s] [vol%] [vol%]
## 1 2005-09-08 11:46:07 0 0 0
## 2 2005-09-08 11:46:08 1 0 0
## 3 2005-09-08 11:46:09 2 0 0
## 4 2005-09-08 11:46:10 3 0 0
## 5 2005-09-08 11:46:11 4 0 0
## 6 2005-09-08 11:46:12 5 0 0
## 7 2005-09-08 11:46:13 6 0 0
## ... not showing: 993 rows; 21 cols (elements)
## ... other cols: HC[ppmC6]; NOX[ppm]; afr; exh.flow.rate[L/min];
## exh.temp[degC]; exh.press[kPa]; amb.temp[degC]; amb.press[kPa];
## amb.humidity[%]; velocity[km/h]; revolution[rpm]; option.1[V];
## option2[V]; option.3[V]; latitude[d.degLat]; longitude[d.degLon];
## altitude[m]; gps.velocity[km/h]; satellite; n.s; w.e
subset extracts a subset from a supplied pems selected using a sub-sampling (or filtering) argument supplied in the same call, for example:
## pems (4x25)
## time.stamp local.time CO CO2 HC NOX afr
## [Y-M-D H:M:S GMT] [s] [vol%] [vol%] [ppmC6] [ppm]
## 1 2005-09-08 11:46:07 0 0 0 0 20.447 199.85
## 2 2005-09-08 11:46:08 1 0 0 0 21.973 199.89
## 3 2005-09-08 11:46:09 2 0 0 0 20.752 199.91
## 4 2005-09-08 11:46:10 3 0 0 0 22.583 199.88
## ... not showing: 18 cols (elements)
## ... other cols: exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
## amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
## revolution[rpm]; option.1[V]; option2[V]; option.3[V];
## latitude[d.degLat]; longitude[d.degLon]; altitude[m];
## gps.velocity[km/h]; satellite; n.s; w.e
## pems (1x25)
## time.stamp local.time CO CO2 HC NOX afr
## [Y-M-D H:M:S GMT] [s] [vol%] [vol%] [ppmC6] [ppm]
## 39 2005-09-08 11:46:45 38 2.9745 11.727 415.19 138.55 13.867
## ... not showing: 18 cols (elements)
## ... other cols: exh.flow.rate[L/min]; exh.temp[degC]; exh.press[kPa];
## amb.temp[degC]; amb.press[kPa]; amb.humidity[%]; velocity[km/h];
## revolution[rpm]; option.1[V]; option2[V]; option.3[V];
## latitude[d.degLat]; longitude[d.degLon]; altitude[m];
## gps.velocity[km/h]; satellite; n.s; w.e
As with other vectors,
subset
is not intended for use with pems.elements.
summary generates a conventional R data summary for a supplied pems or pems.element, for example:
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 0.00 0.20 15.55 22.27 46.23 69.70
units gets or sets the units of a supplied pems or pems.element, for example:
## time.stamp local.time CO CO2 HC NOX afr exh.flow.rate exh.temp
## 1 Y-M-D H:M:S GMT s vol% vol% ppmC6 ppm L/min degC
## exh.press amb.temp amb.press amb.humidity velocity revolution option.1
## 1 kPa degC kPa % km/h rpm V
## option2 option.3 latitude longitude altitude gps.velocity satellite n.s w.e
## 1 V V d.degLat d.degLon m km/h
## time.stamp local.time
## 1 Y-M-D H:M:S GMT s
## afr
## 1
## time.stamp local.time CO CO2 HC NOX afr exh.flow.rate exh.temp
## 1 Y-M-D H:M:S GMT s vol% vol% ppmC6 ppm ratio L/min degC
## exh.press amb.temp amb.press amb.humidity velocity revolution option.1
## 1 kPa degC kPa % km/h rpm V
## option2 option.3 latitude longitude altitude gps.velocity satellite n.s w.e
## 1 V V d.degLat d.degLon m km/h
See also [>pems.utils units] or R help documentation (?pems.units) for more on pems units handling.
with associates a supplied pems, so the user can work with contained pems.elements directly, for example:
#rather than
#diff(pems.1$velocity)/diff(pems.1$local.time)
with(pems.1, diff(velocity)/diff(local.time))
## pems.element [n=999]
## [1] 0.0 0.2 0.0 -0.1 0.2 -0.1 0.4 -0.6 0.1 0.0 0.0 -0.1
## [13] 0.1 -0.1 0.2 -0.1 0.1 -0.1 0.2 -0.3 0.1 0.0 -0.1 0.0
## [25] 0.0 0.2 -0.1 -0.1 0.0 0.1 0.1 0.1 -0.3 0.2 0.1 -0.3
## ... not showing: 81 rows
## ... <numeric>
As with other vectors,
with
is not intended for use with pems.elements.
There are numerous generic functions in R, and pems and pems.element versions are only written for those where a need was identified.
If you think any other generic functions would be useful, please let me know. [>email me]
Similarly, if you have any suggestions how to make either pems.utils or this document better or you have any problems using either, please let me know. [>email me].
Return to the [>website index] or [>introduction].
Chambers, J., 2008. Software for data analysis: programming with R. Springer Science & Business Media.↩︎