Title: | Miscellaneous Utilities Developed by the Predictive Ecology Group |
---|---|
Description: | Miscellaneous utilities developed by the Predictive Ecology Group (<http://predictiveecology.org>). |
Authors: | Eliot J B McIntire [aut, cre] , Alex M Chubaty [aut] , His Majesty the King in Right of Canada, as represented by the Minister of Natural Resources Canada [cph] |
Maintainer: | Eliot J B McIntire <[email protected]> |
License: | GPL-3 |
Version: | 0.0.4.9011 |
Built: | 2024-11-07 02:42:00 UTC |
Source: | https://github.com/PredictiveEcology/pemisc |
pemisc
packageMiscellaneous utilities developed by the Predictive Ecology Group (https://predictiveecology.org).
This reports the 'available' memory from a system call: free
on Linux,
vm_stat
on macOS, or wmic
on Windows.
If neither is installed on the system, returns NULL
.
availableMemory()
availableMemory()
Numeric of class "object_size", so it can be reported in any units with format,
e.g., format(availableMemory(), unit = "GB")
.
man free
for description of available memory estimation.
Based on DBH or DBH and Height.
biomassCalculation(species, DBH, includeHeight, height, equationSource) ## S4 method for signature 'character,numeric,logical,numeric,character' biomassCalculation(species, DBH, includeHeight, height, equationSource) ## S4 method for signature 'character,numeric,missing,numeric,character' biomassCalculation(species, DBH, height, equationSource) ## S4 method for signature 'character,numeric,logical,numeric,missing' biomassCalculation(species, DBH, includeHeight, height) ## S4 method for signature 'character,numeric,missing,numeric,missing' biomassCalculation(species, DBH, height) ## S4 method for signature 'character,numeric,missing,missing,character' biomassCalculation(species, DBH, equationSource) ## S4 method for signature 'character,numeric,missing,missing,missing' biomassCalculation(species, DBH)
biomassCalculation(species, DBH, includeHeight, height, equationSource) ## S4 method for signature 'character,numeric,logical,numeric,character' biomassCalculation(species, DBH, includeHeight, height, equationSource) ## S4 method for signature 'character,numeric,missing,numeric,character' biomassCalculation(species, DBH, height, equationSource) ## S4 method for signature 'character,numeric,logical,numeric,missing' biomassCalculation(species, DBH, includeHeight, height) ## S4 method for signature 'character,numeric,missing,numeric,missing' biomassCalculation(species, DBH, height) ## S4 method for signature 'character,numeric,missing,missing,character' biomassCalculation(species, DBH, equationSource) ## S4 method for signature 'character,numeric,missing,missing,missing' biomassCalculation(species, DBH)
species |
Character string giving the species name. |
DBH |
Numeric. The tree's diameter at breast height (DBH, cm). |
includeHeight |
Logical. Whether the biomass is calculated based on DBH and height.
If |
height |
Numeric. The tree's height (m). |
equationSource |
Character. Determine the sources of equations.
Currently, this function has two options, i.e., |
Biomass (kg) and missedSpecies
list that was not calculated.
Yong Luo
## Not run: DBH <- seq(1, 100, 5) species <- c(rep("jack pine", 10), rep("black spruce", 10)) species[1] <- "wrongSpecies" height <- seq(20, 40, length = 20) # without height information and and taking the equations from Lambert 2005 biomass1 <- biomassCalculation(species = species, DBH = DBH) # with height information and and taking the equations from Lambert 2005 biomass2 <- biomassCalculation(species = species, DBH = DBH, includeHeight = TRUE, height = height) ## End(Not run)
## Not run: DBH <- seq(1, 100, 5) species <- c(rep("jack pine", 10), rep("black spruce", 10)) species[1] <- "wrongSpecies" height <- seq(20, 40, length = 20) # without height information and and taking the equations from Lambert 2005 biomass1 <- biomassCalculation(species = species, DBH = DBH) # with height information and and taking the equations from Lambert 2005 biomass2 <- biomassCalculation(species = species, DBH = DBH, includeHeight = TRUE, height = height) ## End(Not run)
.prj
fileIn cases where a shapefile is missing its associated .prj
file.
createPrjFile( shpFile, urlForProj = "http://spatialreference.org/ref/epsg/nad83-utm-zone-11n/prj/" )
createPrjFile( shpFile, urlForProj = "http://spatialreference.org/ref/epsg/nad83-utm-zone-11n/prj/" )
shpFile |
The filename of a shapefile to add |
urlForProj |
The url from which to fetch the projection, e.g.,
|
raster::factorValues()
Note there is an option to remove the NAs, which will make it MUCH faster,
if TRUE
factorValues2(x, v, layer, att, append.names, na.rm = FALSE)
factorValues2(x, v, layer, att, append.names, na.rm = FALSE)
x |
Raster* object |
v |
integer cell values |
layer |
integer > 0 indicating which layer to use (in a RasterStack or RasterBrick) |
att |
numeric or character. Which variable(s) in the RAT table should be used. If |
append.names |
logical. Should names of data.frame returned by a combination of the name of the layer and the RAT variables? (can be useful for multilayer objects |
na.rm |
Logical. If |
Search among local objects (which will often be arguments passed into a function) as well as
dot objects to match the formals needed by fn
.
If localFormalArgs
is named, then it will match the formal
(name of localFormalArgs
) with the local object,
e.g., localFormalArgs = c(x = "obj")
will find the object in the local environment called
"obj"
, and this will be found because it matches the x
argument in fn
.
getLocalArgsFor(fn, localFormalArgs, envir, dots)
getLocalArgsFor(fn, localFormalArgs, envir, dots)
fn |
Function name(?) |
localFormalArgs |
A (named) character vector or arguments to |
envir |
The environment in which to (???) |
dots |
TODO: need description |
List of named objects. The names are the formals in fn
, and
the objects are the values for those formals.
This can easily be passed to do.call(fn, args1)
repo/package@branch
stringGet the package name from a GitHub repo/package@branch
string
ghPkgName(x)
ghPkgName(x)
x |
character vector of package names |
a named character vector
pkgs <- c("dplyr", "PredictiveEcology/pemisc", "PredictiveEcology/SpaDES.core@development") ghPkgName(pkgs) ## "dplyr" "pemisc" "SpaDES.core"
pkgs <- c("dplyr", "PredictiveEcology/pemisc", "PredictiveEcology/SpaDES.core@development") ghPkgName(pkgs) ## "dplyr" "pemisc" "SpaDES.core"
When running arbitrary functions inside other functions, there is a common
construct in R to use ...
. It does not work, however, in the general case
to write do.call(fn, list(...))
because not all fn
themselves
accept ...
. So this will fail if too many arguments are supplied to
the ...
. In the general case, we want to write:
do.call(fn, list(onlyTheArgumentsThatAreNeeded))
. This function helps
to find the onlyTheArgumentsThatAreNeeded
by determining a) what is needed
by the fn
(which can be a list of many fn
), and b) where to find
values, either in an arbitrary environment or passed in via dots
.
identifyVectorArgs(fn, localFormalArgs, envir, dots)
identifyVectorArgs(fn, localFormalArgs, envir, dots)
fn |
A function or list of functions from which to run |
localFormalArgs |
A vector of possible objects, e.g., from |
envir |
The environment to find the objects named in |
dots |
Generally list(...), which would be an alternative place to find
|
A list of length 2, named argsSingle
and argsMulti
, which
can be passed to e.g.,
MapOrDoCall(fn, multiple = args1$argsMulti, single = args1$argsSingle)
Determines whether a string that may correspond to a package name
(e.g., repo/package@branch
), could be a package installed from GitHub.
This is determined solely by the presence of a /
in the string.
See example below.
isGitHubPkg(x)
isGitHubPkg(x)
x |
character vector of package names |
a named logical vector
pkgs <- c("dplyr", "PredictiveEcology/pemisc", "PredictiveEcology/SpaDES.core@development") isGitHubPkg(pkgs) ## FALSE TRUE TRUE
pkgs <- c("dplyr", "PredictiveEcology/pemisc", "PredictiveEcology/SpaDES.core@development") isGitHubPkg(pkgs) ## FALSE TRUE TRUE
makeForkCluster
with random seed setThis will set different random seeds on the clusters (not the default)
with makeForkCluster
.
It also defaults to creating a logfile with message of where it is.
makeClusterRandom( ..., type = "SOCK", iseed = NULL, libraries = NULL, objects = NULL, envir = parent.frame() ) makeForkClusterRandom(..., iseed = NULL) makeSockClusterRandom(..., iseed = NULL)
makeClusterRandom( ..., type = "SOCK", iseed = NULL, libraries = NULL, objects = NULL, envir = parent.frame() ) makeForkClusterRandom(..., iseed = NULL) makeSockClusterRandom(..., iseed = NULL)
... |
passed to |
type |
One of the supported types: see ‘Details’. |
iseed |
passed to |
libraries |
A character vector of libraries to load in the SOCK cluster. This is ignored if a "FORK" cluster |
objects |
a character string of objects that are required inside the SOCK cluster. Ignored if type != "SOCK" |
envir |
Required if |
makeIpsForNetworkCluster
is a simple wrapper around makeIps
.
makeIpsForNetworkCluster( ipStart = "10.20.0", ipEnd = c(68, 97, 189, 213, 220, 58, 106, 184, 217), availableCores = c(50, 50, 50, 50, 50, 50, 23, 23, 23), availableRAM = c(950, 500, 500, 500, 500, 500, 245, 245, 245), nProcess = 8, proc = "cores", internalProcesses = 10, sizeGbEachProcess = 35, localHostEndIp = 68 ) makeIps(machines, ipStart, proc, nProcess, sizeGbEachProcess)
makeIpsForNetworkCluster( ipStart = "10.20.0", ipEnd = c(68, 97, 189, 213, 220, 58, 106, 184, 217), availableCores = c(50, 50, 50, 50, 50, 50, 23, 23, 23), availableRAM = c(950, 500, 500, 500, 500, 500, 245, 245, 245), nProcess = 8, proc = "cores", internalProcesses = 10, sizeGbEachProcess = 35, localHostEndIp = 68 ) makeIps(machines, ipStart, proc, nProcess, sizeGbEachProcess)
ipStart |
Network address prefix (i.e., the first, second, and third triplets of the IP address) |
ipEnd |
Host IP address identifier (i.e., the final triplet of the IP address) |
availableCores |
the number of available threads on each machine. |
availableRAM |
the available RAM on each machine in GB |
nProcess |
the number of processes |
proc |
one of |
internalProcesses |
DESCRIPTION NEEDED |
sizeGbEachProcess |
the size in GB of each process |
localHostEndIp |
the address in |
machines |
|
A vector of IP addresses associated with each machine in the network cluster.
Given the size of a problem, it may not be useful to create a cluster. This will make a fork cluster (so Linux only).
makeOptimalCluster( useParallel = getOption("pemisc.useParallel", FALSE), MBper = 500, maxNumClusters = parallelly::availableCores(constraints = "connections"), assumeHyperThreads = FALSE, ... )
makeOptimalCluster( useParallel = getOption("pemisc.useParallel", FALSE), MBper = 500, maxNumClusters = parallelly::availableCores(constraints = "connections"), assumeHyperThreads = FALSE, ... )
useParallel |
Logical or numeric. If |
MBper |
Numeric. Passed to |
maxNumClusters |
Numeric or Integer. The theoretical upper limit for number of nodes to use with the cluster. |
assumeHyperThreads |
Logical. If |
... |
Passed to |
Map
and parallel::clusterMap
togetherThis will send to Map
or clusterMap
, depending on whether cl
is provided.
Because they use different argument names for the main function
to call, leave that argument unnamed.
Map2(f, ..., cl = NULL)
Map2(f, ..., cl = NULL)
f |
passed as |
... |
passed to |
cl |
A cluster object, passed to |
## Not run: a <- 1:5 Map2(a, f = function(x) x) ## End(Not run)
## Not run: a <- 1:5 Map2(a, f = function(x) x) ## End(Not run)
Map
/lapply
all in oneUsually run after identifyVectorArgs
which will separate the arguments
into vectors of values for a call to Map
, and arguments that have
only one value (passed to MoreArgs
in Map
). If all are single
length arguments, then it will pass to lapply
. If a cl
is provided
and is non-NULL
, then it will pass all arguments to clusterMap
or
clusterApply
.
MapOrDoCall(fn, multiple, single, useCache = FALSE, cl = NULL)
MapOrDoCall(fn, multiple, single, useCache = FALSE, cl = NULL)
fn |
The function that will be run via |
multiple |
This a list the arguments that Map will cycle over. |
single |
Passed to |
useCache |
Logical indicating whether to use the cache. |
cl |
A cluster object or |
identifyVectorArgs
Sends to message
, but in a structured way so that a data.frame
-like can
be cleanly sent to messaging.
messageDF(df, round, colour = NULL)
messageDF(df, round, colour = NULL)
df |
A data.frame, data.table, matrix |
round |
An optional numeric to pass to |
colour |
An optional colour to use from |
RasterStack
Rescales the values of of each RasterLayer
between [0,1]
.
normalizeStack(x)
normalizeStack(x)
x |
A |
Tati Micheletti
This uses ps -ef
so only works on unix-alikes. It will search
for the percent CPU use and select only those above 40
numActiveThreads(pattern = "--slave", minCPU = 50)
numActiveThreads(pattern = "--slave", minCPU = 50)
pattern |
Character string that will be matched to the |
minCPU |
A numeric indicating what percent is the minimum to be considered "active" |
A numeric of the number of active threads that match the pattern
Eliot McIntire
## Not run: ## Determine how many threads are used in each remote machine in a cluster cores = "localhost" # put other machine names here uniqueCores <- unique(cores) cl <- future::makeClusterPSOCK(uniqueCores, revtunnel = TRUE) clusterExport(cl, "numActiveThreads") out <- clusterEvalQ(cl, { numActiveThreads() }) names(out) <- uniqueCores unlist(out) stopCluster(cl) ## End(Not run)
## Not run: ## Determine how many threads are used in each remote machine in a cluster cores = "localhost" # put other machine names here uniqueCores <- unique(cores) cl <- future::makeClusterPSOCK(uniqueCores, revtunnel = TRUE) clusterExport(cl, "numActiveThreads") out <- clusterEvalQ(cl, { numActiveThreads() }) names(out) <- uniqueCores unlist(out) stopCluster(cl) ## End(Not run)
Optimally determine the number of cores to use to set up a new cluster, based on:
the number of cores available (see note);
the amount of free memory available on the local machine;
the number of cores requested vs. the number available, such that if requesting more cores than available, the number of cores used will be adjusted to be a multiple of the number of cores needed, so jobs can be run in approximately-even-sized batches. (E.g., if 16 cores available but need 50, the time taken to run 3 batches of 16 plus a single batch of 2 – i.e., 4 batches total – is the same as running 4 batches of 13.)
optimalClusterNumGeneralized( memRequiredMB = 500, maxNumClusters = parallelly::availableCores(constraints = "connections"), NumCoresAvailable = parallelly::availableCores(constraints = "connections"), availMem = pemisc::availableMemory()/1e+06 ) optimalClusterNum( memRequiredMB = 500, maxNumClusters = parallelly::availableCores(constraints = "connections") )
optimalClusterNumGeneralized( memRequiredMB = 500, maxNumClusters = parallelly::availableCores(constraints = "connections"), NumCoresAvailable = parallelly::availableCores(constraints = "connections"), availMem = pemisc::availableMemory()/1e+06 ) optimalClusterNum( memRequiredMB = 500, maxNumClusters = parallelly::availableCores(constraints = "connections") )
memRequiredMB |
The amount of memory needed in MB |
maxNumClusters |
The number of nodes needed (requested) |
NumCoresAvailable |
The number of cores available on the local machine (see note). |
availMem |
The amount of free memory (RAM) available to use. |
integer specifying the number of cores
R hardcodes the maximum number of socket connections it can use (currently set to 128 in R 4.1). Three of these are reserved for the main R process, so practically speaking, a user can create at most 125 connections e.g., when creating a cluster. See https://github.com/HenrikBengtsson/Wishlist-for-R/issues/28.
We limit this a bit further here just in case the user already has open connections.
Uses igraph and Require::pkgDep
.
pkgDepsGraph( pkgs = c("LandR", "pemisc", "map", "SpaDES", "SpaDES.tools", "SpaDES.core", "SpaDES.addins", "SpaDES.shiny", "reproducible", "quickPlot"), plot.it = TRUE )
pkgDepsGraph( pkgs = c("LandR", "pemisc", "map", "SpaDES", "SpaDES.tools", "SpaDES.core", "SpaDES.addins", "SpaDES.shiny", "reproducible", "quickPlot"), plot.it = TRUE )
pkgs |
A character vector of package names. Default is
|
plot.it |
Logical. If |
A list of 2: dt
a data.table of the dependencies, and dtGraph
an igraph
object that can be plotted with plot()
Downloads data from CWFIS Datamart at http://cwfis.cfs.nrcan.gc.ca/datamart.
This runs prepInputs
internally, so use can pass studyArea
etc.
prepFireCanada( year, type = c("NBAC", "Polygon", "Point"), urlBase = "http://cwfis.cfs.nrcan.gc.ca/downloads/nbac/", ... )
prepFireCanada( year, type = c("NBAC", "Polygon", "Point"), urlBase = "http://cwfis.cfs.nrcan.gc.ca/downloads/nbac/", ... )
year |
Numeric, length 1. Which year, from 1986 to 2018 (currently) to download |
type |
Either "NBAC", "Polygon" or "Point" to get the National Burn Area Composite, the Polygon or the Point datasets. |
urlBase |
The url of the directory where the NBAC are stored. Default is the currently known url. If this url becomes stale, please notify the predictive ecology team. |
... |
Additional arguments. |
A SpatialPolygonsDataFrame
plus several downloaded files, including
the ‘.zip’ archive and the extracted files.
Because it is running prepInputs
, checksumming is occurring too.
## Not run: # This will download 2 recent years library(sf) NBAC <- lapply(2016:2017, function(yr) a <- prepFireCanada(yr)) Points <- prepFireCanada(yr, type = "Points", fun = "st_read") Polygons <- prepFireCanada(yr, type = "Polygons") ## End(Not run)
## Not run: # This will download 2 recent years library(sf) NBAC <- lapply(2016:2017, function(yr) a <- prepFireCanada(yr)) Points <- prepFireCanada(yr, type = "Points", fun = "st_read") Polygons <- prepFireCanada(yr, type = "Polygons") ## End(Not run)
This extracts or creates a new raster layer, whose intention is to be used as
the rasterToMatch
argument in further prepInputs
calls.
rasterToMatch(x, ...) ## S4 method for signature 'Raster' rasterToMatch(x, studyArea, ...) ## S4 method for signature 'SpatialPolygonsDataFrame' rasterToMatch(x, studyArea, rasterToMatch, ...)
rasterToMatch(x, ...) ## S4 method for signature 'Raster' rasterToMatch(x, studyArea, ...) ## S4 method for signature 'SpatialPolygonsDataFrame' rasterToMatch(x, studyArea, rasterToMatch, ...)
x |
A Raster Layer with correct resolution and origin. |
... |
Additional arguments |
studyArea |
A |
rasterToMatch |
The raster to match in a |
A RasterLayer
object.
Deprecated functionality
reproducibilityReceipt(title)
reproducibilityReceipt(title)
title |
Header title for the inserted details section. |
Similar to terms
, but this is used on a quoted model and
will only return unique matches in a data
.
termsInData(model, data)
termsInData(model, data)
model |
A quoted model statement |
data |
A data.frame-like object with column names in which to match terms in
|