FUSE model in RHydro package (part 4: HydroMAD compatibility)

This is the fourth of a series of tutorials on the FUSE implementation within the RHydro package. The script for this tutorial is available here. If you are interested in following the discussion related to this post and see how it evolves, join the R4Hydrology community on Google+!

If you want to know what FUSE is, how to prepare your data and run a simple simulation go to the first post of the series, for a basic calibration example (using 1 model structure) go to the second post, while for an example of multi-model calibration got to the third post.

RHydro-HydroMAD compatibility

HydroMAD is an excellent framework for hydrological modelling, optimization, sensitivity analysis and assessment of results. It contains a large set of soil moisture accounting modules and routing functions.

Thanks to Joseph Guillaume (hydromad’s maintainer), FUSE-RHydro is now compatible with HydroMAD, therefore using and calibrating FUSE becomes even easier! Joseph provided many of the examples below, many thanks for that too!

In this tutorial I will show how to:

  1. set up FUSE and its parameter ranges using the hydromad approach
  2. run a simulation
  3. calibrate FUSE using one of the hydromad’s algorithms

Recap from previous posts

Load the package and prepare your default list of models

library(hydromad) 
library(RHydro)
data(modlist)

Read sample data in

temp <- read.csv("dummyData.csv") 

Convert to date the first column and then convert to zoo object

temp[,1] <- as.Date(temp[,1],format="%Y-%m-%d") 
DATA <- read.zoo(temp)

Step 1:  set up FUSE and its parameter ranges using the hydromad approach

HydroMAD allows to specify a rainfall-runoff model of choice. This is achieved by using the hydromad() function and specifying the soil moisture accounting model and routing function to use.

Set the parameter ranges using hydromad.options

hydromad.options(fusesma fusesma.ranges(),
                 fuserouting fuserouting.ranges())
Set up the model
modspec <- hydromad(DATA,
                    sma = "fusesma"                    routing = "fuserouting", 
                    mid = 1:1248, 
                    modlist = modlist)
# Randomly generate 1 parameter set
myNewParameterSet <- parameterSetscoef(modspecwarn=FALSE), 
                                    1                                    method="random" )

Step 2: run a single simulation

Run a simulation using the parameter set generated above
modx <- update(modspec,
               newpars = myNewParameterSet)
Generate a summary of the result
summary(modx)
The instantaneous runoff is
U <- modx$U
The routed discharge is
Qrout <- modx$fitted.values
Plot the Observed vs Simulated value
hydromad:::xyplot.hydromad(modx)
Add the precipitation to the above plot
hydromad:::xyplot.hydromad(modx, with.P=TRUE)

Step 3: calibrate FUSE using one of hydromad’s algorithms

Hydromad provide the “fitBy” method to calibrate using a specified algorithm. As an example, the Shuffled Complex Evolution method can be used as shown below. Please note that the procedure is likely to take a LONG time.

modfit <- fitBySCE(modspec)

Get a summary of the result

summary(modfit)

If you want to use the latest version of fuse, the above steps can be adapted as follows:

library(hydromad) 
library(fuse)
data(modlist)

# Load data
temp <- read.csv("dummyData.csv") 
temp[,1] <- as.Date(temp[,1],format="%Y-%m-%d") 
DATA <- read.zoo(temp)

# Set the parameter ranges using hydromad.options
hydromad.options(fusesma = fusesma.ranges(),
                 fuserouting = fuserouting.ranges())

# Set up the model
modspec <- hydromad(DATA,
                    sma = "fusesma", 
                    routing = "fuserouting",
                    mid = 1:1248,
                    deltim = 1/24)

# Calibrate FUSE using one of hydromad’s algorithms
modfit <- fitBySCE(modspec)

# Get a summary of the result
summary(modfit)

What’s next?

This tutorials was just a brief introduction to a topic that can be explored in many different directions. From a technical point of view, it could be worth to invest some effort into code optimization and/or parallelisation. This would also have an impact on the scientific side facilitating experiments on sensitivity analysis, regionalisation of catchment characteristics and model structure variability.

Some of those ideas are already on their way, some others are just random thoughts. Therefore…watch this space!

Advertisements

FUSE model in RHydro package (part 3: ensemble)

This is the third of a series of tutorials on the FUSE implementation within the RHydro package.  The script for this tutorial is available here. If you are interested in following the discussion related to this post and see how it evolves, join the R4Hydrology community on Google+!

If you want to know what FUSE is, how to prepare your data and run a simple simulation go to the first post of the series, while for a basic calibration example (using 1 model structure) go to the second post.

 

Recap from previous posts

Load the package and prepare your data:

library(RHydro)
temp <- read.csv("dummyData.csv") 
DATA <- zooreg(temp[,2:4], order.by=temp[,1])
myDELTIM <- 1

 

FUSE ensemble

Very often hydrologists decide to use a particular hydrological model based on code availability, familiarity and experience rather than based on hydrological suitability. The real advantage of using FUSE is the possibility to work with an ensemble of multiple models so that uncertainties related to the model structure can be quantified.

The input that defines the model structure is called mid (model identification number) and its value ranges between 1 and 1248. When the most suitable model structure(s) is not known a priori,  the mid can be added to the list of parameters and calibrated.

Adding the full mid-range implies the need to increase significantly the sampling space. There are, however, 4 model structures (called parent models) from which all the other model combinations are derived. In this tutorial I will only consider those 4 model structures.

In this tutorial I will show how to:

A. define the update sampling space (parameter + mid ranges)

B. run a multi-model calibration

C. compare results

 

Step A: define the parameter ranges + mid range

The parent models are as follows:

60 = TOPMODEL,

230 = ARNOXVIC,

342 = PRMS

426 = SACRAMENTO

Therefore mid can be one of those 4 values:

mids <- c(60, 230, 342, 426)

The parameter ranges are defined as in the previous post.

require(tgp)
DefaultRanges <- data.frame(rbind(rferr_add = c(0,0),
                                  rferr_mlt = c(1,1), 
                                  maxwatr_1 = c(25,500), 
                                  maxwatr_2 = c(50,5000),
                                  fracten = c(0.05,0.95), 
                                  frchzne = c(0.05,0.95),
                                  fprimqb = c(0.05,0.95), 
                                  rtfrac1 = c(0.05,0.95), 
                                  percrte = c(0.01,1000), 
                                  percexp = c(1,20), 
                                  sacpmlt = c(1,250), 
                                  sacpexp = c(1,5), 
                                  percfrac = c(0.05,0.95), 
                                  iflwrte = c(0.01,1000), 
                                  baserte = c(0.001,1000), 
                                  qb_powr = c(1,10), 
                                  qb_prms = c(0.001,0.25), 
                                  qbrate_2a = c(0.001,0.25), 
                                  qbrate_2b = c(0.001,0.25), 
                                  sareamax = c(0.05,0.95), 
                                  axv_bexp = c(0.001,3), 
                                  loglamb = c(5,10), 
                                  tishape = c(2,5), 
                                  timedelay = c(0.01,5) )
)
names(DefaultRanges) <- c("Min","Max")
nRuns <- 100
parameters <- lhs( nRuns, as.matrix(DefaultRanges) )
parameters <- data.frame(parameters)
names(parameters) <- c("rferr_add","rferr_mlt","maxwatr_1","maxwatr_2","fracten","frchzne","fprimqb","rtfrac1","percrte","percexp","sacpmlt","sacpexp","percfrac","iflwrte","baserte","qb_powr","qb_prms","qbrate_2a","qbrate_2b","sareamax","axv_bexp","loglamb","tishape","timedelay")

Step B: run a multi-model calibration

Use the Nash-Sutcliffe efficiency as objective function and run the model 4*nRuns times (for each mid and  parameter set).

require(qualV)
indices <- rep(NA,4*nRuns)
discharges <- matrix(NA,ncol=4*nRuns,nrow=dim(DATA)[1])
kCounter <- 0
for (m in 1:4){
  myMID <- mids[m]
  for (pid in 1:nRuns){
  kCounter <- kCounter + 1
  ParameterSet <- as.list(parameters[pid,])
  # Run FUSE Soil Moisture Accounting module
  Qinst <-   fusesma.sim(DATA,
                         mid=myMID,
                         modlist,
                         deltim=myDELTIM,
                         states=FALSE, fluxes=FALSE, fracstate0=0.25,
                         ParameterSet$rferr_add, 
                         ParameterSet$rferr_mlt, 
                         ParameterSet$frchzne, 
                         ParameterSet$fracten, 
                         ParameterSet$maxwatr_1,
                         ParameterSet$percfrac,
                         ParameterSet$fprimqb, 
                         ParameterSet$qbrate_2a, 
                         ParameterSet$qbrate_2b, 
                         ParameterSet$qb_prms, 
                         ParameterSet$maxwatr_2, 
                         ParameterSet$baserte, 
                         ParameterSet$rtfrac1, 
                         ParameterSet$percrte, 
                         ParameterSet$percexp, 
                         ParameterSet$sacpmlt, 
                         ParameterSet$sacpexp, 
                         ParameterSet$iflwrte,  
                         ParameterSet$axv_bexp, 
                         ParameterSet$sareamax, 
                         ParameterSet$loglamb, 
                         ParameterSet$tishape,
                         ParameterSet$qb_powr)

    # Run FUSE Routing module
    Qrout <- fuserouting.sim(Qinst, mid=myMID, 
                             modlist=modlist, 
                             timedelay=ParameterSet$timedelay, 
                             deltim=myDELTIM)

    indices[kCounter] <- EF(DATA$Q,Qrout)
    discharges[,kCounter] <- Qrout
}
}

 

Step C: compare results

Deterministically, the best simulation according to the Nash-Sutcliffe efficiency, is the one with the maximum factor.

bestRun <- which(indices == max(indices))

This corresponds to the model ARNOXVIC, in fact:

bestModel <- function(runNumber){
 if (runNumber<(nRuns+1)) myBestModel <- "TOPMODEL"
 if (runNumber>(nRuns+1) & runNumber<(2*nRuns+1)) myBestModel <- "ARNOXVIC"
 if (runNumber>(2*nRuns+1) & runNumber<(3*nRuns+1)) myBestModel <- "PRMS"
 if (runNumber>(3*nRuns+1) & runNumber<(4*nRuns+1)) myBestModel <- "SACRAMENTO"
 return(myBestModel)
}
bestModel(bestRun)

plot(coredata(DATA$Q),type="l",xlab="",ylab="Streamflow [mm/day]", lwd=0.5)
for(pid in 1:(4*nRuns)){
 lines(discharges[,pid], col="gray", lwd=3)
}
lines(coredata(DATA$Q),col="black", lwd=1)
lines(discharges[,bestRun],col="red", lwd=1)

The plot below shows the observed streamflow in black, all the simulated results in grey and the “best” simulated streamflow in red.

As you can see, using multiple model structures inflates the uncertainty.

FUSE simulations (4 models)

FUSE simulations (4 models)

 

How the best simulation of each model structure compare to each other?

bestRun0060 <- which(indices[1:nRuns] == max(indices[1:nRuns]))
bestRun0230 <- nRuns + which(indices[(nRuns+1):(2*nRuns)] == max(indices[(nRuns+1):(2*nRuns)]))
bestRun0342 <- 2*nRuns + which(indices[(2*nRuns+1):(3*nRuns)] == max(indices[(2*nRuns+1):(3*nRuns)]))
bestRun0426 <- 3*nRuns + which(indices[(3*nRuns+1):(4*nRuns)] == max(indices[(3*nRuns+1):(4*nRuns)]))

plot(coredata(DATA$Q),type="l",xlab="",ylab="Streamflow [mm/day]", lwd=1)
lines(discharges[,bestRun0060], col="green", lwd=1)
lines(discharges[,bestRun0230], col="blue", lwd=1)
lines(discharges[,bestRun0342], col="pink", lwd=1)
lines(discharges[,bestRun0426], col="orange", lwd=1)

legend("top", 
        c("TOPMODEL", "ARNOXVIC", "PRMS","SACRAMENTO"), 
        col = c("green", "blue", "pink", "orange"),
        lty = c(1, 1, 1, 1))

Best simulation for each model structure

Best simulation for each model structure

The plot above shows that TOPMODEL seems the least affected by the initial conditions, while PRMS is the most affected.

What’s next?

FUSE-RHydro is also compatible with HydroMAD, therefore model calibration and assessment becomes even easier (see an example here)!