As I have to reconstruct missing data for unmonitored boats, I created a table of days (and time of day, which can be morning/afternoon) that I need to reconstruct for an example boat "GONCALO", by using my dataset. My data table contains monitored boats, the date they were moving and the number of their trip. So, if for example day "06-05-2018, morning" is to be reconstructed for "GONCALO", I need to find out which number of trip from my data I use for reconstructing. Under following conditions:

  • if available for this day I want to get the number of trip of "FAIDOCA"
  • if "FAIDOCA" is not available I want to use "RISSO" for this day
  • if both are not available, I take the trip number of any of the boats on this day that were available in this specific (or if not any) time of the day
  • if I don't have a sample in my data for reconstructing this day, it should randomly pick a trip number from any other day of "FAIDOCA"

Example data:

Unmonitored dates:

Sample data:

This is my code, to keep it simple I tried to build the loop with less conditions first, so some of the listed arguments are missing. But already at this point it doesn't give me the correct number of trip.


for(i in (seq(1,NROW(goncalo)))) {

  for (j in (seq(1,NROW(data)))){

    if(goncalo$date[i]==data$date[j] & goncalo$time_of_day[i]==data$time_of_day[j] & data$name[j]=="FAIDOCA"){
              out[i] <- data$numberoftrip [j]
      out[i] <-sample(data$numberoftrip)

My aim is to get a table with the number of trip, the date they were made and the name of the boat. So I know which data (number of trip) to take to reconstruct the missing dates. I would appreciate any ideas!

0 Answers