I am trying to do the following, my dataset looks like this it contains a date in POSIXct format, hourly windspeed and hourly wind direction (df is called wind_DNSeason). My goal is to get frequency counts of windspeed according to the beaufort scale based on season and daylight.
date wspd_havg10m_kn avg_wdir
1 2013-12-06 00:25:00 9.835853 50
2 2013-12-06 01:25:00 10.506479 56
3 2013-12-06 02:25:00 11.847732 55
4 2013-12-06 03:25:00 8.494600 53
5 2013-12-06 04:25:00 13.188985 47
6 2013-12-06 05:25:00 13.188985 60
Adding season based on the date:
wind_DNSeason$season<-time2season(wind_DNSeason$date, out.fmt="seasons", type="default")
Then I am cutting the data into daylight and nighttime using the openair package:
wind_DNSeason$daylight <- cutData(wind, type = "daylight", local.hour.offset = -8, latitude = 54.312519, longitude = -130.305405, local.tz= "Canada/Pacific")
I am aware of the function aggregate but I doubt I am using it correctly:
aggregate(wspd_havg10m_kn ~ season + daylight, wind_DNSeason, length)
That gives me the count of occurences but that is not what I want. Am I trying to do too much in one step?
I would need to know the grouping of the occuring windspeeds (see breaks below) per season split up in day and night. As I would like to create barplots with the different frequencies. breaks=c(0,1,3,6,10,16, 21, 27, 33, 40, 47)
Could I get something that would look somehow like this from which I could then easily calculate the percentages to plot it in barplots:
season daylight total_count wspd<=1 wspd>1,<=3 wspd>3,<=6 etc
1 autumm daylight 854 151 34 56
2 spring daylight 2580 456 56 98
3 summer daylight 1722 34 344 09
4 winter daylight 852 545 55 55
5 autumm nighttime 1030 55 6 777
6 spring nighttime 1825 89 89 344
7 summer nighttime 827 344 55 66
8 winter nighttime 1533 34 66 777
any ideas? THanks for any help!
I tried using dplyr and I think I am really close but somehow it doesn't seem to add up the frequencies correctly. This is how I applied the suggested code:
a<-wind_DNSeason %>% group_by(season,daylight) %>%
mutate(count=n(),"wspd<=1" = sum(wspd_havg10m_kn<=1),
"wspd>1,<=3" = sum(wspd_havg10m_kn > 1 & wspd_havg10m_kn <= 3, na.rm=TRUE),
"wspd>3,<=6" = sum(wspd_havg10m_kn > 3 & wspd_havg10m_kn <= 6,na.rm=TRUE),
"wspd>6,<=10" = sum(wspd_havg10m_kn > 6 & wspd_havg10m_kn <= 10,na.rm=TRUE),
"wspd>10,<=16" = sum(wspd_havg10m_kn > 10 & wspd_havg10m_kn <= 16,na.rm=TRUE),
"wspd>16,<=21" = sum(wspd_havg10m_kn > 16 & wspd_havg10m_kn <= 21,na.rm=TRUE),
"wspd>21,<=27" = sum(wspd_havg10m_kn > 21 & wspd_havg10m_kn <= 27,na.rm=TRUE),
"wspd>27,<=33" = sum(wspd_havg10m_kn > 27 & wspd_havg10m_kn <= 33,na.rm=TRUE),
"wspd>33,<=40" = sum(wspd_havg10m_kn > 33 & wspd_havg10m_kn <= 40,na.rm=TRUE),
"wspd>40,<=47" = sum(wspd_havg10m_kn > 33 & wspd_havg10m_kn <= 47,na.rm=TRUE))
And the output looks like this, I selected some of the unique rows as it duplicates it across the whole df (e.g for winter day and nightime):
date wspd_havg10m_kn avg_wdir daylight season count wspd<=1 wspd>1,<=3 wspd>3,<=6 wspd>6,<=10 wspd>10,<=16 wspd>16,<=21 wspd>21,<=27 wspd>27,<=33 wspd>33,<=40 wspd>40,<=47
1 2013-12-06 00:25:00 9.8358531 50 nighttime winter 2751 NA 59 185 315 551 260 106 47 6 6
2 2013-12-06 12:25:00 7.3768898 57 daylight winter 1449 NA 13 73 251 322 133 46 13 0 0
Shouldn't the frequencies of the different groups add up to the total count? The total df contains 13368 timesteps, if I add up the frequencies for each group I only get 11165. There are no windspeeds that are bigger than the largest group. What am I missing?
Here's a
dplyr
solution:You can add on as many columns for wind strengths as you want, filling out the names and requirements.