Hi I have a file having structure as follows:
> df
LATITUDE1 LONGITUDE1 LATITUDE2 LONGITUDE2 X V Y W Cell1 Cell2
1 -71.2 -180 -71.344 178.97 -72 -72 -180 178 -26100 -25742
2 -71.0 -180 -71.300 177.70 -71 -72 -180 177 -25740 -25743
3 -70.8 -180 -71.300 177.70 -71 -72 -180 177 -25740 -25743
4 -70.6 -180 -71.444 174.30 -71 -72 -180 174 -25740 -25746
5 -70.4 -180 -71.040 175.76 -71 -72 -180 175 -25740 -25745
6 -70.2 -180 -70.499 176.33 -71 -71 -180 176 -25740 -25384
7 -70.0 -180 -70.350 177.03 -70 -71 -180 177 -25380 -25383
8 -69.8 -180 -70.995 176.40 -70 -71 -180 176 -25380 -25384
9 -69.6 -180 -71.309 171.87 -70 -72 -180 171 -25380 -25749
10 -69.4 -180 -71.015 171.42 -70 -72 -180 171 -25380 -25749
I have some R-code that summarizes non-zero transition probabilities from Cell1-levels to Cell2-levels:
counts <- by(df, df$Cell1, function(d) c(table(d$Cell2)/nrow(d)))
> counts1
df$Cell1: -26100
-25742 -25743 -25746 -25745 -25384 -25383 -25749
1 0 0 0 0 0 0
------------------------------------------------------------
df$Cell1: -25740
-25742 -25743 -25746 -25745 -25384 -25383 -25749
0.0 0.4 0.2 0.2 0.2 0.0 0.0
------------------------------------------------------------
df$Cell1: -25380
-25742 -25743 -25746 -25745 -25384 -25383 -25749
0.00 0.00 0.00 0.00 0.25 0.25 0.50
I would like to be able to make a sparse matrix of transition probabilities from this list (zero and non-zero): Since my list elements are of unequal length this is rather difficult. I have tried do.call
but this its not acceptable, since I would have to look up "manually" every Cell-level and determine whether or not it should be zero.
> do.call(rbind, counts)
-25746 -25745 -25743 -25384
-26100 1.0 1.00 1.00 1.0
-25740 0.2 0.20 0.40 0.2
-25380 0.5 0.25 0.25 0.5
Thank you.
EDIT: Using akrins code below I get a matrix of the form
do.call(rbind, counts)
-25742 -25743 -25746 -25745 -25384 -25383 -25749
-26100 1 0.0 0.0 0.0 0.00 0.00 0.0
-25740 0 0.4 0.2 0.2 0.20 0.00 0.0
-25380 0 0.0 0.0 0.0 0.25 0.25 0.5
I am expecting results of the form
A B C D
A aa 0 ac 0
B ba bb 0 bd
C 0 cb 0 0
D 0 db 0 0
The table function creates one entry per level when given factors.
If I understood correctly, this is what you want:
This yields:
If you know that the cells are between e.g. -30000 and 30000 you can simply set
levels=-30000:30000
.EDIT: If you want the probabilities, just normalize the lines or use prop.table to do it.
But you end up with NaN on the lines with no entries. You should normalize the lines yourself, or if you prefer the quick and dirty way,
t[is.nan(t)] <- 0
So that you end up with: