I am wanting to create a single dataframe from the following website: http://www.arrs.net/MaraList/ML_2014.htm
Unfortunately I am not sure how to take what seem to be tab delimiters and create columns of data. The code I have below takes and creates multiple character strings but I'm having trouble determining how to separate names that have multiple words into a single column as it is shown on the site.
library(XML)
url<-"http://www.arrs.net/MaraList/ML_2014.htm"
data<-readLines(url)
data<-sub("</FONT></b><FONT SIZE=\"2\" <FONT COLOR=\"#00000\" FACE=\"Courier New, Courier\">","",data)
data<-sub("<B><FONT COLOR=\"#0066FF\" FACE=\"Arial\">","",data)
data<-read.table(textConnection(data),stringsAsFactors=FALSE)
data<-data[11:40000,1]
So, not sure any of the current code I have can get me there. Any information or link(s) to prior posts would be appreciated.
Here's one approach to read this in (using two packages I maintain and the terrific
stacksplitshape
package) . You'll need the dev version ofqdapTools
.