I have a VCorpus
"oanc" and I want to change all the words to lower case, so I use the following function
oanc1 <- tm_map(oanc, content_transformer(tolower))
But I got a warning:
Warning message:
In mclapply(content(x), FUN, ...) :
scheduled cores 2 encountered errors in user code, all values of the jobs will be affected
The VCorpus
"oanc" is of size 586MB while "oanc1" is only 4MB. In addition, all the contents, except the first text, are broken, and when I run
writeLines(as.character(oanc1[[2]]))
I got
Error in FUN(content(x), ...) :
invalid input 'O<8c><be>BĭĪ<e2>=<f3><81>̡@>9<c2>Au<b7>l<99><c5>u <c4>%<a0>[,<9c><93><b8><90>w<b7><97><f7>58<e3><d7>><91><bf>"~WD<cf>2<c3><84>1GQ<dd><ed>ـ\<e2><fb><f3><d3>X]<fe>5t!<9f><89>ٍdH<e3><d6>Zu<bc><e8><b6>_RS<f0><f7><81><eb>E<f0><bd>Ԗ2o<b4>G<a7><b9><d2><fc><8a><f2><89>3<a8>ؗ<d6><c0>.w,<l<b7>}<f8>J<8f><f1><f1>����{p<94><a3>x<9e><89><da>e'<8c><ca>}y<d1><ca>V<f7>v<c3>>S^`<9e><86><f1><b1>E<b8>)<cd>ꅹ<e5><ab><<80><eb><8e>z<d0>}<a3>C<86>(%r<86><f4><e3>i*<da>i V{<94>'<f6>i<f6><a7>{dh<d0>jG۾wO<dd>?<<f7>i<c5>c<84>G<dc>3<bb>-E<e9>L<b1><b6>XG<f5>F<81><97><b1><e5><de>ln<b1><d6><f5><f6><90> DŽ<b2>/j<fc><d9>{£<83><f1><c5>;n7<bb>ɰEG<a9><b0><87>!<b5>5]9<b9><e6><fe>_Q<aa>U<a8><c0><cf>,<d9><dc>wܒ<ba>ɑ<f1>Q<c9>:r<e4><b4><ea>w<be>PCb' in 'utf8towcs'
Does any one can help me? My operating system is ubuntu 14.04LTS, and R version 3.2.0
First, make sure the text is encoded in UTF-8 (if you can open the file in a text editor then you should be able to modify the encoding when you save it). If that doesn't fix the problem, then try adding the argument "mc.cores = 1" to the tm_map function.