Predictionio very large task size

1.3k views Asked by At

I am using recommendation engine and have modified my dataset. A few line from my dataset is as below

4695::132687::5
4695::132688::5
4835::132689::5
3691::132690::5

I can successfully build train and deploy engine. But on issuing pio train I am getting too many very large task size messages. I think this is not a serious issue as I can deploy engine and work on REST API without issues. A part of the messages is pasted below.

[INFO] [Engine$] Data santiy check is on.
[INFO] [Engine$] com.marlabs.TrainingData does not support data sanity check. Skipping check.
[INFO] [Engine$] com.marlabs.PreparedData does not support data sanity check. Skipping check.
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeSystemBLAS
[WARN] [BLAS] Failed to load implementation from: com.github.fommil.netlib.NativeRefBLAS
[WARN] [TaskSetManager] Stage 16 contains a task of very large size (611 KB). The maximum recommended task size is 100 KB.
[Stage 17:>                                                         (0 + 0) / 4][WARN] [TaskSetManager] Stage 17 contains a task of very large size (614 KB). The maximum recommended task size is 100 KB.
[WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeSystemLAPACK
[WARN] [LAPACK] Failed to load implementation from: com.github.fommil.netlib.NativeRefLAPACK
[WARN] [TaskSetManager] Stage 18 contains a task of very large size (615 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 19 contains a task of very large size (615 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 20 contains a task of very large size (616 KB). The maximum recommended task size is 100 KB.
[Stage 21:>                                                         (0 + 0) / 4][WARN] [TaskSetManager] Stage 21 contains a task of very large size (617 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 22 contains a task of very large size (618 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 23 contains a task of very large size (619 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 24 contains a task of very large size (619 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 25 contains a task of very large size (620 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 26 contains a task of very large size (621 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 27 contains a task of very large size (622 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 28 contains a task of very large size (623 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 29 contains a task of very large size (624 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 30 contains a task of very large size (624 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 31 contains a task of very large size (625 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 32 contains a task of very large size (626 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 33 contains a task of very large size (627 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 34 contains a task of very large size (628 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 35 contains a task of very large size (628 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 36 contains a task of very large size (629 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 37 contains a task of very large size (630 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 38 contains a task of very large size (631 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 39 contains a task of very large size (632 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 40 contains a task of very large size (633 KB). The maximum recommended task size is 100 KB.
[WARN] [TaskSetManager] Stage 41 contains a task of very large size (633 KB). The maximum recommended task size is 100 KB.

Also does the url http://localhost:7070/events.json?accessKey=<Access_Key> shows all events or a part of the events? I have imported more than 20k events and url is only showing about 50 events.

1

There are 1 answers

0
Kenneth Chan On BEST ANSWER

As described here, it should be safe to ignore this warning for ALS.

If you are interested in digging into more details of these warning. you can start Spark standalone cluster. and then enable the event log and configure the log directory and when run "pio train". For example:

pio train -- --master <YOUR spark master URL> --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=/your_spark_event_log_directory/event_log

Then you can go to Spark UI (http://localhost:8080/ by default) and look at the stages detail of the job.

Yes. querying event sever http://localhost:7070/events.json?accessKey=<Access_Key> return 20 events by default. You can pass the limit parameter to get more events.

for example. to get 100 events, use"http://localhost:7070/events.json?accessKey=<Access_Key>&limit=100" Please see here for more details.