I am trying to load a dataset in weka, I have tried many solutions such as arff format, comas etc. but it was all a failure. Could any of you give me a working solution or load this dataset according to the format.
CSV file not recognized as csv, reason nominal value not declared in header
511 views Asked by Arsh At
2
There are 2 answers
0
fracpete
On
Instead of using Weka's functionality for reading CSV files, you could use ADAMS (developed at the same university; I'm the lead developer) instead.
Download the adams-ml-app snapshot and then use the Weka Investigator to load/save the file:
- Load it as ADAMS Spreadsheets (.csv, .csv.gz)
- Save it as Arff data files (.arff, .arff.gz) or Simple ARFF data files (.arff, .arff.gz)
The Reviews column contains an erroneous 3.0M, which prevents it from becoming numeric.
If you want to have an introduction to the Weka Investigator, then take a look at my talk from the Weka User Conference 2021: Taking Weka to the next level with ADAMS .
Related Questions in DATASET
- How to add a new variable to xarray.Dataset in Python with same time,lat,lon dimensions with assign?
- Power BI Automations of Audits and APIs
- Trouble understanding how to use list of String data in a Machine Learning dataset - Features expanded before making prediction
- how to difference values within several panels
- How to use an imported Excel file inside Anylogic model
- Need to be able to load different reports into the same report viewer, based on the selection of a combobox value How do i do this?
- Can i merge my custom model and pretrained model in yolov9
- How to access the whole public dataset hosted on a website?
- Use dataset name in knitr code chunk in R
- How many images should I label from the training set?
- How to get a list of numbers out of an awk output in bash
- Wrong file reading in Jupyter
- Request for Rui Li twitter dataset
- Illustrator file to single word Dataset
- Image augmentation for dataset creation
Related Questions in DATA-MINING
- How can I compare the similarity between multiple sets?
- I can't click the xpath address after 2 iteration
- Text clustering based on “stance” rather than the distribution of embeddings as the basis for clustering
- Using a BERT Model, I keep getting the error: Op type not registered 'CaseFoldUTF8' in binary running on MacBook-Pro-21.lan
- How to generate all possible association rule using frequent itemset?
- Representation of sequential rules in data mining (sequence pattern mining)
- Add rows to the weather data for each day, placing the corresponding date at the top
- The Output of this python code is not what I am expecting
- Preparing CSV files for pm4py event-log conversion
- KNIME Concatenate node with List Files/Folders loop?
- Weka attribute problems
- What is a more optimal method for performing this Pandas Computation
- Scrape Company opening amd closing time on Google map
- Python as_strided method, how does it work?
- Why is this .csv file not woking in Weka?
Related Questions in WEKA
- I keep getting a "NoClassDefFound" error with Weka Ai using Java. I keep getting this Error?
- How to treat integer attributes in WEKA i.e. number of bedrooms (cannot be float values)
- Dataset not being accepted by Weka's J48 plugin (C 4.5 algorithm)
- weka inital heap size memory allocated
- Problem with Decision Tree Visualization in Weka: sorry there is no instances data for this node
- How can I limit the depth of a decision tree using C4.5 in Weka?
- Weka supplied test set didn't process the full dataset
- converting a csv file to arff file using weka converter, but it is not counting enough columns
- i have loaded a csv file in weka tool but J48 is not highlight
- Why am I getting these exceptions when trying to load a .csv file into Weka 3.8.6?
- converting a csv file to arff file using weka converter
- WEKA EEG data Filter creation
- How can I see the ideal range of a numerical independent variable according to its dependent variable?
- Intepreting WEKA data
- Java Weka API: Getting ROC Area values
Related Questions in RWEKA
- weka inital heap size memory allocated
- Yearly Forecasting in WEKA ,
- Text analysis with dictionary of words: NGramTokenizer not working
- Issues Installing RWeka in Databricks
- R Session Aborted while loading RWeka
- CSV file not recognized as csv, reason nominal value not declared in header
- make_Weka_classifier("weka/classifiers/bayes/naiveBayes") and J48 not working on my R?
- Weka exception: Can't open file iris.arrf
- Running CatBoost in Weka
- In WEKA, J48, does setting the minNumObj to 1 make sense?
- All nodes are labelled as "setosa" for R48 plot of iris dataset in R
- Not able to install RWeka
- Is there a non-Java implementation of the M5P regression model of the RWeka library?
- Unable to load RWeka package
- How to save the result of feature selection in Weka?
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Popular Tags
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)

There are too many issues with lines in this file. In line 23, I eliminated the odd looking brackets. I removed all single quotes (') I eliminated all repeated double quotes ("") In line 10474 the first two fields (before the number) didn't seem to be separated, so I added a comma. This allowed the file to go through initial screening, but...
The file contains a lot of odd emojis. I started to eliminate them one by one, but there are clearly more of these than I wish to deal with. Each time I got rid of one, it would read farther into the file, then stop at the next one.
If I just try to read the top of the file, the first 20 lines before we get to any of these problems, it reads fine.
My partial editing can be found here: https://www.dropbox.com/s/ij707mb23dt1jvz/googleplaystore3.csv?dl=0 I think if you clear up the remaining emojis the file should be usable.