My question has three parts: (1) Can a feedforward Neural Network handle input features that are mixed: Some are categorical (discrete-valued: e.g., Low, Med, High) and some are real-valued? The total number of the input feature variables is about 80 - 90, and I wish to solve a (supervised) classification problem (2) If the answer to part (1) is yes, I have read about using binary codes like (Low = 001, Med = 010, High = 100, etc.) for representing the discrete-valued input feature-variables in other contexts--will that work for the NN's as well? I am concerned about scaling / normalization of the whole input feature vector (which I suppose is recommended)--how to scale/normalize the whole, mixed feature vector or it is not required? (3) Someone suggested that I use Random Forest (RF). I am not that familiar with the RF's. What are the pros and cons of using RF versus NN's in the given context?
Neural Nets Mixed Real-valued and Categorical Input Features
864 views Asked by H W At
1
There are 1 answers
Related Questions in MACHINE-LEARNING
- Trained ML model with the camera module is not giving predictions
- Keras similarity calculation. Enumerating distance between two tensors, which indicates as lists
- How to get content of BLOCK types LAYOUT_TITLE, LAYOUT_SECTION_HEADER and LAYOUT_xx in Textract
- How to predict input parameters from target parameter in a machine learning model?
- The training accuracy and the validation accuracy curves are almost parallel to each other. Is the model overfitting?
- ImportError: cannot import name 'HuggingFaceInferenceAPI' from 'llama_index.llms' (unknown location)
- Which library can replace causal_conv1d in machine learning programming?
- Fine-Tuning Large Language Model on PDFs containing Text and Images
- Sketch Guided Text to Image Generation
- My ICNN doesn't seem to work for any n_hidden
- Optuna Hyperband Algorithm Not Following Expected Model Training Scheme
- How can I resolve this error and work smoothly in deep learning?
- ModuleNotFoundError: No module named 'llama_index.node_parser'
- Difference between model.evaluate and metrics.accuracy_score
- Give Bert an input and ask him to predict. In this input, can Bert apply the first word prediction result to all subsequent predictions?
Related Questions in NEURAL-NETWORK
- Influence of Unused FFN on Model Accuracy in PyTorch
- How to train a model with CSV files of multiple patients?
- Does tensorflow have a way of calculating input importance for simple neural networks
- My ICNN doesn't seem to work for any n_hidden
- a problem for save and load a pytorch model
- config QConfig in pytorch QAT
- How can I convert a flax.linen.Module to a torch.nn.Module?
- Spiking neural network on FPGA
- Error while loading .keras model: Layer node index out of bounds
- Matrix multiplication issue in a Bidirectional LSTM Model
- Recommended way to use Gymnasium with neural networks to avoid overheads in model.fit and model.predict
- Loss is not changing. Its remaining constant
- Relationship Between Neural Network Distances and Performance
- Mapping a higher dimension tensor into a lower one: (B, F, D) -> (B, F-n, D) in PyTorch
- jax: How do we solve the error: pmap was requested to map its argument along axis 0, which implies that its rank should be at least 1, but is only 0?
Related Questions in RANDOM-FOREST
- Multioutput regression using GPU
- Calculate RMSE for RF regression hyperparameter tuning in GEE encountering issue with error "(...)List<FeatureCollection>."
- Unsupervised random forest with large dataset
- Issue with proj4: Error: [project] 'to' cannot be missing
- Apache Spark RandomForestClassifier Predict label for single user input
- Feature Selection with Random Forest and R Package 'Ranger' / interpretation of function 'variable.importance'
- Object not found when building a random forest regression
- Modelling for species or community interactions at timepoints
- roc_auc_score differs between RandomForestClassifier GridSearchCV and explicitly coded RandomForestCLassifier
- SKLearn algorithms than handle native NaN values
- Can CNN and RF be trained together
- Partial dependence plot - model developed using scaled data, how to unscale for PDP?
- Trained Random forest model from python to matlab
- evaluation metrics of MSE,MAE and RMSE
- predict_proba() giving probabilities as 0s and 1s but few intermediate values
Related Questions in FEATURE-SELECTION
- Feature Selection with Random Forest and R Package 'Ranger' / interpretation of function 'variable.importance'
- Dynamically set K value of SelectKBest
- ANOVA Feature Selection
- Trying to use the multiprocessing library in Python but I am running into issues where it freezes but throws no error
- Catia Macro - select all ''non'' updated features
- Pycaret : Got Missing Value error in target col
- Is there a way to retrieve coefficients of SequentialFeatureSelection after model fit?
- Unable to find out the feature importance list from histgradientboosting classifier
- Feature selection with boruta python package
- Feature selection using backward feature selection in scikit-learn and PCA
- Training feature matrix vs Real input
- Feature selection using GI (Gini Importance) and MIC(Maximum Information Coefficient)
- How to select n columns from a matrix minimizing a given function
- WEKA Caim package
- Relation between Jacobians and gradients of neural network's forward pass w.r.t. inputs
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)
As far as point 2 goes, if you transform each of your categorical inputs into a k-vector (with k = # of classes) you are just introducing k new inputs, which are scaled in the range [0, 1], so if your real-valued input features are themselves scaled in that range you're pretty much okay.
Note that if you are using a tanh activation function (whose outputs range from -1 to 1), you should transform your categorical input features accordingly, so (say k = 3):
0 should become <1, -1, -1>
1 should become <-1, 1, -1>
2 should become <-1, -1, 1>
Hope I'm clear about that.