my dataset has a column 'sales_method' with 3 values A B C but they have different counts of 7400 4900 2500... should i sample for the smaller categories until they all have 7400 entries? or should i sample the big ones for only 2500 rows ? i have other numbers to compare about like 'revenue'. and maybe it can be inclined to bigger size = bigger revenue. (so sample whole rows without permutation I'm assuming). this could be just to make numpy functions work, when they need input of the same sizes.
should i sample different size data variables, until they all have the same size?
28 views Asked by L'Ri Talent At
0
There are 0 answers
Related Questions in PYTHON
- How to store a date/time in sqlite (or something similar to a date)
- Instagrapi recently showing HTTPError and UnknownError
- How to Retrieve Data from an MySQL Database and Display it in a GUI?
- How to create a regular expression to partition a string that terminates in either ": 45" or ",", without the ": "
- Python Geopandas unable to convert latitude longitude to points
- Influence of Unused FFN on Model Accuracy in PyTorch
- Seeking Python Libraries for Removing Extraneous Characters and Spaces in Text
- Writes to child subprocess.Popen.stdin don't work from within process group?
- Conda has two different python binarys (python and python3) with the same version for a single environment. Why?
- Problem with add new attribute in table with BOTO3 on python
- Can't install packages in python conda environment
- Setting diagonal of a matrix to zero
- List of numbers converted to list of strings to iterate over it. But receiving TypeError messages
- Basic Python Question: Shortening If Statements
- Python and regex, can't understand why some words are left out of the match
Related Questions in PANDAS
- ModuleNotFoundError on .ipynb
- Str object is not callable in pandas
- Need help realigning python fill_between with data points
- AttributeError: module 'numba' has no attribute 'generated_jit'
- Fix error when assigning a list of values to dataframe row
- How to make pandas show large datasets in output?
- merge dataframe but do not sort by merge key
- vim python omnifunc not working some modules
- Preserving DataFrame Modifications Across Options in a Streamlit Application
- How to join 2 datasets by looking up based on a string (full match or part match)
- Python Pandas getting hierarchy path till top management
- How to convert pandas series to integer for use in datetime.fromisocalendar
- reformat numbers stored in array
- How can I resolve this error and work smoothly in deep learning?
- What is the best way to merge two dataframes that one of them has date ranges and the other one has date WITHOUT any shared columns?
Related Questions in NUMPY
- Why numpy.vectorize calls vectorized function more times than elements in the vector?
- Producing filtered random samples which can be replicated using the same seed
- Numpy array methods are faster than numpy functions?
- When I create a series of spectrograms from a long audio file, the colour intesities vary noticably
- How do I fix a NumPy ValueError for an inhomogeneous array shape?
- How should I troubleshoot "RuntimeWarning: invalid value encountered in arccos" in NumPy?
- Unravel by multi-index/group
- Calculating IRR Using Numpy
- Integrating with an array of upper limits without sacrificing time efficiency
- Why doesn't this code work? - Backpropagation algorithm
- How to remove integers from a mixed numpy array containing sub-arrays and integers?
- How to transfer object dataframe in sklearn.ensemble methods
- Rust cannot borrow as mutable
- Why does the following code detect this matrix as a non-singular matrix?
- How to detect the exact boundary of a Sudoku using OpenCV when there are multiple external boundaries?
Related Questions in SCIPY.STATS
- Truncated normal distribution doesn't agree with untruncated normal distribution?
- should i sample different size data variables, until they all have the same size?
- How do I import the frechet_r function from the scipy library?
- Python- Scipy: if I have a 2D KDE from a distribution of data, can I then feed it a 1D array of "x" vals to get corresponding "y" vals?
- Zero value when computing an integral in 4-dimensions using quasi-Monte Carlo quadrature
- How to give negative log10 distribution to Python probplot function for qq-plotting p-values?
- Getting inaccurate statistics using scipy
- Different results of SPSS and Python KS-test to assess normality
- How is the poisson data generated inside of the special.pdtr function?
- Computing multivariate normal integral over box region in SciPy
- what am i doing wrong with confusion matrix
- How to compute percentiles with numpy?
- Is there a discretised version of scipy.stats.loguniform?
- How to connect the markers of a probplot with a line
- Missing _shape_info in custom Scipy distributions
Popular Questions
- How do I undo the most recent local commits in Git?
- How can I remove a specific item from an array in JavaScript?
- How do I delete a Git branch locally and remotely?
- Find all files containing a specific text (string) on Linux?
- How do I revert a Git repository to a previous commit?
- How do I create an HTML button that acts like a link?
- How do I check out a remote Git branch?
- How do I force "git pull" to overwrite local files?
- How do I list all files of a directory?
- How to check whether a string contains a substring in JavaScript?
- How do I redirect to another webpage?
- How can I iterate over rows in a Pandas DataFrame?
- How do I convert a String to an int in Java?
- Does Python have a string 'contains' substring method?
- How do I check if a string contains a specific word?
Popular Tags
Trending Questions
- UIImageView Frame Doesn't Reflect Constraints
- Is it possible to use adb commands to click on a view by finding its ID?
- How to create a new web character symbol recognizable by html/javascript?
- Why isn't my CSS3 animation smooth in Google Chrome (but very smooth on other browsers)?
- Heap Gives Page Fault
- Connect ffmpeg to Visual Studio 2008
- Both Object- and ValueAnimator jumps when Duration is set above API LvL 24
- How to avoid default initialization of objects in std::vector?
- second argument of the command line arguments in a format other than char** argv or char* argv[]
- How to improve efficiency of algorithm which generates next lexicographic permutation?
- Navigating to the another actvity app getting crash in android
- How to read the particular message format in android and store in sqlite database?
- Resetting inventory status after order is cancelled
- Efficiently compute powers of X in SSE/AVX
- Insert into an external database using ajax and php : POST 500 (Internal Server Error)