Checking for empty column in sample data?

Question

Checking for empty column in sample data?

132 views Asked by Tyrone_Slothrop At 06 January 2025 at 01:11

My script below takes a sample from an excel file, calculates a sample size based on some criteria, and spits out a csv file. My issue is with a part of the script that checks to see if a certain column is empty. I have tried .empty and isnull. Is null doesn't throw an error, but it doesn't do what I want, and .empty gives me a keyword error. How can I combine an if statement and a statement to check for an empty column?

**if df2['Subcategory'].isnull:**
    def sample_per(df2):
        if len(df2) >= 15000:
            return (df2.groupby('Category').apply(lambda x: x.sample(frac=0.01)))
        elif len(df2) < 15000 and len(df2) > 10000:
            return (df2.groupby('Category').apply(lambda x: x.sample(frac=0.03)))
        else:
            return (df2.groupby('Category').apply(lambda x: x.sample(frac=0.05)))

else:
    def sample_per(df2):
        if len(df2) >= 15000:
            return (df2.groupby('Subcategory').apply(lambda x: x.sample(frac=0.01)))
        elif len(df2) < 15000 and len(df2) > 10000:
            return (df2.groupby('Subcategory').apply(lambda x: x.sample(frac=0.03)))
        else:
            return (df2.groupby('Subcategory').apply(lambda x: x.sample(frac=0.05)))

Original Q&A

There are 1 answers

**Masso** · Answer 1 · 2020-04-08T23:36:28+00:00

.isnull() is used to check for NaN (or similar) values! (Not a Number)

If by empty column you mean a column of NaN...

You can either use .isnan() or .isnull() methods of Series object!

Watch it! in if df2['Subcategory'].isnull you didn’t call .isnull() ... meaning you didn’t write the parenthesis!

After that you will be returned a Series of Boolean values.

If you wanna know if all of the rows in that column are NaN you can just do this (to obtain a single True or False):

if df2['Subcategory'].isnull().all(): Rest of the code

If by empty you mean filled with “” (empty strings)
Then you could do this

df2['Subcategory'].apply(lambda x: not x).all()

Which evaluates to True if all the rows in “Subcategory” are empty strings.

Ps. Use .any() instead of .all() to check if at least one is True!

TechQA.

Checking for empty column in sample data?

There are 1 answers

Related Questions in PYTHON

Related Questions in PANDAS

Popular Questions

Popular Tags

Trending Questions