Understanding Chi Square test on Titanic dataset

568 views Asked by ApaarBawa At 09 October 2020 at 07:41

Currently I am working on Hypothesis Testing on datasets.

While reading about chi square tests I found this notebook through Kaggle:

https://github.com/viswanathanc/statistics/blob/master/Titanic%20Chi%20Square%20test%20-%20PClass%20vs%20Survied.ipynb

It is chi square hypothesis testing on titanic dataset.

For calculating relationship between class and survival he used this code:

1) For getting contingency table (observed values)

PClass_survd = pd.pivot_table(data,index=['Pclass'],columns=['Survived'],aggfunc='size')

2) How class and survival is distributed

pct_class = PClass_survd.sum(axis=1)/891

pct_survived = PClass_survd.sum(axis=0)/891

3) To Calculate Expected Values

pct_class.to_frame()@(pct_survived.to_frame().T)

I don't understand How expected values are calculated in step 3. I know pd.to_frame() convert series to dataframe.

Can anyone please explain this step 3 in detail or how generally expected values be calculated from dataset without using chi square function from stats (with example if possible) ?

Thanks in advance

Original Q&A

TechQA.

Understanding Chi Square test on Titanic dataset

There are 0 answers

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in MACHINE-LEARNING

Related Questions in CHI-SQUARED

Related Questions in HYPOTHESIS-TEST

Popular Questions

Popular Tags

Trending Questions