Overview

Dataset statistics

Number of variables12
Number of observations1190
Missing cells0
Missing cells (%)0.0%
Duplicate rows272
Duplicate rows (%)22.9%
Total size in memory327.0 KiB
Average record size in memory281.4 B

Variable types

Numeric5
Categorical7

Warnings

Dataset has 272 (22.9%) duplicate rowsDuplicates
ST_Slope is highly correlated with Ex_Angina and 3 other fieldsHigh correlation
Ex_Angina is highly correlated with ST_Slope and 4 other fieldsHigh correlation
Max_Heart is highly correlated with Ex_Angina and 1 other fieldsHigh correlation
Cardio is highly correlated with ST_Slope and 4 other fieldsHigh correlation
Old_Peak is highly correlated with ST_Slope and 2 other fieldsHigh correlation
Chest_Pain is highly correlated with ST_Slope and 2 other fieldsHigh correlation
ST_Slope is highly correlated with CardioHigh correlation
Cardio is highly correlated with ST_Slope and 1 other fieldsHigh correlation
Chest_Pain is highly correlated with CardioHigh correlation
Cholesterol has 172 (14.5%) zeros Zeros
Old_Peak has 455 (38.2%) zeros Zeros

Reproduction

Analysis started2021-07-09 14:51:24.208754
Analysis finished2021-07-09 14:51:39.665732
Duration15.46 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Age
Real number (ℝ≥0)

Distinct50
Distinct (%)4.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean53.72016807
Minimum28
Maximum77
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2021-07-09T10:51:39.928045image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum28
5-th percentile38
Q147
median54
Q360
95-th percentile68
Maximum77
Range49
Interquartile range (IQR)13

Descriptive statistics

Standard deviation9.358202798
Coefficient of variation (CV)0.1742027833
Kurtosis-0.4117756705
Mean53.72016807
Median Absolute Deviation (MAD)7
Skewness-0.1921112798
Sum63927
Variance87.5759596
MonotonicityNot monotonic
2021-07-09T10:51:40.290568image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
5467
 
5.6%
5858
 
4.9%
5750
 
4.2%
5247
 
3.9%
5947
 
3.9%
5147
 
3.9%
5547
 
3.9%
5647
 
3.9%
6246
 
3.9%
6044
 
3.7%
Other values (40)690
58.0%
ValueCountFrequency (%)
281
 
0.1%
294
 
0.3%
301
 
0.1%
312
 
0.2%
325
 
0.4%
332
 
0.2%
349
0.8%
3514
1.2%
366
0.5%
3713
1.1%
ValueCountFrequency (%)
773
 
0.3%
763
 
0.3%
753
 
0.3%
748
0.7%
731
 
0.1%
724
 
0.3%
718
0.7%
7011
0.9%
6916
1.3%
6813
1.1%

Sex
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size67.5 KiB
1
909 
0
281 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1190
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row0
5th row1

Common Values

ValueCountFrequency (%)
1909
76.4%
0281
 
23.6%

Length

2021-07-09T10:51:40.890624image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:51:41.081034image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1909
76.4%
0281
 
23.6%

Most occurring characters

ValueCountFrequency (%)
1909
76.4%
0281
 
23.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1190
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1909
76.4%
0281
 
23.6%

Most occurring scripts

ValueCountFrequency (%)
Common1190
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1909
76.4%
0281
 
23.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1909
76.4%
0281
 
23.6%

Chest_Pain
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size67.5 KiB
4
625 
3
283 
2
216 
1
66 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1190
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row3
3rd row2
4th row4
5th row3

Common Values

ValueCountFrequency (%)
4625
52.5%
3283
23.8%
2216
 
18.2%
166
 
5.5%

Length

2021-07-09T10:51:41.583611image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:51:41.772436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
4625
52.5%
3283
23.8%
2216
 
18.2%
166
 
5.5%

Most occurring characters

ValueCountFrequency (%)
4625
52.5%
3283
23.8%
2216
 
18.2%
166
 
5.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1190
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4625
52.5%
3283
23.8%
2216
 
18.2%
166
 
5.5%

Most occurring scripts

ValueCountFrequency (%)
Common1190
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4625
52.5%
3283
23.8%
2216
 
18.2%
166
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII1190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4625
52.5%
3283
23.8%
2216
 
18.2%
166
 
5.5%

Rest_Systolic
Real number (ℝ≥0)

Distinct67
Distinct (%)5.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean132.1537815
Minimum0
Maximum200
Zeros1
Zeros (%)0.1%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2021-07-09T10:51:42.052603image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile106
Q1120
median130
Q3140
95-th percentile160
Maximum200
Range200
Interquartile range (IQR)20

Descriptive statistics

Standard deviation18.36882342
Coefficient of variation (CV)0.1389958214
Kurtosis2.757899211
Mean132.1537815
Median Absolute Deviation (MAD)10
Skewness0.29346243
Sum157263
Variance337.4136737
MonotonicityNot monotonic
2021-07-09T10:51:42.424670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
120166
13.9%
130149
 
12.5%
140137
 
11.5%
11076
 
6.4%
15073
 
6.1%
16061
 
5.1%
12539
 
3.3%
12827
 
2.3%
13526
 
2.2%
13826
 
2.2%
Other values (57)410
34.5%
ValueCountFrequency (%)
01
 
0.1%
801
 
0.1%
921
 
0.1%
944
 
0.3%
956
 
0.5%
961
 
0.1%
981
 
0.1%
10019
1.6%
1012
 
0.2%
1025
 
0.4%
ValueCountFrequency (%)
2005
 
0.4%
1922
 
0.2%
1902
 
0.2%
1851
 
0.1%
18015
1.3%
1785
 
0.4%
1742
 
0.2%
1723
 
0.3%
17016
1.3%
1653
 
0.3%

Cholesterol
Real number (ℝ≥0)

ZEROS

Distinct222
Distinct (%)18.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean210.3638655
Minimum0
Maximum603
Zeros172
Zeros (%)14.5%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2021-07-09T10:51:42.810040image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q1188
median229
Q3269.75
95-th percentile330
Maximum603
Range603
Interquartile range (IQR)81.75

Descriptive statistics

Standard deviation101.420489
Coefficient of variation (CV)0.4821193449
Kurtosis0.8242640545
Mean210.3638655
Median Absolute Deviation (MAD)41
Skewness-0.7816458178
Sum250333
Variance10286.1156
MonotonicityNot monotonic
2021-07-09T10:51:43.145003image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0172
 
14.5%
25416
 
1.3%
23413
 
1.1%
20413
 
1.1%
21113
 
1.1%
22312
 
1.0%
23012
 
1.0%
21912
 
1.0%
19711
 
0.9%
24311
 
0.9%
Other values (212)905
76.1%
ValueCountFrequency (%)
0172
14.5%
851
 
0.1%
1002
 
0.2%
1101
 
0.1%
1131
 
0.1%
1171
 
0.1%
1231
 
0.1%
1263
 
0.3%
1291
 
0.1%
1311
 
0.1%
ValueCountFrequency (%)
6031
0.1%
5642
0.2%
5291
0.1%
5181
0.1%
4911
0.1%
4681
0.1%
4661
0.1%
4581
0.1%
4172
0.2%
4121
0.1%

Fast_Glucose
Categorical

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size67.5 KiB
0
936 
1
254 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1190
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0936
78.7%
1254
 
21.3%

Length

2021-07-09T10:51:43.776229image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:51:43.948068image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0936
78.7%
1254
 
21.3%

Most occurring characters

ValueCountFrequency (%)
0936
78.7%
1254
 
21.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1190
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0936
78.7%
1254
 
21.3%

Most occurring scripts

ValueCountFrequency (%)
Common1190
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0936
78.7%
1254
 
21.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0936
78.7%
1254
 
21.3%

Rest_ECG
Categorical

Distinct3
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size67.5 KiB
0
684 
2
325 
1
181 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1190
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0684
57.5%
2325
27.3%
1181
 
15.2%

Length

2021-07-09T10:51:44.466129image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:51:44.639314image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0684
57.5%
2325
27.3%
1181
 
15.2%

Most occurring characters

ValueCountFrequency (%)
0684
57.5%
2325
27.3%
1181
 
15.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1190
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0684
57.5%
2325
27.3%
1181
 
15.2%

Most occurring scripts

ValueCountFrequency (%)
Common1190
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0684
57.5%
2325
27.3%
1181
 
15.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0684
57.5%
2325
27.3%
1181
 
15.2%

Max_Heart
Real number (ℝ≥0)

HIGH CORRELATION

Distinct119
Distinct (%)10.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean139.7327731
Minimum60
Maximum202
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size9.4 KiB
2021-07-09T10:51:44.884688image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile97
Q1121
median140.5
Q3160
95-th percentile179
Maximum202
Range142
Interquartile range (IQR)39

Descriptive statistics

Standard deviation25.51763555
Coefficient of variation (CV)0.1826173988
Kurtosis-0.4603229879
Mean139.7327731
Median Absolute Deviation (MAD)19.5
Skewness-0.2330977118
Sum166282
Variance651.149724
MonotonicityNot monotonic
2021-07-09T10:51:45.229768image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
15049
 
4.1%
14046
 
3.9%
12039
 
3.3%
13036
 
3.0%
16035
 
2.9%
12528
 
2.4%
17025
 
2.1%
12224
 
2.0%
16223
 
1.9%
11023
 
1.9%
Other values (109)862
72.4%
ValueCountFrequency (%)
601
0.1%
631
0.1%
671
0.1%
691
0.1%
701
0.1%
712
0.2%
722
0.2%
731
0.1%
771
0.1%
781
0.1%
ValueCountFrequency (%)
2022
 
0.2%
1952
 
0.2%
1942
 
0.2%
1922
 
0.2%
1903
0.3%
1883
0.3%
1872
 
0.2%
1864
0.3%
1855
0.4%
1845
0.4%

Ex_Angina
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size67.5 KiB
0
729 
1
461 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1190
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
0729
61.3%
1461
38.7%

Length

2021-07-09T10:51:45.854621image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:51:46.042078image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
0729
61.3%
1461
38.7%

Most occurring characters

ValueCountFrequency (%)
0729
61.3%
1461
38.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1190
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0729
61.3%
1461
38.7%

Most occurring scripts

ValueCountFrequency (%)
Common1190
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0729
61.3%
1461
38.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII1190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0729
61.3%
1461
38.7%

Old_Peak
Real number (ℝ)

HIGH CORRELATION
ZEROS

Distinct53
Distinct (%)4.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.9227731092
Minimum-2.6
Maximum6.2
Zeros455
Zeros (%)38.2%
Negative13
Negative (%)1.1%
Memory size9.4 KiB
2021-07-09T10:51:46.279185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-2.6
5-th percentile0
Q10
median0.6
Q31.6
95-th percentile3
Maximum6.2
Range8.8
Interquartile range (IQR)1.6

Descriptive statistics

Standard deviation1.086337219
Coefficient of variation (CV)1.177252791
Kurtosis1.408578814
Mean0.9227731092
Median Absolute Deviation (MAD)0.6
Skewness1.094005883
Sum1098.1
Variance1.180128552
MonotonicityNot monotonic
2021-07-09T10:51:46.592795image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0455
38.2%
198
 
8.2%
284
 
7.1%
1.558
 
4.9%
1.240
 
3.4%
0.233
 
2.8%
332
 
2.7%
1.431
 
2.6%
0.827
 
2.3%
1.827
 
2.3%
Other values (43)305
25.6%
ValueCountFrequency (%)
-2.61
0.1%
-21
0.1%
-1.51
0.1%
-1.11
0.1%
-12
0.2%
-0.91
0.1%
-0.81
0.1%
-0.71
0.1%
-0.52
0.2%
-0.12
0.2%
ValueCountFrequency (%)
6.22
 
0.2%
5.62
 
0.2%
51
 
0.1%
4.41
 
0.1%
4.24
 
0.3%
410
0.8%
3.82
 
0.2%
3.71
 
0.1%
3.68
0.7%
3.53
 
0.3%

ST_Slope
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)0.3%
Missing0
Missing (%)0.0%
Memory size67.5 KiB
2
582 
1
526 
3
81 
0
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1190
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.1%

Sample

1st row1
2nd row2
3rd row1
4th row2
5th row1

Common Values

ValueCountFrequency (%)
2582
48.9%
1526
44.2%
381
 
6.8%
01
 
0.1%

Length

2021-07-09T10:51:47.219052image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:51:47.696480image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
2582
48.9%
1526
44.2%
381
 
6.8%
01
 
0.1%

Most occurring characters

ValueCountFrequency (%)
2582
48.9%
1526
44.2%
381
 
6.8%
01
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1190
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2582
48.9%
1526
44.2%
381
 
6.8%
01
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Common1190
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2582
48.9%
1526
44.2%
381
 
6.8%
01
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2582
48.9%
1526
44.2%
381
 
6.8%
01
 
0.1%

Cardio
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size67.5 KiB
1
629 
0
561 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters1190
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1629
52.9%
0561
47.1%

Length

2021-07-09T10:51:48.251024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:51:48.424809image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
1629
52.9%
0561
47.1%

Most occurring characters

ValueCountFrequency (%)
1629
52.9%
0561
47.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1190
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1629
52.9%
0561
47.1%

Most occurring scripts

ValueCountFrequency (%)
Common1190
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1629
52.9%
0561
47.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1629
52.9%
0561
47.1%

Interactions

2021-07-09T10:51:30.914043image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:31.221073image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:31.536395image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:31.828094image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:32.012917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:32.266212image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:32.553499image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:32.843056image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:33.108677image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:33.455921image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:33.741324image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:34.046115image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:34.363790image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:34.687581image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:34.998348image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:35.275199image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:35.809221image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:36.127670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:36.337764image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:36.647537image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:36.967440image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:37.235311image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:37.550350image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:37.837094image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:51:38.144948image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-07-09T10:51:48.554527image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-09T10:51:48.850630image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-09T10:51:49.141830image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-09T10:51:49.443322image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-07-09T10:51:49.932238image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-07-09T10:51:38.672718image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-09T10:51:39.387626image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

AgeSexChest_PainRest_SystolicCholesterolFast_GlucoseRest_ECGMax_HeartEx_AnginaOld_PeakST_SlopeCardio
040121402890017200.010
149031601800015601.021
23712130283019800.010
348041382140010811.521
454131501950012200.010
539131203390017000.010
645021302370017000.010
754121102080014200.010
837141402070013011.521
948021202840012000.010

Last rows

AgeSexChest_PainRest_SystolicCholesterolFast_GlucoseRest_ECGMax_HeartEx_AnginaOld_PeakST_SlopeCardio
118063141401870214414.011
118163041241970013610.021
118241121201570018200.010
11835914164176129001.021
118457041402410012310.221
118545111102640013201.221
118668141441931014103.421
118757141301310011511.221
118857021302360217400.021
118938131381750017300.010

Duplicate rows

Most frequently occurring

AgeSexChest_PainRest_SystolicCholesterolFast_GlucoseRest_ECGMax_HeartEx_AnginaOld_PeakST_SlopeCardio# duplicates
029121302040220200.0102
134021182100019200.7102
234111181820217400.0102
335041381830018201.4102
435141201980013011.6212
535141262820215610.0112
637031202150017000.0102
737131302500018703.5302
838111202310018213.8212
93903941990017900.0102