Overview

Dataset statistics

Number of variables13
Number of observations70000
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory6.9 MiB
Average record size in memory104.0 B

Variable types

Numeric6
Categorical7

Warnings

gender is highly correlated with heightHigh correlation
height is highly correlated with genderHigh correlation
ap_hi is highly correlated with ap_loHigh correlation
ap_lo is highly correlated with ap_hiHigh correlation
ap_hi is highly correlated with ap_loHigh correlation
ap_lo is highly correlated with ap_hiHigh correlation
gender is highly correlated with smokeHigh correlation
alco is highly correlated with smokeHigh correlation
gluc is highly correlated with cholesterolHigh correlation
cholesterol is highly correlated with glucHigh correlation
smoke is highly correlated with gender and 1 other fieldsHigh correlation
ap_hi is highly skewed (γ1 = 85.29621386) Skewed
ap_lo is highly skewed (γ1 = 32.11408283) Skewed
id is uniformly distributed Uniform
id has unique values Unique

Reproduction

Analysis started2021-07-09 14:24:14.065404
Analysis finished2021-07-09 14:24:35.847881
Duration21.78 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

id
Real number (ℝ≥0)

UNIFORM
UNIQUE

Distinct70000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean49972.4199
Minimum0
Maximum99999
Zeros1
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size547.0 KiB
2021-07-09T10:24:36.154105image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile4954.95
Q125006.75
median50001.5
Q374889.25
95-th percentile94937.15
Maximum99999
Range99999
Interquartile range (IQR)49882.5

Descriptive statistics

Standard deviation28851.30232
Coefficient of variation (CV)0.5773445109
Kurtosis-1.198373591
Mean49972.4199
Median Absolute Deviation (MAD)24944
Skewness-0.001277812858
Sum3498069393
Variance832397645.7
MonotonicityStrictly increasing
2021-07-09T10:24:36.343436image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
971461
 
< 0.1%
787071
 
< 0.1%
684681
 
< 0.1%
664211
 
< 0.1%
725661
 
< 0.1%
705191
 
< 0.1%
930481
 
< 0.1%
950991
 
< 0.1%
439681
 
< 0.1%
Other values (69990)69990
> 99.9%
ValueCountFrequency (%)
01
< 0.1%
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
81
< 0.1%
91
< 0.1%
121
< 0.1%
131
< 0.1%
141
< 0.1%
ValueCountFrequency (%)
999991
< 0.1%
999981
< 0.1%
999961
< 0.1%
999951
< 0.1%
999931
< 0.1%
999921
< 0.1%
999911
< 0.1%
999901
< 0.1%
999881
< 0.1%
999861
< 0.1%

age
Real number (ℝ≥0)

Distinct8076
Distinct (%)11.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean19468.86581
Minimum10798
Maximum23713
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size547.0 KiB
2021-07-09T10:24:36.571234image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum10798
5-th percentile15069
Q117664
median19703
Q321327
95-th percentile23259
Maximum23713
Range12915
Interquartile range (IQR)3663

Descriptive statistics

Standard deviation2467.251667
Coefficient of variation (CV)0.1267280637
Kurtosis-0.8234468445
Mean19468.86581
Median Absolute Deviation (MAD)1711
Skewness-0.3070553957
Sum1362820607
Variance6087330.79
MonotonicityNot monotonic
2021-07-09T10:24:36.767961image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1974132
 
< 0.1%
1823632
 
< 0.1%
2037631
 
< 0.1%
2044231
 
< 0.1%
1825331
 
< 0.1%
2045730
 
< 0.1%
2115930
 
< 0.1%
2189230
 
< 0.1%
2046430
 
< 0.1%
1818430
 
< 0.1%
Other values (8066)69693
99.6%
ValueCountFrequency (%)
107981
 
< 0.1%
108591
 
< 0.1%
108781
 
< 0.1%
109641
 
< 0.1%
142751
 
< 0.1%
142771
 
< 0.1%
142821
 
< 0.1%
142841
 
< 0.1%
142871
 
< 0.1%
142913
< 0.1%
ValueCountFrequency (%)
237131
< 0.1%
237011
< 0.1%
236921
< 0.1%
236901
< 0.1%
236871
< 0.1%
236841
< 0.1%
236781
< 0.1%
236771
< 0.1%
236752
< 0.1%
236732
< 0.1%

gender
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 MiB
1
45530 
2
24470 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters70000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row1
3rd row1
4th row2
5th row1

Common Values

ValueCountFrequency (%)
145530
65.0%
224470
35.0%

Length

2021-07-09T10:24:37.042377image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:24:37.127195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
145530
65.0%
224470
35.0%

Most occurring characters

ValueCountFrequency (%)
145530
65.0%
224470
35.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number70000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
145530
65.0%
224470
35.0%

Most occurring scripts

ValueCountFrequency (%)
Common70000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
145530
65.0%
224470
35.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII70000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
145530
65.0%
224470
35.0%

height
Real number (ℝ≥0)

HIGH CORRELATION

Distinct109
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean164.3592286
Minimum55
Maximum250
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size547.0 KiB
2021-07-09T10:24:37.238441image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum55
5-th percentile152
Q1159
median165
Q3170
95-th percentile178
Maximum250
Range195
Interquartile range (IQR)11

Descriptive statistics

Standard deviation8.210126365
Coefficient of variation (CV)0.04995232964
Kurtosis7.943652579
Mean164.3592286
Median Absolute Deviation (MAD)5
Skewness-0.6421874522
Sum11505146
Variance67.40617492
MonotonicityNot monotonic
2021-07-09T10:24:37.416575image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1655853
 
8.4%
1605022
 
7.2%
1704679
 
6.7%
1684399
 
6.3%
1643396
 
4.9%
1583313
 
4.7%
1623257
 
4.7%
1692791
 
4.0%
1562755
 
3.9%
1672538
 
3.6%
Other values (99)31997
45.7%
ValueCountFrequency (%)
551
 
< 0.1%
571
 
< 0.1%
591
 
< 0.1%
601
 
< 0.1%
641
 
< 0.1%
652
< 0.1%
661
 
< 0.1%
673
< 0.1%
682
< 0.1%
703
< 0.1%
ValueCountFrequency (%)
2501
 
< 0.1%
2071
 
< 0.1%
2001
 
< 0.1%
19814
< 0.1%
1974
 
< 0.1%
1966
< 0.1%
1956
< 0.1%
1942
 
< 0.1%
1936
< 0.1%
19212
< 0.1%

weight
Real number (ℝ≥0)

Distinct287
Distinct (%)0.4%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean74.20569
Minimum10
Maximum200
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size547.0 KiB
2021-07-09T10:24:37.601802image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum10
5-th percentile55
Q165
median72
Q382
95-th percentile100
Maximum200
Range190
Interquartile range (IQR)17

Descriptive statistics

Standard deviation14.39575668
Coefficient of variation (CV)0.1939980166
Kurtosis2.58682545
Mean74.20569
Median Absolute Deviation (MAD)8
Skewness1.012070108
Sum5194398.3
Variance207.2378103
MonotonicityNot monotonic
2021-07-09T10:24:37.768704image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
653850
 
5.5%
703764
 
5.4%
682831
 
4.0%
752740
 
3.9%
602710
 
3.9%
802625
 
3.8%
722303
 
3.3%
692195
 
3.1%
782090
 
3.0%
741867
 
2.7%
Other values (277)43025
61.5%
ValueCountFrequency (%)
101
 
< 0.1%
111
 
< 0.1%
211
 
< 0.1%
221
 
< 0.1%
231
 
< 0.1%
281
 
< 0.1%
291
 
< 0.1%
303
< 0.1%
311
 
< 0.1%
323
< 0.1%
ValueCountFrequency (%)
2002
< 0.1%
1831
 
< 0.1%
1811
 
< 0.1%
1804
< 0.1%
1783
< 0.1%
1771
 
< 0.1%
1751
 
< 0.1%
1721
 
< 0.1%
1711
 
< 0.1%
1703
< 0.1%

ap_hi
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct153
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean128.8172857
Minimum-150
Maximum16020
Zeros0
Zeros (%)0.0%
Negative7
Negative (%)< 0.1%
Memory size547.0 KiB
2021-07-09T10:24:37.934511image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-150
5-th percentile100
Q1120
median120
Q3140
95-th percentile160
Maximum16020
Range16170
Interquartile range (IQR)20

Descriptive statistics

Standard deviation154.0114195
Coefficient of variation (CV)1.19558038
Kurtosis7580.074738
Mean128.8172857
Median Absolute Deviation (MAD)10
Skewness85.29621386
Sum9017210
Variance23719.51732
MonotonicityNot monotonic
2021-07-09T10:24:38.087107image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12027699
39.6%
1409506
 
13.6%
1308961
 
12.8%
1108644
 
12.3%
1504450
 
6.4%
1603036
 
4.3%
1002581
 
3.7%
90982
 
1.4%
170717
 
1.0%
180695
 
1.0%
Other values (143)2729
 
3.9%
ValueCountFrequency (%)
-1501
 
< 0.1%
-1401
 
< 0.1%
-1202
 
< 0.1%
-1151
 
< 0.1%
-1002
 
< 0.1%
12
 
< 0.1%
71
 
< 0.1%
107
 
< 0.1%
1128
 
< 0.1%
1276
0.1%
ValueCountFrequency (%)
160201
 
< 0.1%
140204
< 0.1%
130102
< 0.1%
115001
 
< 0.1%
110201
 
< 0.1%
20001
 
< 0.1%
16201
 
< 0.1%
15001
 
< 0.1%
14202
< 0.1%
14091
 
< 0.1%

ap_lo
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
SKEWED

Distinct157
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean96.63041429
Minimum-70
Maximum11000
Zeros21
Zeros (%)< 0.1%
Negative1
Negative (%)< 0.1%
Memory size547.0 KiB
2021-07-09T10:24:38.269670image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-70
5-th percentile70
Q180
median80
Q390
95-th percentile100
Maximum11000
Range11070
Interquartile range (IQR)10

Descriptive statistics

Standard deviation188.4725303
Coefficient of variation (CV)1.950447296
Kurtosis1425.914585
Mean96.63041429
Median Absolute Deviation (MAD)1
Skewness32.11408283
Sum6764129
Variance35521.89468
MonotonicityNot monotonic
2021-07-09T10:24:38.454244image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8034847
49.8%
9014316
20.5%
7010245
 
14.6%
1004082
 
5.8%
602727
 
3.9%
1000666
 
1.0%
110401
 
0.6%
79357
 
0.5%
85290
 
0.4%
75211
 
0.3%
Other values (147)1858
 
2.7%
ValueCountFrequency (%)
-701
 
< 0.1%
021
< 0.1%
11
 
< 0.1%
62
 
< 0.1%
72
 
< 0.1%
82
 
< 0.1%
91
 
< 0.1%
107
 
< 0.1%
151
 
< 0.1%
2015
< 0.1%
ValueCountFrequency (%)
110001
 
< 0.1%
100003
< 0.1%
98001
 
< 0.1%
91001
 
< 0.1%
90112
< 0.1%
85001
 
< 0.1%
82001
 
< 0.1%
81001
 
< 0.1%
80993
< 0.1%
80791
 
< 0.1%

cholesterol
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 MiB
1
52385 
2
9549 
3
8066 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters70000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row3
3rd row3
4th row1
5th row1

Common Values

ValueCountFrequency (%)
152385
74.8%
29549
 
13.6%
38066
 
11.5%

Length

2021-07-09T10:24:38.752620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:24:38.841534image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
152385
74.8%
29549
 
13.6%
38066
 
11.5%

Most occurring characters

ValueCountFrequency (%)
152385
74.8%
29549
 
13.6%
38066
 
11.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number70000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
152385
74.8%
29549
 
13.6%
38066
 
11.5%

Most occurring scripts

ValueCountFrequency (%)
Common70000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
152385
74.8%
29549
 
13.6%
38066
 
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII70000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
152385
74.8%
29549
 
13.6%
38066
 
11.5%

gluc
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 MiB
1
59479 
3
 
5331
2
 
5190

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters70000
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1

Common Values

ValueCountFrequency (%)
159479
85.0%
35331
 
7.6%
25190
 
7.4%

Length

2021-07-09T10:24:39.090625image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:24:39.174336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
159479
85.0%
35331
 
7.6%
25190
 
7.4%

Most occurring characters

ValueCountFrequency (%)
159479
85.0%
35331
 
7.6%
25190
 
7.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number70000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
159479
85.0%
35331
 
7.6%
25190
 
7.4%

Most occurring scripts

ValueCountFrequency (%)
Common70000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
159479
85.0%
35331
 
7.6%
25190
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII70000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
159479
85.0%
35331
 
7.6%
25190
 
7.4%

smoke
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 MiB
0
63831 
1
 
6169

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters70000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
063831
91.2%
16169
 
8.8%

Length

2021-07-09T10:24:39.387085image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:24:39.468676image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
063831
91.2%
16169
 
8.8%

Most occurring characters

ValueCountFrequency (%)
063831
91.2%
16169
 
8.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number70000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
063831
91.2%
16169
 
8.8%

Most occurring scripts

ValueCountFrequency (%)
Common70000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
063831
91.2%
16169
 
8.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII70000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
063831
91.2%
16169
 
8.8%

alco
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 MiB
0
66236 
1
 
3764

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters70000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
066236
94.6%
13764
 
5.4%

Length

2021-07-09T10:24:39.880280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:24:39.986521image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
066236
94.6%
13764
 
5.4%

Most occurring characters

ValueCountFrequency (%)
066236
94.6%
13764
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number70000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
066236
94.6%
13764
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common70000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
066236
94.6%
13764
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII70000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
066236
94.6%
13764
 
5.4%

active
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 MiB
1
56261 
0
13739 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters70000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row0
4th row1
5th row0

Common Values

ValueCountFrequency (%)
156261
80.4%
013739
 
19.6%

Length

2021-07-09T10:24:40.244719image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:24:40.358009image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
156261
80.4%
013739
 
19.6%

Most occurring characters

ValueCountFrequency (%)
156261
80.4%
013739
 
19.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number70000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
156261
80.4%
013739
 
19.6%

Most occurring scripts

ValueCountFrequency (%)
Common70000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
156261
80.4%
013739
 
19.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII70000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
156261
80.4%
013739
 
19.6%

cardio
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size3.9 MiB
0
35021 
1
34979 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters70000
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
035021
50.0%
134979
50.0%

Length

2021-07-09T10:24:40.616811image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-09T10:24:40.712524image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
035021
50.0%
134979
50.0%

Most occurring characters

ValueCountFrequency (%)
035021
50.0%
134979
50.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number70000
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
035021
50.0%
134979
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common70000
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
035021
50.0%
134979
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII70000
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
035021
50.0%
134979
50.0%

Interactions

2021-07-09T10:24:28.096184image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:28.271162image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:28.456505image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:28.667122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:28.858810image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:29.051612image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:29.256310image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:29.431790image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:29.594187image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:29.905962image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:30.103329image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:30.295309image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:30.461343image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:30.651228image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:30.817125image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:31.021202image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:31.220180image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:31.442529image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:31.697115image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:31.883501image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:32.043940image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:32.260931image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:32.423787image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:32.636954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:32.806280image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:33.009671image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:33.222256image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:33.455627image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:33.646656image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:33.858283image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:34.041897image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:34.239613image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:34.394972image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:34.564398image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:34.748002image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-09T10:24:34.931607image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-07-09T10:24:40.822251image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-09T10:24:41.113835image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-09T10:24:41.350917image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-09T10:24:41.634305image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-07-09T10:24:41.874008image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-07-09T10:24:35.185592image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-09T10:24:35.594229image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

idagegenderheightweightap_hiap_locholesterolglucsmokealcoactivecardio
0018393216862.011080110010
1120228115685.014090310011
2218857116564.013070310001
3317623216982.0150100110011
4417474115656.010060110000
5821914115167.012080220000
6922113115793.013080310010
71222584217895.013090330011
81317668115871.011070110010
91419834116468.011060110000

Last rows

idagegenderheightweightap_hiap_locholesterolglucsmokealcoactivecardio
699909998615094116872.011070110011
699919998820609115972.013090220010
699929999018792116156.017090110011
699939999119699117270.013090110011
699949999221074116580.015080110011
699959999319240216876.012080111010
6999699995226011158126.014090220011
6999799996190662183105.018090310101
699989999822431116372.013580120001
699999999920540117072.012080210010