Overview

Dataset statistics

Number of variables9
Number of observations22750
Missing cells4622
Missing cells (%)2.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.9 MiB
Average record size in memory364.3 B

Variable types

Categorical4
Boolean1
Numeric4

Warnings

Employee ID has a high cardinality: 22750 distinct values High cardinality
Date of Joining has a high cardinality: 366 distinct values High cardinality
Designation is highly correlated with Resource Allocation and 2 other fieldsHigh correlation
Resource Allocation is highly correlated with Designation and 2 other fieldsHigh correlation
Mental Fatigue Score is highly correlated with Designation and 2 other fieldsHigh correlation
Burn Rate is highly correlated with Designation and 2 other fieldsHigh correlation
Designation is highly correlated with Resource Allocation and 2 other fieldsHigh correlation
Resource Allocation is highly correlated with Designation and 2 other fieldsHigh correlation
Mental Fatigue Score is highly correlated with Designation and 2 other fieldsHigh correlation
Burn Rate is highly correlated with Designation and 2 other fieldsHigh correlation
Designation is highly correlated with Resource Allocation and 2 other fieldsHigh correlation
Resource Allocation is highly correlated with Designation and 2 other fieldsHigh correlation
Mental Fatigue Score is highly correlated with Designation and 2 other fieldsHigh correlation
Burn Rate is highly correlated with Designation and 2 other fieldsHigh correlation
Designation is highly correlated with Mental Fatigue Score and 2 other fieldsHigh correlation
Mental Fatigue Score is highly correlated with Designation and 2 other fieldsHigh correlation
Burn Rate is highly correlated with Designation and 2 other fieldsHigh correlation
Resource Allocation is highly correlated with Designation and 2 other fieldsHigh correlation
Resource Allocation has 1381 (6.1%) missing values Missing
Mental Fatigue Score has 2117 (9.3%) missing values Missing
Burn Rate has 1124 (4.9%) missing values Missing
Employee ID is uniformly distributed Uniform
Employee ID has unique values Unique
Designation has 1507 (6.6%) zeros Zeros
Burn Rate has 272 (1.2%) zeros Zeros

Reproduction

Analysis started2021-07-28 00:04:11.829279
Analysis finished2021-07-28 00:04:19.744020
Duration7.91 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Employee ID
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct22750
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size1.7 MiB
fffe33003300310037003700
 
1
fffe3100340038003300
 
1
fffe31003800360037003700
 
1
fffe32003200300033003600
 
1
fffe31003800390038003800
 
1
Other values (22745)
22745 

Length

Max length24
Median length24
Mean length22.73810989
Min length8

Characters and Unicode

Total characters517292
Distinct characters12
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique22750 ?
Unique (%)100.0%

Sample

1st rowfffe32003000360033003200
2nd rowfffe3700360033003500
3rd rowfffe31003300320037003900
4th rowfffe32003400380032003900
5th rowfffe31003900340031003600

Common Values

ValueCountFrequency (%)
fffe330033003100370037001
 
< 0.1%
fffe31003400380033001
 
< 0.1%
fffe310038003600370037001
 
< 0.1%
fffe320032003000330036001
 
< 0.1%
fffe310038003900380038001
 
< 0.1%
fffe310030003400310032001
 
< 0.1%
fffe310035003800330033001
 
< 0.1%
fffe330034003200310039001
 
< 0.1%
fffe310038003000370038001
 
< 0.1%
fffe320037003100320038001
 
< 0.1%
Other values (22740)22740
> 99.9%

Length

2021-07-27T20:04:19.969880image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
fffe330033003100370037001
 
< 0.1%
fffe31003400380033001
 
< 0.1%
fffe310038003600370037001
 
< 0.1%
fffe320032003000330036001
 
< 0.1%
fffe310038003900380038001
 
< 0.1%
fffe310030003400310032001
 
< 0.1%
fffe310035003800330033001
 
< 0.1%
fffe330034003200310039001
 
< 0.1%
fffe310038003000370038001
 
< 0.1%
fffe320037003100320038001
 
< 0.1%
Other values (22740)22740
> 99.9%

Most occurring characters

ValueCountFrequency (%)
0221917
42.9%
3119209
23.0%
f68250
 
13.2%
e22750
 
4.4%
215983
 
3.1%
115866
 
3.1%
49422
 
1.8%
98827
 
1.7%
58825
 
1.7%
78774
 
1.7%
Other values (2)17469
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number426292
82.4%
Lowercase Letter91000
 
17.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0221917
52.1%
3119209
28.0%
215983
 
3.7%
115866
 
3.7%
49422
 
2.2%
98827
 
2.1%
58825
 
2.1%
78774
 
2.1%
68741
 
2.1%
88728
 
2.0%
Lowercase Letter
ValueCountFrequency (%)
f68250
75.0%
e22750
 
25.0%

Most occurring scripts

ValueCountFrequency (%)
Common426292
82.4%
Latin91000
 
17.6%

Most frequent character per script

Common
ValueCountFrequency (%)
0221917
52.1%
3119209
28.0%
215983
 
3.7%
115866
 
3.7%
49422
 
2.2%
98827
 
2.1%
58825
 
2.1%
78774
 
2.1%
68741
 
2.1%
88728
 
2.0%
Latin
ValueCountFrequency (%)
f68250
75.0%
e22750
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII517292
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0221917
42.9%
3119209
23.0%
f68250
 
13.2%
e22750
 
4.4%
215983
 
3.1%
115866
 
3.1%
49422
 
1.8%
98827
 
1.7%
58825
 
1.7%
78774
 
1.7%
Other values (2)17469
 
3.4%

Date of Joining
Categorical

HIGH CARDINALITY

Distinct366
Distinct (%)1.6%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
2008-01-06
 
86
2008-05-21
 
85
2008-02-04
 
82
2008-07-16
 
81
2008-01-29
 
80
Other values (361)
22336 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters227500
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2008-09-30
2nd row2008-11-30
3rd row2008-03-10
4th row2008-11-03
5th row2008-07-24

Common Values

ValueCountFrequency (%)
2008-01-0686
 
0.4%
2008-05-2185
 
0.4%
2008-02-0482
 
0.4%
2008-07-1681
 
0.4%
2008-01-2980
 
0.4%
2008-07-1380
 
0.4%
2008-02-1879
 
0.3%
2008-09-2879
 
0.3%
2008-09-1478
 
0.3%
2008-05-1078
 
0.3%
Other values (356)21942
96.4%

Length

2021-07-27T20:04:20.235444image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2008-01-0686
 
0.4%
2008-05-2185
 
0.4%
2008-02-0482
 
0.4%
2008-07-1681
 
0.4%
2008-01-2980
 
0.4%
2008-07-1380
 
0.4%
2008-02-1879
 
0.3%
2008-09-2879
 
0.3%
2008-09-1478
 
0.3%
2008-05-1078
 
0.3%
Other values (356)21942
96.4%

Most occurring characters

ValueCountFrequency (%)
073377
32.3%
-45500
20.0%
236061
15.9%
827006
 
11.9%
119664
 
8.6%
35292
 
2.3%
94201
 
1.8%
64110
 
1.8%
44105
 
1.8%
54093
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number182000
80.0%
Dash Punctuation45500
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
073377
40.3%
236061
19.8%
827006
 
14.8%
119664
 
10.8%
35292
 
2.9%
94201
 
2.3%
64110
 
2.3%
44105
 
2.3%
54093
 
2.2%
74091
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
-45500
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common227500
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
073377
32.3%
-45500
20.0%
236061
15.9%
827006
 
11.9%
119664
 
8.6%
35292
 
2.3%
94201
 
1.8%
64110
 
1.8%
44105
 
1.8%
54093
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII227500
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
073377
32.3%
-45500
20.0%
236061
15.9%
827006
 
11.9%
119664
 
8.6%
35292
 
2.3%
94201
 
1.8%
64110
 
1.8%
44105
 
1.8%
54093
 
1.8%

Gender
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.3 MiB
Female
11908 
Male
10842 

Length

Max length6
Median length6
Mean length5.046857143
Min length4

Characters and Unicode

Total characters114816
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFemale
2nd rowMale
3rd rowFemale
4th rowMale
5th rowFemale

Common Values

ValueCountFrequency (%)
Female11908
52.3%
Male10842
47.7%

Length

2021-07-27T20:04:20.438520image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T20:04:20.516626image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
female11908
52.3%
male10842
47.7%

Most occurring characters

ValueCountFrequency (%)
e34658
30.2%
a22750
19.8%
l22750
19.8%
F11908
 
10.4%
m11908
 
10.4%
M10842
 
9.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter92066
80.2%
Uppercase Letter22750
 
19.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e34658
37.6%
a22750
24.7%
l22750
24.7%
m11908
 
12.9%
Uppercase Letter
ValueCountFrequency (%)
F11908
52.3%
M10842
47.7%

Most occurring scripts

ValueCountFrequency (%)
Latin114816
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e34658
30.2%
a22750
19.8%
l22750
19.8%
F11908
 
10.4%
m11908
 
10.4%
M10842
 
9.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII114816
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e34658
30.2%
a22750
19.8%
l22750
19.8%
F11908
 
10.4%
m11908
 
10.4%
M10842
 
9.4%

Company Type
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.4 MiB
Service
14833 
Product
7917 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters159250
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowService
2nd rowService
3rd rowProduct
4th rowService
5th rowService

Common Values

ValueCountFrequency (%)
Service14833
65.2%
Product7917
34.8%

Length

2021-07-27T20:04:20.688459image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-07-27T20:04:20.750946image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
service14833
65.2%
product7917
34.8%

Most occurring characters

ValueCountFrequency (%)
e29666
18.6%
r22750
14.3%
c22750
14.3%
S14833
9.3%
v14833
9.3%
i14833
9.3%
P7917
 
5.0%
o7917
 
5.0%
d7917
 
5.0%
u7917
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter136500
85.7%
Uppercase Letter22750
 
14.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e29666
21.7%
r22750
16.7%
c22750
16.7%
v14833
10.9%
i14833
10.9%
o7917
 
5.8%
d7917
 
5.8%
u7917
 
5.8%
t7917
 
5.8%
Uppercase Letter
ValueCountFrequency (%)
S14833
65.2%
P7917
34.8%

Most occurring scripts

ValueCountFrequency (%)
Latin159250
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e29666
18.6%
r22750
14.3%
c22750
14.3%
S14833
9.3%
v14833
9.3%
i14833
9.3%
P7917
 
5.0%
o7917
 
5.0%
d7917
 
5.0%
u7917
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII159250
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e29666
18.6%
r22750
14.3%
c22750
14.3%
S14833
9.3%
v14833
9.3%
i14833
9.3%
P7917
 
5.0%
o7917
 
5.0%
d7917
 
5.0%
u7917
 
5.0%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size22.3 KiB
True
12290 
False
10460 
ValueCountFrequency (%)
True12290
54.0%
False10460
46.0%
2021-07-27T20:04:20.782192image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Designation
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.178725275
Minimum0
Maximum5
Zeros1507
Zeros (%)6.6%
Negative0
Negative (%)0.0%
Memory size177.9 KiB
2021-07-27T20:04:20.844679image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q11
median2
Q33
95-th percentile4
Maximum5
Range5
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.135144694
Coefficient of variation (CV)0.5210132306
Kurtosis-0.4149159961
Mean2.178725275
Median Absolute Deviation (MAD)1
Skewness0.09242138479
Sum49566
Variance1.288553476
MonotonicityNot monotonic
2021-07-27T20:04:20.938405image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
27588
33.4%
35985
26.3%
14881
21.5%
42391
 
10.5%
01507
 
6.6%
5398
 
1.7%
ValueCountFrequency (%)
01507
 
6.6%
14881
21.5%
27588
33.4%
35985
26.3%
42391
 
10.5%
5398
 
1.7%
ValueCountFrequency (%)
5398
 
1.7%
42391
 
10.5%
35985
26.3%
27588
33.4%
14881
21.5%
01507
 
6.6%

Resource Allocation
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct10
Distinct (%)< 0.1%
Missing1381
Missing (%)6.1%
Infinite0
Infinite (%)0.0%
Mean4.481398287
Minimum1
Maximum10
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size177.9 KiB
2021-07-27T20:04:21.032131image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median4
Q36
95-th percentile8
Maximum10
Range9
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.047211121
Coefficient of variation (CV)0.4568241851
Kurtosis-0.4798840553
Mean4.481398287
Median Absolute Deviation (MAD)1
Skewness0.2045727345
Sum95763
Variance4.191073372
MonotonicityNot monotonic
2021-07-27T20:04:21.110240image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
43893
17.1%
53861
17.0%
33192
14.0%
62943
12.9%
22075
9.1%
71965
8.6%
11791
7.9%
81044
 
4.6%
9446
 
2.0%
10159
 
0.7%
(Missing)1381
 
6.1%
ValueCountFrequency (%)
11791
7.9%
22075
9.1%
33192
14.0%
43893
17.1%
53861
17.0%
62943
12.9%
71965
8.6%
81044
 
4.6%
9446
 
2.0%
10159
 
0.7%
ValueCountFrequency (%)
10159
 
0.7%
9446
 
2.0%
81044
 
4.6%
71965
8.6%
62943
12.9%
53861
17.0%
43893
17.1%
33192
14.0%
22075
9.1%
11791
7.9%

Mental Fatigue Score
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct101
Distinct (%)0.5%
Missing2117
Missing (%)9.3%
Infinite0
Infinite (%)0.0%
Mean5.728187854
Minimum0
Maximum10
Zeros171
Zeros (%)0.8%
Negative0
Negative (%)0.0%
Memory size177.9 KiB
2021-07-27T20:04:21.235209image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile2.3
Q14.6
median5.9
Q37.1
95-th percentile8.7
Maximum10
Range10
Interquartile range (IQR)2.5

Descriptive statistics

Standard deviation1.920838688
Coefficient of variation (CV)0.3353309523
Kurtosis0.17427663
Mean5.728187854
Median Absolute Deviation (MAD)1.2
Skewness-0.4308950579
Sum118189.7
Variance3.689621265
MonotonicityNot monotonic
2021-07-27T20:04:21.360179image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
6470
 
2.1%
5.8464
 
2.0%
5.9458
 
2.0%
6.1457
 
2.0%
6.3454
 
2.0%
6.5448
 
2.0%
5.7448
 
2.0%
5.6441
 
1.9%
6.4432
 
1.9%
6.7428
 
1.9%
Other values (91)16133
70.9%
(Missing)2117
 
9.3%
ValueCountFrequency (%)
0171
0.8%
0.117
 
0.1%
0.223
 
0.1%
0.313
 
0.1%
0.419
 
0.1%
0.524
 
0.1%
0.625
 
0.1%
0.727
 
0.1%
0.828
 
0.1%
0.938
 
0.2%
ValueCountFrequency (%)
10119
0.5%
9.927
 
0.1%
9.841
 
0.2%
9.740
 
0.2%
9.632
 
0.1%
9.559
0.3%
9.454
0.2%
9.367
0.3%
9.269
0.3%
9.184
0.4%

Burn Rate
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
MISSING
ZEROS

Distinct101
Distinct (%)0.5%
Missing1124
Missing (%)4.9%
Infinite0
Infinite (%)0.0%
Mean0.4520054564
Minimum0
Maximum1
Zeros272
Zeros (%)1.2%
Negative0
Negative (%)0.0%
Memory size177.9 KiB
2021-07-27T20:04:21.485150image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0.12
Q10.31
median0.45
Q30.59
95-th percentile0.78
Maximum1
Range1
Interquartile range (IQR)0.28

Descriptive statistics

Standard deviation0.1982263695
Coefficient of variation (CV)0.4385486208
Kurtosis-0.2615790285
Mean0.4520054564
Median Absolute Deviation (MAD)0.14
Skewness0.04573737091
Sum9775.07
Variance0.03929369357
MonotonicityNot monotonic
2021-07-27T20:04:21.610121image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.47475
 
2.1%
0.43444
 
2.0%
0.41434
 
1.9%
0.45431
 
1.9%
0.5428
 
1.9%
0.48426
 
1.9%
0.4419
 
1.8%
0.38418
 
1.8%
0.39414
 
1.8%
0.42414
 
1.8%
Other values (91)17323
76.1%
(Missing)1124
 
4.9%
ValueCountFrequency (%)
0272
1.2%
0.0127
 
0.1%
0.0259
 
0.3%
0.0356
 
0.2%
0.0449
 
0.2%
0.0561
 
0.3%
0.0663
 
0.3%
0.0757
 
0.3%
0.0894
 
0.4%
0.0993
 
0.4%
ValueCountFrequency (%)
185
0.4%
0.998
 
< 0.1%
0.9818
 
0.1%
0.9717
 
0.1%
0.9613
 
0.1%
0.9517
 
0.1%
0.9422
 
0.1%
0.9326
 
0.1%
0.9226
 
0.1%
0.9127
 
0.1%

Interactions

2021-07-27T20:04:17.155375image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:17.251894image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:17.361242image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:17.486211image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:17.595563image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:17.704876image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:17.798639image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:17.907988image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:18.017336image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:18.126687image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:18.236036image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:18.345385image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:18.454735image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:18.564084image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:18.673435image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-07-27T20:04:18.782784image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-07-27T20:04:21.703849image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-07-27T20:04:21.828820image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-07-27T20:04:22.081227image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-07-27T20:04:22.202945image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-07-27T20:04:22.343536image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-07-27T20:04:19.181902image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-07-27T20:04:19.400603image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-07-27T20:04:19.556814image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-07-27T20:04:19.650540image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Employee IDDate of JoiningGenderCompany TypeWFH Setup AvailableDesignationResource AllocationMental Fatigue ScoreBurn Rate
0fffe320030003600330032002008-09-30FemaleServiceNo2.03.03.80.16
1fffe37003600330035002008-11-30MaleServiceYes1.02.05.00.36
2fffe310033003200370039002008-03-10FemaleProductYes2.0NaN5.80.49
3fffe320034003800320039002008-11-03MaleServiceYes1.01.02.60.20
4fffe310039003400310036002008-07-24FemaleServiceNo3.07.06.90.52
5fffe33003500370035002008-11-26MaleProductYes2.04.03.60.29
6fffe330033003400390031002008-01-02FemaleServiceNo3.06.07.90.62
7fffe320036003200370034002008-10-31FemaleServiceYes2.04.04.40.33
8fffe320032003000340037002008-12-27FemaleServiceNo3.06.0NaN0.56
9fffe310036003200300032002008-03-09FemaleProductNo3.06.0NaN0.67

Last rows

Employee IDDate of JoiningGenderCompany TypeWFH Setup AvailableDesignationResource AllocationMental Fatigue ScoreBurn Rate
22740fffe330033003800310031002008-09-05FemaleProductNo3.06.07.30.55
22741fffe310036003500340038002008-01-07MaleProductNo2.05.06.0NaN
22742fffe330032003100390030002008-07-28MaleProductNo3.05.08.10.69
22743fffe33003900300036002008-12-15FemaleProductYes1.03.06.00.48
22744fffe320035003700330032002008-05-27MaleProductNo3.07.06.20.54
22745fffe310035003700390031002008-12-30FemaleServiceNo1.03.0NaN0.41
22746fffe330030003500310038002008-01-19FemaleProductYes3.06.06.70.59
22747fffe3900320030002008-11-05MaleServiceYes3.07.0NaN0.72
22748fffe330033003200360039002008-01-10FemaleServiceNo2.05.05.90.52
22749fffe34003500310038002008-01-06MaleProductNo3.06.07.80.61