Creating a metric based on some features Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsApproach to creating a user profile in music web applicationDissmissing features based on correlation with target variableHow do I create my featuresDoes the SVM require lots of features most of the time?Suitable aggregations (mean, median or something else) to make features?How can I deal with circular features like hours?Creating similarity metric with Doc2Vec and additional featuresI want to create an additional feature(column) based on some manipulation of values from existing featuresCreating a Feature to determine popularityWhat are features for state-action pairs in RL?

Dot products and For-loops

How to deal with my PhD supervisors rudely critiquing all my draft papers?

Single word antonym of "flightless"

Proof involving the spectral radius and the Jordan canonical form

Determinant is linear as a function of each of the rows of the matrix.

"Seemed to had" is it correct?

Antler Helmet: Can it work?

Did Kevin spill real chili?

Did Xerox really develop the first LAN?

Were Kohanim forbidden from serving in King David's army?

Why is "Captain Marvel" translated as male in Portugal?

Is there a documented rationale why the House Ways and Means chairman can demand tax info?

What happens to sewage if there is no river near by?

Is there any avatar supposed to be born between the death of Krishna and the birth of Kalki?

Why is black pepper both grey and black?

Is above average number of years spent on PhD considered a red flag in future academia or industry positions?

3 doors, three guards, one stone

Why don't the Weasley twins use magic outside of school if the Trace can only find the location of spells cast?

Do I really need recursive chmod to restrict access to a folder?

When is phishing education going too far?

What are the possible ways to detect skin while classifying diseases?

Problem drawing boxes with arrows in tikZ

Why constant symbols in a language?

Why is "Consequences inflicted." not a sentence?

Creating a metric based on some features

Announcing the arrival of Valued Associate #679: Cesar Manara

Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)

2019 Moderator Election Q&A - Questionnaire

2019 Community Moderator Election ResultsApproach to creating a user profile in music web applicationDissmissing features based on correlation with target variableHow do I create my featuresDoes the SVM require lots of features most of the time?Suitable aggregations (mean, median or something else) to make features?How can I deal with circular features like hours?Creating similarity metric with Doc2Vec and additional featuresI want to create an additional feature(column) based on some manipulation of values from existing featuresCreating a Feature to determine popularityWhat are features for state-action pairs in RL?

I want to create a new metric based on some features but dont know how to start. I basically want to create a "job satisfaction level" metric based on some features. The features could be work hours, shift, If working on weekend and so on. I dont know how to start. In ideal world, I want to comp up with weights for each of these features and compute a final value and then put the final value in a job satisfaction level bucket. I then want to use this metric in my training model. Is there any methodology to do so? Lets assume I have different warehouses with different values for those features and I want to compute a "job content" or "job satisfaction" metric based on the features I mentioned above for all of these locations. Then I want to use this new computed metric with my other features for an employee resignation prediction. Any help is appreciated.

Thanks

asked Apr 1 at 15:25

Fatima

161

add a comment |

Thanks

asked Apr 1 at 15:25

Fatima

161

add a comment |

Thanks

asked Apr 1 at 15:25

Fatima

161

Thanks

feature-engineering feature-construction

asked Apr 1 at 15:25

Fatima

161

asked Apr 1 at 15:25

Fatima

161

asked Apr 1 at 15:25

Fatima

161

asked Apr 1 at 15:25

Fatima

161

asked Apr 1 at 15:25

Fatima

161

add a comment |

2 Answers
2

active

oldest

votes

Indeed, there are methodologies that have been tested elsewhere, some with greater and less success.

I will propose one of them to build a prediction of job satisfaction, which you can then enter as an explanatory variable in a supervised model of employee resignation, whose methodology you can review in this tutorial with Python code that I did some time ago: HR analytics MVP

Methodology to generate a satisfaction level prediction: Deduce the importance of the variables from a score that represents the satisfaction declared by a subset of the members of your company

I think the best way to start doing a good MVP (minimum viable product) with which you can deliver relatively fast results having a result that incorporates elements of your company is one in which you derive the importance of the features from a dataset in which have your explanatory variables and a target with a declarative satisfaction survey made to the workers from which the score that is the variable explained was calculated. For this you must follow the following steps:

1.-You design a satisfaction survey that will be answered by the workers and that will allow you to calculate a Score from it. Here the important thing is that the design of the survey is as complete as possible, that the number of respondents allows you to draw conclusions at a statistical level and, most importantly, that of those who answer the survey have how to extract the raw data that allows you later deduce which are the most relevant variables. Here are some resources that can give you some ideas of how to generate the satisfaction level index

2.-Then, using that dataset generated in step 1, you can make a feature engineer and establish which variables have the greatest impact on the satisfaction declared by the workers.

3.-Solved the point 2 you can generate predictions on the score and apply your model to the future and with other workers of the same company.

Important: Whenever you run the prediction for the next period you should do a few satisfaction surveys in each iteration to confirm that the model is still valid and to use that data as a permanent retraining. In general, the model should be useful as long as the context of the company does not undergo major changes (mergers, significant deterioration of the work environment due to massive dismissals, etc.), since in such cases you should try to capture the short and long term effects of these shock

Although this methodology is a good starting point, it is omitting many things that are difficult to detect for a company because it corresponds to exogenous variables to it, such as:

a.- That the person changes his interests and / or goals in terms of career. Example: a software developer who wants to change the focus of his career towards a more commercial facet or another specialty such as Data Science or Data Engineer

b.-That the person change their objectives and / or prioritize them in their life. Example: A person who wants to begin to dedicate more time to his personal life because he went through a crisis with his partner

Here is an example of where they used that methodology: Mining the drivers of job satisfaction using algorithmic variable importance measures

PD: There are other lines of research that avoid extracting the satisfaction index from the direct query to the employee and occupy other variables such as equivalent income or time spent in the company as equivalent metric. It is not my favorite line, but here I leave an example of that: Using equivalent income as metric

answered Apr 1 at 16:38

iair linker

213

$begingroup$
Thank you Iair. Good information. Unfortunately, 30% of my data are the ones who already resigned and I can not give them a survey to get their satisfaction level. However, it is a good idea to start collecting the data from now on for further improvements of my model.
$endgroup$
– Fatima
Apr 1 at 17:28

add a comment |

Here are some points to start which all involve gathering "supervision" data.

One point to start is to gather satisfaction feed-backs from employees, such as "not satisfied" -1, "average" 0, "satisfied" 1, or a 5 level score, etc. Then you can tackle this problem as a classification/regression task. In the process, you will find out about the important metrics/combinations.

Another point to start is to use expert knowledge. That is, experts (or any eligible person) will read the employees' reports and assign a satisfaction level to them. Then you should proceed as the first case.

An easier data acquisition is in the form of comparison rather than absolute satisfaction levels. For example, you should gather data in the form of "employee 1 is more satisfied than employee 2", then proceed to predict the difference between two employees, i.e. +1 or -1, using the previous steps. This way, for any given employee $e$, and a small group of representative employees $e_1$ to $e_5$ that are sorted based on their satisfaction level (as a measuring stick), you can feed $(e, e_i)$ to the model to find the satisfaction level of employee $e$ in the spectrum. For example, 5 outputs in the spectrum would be $+1, +1, +1, -1, -1$, which means satisfaction level of $e$ is between $e_3$ and $e_4$. This way, you may not even need to extract the important features, just convert the output of prediction model into a score.

edited Apr 1 at 16:39

answered Apr 1 at 16:31

Esmailian

3,311420

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48361%2fcreating-a-metric-based-on-some-features%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

2 Answers
2

active

oldest

votes

2 Answers
2

active

oldest

votes

Indeed, there are methodologies that have been tested elsewhere, some with greater and less success.

Methodology to generate a satisfaction level prediction: Deduce the importance of the variables from a score that represents the satisfaction declared by a subset of the members of your company

2.-Then, using that dataset generated in step 1, you can make a feature engineer and establish which variables have the greatest impact on the satisfaction declared by the workers.

3.-Solved the point 2 you can generate predictions on the score and apply your model to the future and with other workers of the same company.

Although this methodology is a good starting point, it is omitting many things that are difficult to detect for a company because it corresponds to exogenous variables to it, such as:

Here is an example of where they used that methodology: Mining the drivers of job satisfaction using algorithmic variable importance measures

answered Apr 1 at 16:38

iair linker

213

$begingroup$
Thank you Iair. Good information. Unfortunately, 30% of my data are the ones who already resigned and I can not give them a survey to get their satisfaction level. However, it is a good idea to start collecting the data from now on for further improvements of my model.
$endgroup$
– Fatima
Apr 1 at 17:28

add a comment |

Indeed, there are methodologies that have been tested elsewhere, some with greater and less success.

Methodology to generate a satisfaction level prediction: Deduce the importance of the variables from a score that represents the satisfaction declared by a subset of the members of your company

2.-Then, using that dataset generated in step 1, you can make a feature engineer and establish which variables have the greatest impact on the satisfaction declared by the workers.

3.-Solved the point 2 you can generate predictions on the score and apply your model to the future and with other workers of the same company.

Although this methodology is a good starting point, it is omitting many things that are difficult to detect for a company because it corresponds to exogenous variables to it, such as:

Here is an example of where they used that methodology: Mining the drivers of job satisfaction using algorithmic variable importance measures

answered Apr 1 at 16:38

iair linker

213

$begingroup$
Thank you Iair. Good information. Unfortunately, 30% of my data are the ones who already resigned and I can not give them a survey to get their satisfaction level. However, it is a good idea to start collecting the data from now on for further improvements of my model.
$endgroup$
– Fatima
Apr 1 at 17:28

add a comment |

Indeed, there are methodologies that have been tested elsewhere, some with greater and less success.

Methodology to generate a satisfaction level prediction: Deduce the importance of the variables from a score that represents the satisfaction declared by a subset of the members of your company

2.-Then, using that dataset generated in step 1, you can make a feature engineer and establish which variables have the greatest impact on the satisfaction declared by the workers.

3.-Solved the point 2 you can generate predictions on the score and apply your model to the future and with other workers of the same company.

Although this methodology is a good starting point, it is omitting many things that are difficult to detect for a company because it corresponds to exogenous variables to it, such as:

Here is an example of where they used that methodology: Mining the drivers of job satisfaction using algorithmic variable importance measures

answered Apr 1 at 16:38

iair linker

213

Indeed, there are methodologies that have been tested elsewhere, some with greater and less success.

Methodology to generate a satisfaction level prediction: Deduce the importance of the variables from a score that represents the satisfaction declared by a subset of the members of your company

2.-Then, using that dataset generated in step 1, you can make a feature engineer and establish which variables have the greatest impact on the satisfaction declared by the workers.

3.-Solved the point 2 you can generate predictions on the score and apply your model to the future and with other workers of the same company.

Although this methodology is a good starting point, it is omitting many things that are difficult to detect for a company because it corresponds to exogenous variables to it, such as:

Here is an example of where they used that methodology: Mining the drivers of job satisfaction using algorithmic variable importance measures

answered Apr 1 at 16:38

iair linker

213

answered Apr 1 at 16:38

iair linker

213

answered Apr 1 at 16:38

iair linker

213

answered Apr 1 at 16:38

iair linker

213

$begingroup$
Thank you Iair. Good information. Unfortunately, 30% of my data are the ones who already resigned and I can not give them a survey to get their satisfaction level. However, it is a good idea to start collecting the data from now on for further improvements of my model.
$endgroup$
– Fatima
Apr 1 at 17:28

add a comment |

$begingroup$
Thank you Iair. Good information. Unfortunately, 30% of my data are the ones who already resigned and I can not give them a survey to get their satisfaction level. However, it is a good idea to start collecting the data from now on for further improvements of my model.
$endgroup$
– Fatima
Apr 1 at 17:28

Thank you Iair. Good information. Unfortunately, 30% of my data are the ones who already resigned and I can not give them a survey to get their satisfaction level. However, it is a good idea to start collecting the data from now on for further improvements of my model.

– Fatima
Apr 1 at 17:28

add a comment |

Here are some points to start which all involve gathering "supervision" data.

edited Apr 1 at 16:39

answered Apr 1 at 16:31

Esmailian

3,311420

add a comment |

Here are some points to start which all involve gathering "supervision" data.

edited Apr 1 at 16:39

answered Apr 1 at 16:31

Esmailian

3,311420

add a comment |

Here are some points to start which all involve gathering "supervision" data.

edited Apr 1 at 16:39

answered Apr 1 at 16:31

Esmailian

3,311420

Here are some points to start which all involve gathering "supervision" data.

edited Apr 1 at 16:39

answered Apr 1 at 16:31

Esmailian

3,311420

edited Apr 1 at 16:39

answered Apr 1 at 16:31

Esmailian

3,311420

answered Apr 1 at 16:31

Esmailian

3,311420

answered Apr 1 at 16:31

Esmailian

3,311420

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

mOurt p0xkEMVt51

搜尋此網誌

Trjtdtk

2 Answers
2

Your Answer

Post as a guest

2 Answers
2

2 Answers
2

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

2 Answers 2

Your Answer

Sign up or log in

Post as a guest

Post as a guest

2 Answers 2

2 Answers 2

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

2 Answers
2

2 Answers
2

2 Answers
2