Predict how many days late or early someone will finish their workStatistical Commute Analysis in JavaPredicting Soccer: guessing which matches a model will predict correctlyHow do I perform Naive Bayes Classification with a Bayesian Belief Network?How to interpret a decision tree correctly?How to predict the probability of an event?Match users based on the content of their articlesPredict customer action from previous buying historyk-Nearest Neighbours with time series data - how to obtain whole-time-period estimatorsReinforcement algorithm for binary classificationHow to train a model to predict a time window than an event will occur on a website

Can someone publish a story that happened to you?

Minor Revision with suggestion of an alternative proof by reviewer

Is it possible to determine the symmetric encryption method used by output size?

what is the sudo password for a --disabled-password user

Is there any limitation with Arduino Nano serial communication distance?

Size of electromagnet needed to replicate Earth's magnetic field

Is there a way to get a compiler for the original B programming language?

Normal Map bad shading in Rendered display

What does the "ep" capability mean?

Was there a Viking Exchange as well as a Columbian one?

Why do Computer Science majors learn Calculus?

How come there are so many candidates for the 2020 Democratic party presidential nomination?

What makes accurate emulation of old systems a difficult task?

How could Tony Stark make this in Endgame?

Combinable filters

How to solve constants out of the internal energy equation?

Will tsunami waves travel forever if there was no land?

How can the Zone of Truth spell be defeated without the caster knowing?

How to pronounce 'C++' in Spanish

What is the relationship between spectral sequences and obstruction theory?

French for 'It must be my imagination'?

How to creep the reader out with what seems like a normal person?

Is the 5 MB static resource size limit 5,242,880 bytes or 5,000,000 bytes?

Do I have an "anti-research" personality?

Predict how many days late or early someone will finish their work

Statistical Commute Analysis in JavaPredicting Soccer: guessing which matches a model will predict correctlyHow do I perform Naive Bayes Classification with a Bayesian Belief Network?How to interpret a decision tree correctly?How to predict the probability of an event?Match users based on the content of their articlesPredict customer action from previous buying historyk-Nearest Neighbours with time series data - how to obtain whole-time-period estimatorsReinforcement algorithm for binary classificationHow to train a model to predict a time window than an event will occur on a website

So I have a set of deadlines and people, with a database of when those people finished their previous work and how much after the deadline it was, as well as when the work was given. The work itself were articles, so I also have the word count for each. How do you, based on the previous data, calculate the amount of days earlier or later somebody will most probably finish their work?

As a concrete example of the problem I'm trying to solve:

John finished his last 5 projects 5,4,3,6,2 days late. What is the most probable amount of days earlier or late he will finish his work?

Basically I'm looking for an appropriate machine learning algortihm to implement to calculate this probable end date.

edited Apr 7 at 12:48

asked Apr 7 at 10:10

GenRincewind

$begingroup$
Very well written question, props! Roughly how many deadlines do you have per person and in total? Do you have access to other data, like a textual description of a task?
$endgroup$
– jonnor
Apr 7 at 11:38

add a comment |

As a concrete example of the problem I'm trying to solve:

John finished his last 5 projects 5,4,3,6,2 days late. What is the most probable amount of days earlier or late he will finish his work?

Basically I'm looking for an appropriate machine learning algortihm to implement to calculate this probable end date.

edited Apr 7 at 12:48

asked Apr 7 at 10:10

GenRincewind

$begingroup$
Very well written question, props! Roughly how many deadlines do you have per person and in total? Do you have access to other data, like a textual description of a task?
$endgroup$
– jonnor
Apr 7 at 11:38

add a comment |

As a concrete example of the problem I'm trying to solve:

John finished his last 5 projects 5,4,3,6,2 days late. What is the most probable amount of days earlier or late he will finish his work?

Basically I'm looking for an appropriate machine learning algortihm to implement to calculate this probable end date.

edited Apr 7 at 12:48

asked Apr 7 at 10:10

GenRincewind

As a concrete example of the problem I'm trying to solve:

John finished his last 5 projects 5,4,3,6,2 days late. What is the most probable amount of days earlier or late he will finish his work?

Basically I'm looking for an appropriate machine learning algortihm to implement to calculate this probable end date.

machine-learning time-series predictive-modeling probability markov-process

edited Apr 7 at 12:48

asked Apr 7 at 10:10

GenRincewind

edited Apr 7 at 12:48

asked Apr 7 at 10:10

GenRincewind

edited Apr 7 at 12:48

asked Apr 7 at 10:10

GenRincewind

asked Apr 7 at 10:10

GenRincewind

asked Apr 7 at 10:10

GenRincewind

$begingroup$
Very well written question, props! Roughly how many deadlines do you have per person and in total? Do you have access to other data, like a textual description of a task?
$endgroup$
– jonnor
Apr 7 at 11:38

add a comment |

$begingroup$
Very well written question, props! Roughly how many deadlines do you have per person and in total? Do you have access to other data, like a textual description of a task?
$endgroup$
– jonnor
Apr 7 at 11:38

Very well written question, props! Roughly how many deadlines do you have per person and in total? Do you have access to other data, like a textual description of a task?

– jonnor
Apr 7 at 11:38

add a comment |

1 Answer
1

active

oldest

votes

If we assume that each task delivery is independent of eachother, and the process does not change a lot over time (stationary), we can treat this as a standard regression problem.

Since this is about deadlines, we expect that there might be variations over time, or patterns of delay across the seasons of the year or week. So time-based features might look something like:

|deadline_year|deadline_week_number|deadline_day_of_week|

We also expect that the size of a delay might depend on the size of the task. So if you have the start date, or an estimate on number of days, definitely include that. If people can have multiple tasks at the same time, include that also.

|workdays_between_start_and_deadline|workdays_estimated|concurrent_tasks|

And we expect that delays may depend on the person who performs the task, and who created the task.

|task_owner|task_creator|

Use Exploratory Data Analysis and your knowledge about the processes that created to find more of these possible relationships. Use scatterplots of each feature against the target days_delayed (negative=before time, 0=on time).

One can start with a strong non-linear model like RandomForest. This can give estimates which can be scored (by mean squared error for example), and indicate whether your features are predictive or not.
To get probability intervals, you can use a Bayesian model such as Bayesian Ridge Regression. This is a linear model, so may have to spend more time on feature engineering to make the relationships between feature and target (roughly) linear.

edited Apr 7 at 14:48

answered Apr 7 at 12:26

jonnor

2826

$begingroup$
I added the additional information I possess.
$endgroup$
– GenRincewind
Apr 7 at 12:49

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48801%2fpredict-how-many-days-late-or-early-someone-will-finish-their-work%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

If we assume that each task delivery is independent of eachother, and the process does not change a lot over time (stationary), we can treat this as a standard regression problem.

Since this is about deadlines, we expect that there might be variations over time, or patterns of delay across the seasons of the year or week. So time-based features might look something like:

|deadline_year|deadline_week_number|deadline_day_of_week|

|workdays_between_start_and_deadline|workdays_estimated|concurrent_tasks|

And we expect that delays may depend on the person who performs the task, and who created the task.

|task_owner|task_creator|

edited Apr 7 at 14:48

answered Apr 7 at 12:26

jonnor

2826

$begingroup$
I added the additional information I possess.
$endgroup$
– GenRincewind
Apr 7 at 12:49

add a comment |

If we assume that each task delivery is independent of eachother, and the process does not change a lot over time (stationary), we can treat this as a standard regression problem.

Since this is about deadlines, we expect that there might be variations over time, or patterns of delay across the seasons of the year or week. So time-based features might look something like:

|deadline_year|deadline_week_number|deadline_day_of_week|

|workdays_between_start_and_deadline|workdays_estimated|concurrent_tasks|

And we expect that delays may depend on the person who performs the task, and who created the task.

|task_owner|task_creator|

edited Apr 7 at 14:48

answered Apr 7 at 12:26

jonnor

2826

$begingroup$
I added the additional information I possess.
$endgroup$
– GenRincewind
Apr 7 at 12:49

add a comment |

If we assume that each task delivery is independent of eachother, and the process does not change a lot over time (stationary), we can treat this as a standard regression problem.

Since this is about deadlines, we expect that there might be variations over time, or patterns of delay across the seasons of the year or week. So time-based features might look something like:

|deadline_year|deadline_week_number|deadline_day_of_week|

|workdays_between_start_and_deadline|workdays_estimated|concurrent_tasks|

And we expect that delays may depend on the person who performs the task, and who created the task.

|task_owner|task_creator|

edited Apr 7 at 14:48

answered Apr 7 at 12:26

jonnor

2826

If we assume that each task delivery is independent of eachother, and the process does not change a lot over time (stationary), we can treat this as a standard regression problem.

Since this is about deadlines, we expect that there might be variations over time, or patterns of delay across the seasons of the year or week. So time-based features might look something like:

|deadline_year|deadline_week_number|deadline_day_of_week|

|workdays_between_start_and_deadline|workdays_estimated|concurrent_tasks|

And we expect that delays may depend on the person who performs the task, and who created the task.

|task_owner|task_creator|

edited Apr 7 at 14:48

answered Apr 7 at 12:26

jonnor

2826

edited Apr 7 at 14:48

answered Apr 7 at 12:26

jonnor

2826

answered Apr 7 at 12:26

jonnor

2826

answered Apr 7 at 12:26

jonnor

2826

$begingroup$
I added the additional information I possess.
$endgroup$
– GenRincewind
Apr 7 at 12:49

add a comment |

$begingroup$
I added the additional information I possess.
$endgroup$
– GenRincewind
Apr 7 at 12:49

I added the additional information I possess.

– GenRincewind
Apr 7 at 12:49

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

3erDIMQVriZHTLeDMA9Lx f9Al6 w,x L,zBSYDlHIXMJs 6DrNEWfpitWzl U AyQA7g,Mmoak8

搜尋此網誌

Trjtdtk

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer
1

1 Answer
1

1 Answer
1