Find-out abnormal behavior over the timeHow to train model to predict events 30 minutes prior, from multi-dimensionnal timeseriesUsing time series data from a sensor for MLSales Predictions Over TimeServer log analysis using machine learningPredicting or patron find of a binary variable over timeTo detect unauthorized access using outlier detectionHow to find out the percentage of contribution of a variable for another variable/feature?Time Series Autocorrelation EstimationHow to find similarity of two series over time containing periodic trends?
The Digit Triangles
Do you waste sorcery points if you try to apply metamagic to a spell from a scroll but fail to cast it?
Echo with obfuscation
Remove all of the duplicate numbers in an array of numbers - Javascript
Extracting patterns from a text
Do people actually use the word "kaputt" in conversation?
Why do Radio Buttons not fill the entire outer circle?
Distinction between 地平線 【ちへいせん】 and 水平線 【すいへいせん】
When should I pay my rent?
How do you justify more code being written by following clean code practices?
Would a primitive species be able to learn English from reading books alone?
Adjusting bounding box of PlotLegends in TimelinePlot
If the only attacker is removed from combat, is a creature still counted as having attacked this turn?
How to make a list of partial sums using forEach
Possible Eco thriller, man invents a device to remove rain from glass
How to predict the next number in a series while having additional series of data that might affect it?
When is the exact date for EOL of Ubuntu 14.04 LTS?
How to make money from a browser who sees 5 seconds into the future of any web page?
What does the word 'upstream' mean in the context?
How to write Quadratic equation with negative coefficient
"Oh no!" in Latin
Has the laser at Magurele, Romania reached a tenth of the Sun's power?
What is it called to attack a person then say something uplifting?
Air travel with refrigerated insulin
Find-out abnormal behavior over the time
How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseriesUsing time series data from a sensor for MLSales Predictions Over TimeServer log analysis using machine learningPredicting or patron find of a binary variable over timeTo detect unauthorized access using outlier detectionHow to find out the percentage of contribution of a variable for another variable/feature?Time Series Autocorrelation EstimationHow to find similarity of two series over time containing periodic trends?
$begingroup$
I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.
I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.
can we do it by simple statistical model? or machine learning is required?
Can please suggest the best suitable machine learning algorithm for this scenario?
Also please mention the web links over here.
Thanks
machine-learning dataset time-series statistics probability
$endgroup$
add a comment |
$begingroup$
I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.
I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.
can we do it by simple statistical model? or machine learning is required?
Can please suggest the best suitable machine learning algorithm for this scenario?
Also please mention the web links over here.
Thanks
machine-learning dataset time-series statistics probability
$endgroup$
1
$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12
add a comment |
$begingroup$
I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.
I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.
can we do it by simple statistical model? or machine learning is required?
Can please suggest the best suitable machine learning algorithm for this scenario?
Also please mention the web links over here.
Thanks
machine-learning dataset time-series statistics probability
$endgroup$
I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.
I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.
can we do it by simple statistical model? or machine learning is required?
Can please suggest the best suitable machine learning algorithm for this scenario?
Also please mention the web links over here.
Thanks
machine-learning dataset time-series statistics probability
machine-learning dataset time-series statistics probability
asked Feb 13 '18 at 6:29
JavaUserJavaUser
1011
1011
1
$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12
add a comment |
1
$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12
1
1
$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12
$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Your problem definition
You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.
I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.
Using a statistical model
The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.
You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).
Using machine learning
You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).
If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.
For more information on anomaly detection for time series refer to:
Using time series data from a sensor for ML
How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27753%2ffind-out-abnormal-behavior-over-the-time%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Your problem definition
You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.
I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.
Using a statistical model
The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.
You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).
Using machine learning
You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).
If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.
For more information on anomaly detection for time series refer to:
Using time series data from a sensor for ML
How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries
$endgroup$
add a comment |
$begingroup$
Your problem definition
You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.
I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.
Using a statistical model
The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.
You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).
Using machine learning
You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).
If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.
For more information on anomaly detection for time series refer to:
Using time series data from a sensor for ML
How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries
$endgroup$
add a comment |
$begingroup$
Your problem definition
You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.
I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.
Using a statistical model
The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.
You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).
Using machine learning
You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).
If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.
For more information on anomaly detection for time series refer to:
Using time series data from a sensor for ML
How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries
$endgroup$
Your problem definition
You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.
I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.
Using a statistical model
The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.
You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).
Using machine learning
You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).
If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.
For more information on anomaly detection for time series refer to:
Using time series data from a sensor for ML
How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries
answered Feb 13 '18 at 7:11
JahKnowsJahKnows
5,167625
5,167625
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27753%2ffind-out-abnormal-behavior-over-the-time%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12