Find-out abnormal behavior over the timeHow to train model to predict events 30 minutes prior, from multi-dimensionnal timeseriesUsing time series data from a sensor for MLSales Predictions Over TimeServer log analysis using machine learningPredicting or patron find of a binary variable over timeTo detect unauthorized access using outlier detectionHow to find out the percentage of contribution of a variable for another variable/feature?Time Series Autocorrelation EstimationHow to find similarity of two series over time containing periodic trends?

The Digit Triangles

Do you waste sorcery points if you try to apply metamagic to a spell from a scroll but fail to cast it?

Echo with obfuscation

Remove all of the duplicate numbers in an array of numbers - Javascript

Extracting patterns from a text

Do people actually use the word "kaputt" in conversation?

Why do Radio Buttons not fill the entire outer circle?

Distinction between 地平線 【ちへいせん】 and 水平線 【すいへいせん】

When should I pay my rent?

How do you justify more code being written by following clean code practices?

Would a primitive species be able to learn English from reading books alone?

Adjusting bounding box of PlotLegends in TimelinePlot

If the only attacker is removed from combat, is a creature still counted as having attacked this turn?

How to make a list of partial sums using forEach

Possible Eco thriller, man invents a device to remove rain from glass

How to predict the next number in a series while having additional series of data that might affect it?

When is the exact date for EOL of Ubuntu 14.04 LTS?

How to make money from a browser who sees 5 seconds into the future of any web page?

What does the word 'upstream' mean in the context?

How to write Quadratic equation with negative coefficient

"Oh no!" in Latin

Has the laser at Magurele, Romania reached a tenth of the Sun's power?

What is it called to attack a person then say something uplifting?

Air travel with refrigerated insulin



Find-out abnormal behavior over the time


How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseriesUsing time series data from a sensor for MLSales Predictions Over TimeServer log analysis using machine learningPredicting or patron find of a binary variable over timeTo detect unauthorized access using outlier detectionHow to find out the percentage of contribution of a variable for another variable/feature?Time Series Autocorrelation EstimationHow to find similarity of two series over time containing periodic trends?













0












$begingroup$


I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks










share|improve this question









$endgroup$







  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12















0












$begingroup$


I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks










share|improve this question









$endgroup$







  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12













0












0








0





$begingroup$


I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks










share|improve this question









$endgroup$




I want to detect abnormal behaviour in a oil pipe where the oil is flowing with some constant pressure. I have a sensor which monitors and pressure over the time and push it to my cloud server.



I got the dataset . My requirement is to do data analytics in python and find-out the abnormal pattern in the dataset over the time. And I need to suggest abnormal behaviors present in the dataset ., say for example., from 1 pm to 3:30 pm today , the pressure raises and falls may be due to some leakage in the pipe.



can we do it by simple statistical model? or machine learning is required?



Can please suggest the best suitable machine learning algorithm for this scenario?



Also please mention the web links over here.



Thanks







machine-learning dataset time-series statistics probability






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Feb 13 '18 at 6:29









JavaUserJavaUser

1011




1011







  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12












  • 1




    $begingroup$
    Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
    $endgroup$
    – phiver
    Feb 13 '18 at 14:12







1




1




$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12




$begingroup$
Visit (sections of) the pipeline. Talk to the people who maintain it. There is only so much that you can see from data. You never know if someone is digging near the pipe that might cause fluctuations in the data which are not caused by leakage.
$endgroup$
– phiver
Feb 13 '18 at 14:12










1 Answer
1






active

oldest

votes


















2












$begingroup$

Your problem definition



You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



Using a statistical model



The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



Using machine learning



You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



For more information on anomaly detection for time series refer to:



Using time series data from a sensor for ML



How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






share|improve this answer









$endgroup$












    Your Answer





    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "557"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27753%2ffind-out-abnormal-behavior-over-the-time%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2












    $begingroup$

    Your problem definition



    You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



    I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



    Using a statistical model



    The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



    You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



    Using machine learning



    You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



    If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



    For more information on anomaly detection for time series refer to:



    Using time series data from a sensor for ML



    How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






    share|improve this answer









    $endgroup$

















      2












      $begingroup$

      Your problem definition



      You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



      I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



      Using a statistical model



      The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



      You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



      Using machine learning



      You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



      If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



      For more information on anomaly detection for time series refer to:



      Using time series data from a sensor for ML



      How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






      share|improve this answer









      $endgroup$















        2












        2








        2





        $begingroup$

        Your problem definition



        You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



        I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



        Using a statistical model



        The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



        You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



        Using machine learning



        You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



        If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



        For more information on anomaly detection for time series refer to:



        Using time series data from a sensor for ML



        How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries






        share|improve this answer









        $endgroup$



        Your problem definition



        You have time series data which is used to measure the pressure using your sensor. You wish to identify when the pressure recordings are abnormal. This problem would be best solved using anomaly detection algorithms. But, there are so many ways that you can approach this problem.



        I would use a sliding window approach and use that as your feature space to detect the distribution of your recordings. The window length you will select $m$ is the first of your hyper-parameters that you will need to tune.



        Using a statistical model



        The retained time series $X in mathbbR^m$, you can treat this signal as a queue, first in first out. When a new recording is defined then discard the oldest datapoint. For a given set $X$ of samples, get the mean and the standard deviation $sigma$. If the new point exceeds a multiple of $sigma$, usually this is set to $3sigma$, but it depends on the expected variation of your sensor, then we flag a state change.



        You can further extend this by using the generalized likelihood ratio test (GLRT) to determine when a new sample causes the point to fall significantly outside the distribution of the null hypothesis. In which case this indicates a state change (the new point came from a different distribution than the normal flow through the pipe).



        Using machine learning



        You can collect multiple instances of your time series and annotate them based on your experience as being nominal or anomalous. Then you will have a supervised 2-class classification problem. First you should attempt some feature extraction using PCA, LDA or similar techniques.. Then you can attempt to use all the fun algorithms available to you through scikit-learn (SVM, Random Forests, K-NN, etc.).



        If there is a significantly higher number of nominal instances than anomalous ones, this will introduce bias to your model. Anomaly detection algorithms are better suited for these types of problems. These algorithms learn the distribution within which your nominal set should belong. Then for novel instances it evaluates the probability of it being contained in the learned distribution. If the probability is small, then the algorithm will flag the instance as anomalous.



        For more information on anomaly detection for time series refer to:



        Using time series data from a sensor for ML



        How to train model to predict events 30 minutes prior, from multi-dimensionnal timeseries







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Feb 13 '18 at 7:11









        JahKnowsJahKnows

        5,167625




        5,167625



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f27753%2ffind-out-abnormal-behavior-over-the-time%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

            Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

            Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High