Unsupervised learning for anomaly detection2019 Community Moderator ElectionOpen source Anomaly Detection in PythonExample of binary classifier with numerical features using deep learningAnomaly detection for transaction dataUnsupervised feature reduction for anomaly detection with autoencodershow to compare different sets of time series dataUnsupervised Anomaly Detection in ImagesExploratory analysis and feature engineering for time till failure prediction using sensor data of enginesAnomaly detection on time seriesWhich methods exist to find correlations between multiple univariate timeseries anomaly detection output?Anomaly detection on multidimensional time series

What's the purpose of "true" in bash "if sudo true; then"

Student evaluations of teaching assistants

Curses work by shouting - How to avoid collateral damage?

Irreducibility of a simple polynomial

Is there any reason not to eat food that's been dropped on the surface of the moon?

Teaching indefinite integrals that require special-casing

Hostile work environment after whistle-blowing on coworker and our boss. What do I do?

Can I Retrieve Email Addresses from BCC?

Hide Select Output from T-SQL

How will losing mobility of one hand affect my career as a programmer?

Do I need a multiple entry visa for a trip UK -> Sweden -> UK?

Personal Teleportation as a Weapon

How do I define a right arrow with bar in LaTeX?

There is only s̶i̶x̶t̶y one place he can be

Is the destination of a commercial flight important for the pilot?

Cynical novel that describes an America ruled by the media, arms manufacturers, and ethnic figureheads

Is exact Kanji stroke length important?

when is out of tune ok?

What will be the benefits of Brexit?

How was Earth single-handedly capable of creating 3 of the 4 gods of chaos?

What is the intuitive meaning of having a linear relationship between the logs of two variables?

How do I rename a LINUX host without needing to reboot for the rename to take effect?

Lay out the Carpet

What is the oldest known work of fiction?



Unsupervised learning for anomaly detection



2019 Community Moderator ElectionOpen source Anomaly Detection in PythonExample of binary classifier with numerical features using deep learningAnomaly detection for transaction dataUnsupervised feature reduction for anomaly detection with autoencodershow to compare different sets of time series dataUnsupervised Anomaly Detection in ImagesExploratory analysis and feature engineering for time till failure prediction using sensor data of enginesAnomaly detection on time seriesWhich methods exist to find correlations between multiple univariate timeseries anomaly detection output?Anomaly detection on multidimensional time series










1












$begingroup$


I've started working on an anomaly detection in Python.
My dataset is a time series one. The data is being collected by some sensors which record and collect data on semiconductor making machines.



My dataset looks like this:



ContextID Time_ms Ar_Flow_sccm BacksGas_Flow_sccm
7289973 09:12:48.502 49.56054688 1.953125
7289973 09:12:48.603 49.56054688 2.05078125
7289973 09:12:48.934 99.85351563 2.05078125
7289973 09:12:49.924 351.3183594 2.05078125
7289973 09:12:50.924 382.8125 1.953125
7289973 09:12:51.924 382.8125 1.7578125
7289973 09:12:52.934 382.8125 1.7578125
7289999 09:15:36.434 50.04882813 1.7578125
7289999 09:15:36.654 50.04882813 1.7578125
7289999 09:15:36.820 50.04882813 1.66015625
7289999 09:15:37.904 333.2519531 1.85546875
7289999 09:15:38.924 377.1972656 1.953125
7289999 09:15:39.994 377.1972656 1.7578125
7289999 09:15:41.94 388.671875 1.85546875
7289999 09:15:42.136 388.671875 1.85546875
7290025 09:18:00.429 381.5917969 1.85546875
7290025 09:18:01.448 381.5917969 1.85546875
7290025 09:18:02.488 381.5917969 1.953125
7290025 09:18:03.549 381.5917969 14.453125
7290025 09:18:04.589 381.5917969 46.77734375


What I have to do is to apply some unsupervised learning technique on each and every parameter column individually and find any anomalies that might exist in there. The ContextID is more like a product number.



I would like to know which unsupervised learning techniques can be used for this kind of task at hand.



Thanks










share|improve this question







New contributor




Kashyap Maheshwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$











  • $begingroup$
    Do you know the percent of anomalies in your data?
    $endgroup$
    – Alireza Zolanvari
    Mar 21 at 11:33










  • $begingroup$
    I am afraid not @AlirezaZolanvari
    $endgroup$
    – Kashyap Maheshwari
    Mar 21 at 11:43















1












$begingroup$


I've started working on an anomaly detection in Python.
My dataset is a time series one. The data is being collected by some sensors which record and collect data on semiconductor making machines.



My dataset looks like this:



ContextID Time_ms Ar_Flow_sccm BacksGas_Flow_sccm
7289973 09:12:48.502 49.56054688 1.953125
7289973 09:12:48.603 49.56054688 2.05078125
7289973 09:12:48.934 99.85351563 2.05078125
7289973 09:12:49.924 351.3183594 2.05078125
7289973 09:12:50.924 382.8125 1.953125
7289973 09:12:51.924 382.8125 1.7578125
7289973 09:12:52.934 382.8125 1.7578125
7289999 09:15:36.434 50.04882813 1.7578125
7289999 09:15:36.654 50.04882813 1.7578125
7289999 09:15:36.820 50.04882813 1.66015625
7289999 09:15:37.904 333.2519531 1.85546875
7289999 09:15:38.924 377.1972656 1.953125
7289999 09:15:39.994 377.1972656 1.7578125
7289999 09:15:41.94 388.671875 1.85546875
7289999 09:15:42.136 388.671875 1.85546875
7290025 09:18:00.429 381.5917969 1.85546875
7290025 09:18:01.448 381.5917969 1.85546875
7290025 09:18:02.488 381.5917969 1.953125
7290025 09:18:03.549 381.5917969 14.453125
7290025 09:18:04.589 381.5917969 46.77734375


What I have to do is to apply some unsupervised learning technique on each and every parameter column individually and find any anomalies that might exist in there. The ContextID is more like a product number.



I would like to know which unsupervised learning techniques can be used for this kind of task at hand.



Thanks










share|improve this question







New contributor




Kashyap Maheshwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$











  • $begingroup$
    Do you know the percent of anomalies in your data?
    $endgroup$
    – Alireza Zolanvari
    Mar 21 at 11:33










  • $begingroup$
    I am afraid not @AlirezaZolanvari
    $endgroup$
    – Kashyap Maheshwari
    Mar 21 at 11:43













1












1








1





$begingroup$


I've started working on an anomaly detection in Python.
My dataset is a time series one. The data is being collected by some sensors which record and collect data on semiconductor making machines.



My dataset looks like this:



ContextID Time_ms Ar_Flow_sccm BacksGas_Flow_sccm
7289973 09:12:48.502 49.56054688 1.953125
7289973 09:12:48.603 49.56054688 2.05078125
7289973 09:12:48.934 99.85351563 2.05078125
7289973 09:12:49.924 351.3183594 2.05078125
7289973 09:12:50.924 382.8125 1.953125
7289973 09:12:51.924 382.8125 1.7578125
7289973 09:12:52.934 382.8125 1.7578125
7289999 09:15:36.434 50.04882813 1.7578125
7289999 09:15:36.654 50.04882813 1.7578125
7289999 09:15:36.820 50.04882813 1.66015625
7289999 09:15:37.904 333.2519531 1.85546875
7289999 09:15:38.924 377.1972656 1.953125
7289999 09:15:39.994 377.1972656 1.7578125
7289999 09:15:41.94 388.671875 1.85546875
7289999 09:15:42.136 388.671875 1.85546875
7290025 09:18:00.429 381.5917969 1.85546875
7290025 09:18:01.448 381.5917969 1.85546875
7290025 09:18:02.488 381.5917969 1.953125
7290025 09:18:03.549 381.5917969 14.453125
7290025 09:18:04.589 381.5917969 46.77734375


What I have to do is to apply some unsupervised learning technique on each and every parameter column individually and find any anomalies that might exist in there. The ContextID is more like a product number.



I would like to know which unsupervised learning techniques can be used for this kind of task at hand.



Thanks










share|improve this question







New contributor




Kashyap Maheshwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







$endgroup$




I've started working on an anomaly detection in Python.
My dataset is a time series one. The data is being collected by some sensors which record and collect data on semiconductor making machines.



My dataset looks like this:



ContextID Time_ms Ar_Flow_sccm BacksGas_Flow_sccm
7289973 09:12:48.502 49.56054688 1.953125
7289973 09:12:48.603 49.56054688 2.05078125
7289973 09:12:48.934 99.85351563 2.05078125
7289973 09:12:49.924 351.3183594 2.05078125
7289973 09:12:50.924 382.8125 1.953125
7289973 09:12:51.924 382.8125 1.7578125
7289973 09:12:52.934 382.8125 1.7578125
7289999 09:15:36.434 50.04882813 1.7578125
7289999 09:15:36.654 50.04882813 1.7578125
7289999 09:15:36.820 50.04882813 1.66015625
7289999 09:15:37.904 333.2519531 1.85546875
7289999 09:15:38.924 377.1972656 1.953125
7289999 09:15:39.994 377.1972656 1.7578125
7289999 09:15:41.94 388.671875 1.85546875
7289999 09:15:42.136 388.671875 1.85546875
7290025 09:18:00.429 381.5917969 1.85546875
7290025 09:18:01.448 381.5917969 1.85546875
7290025 09:18:02.488 381.5917969 1.953125
7290025 09:18:03.549 381.5917969 14.453125
7290025 09:18:04.589 381.5917969 46.77734375


What I have to do is to apply some unsupervised learning technique on each and every parameter column individually and find any anomalies that might exist in there. The ContextID is more like a product number.



I would like to know which unsupervised learning techniques can be used for this kind of task at hand.



Thanks







machine-learning deep-learning time-series unsupervised-learning anomaly-detection






share|improve this question







New contributor




Kashyap Maheshwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question







New contributor




Kashyap Maheshwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question






New contributor




Kashyap Maheshwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Mar 21 at 10:05









Kashyap MaheshwariKashyap Maheshwari

62




62




New contributor




Kashyap Maheshwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





Kashyap Maheshwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






Kashyap Maheshwari is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











  • $begingroup$
    Do you know the percent of anomalies in your data?
    $endgroup$
    – Alireza Zolanvari
    Mar 21 at 11:33










  • $begingroup$
    I am afraid not @AlirezaZolanvari
    $endgroup$
    – Kashyap Maheshwari
    Mar 21 at 11:43
















  • $begingroup$
    Do you know the percent of anomalies in your data?
    $endgroup$
    – Alireza Zolanvari
    Mar 21 at 11:33










  • $begingroup$
    I am afraid not @AlirezaZolanvari
    $endgroup$
    – Kashyap Maheshwari
    Mar 21 at 11:43















$begingroup$
Do you know the percent of anomalies in your data?
$endgroup$
– Alireza Zolanvari
Mar 21 at 11:33




$begingroup$
Do you know the percent of anomalies in your data?
$endgroup$
– Alireza Zolanvari
Mar 21 at 11:33












$begingroup$
I am afraid not @AlirezaZolanvari
$endgroup$
– Kashyap Maheshwari
Mar 21 at 11:43




$begingroup$
I am afraid not @AlirezaZolanvari
$endgroup$
– Kashyap Maheshwari
Mar 21 at 11:43










1 Answer
1






active

oldest

votes


















0












$begingroup$

You can use a density based method such as Local Outlier Factor in order to do this.



If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)






share|improve this answer









$endgroup$












    Your Answer





    StackExchange.ifUsing("editor", function ()
    return StackExchange.using("mathjaxEditing", function ()
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    );
    );
    , "mathjax-editing");

    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "557"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.









    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47728%2funsupervised-learning-for-anomaly-detection%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0












    $begingroup$

    You can use a density based method such as Local Outlier Factor in order to do this.



    If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)






    share|improve this answer









    $endgroup$

















      0












      $begingroup$

      You can use a density based method such as Local Outlier Factor in order to do this.



      If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)






      share|improve this answer









      $endgroup$















        0












        0








        0





        $begingroup$

        You can use a density based method such as Local Outlier Factor in order to do this.



        If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)






        share|improve this answer









        $endgroup$



        You can use a density based method such as Local Outlier Factor in order to do this.



        If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 22 at 10:17









        raghuraghu

        40133




        40133




















            Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.









            draft saved

            draft discarded


















            Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.












            Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.











            Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.














            Thanks for contributing an answer to Data Science Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47728%2funsupervised-learning-for-anomaly-detection%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

            Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

            Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High