Unsupervised learning for anomaly detection2019 Community Moderator ElectionOpen source Anomaly Detection in PythonExample of binary classifier with numerical features using deep learningAnomaly detection for transaction dataUnsupervised feature reduction for anomaly detection with autoencodershow to compare different sets of time series dataUnsupervised Anomaly Detection in ImagesExploratory analysis and feature engineering for time till failure prediction using sensor data of enginesAnomaly detection on time seriesWhich methods exist to find correlations between multiple univariate timeseries anomaly detection output?Anomaly detection on multidimensional time series
What's the purpose of "true" in bash "if sudo true; then"
Student evaluations of teaching assistants
Curses work by shouting - How to avoid collateral damage?
Irreducibility of a simple polynomial
Is there any reason not to eat food that's been dropped on the surface of the moon?
Teaching indefinite integrals that require special-casing
Hostile work environment after whistle-blowing on coworker and our boss. What do I do?
Can I Retrieve Email Addresses from BCC?
Hide Select Output from T-SQL
How will losing mobility of one hand affect my career as a programmer?
Do I need a multiple entry visa for a trip UK -> Sweden -> UK?
Personal Teleportation as a Weapon
How do I define a right arrow with bar in LaTeX?
There is only s̶i̶x̶t̶y one place he can be
Is the destination of a commercial flight important for the pilot?
Cynical novel that describes an America ruled by the media, arms manufacturers, and ethnic figureheads
Is exact Kanji stroke length important?
when is out of tune ok?
What will be the benefits of Brexit?
How was Earth single-handedly capable of creating 3 of the 4 gods of chaos?
What is the intuitive meaning of having a linear relationship between the logs of two variables?
How do I rename a LINUX host without needing to reboot for the rename to take effect?
Lay out the Carpet
What is the oldest known work of fiction?
Unsupervised learning for anomaly detection
2019 Community Moderator ElectionOpen source Anomaly Detection in PythonExample of binary classifier with numerical features using deep learningAnomaly detection for transaction dataUnsupervised feature reduction for anomaly detection with autoencodershow to compare different sets of time series dataUnsupervised Anomaly Detection in ImagesExploratory analysis and feature engineering for time till failure prediction using sensor data of enginesAnomaly detection on time seriesWhich methods exist to find correlations between multiple univariate timeseries anomaly detection output?Anomaly detection on multidimensional time series
$begingroup$
I've started working on an anomaly detection in Python.
My dataset is a time series one. The data is being collected by some sensors which record and collect data on semiconductor making machines.
My dataset looks like this:
ContextID Time_ms Ar_Flow_sccm BacksGas_Flow_sccm
7289973 09:12:48.502 49.56054688 1.953125
7289973 09:12:48.603 49.56054688 2.05078125
7289973 09:12:48.934 99.85351563 2.05078125
7289973 09:12:49.924 351.3183594 2.05078125
7289973 09:12:50.924 382.8125 1.953125
7289973 09:12:51.924 382.8125 1.7578125
7289973 09:12:52.934 382.8125 1.7578125
7289999 09:15:36.434 50.04882813 1.7578125
7289999 09:15:36.654 50.04882813 1.7578125
7289999 09:15:36.820 50.04882813 1.66015625
7289999 09:15:37.904 333.2519531 1.85546875
7289999 09:15:38.924 377.1972656 1.953125
7289999 09:15:39.994 377.1972656 1.7578125
7289999 09:15:41.94 388.671875 1.85546875
7289999 09:15:42.136 388.671875 1.85546875
7290025 09:18:00.429 381.5917969 1.85546875
7290025 09:18:01.448 381.5917969 1.85546875
7290025 09:18:02.488 381.5917969 1.953125
7290025 09:18:03.549 381.5917969 14.453125
7290025 09:18:04.589 381.5917969 46.77734375
What I have to do is to apply some unsupervised learning technique on each and every parameter column individually and find any anomalies that might exist in there. The ContextID
is more like a product number.
I would like to know which unsupervised learning techniques can be used for this kind of task at hand.
Thanks
machine-learning deep-learning time-series unsupervised-learning anomaly-detection
New contributor
$endgroup$
add a comment |
$begingroup$
I've started working on an anomaly detection in Python.
My dataset is a time series one. The data is being collected by some sensors which record and collect data on semiconductor making machines.
My dataset looks like this:
ContextID Time_ms Ar_Flow_sccm BacksGas_Flow_sccm
7289973 09:12:48.502 49.56054688 1.953125
7289973 09:12:48.603 49.56054688 2.05078125
7289973 09:12:48.934 99.85351563 2.05078125
7289973 09:12:49.924 351.3183594 2.05078125
7289973 09:12:50.924 382.8125 1.953125
7289973 09:12:51.924 382.8125 1.7578125
7289973 09:12:52.934 382.8125 1.7578125
7289999 09:15:36.434 50.04882813 1.7578125
7289999 09:15:36.654 50.04882813 1.7578125
7289999 09:15:36.820 50.04882813 1.66015625
7289999 09:15:37.904 333.2519531 1.85546875
7289999 09:15:38.924 377.1972656 1.953125
7289999 09:15:39.994 377.1972656 1.7578125
7289999 09:15:41.94 388.671875 1.85546875
7289999 09:15:42.136 388.671875 1.85546875
7290025 09:18:00.429 381.5917969 1.85546875
7290025 09:18:01.448 381.5917969 1.85546875
7290025 09:18:02.488 381.5917969 1.953125
7290025 09:18:03.549 381.5917969 14.453125
7290025 09:18:04.589 381.5917969 46.77734375
What I have to do is to apply some unsupervised learning technique on each and every parameter column individually and find any anomalies that might exist in there. The ContextID
is more like a product number.
I would like to know which unsupervised learning techniques can be used for this kind of task at hand.
Thanks
machine-learning deep-learning time-series unsupervised-learning anomaly-detection
New contributor
$endgroup$
$begingroup$
Do you know the percent of anomalies in your data?
$endgroup$
– Alireza Zolanvari
Mar 21 at 11:33
$begingroup$
I am afraid not @AlirezaZolanvari
$endgroup$
– Kashyap Maheshwari
Mar 21 at 11:43
add a comment |
$begingroup$
I've started working on an anomaly detection in Python.
My dataset is a time series one. The data is being collected by some sensors which record and collect data on semiconductor making machines.
My dataset looks like this:
ContextID Time_ms Ar_Flow_sccm BacksGas_Flow_sccm
7289973 09:12:48.502 49.56054688 1.953125
7289973 09:12:48.603 49.56054688 2.05078125
7289973 09:12:48.934 99.85351563 2.05078125
7289973 09:12:49.924 351.3183594 2.05078125
7289973 09:12:50.924 382.8125 1.953125
7289973 09:12:51.924 382.8125 1.7578125
7289973 09:12:52.934 382.8125 1.7578125
7289999 09:15:36.434 50.04882813 1.7578125
7289999 09:15:36.654 50.04882813 1.7578125
7289999 09:15:36.820 50.04882813 1.66015625
7289999 09:15:37.904 333.2519531 1.85546875
7289999 09:15:38.924 377.1972656 1.953125
7289999 09:15:39.994 377.1972656 1.7578125
7289999 09:15:41.94 388.671875 1.85546875
7289999 09:15:42.136 388.671875 1.85546875
7290025 09:18:00.429 381.5917969 1.85546875
7290025 09:18:01.448 381.5917969 1.85546875
7290025 09:18:02.488 381.5917969 1.953125
7290025 09:18:03.549 381.5917969 14.453125
7290025 09:18:04.589 381.5917969 46.77734375
What I have to do is to apply some unsupervised learning technique on each and every parameter column individually and find any anomalies that might exist in there. The ContextID
is more like a product number.
I would like to know which unsupervised learning techniques can be used for this kind of task at hand.
Thanks
machine-learning deep-learning time-series unsupervised-learning anomaly-detection
New contributor
$endgroup$
I've started working on an anomaly detection in Python.
My dataset is a time series one. The data is being collected by some sensors which record and collect data on semiconductor making machines.
My dataset looks like this:
ContextID Time_ms Ar_Flow_sccm BacksGas_Flow_sccm
7289973 09:12:48.502 49.56054688 1.953125
7289973 09:12:48.603 49.56054688 2.05078125
7289973 09:12:48.934 99.85351563 2.05078125
7289973 09:12:49.924 351.3183594 2.05078125
7289973 09:12:50.924 382.8125 1.953125
7289973 09:12:51.924 382.8125 1.7578125
7289973 09:12:52.934 382.8125 1.7578125
7289999 09:15:36.434 50.04882813 1.7578125
7289999 09:15:36.654 50.04882813 1.7578125
7289999 09:15:36.820 50.04882813 1.66015625
7289999 09:15:37.904 333.2519531 1.85546875
7289999 09:15:38.924 377.1972656 1.953125
7289999 09:15:39.994 377.1972656 1.7578125
7289999 09:15:41.94 388.671875 1.85546875
7289999 09:15:42.136 388.671875 1.85546875
7290025 09:18:00.429 381.5917969 1.85546875
7290025 09:18:01.448 381.5917969 1.85546875
7290025 09:18:02.488 381.5917969 1.953125
7290025 09:18:03.549 381.5917969 14.453125
7290025 09:18:04.589 381.5917969 46.77734375
What I have to do is to apply some unsupervised learning technique on each and every parameter column individually and find any anomalies that might exist in there. The ContextID
is more like a product number.
I would like to know which unsupervised learning techniques can be used for this kind of task at hand.
Thanks
machine-learning deep-learning time-series unsupervised-learning anomaly-detection
machine-learning deep-learning time-series unsupervised-learning anomaly-detection
New contributor
New contributor
New contributor
asked Mar 21 at 10:05
Kashyap MaheshwariKashyap Maheshwari
62
62
New contributor
New contributor
$begingroup$
Do you know the percent of anomalies in your data?
$endgroup$
– Alireza Zolanvari
Mar 21 at 11:33
$begingroup$
I am afraid not @AlirezaZolanvari
$endgroup$
– Kashyap Maheshwari
Mar 21 at 11:43
add a comment |
$begingroup$
Do you know the percent of anomalies in your data?
$endgroup$
– Alireza Zolanvari
Mar 21 at 11:33
$begingroup$
I am afraid not @AlirezaZolanvari
$endgroup$
– Kashyap Maheshwari
Mar 21 at 11:43
$begingroup$
Do you know the percent of anomalies in your data?
$endgroup$
– Alireza Zolanvari
Mar 21 at 11:33
$begingroup$
Do you know the percent of anomalies in your data?
$endgroup$
– Alireza Zolanvari
Mar 21 at 11:33
$begingroup$
I am afraid not @AlirezaZolanvari
$endgroup$
– Kashyap Maheshwari
Mar 21 at 11:43
$begingroup$
I am afraid not @AlirezaZolanvari
$endgroup$
– Kashyap Maheshwari
Mar 21 at 11:43
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
You can use a density based method such as Local Outlier Factor in order to do this.
If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)
$endgroup$
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47728%2funsupervised-learning-for-anomaly-detection%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
You can use a density based method such as Local Outlier Factor in order to do this.
If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)
$endgroup$
add a comment |
$begingroup$
You can use a density based method such as Local Outlier Factor in order to do this.
If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)
$endgroup$
add a comment |
$begingroup$
You can use a density based method such as Local Outlier Factor in order to do this.
If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)
$endgroup$
You can use a density based method such as Local Outlier Factor in order to do this.
If you want to identify outliers in each column independent of other columns, then you would apply this method separately to each column. (Note that this is different from identifying outlier data points considering all columns together.)
answered Mar 22 at 10:17
raghuraghu
40133
40133
add a comment |
add a comment |
Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.
Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.
Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.
Kashyap Maheshwari is a new contributor. Be nice, and check out our Code of Conduct.
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47728%2funsupervised-learning-for-anomaly-detection%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
Do you know the percent of anomalies in your data?
$endgroup$
– Alireza Zolanvari
Mar 21 at 11:33
$begingroup$
I am afraid not @AlirezaZolanvari
$endgroup$
– Kashyap Maheshwari
Mar 21 at 11:43