NLP - How to detect the presence of a phrase and it's derivatives2019 Community Moderator ElectionWhat is the difference between NLP and text mining?NLP: wit.ai. How to use confidence score?How to detect product name from the bill text?Boolean classification on stringsUsing NLP to detect insurance FraudNLP - How to perform semantic analysis?How to deal with missing data for Bernoulli Naive Bayes?How to detect when the “bibliography” of a paper has began?NLP: What are some popular packages for phrase tokenization?NLP: Fuzzy Word/Phrase Match
Is `x >> pure y` equivalent to `liftM (const y) x`
How can I kill an app using Terminal?
Opposite of a diet
What is paid subscription needed for in Mortal Kombat 11?
Is there a problem with hiding "forgot password" until it's needed?
How to Reset Passwords on Multiple Websites Easily?
Is a stroke of luck acceptable after a series of unfavorable events?
Inappropriate reference requests from Journal reviewers
Applicability of Single Responsibility Principle
Is this apparent Class Action settlement a spam message?
How do I extract a value from a time formatted value in excel?
Is the destination of a commercial flight important for the pilot?
I'm in charge of equipment buying but no one's ever happy with what I choose. How to fix this?
Return the Closest Prime Number
How does the UK government determine the size of a mandate?
Unreliable Magic - Is it worth it?
Do sorcerers' Subtle Spells require a skill check to be unseen?
Failed to fetch jessie backports repository
How to be diplomatic in refusing to write code that breaches the privacy of our users
Pole-zeros of a real-valued causal FIR system
How do we know the LHC results are robust?
Different result between scanning in Epson's "color negative film" mode and scanning in positive -> invert curve in post?
Is oxalic acid dihydrate considered a primary acid standard in analytical chemistry?
Anatomically Correct Strange Women In Ponds Distributing Swords
NLP - How to detect the presence of a phrase and it's derivatives
2019 Community Moderator ElectionWhat is the difference between NLP and text mining?NLP: wit.ai. How to use confidence score?How to detect product name from the bill text?Boolean classification on stringsUsing NLP to detect insurance FraudNLP - How to perform semantic analysis?How to deal with missing data for Bernoulli Naive Bayes?How to detect when the “bibliography” of a paper has began?NLP: What are some popular packages for phrase tokenization?NLP: Fuzzy Word/Phrase Match
$begingroup$
I have a dataset with a free form text field as one of the variables. Essentially I want to determine if a record has the phrase "The cat is not present". However, this phrase could be written as "cat is not present", "cat- not present", "There is no cat", "cat: not present", "no cat here to report", "report: no cat", and many other derivatives. I also want to exclude situations like "I was outside playing with my friend Bob. It was sunny. It was warm. Cat was not present. Overall, it was a good day" because this has "useful" context.
The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"
machine-learning python nlp
$endgroup$
add a comment |
$begingroup$
I have a dataset with a free form text field as one of the variables. Essentially I want to determine if a record has the phrase "The cat is not present". However, this phrase could be written as "cat is not present", "cat- not present", "There is no cat", "cat: not present", "no cat here to report", "report: no cat", and many other derivatives. I also want to exclude situations like "I was outside playing with my friend Bob. It was sunny. It was warm. Cat was not present. Overall, it was a good day" because this has "useful" context.
The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"
machine-learning python nlp
$endgroup$
$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02
$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13
$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago
add a comment |
$begingroup$
I have a dataset with a free form text field as one of the variables. Essentially I want to determine if a record has the phrase "The cat is not present". However, this phrase could be written as "cat is not present", "cat- not present", "There is no cat", "cat: not present", "no cat here to report", "report: no cat", and many other derivatives. I also want to exclude situations like "I was outside playing with my friend Bob. It was sunny. It was warm. Cat was not present. Overall, it was a good day" because this has "useful" context.
The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"
machine-learning python nlp
$endgroup$
I have a dataset with a free form text field as one of the variables. Essentially I want to determine if a record has the phrase "The cat is not present". However, this phrase could be written as "cat is not present", "cat- not present", "There is no cat", "cat: not present", "no cat here to report", "report: no cat", and many other derivatives. I also want to exclude situations like "I was outside playing with my friend Bob. It was sunny. It was warm. Cat was not present. Overall, it was a good day" because this has "useful" context.
The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"
machine-learning python nlp
machine-learning python nlp
edited Mar 22 at 20:25
DataNoob7
asked Mar 20 at 22:22
DataNoob7DataNoob7
193
193
$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02
$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13
$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago
add a comment |
$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02
$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13
$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago
$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02
$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02
$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13
$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13
$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago
$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago
add a comment |
0
active
oldest
votes
Your Answer
StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47705%2fnlp-how-to-detect-the-presence-of-a-phrase-and-its-derivatives%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
0
active
oldest
votes
0
active
oldest
votes
active
oldest
votes
active
oldest
votes
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47705%2fnlp-how-to-detect-the-presence-of-a-phrase-and-its-derivatives%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02
$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13
$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago