NLP - How to detect the presence of a phrase and it's derivatives2019 Community Moderator ElectionWhat is the difference between NLP and text mining?NLP: wit.ai. How to use confidence score?How to detect product name from the bill text?Boolean classification on stringsUsing NLP to detect insurance FraudNLP - How to perform semantic analysis?How to deal with missing data for Bernoulli Naive Bayes?How to detect when the “bibliography” of a paper has began?NLP: What are some popular packages for phrase tokenization?NLP: Fuzzy Word/Phrase Match

Is `x >> pure y` equivalent to `liftM (const y) x`

How can I kill an app using Terminal?

Opposite of a diet

What is paid subscription needed for in Mortal Kombat 11?

Is there a problem with hiding "forgot password" until it's needed?

How to Reset Passwords on Multiple Websites Easily?

Is a stroke of luck acceptable after a series of unfavorable events?

Inappropriate reference requests from Journal reviewers

Applicability of Single Responsibility Principle

Is this apparent Class Action settlement a spam message?

How do I extract a value from a time formatted value in excel?

Is the destination of a commercial flight important for the pilot?

I'm in charge of equipment buying but no one's ever happy with what I choose. How to fix this?

Return the Closest Prime Number

How does the UK government determine the size of a mandate?

Unreliable Magic - Is it worth it?

Do sorcerers' Subtle Spells require a skill check to be unseen?

Failed to fetch jessie backports repository

How to be diplomatic in refusing to write code that breaches the privacy of our users

Pole-zeros of a real-valued causal FIR system

How do we know the LHC results are robust?

Different result between scanning in Epson's "color negative film" mode and scanning in positive -> invert curve in post?

Is oxalic acid dihydrate considered a primary acid standard in analytical chemistry?

Anatomically Correct Strange Women In Ponds Distributing Swords



NLP - How to detect the presence of a phrase and it's derivatives



2019 Community Moderator ElectionWhat is the difference between NLP and text mining?NLP: wit.ai. How to use confidence score?How to detect product name from the bill text?Boolean classification on stringsUsing NLP to detect insurance FraudNLP - How to perform semantic analysis?How to deal with missing data for Bernoulli Naive Bayes?How to detect when the “bibliography” of a paper has began?NLP: What are some popular packages for phrase tokenization?NLP: Fuzzy Word/Phrase Match










0












$begingroup$


I have a dataset with a free form text field as one of the variables. Essentially I want to determine if a record has the phrase "The cat is not present". However, this phrase could be written as "cat is not present", "cat- not present", "There is no cat", "cat: not present", "no cat here to report", "report: no cat", and many other derivatives. I also want to exclude situations like "I was outside playing with my friend Bob. It was sunny. It was warm. Cat was not present. Overall, it was a good day" because this has "useful" context.



The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"










share|improve this question











$endgroup$











  • $begingroup$
    This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
    $endgroup$
    – Sandeep B
    Mar 21 at 13:02










  • $begingroup$
    Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
    $endgroup$
    – DataNoob7
    Mar 21 at 20:13











  • $begingroup$
    I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
    $endgroup$
    – DataNoob7
    2 days ago















0












$begingroup$


I have a dataset with a free form text field as one of the variables. Essentially I want to determine if a record has the phrase "The cat is not present". However, this phrase could be written as "cat is not present", "cat- not present", "There is no cat", "cat: not present", "no cat here to report", "report: no cat", and many other derivatives. I also want to exclude situations like "I was outside playing with my friend Bob. It was sunny. It was warm. Cat was not present. Overall, it was a good day" because this has "useful" context.



The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"










share|improve this question











$endgroup$











  • $begingroup$
    This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
    $endgroup$
    – Sandeep B
    Mar 21 at 13:02










  • $begingroup$
    Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
    $endgroup$
    – DataNoob7
    Mar 21 at 20:13











  • $begingroup$
    I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
    $endgroup$
    – DataNoob7
    2 days ago













0












0








0


0



$begingroup$


I have a dataset with a free form text field as one of the variables. Essentially I want to determine if a record has the phrase "The cat is not present". However, this phrase could be written as "cat is not present", "cat- not present", "There is no cat", "cat: not present", "no cat here to report", "report: no cat", and many other derivatives. I also want to exclude situations like "I was outside playing with my friend Bob. It was sunny. It was warm. Cat was not present. Overall, it was a good day" because this has "useful" context.



The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"










share|improve this question











$endgroup$




I have a dataset with a free form text field as one of the variables. Essentially I want to determine if a record has the phrase "The cat is not present". However, this phrase could be written as "cat is not present", "cat- not present", "There is no cat", "cat: not present", "no cat here to report", "report: no cat", and many other derivatives. I also want to exclude situations like "I was outside playing with my friend Bob. It was sunny. It was warm. Cat was not present. Overall, it was a good day" because this has "useful" context.



The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"







machine-learning python nlp






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 22 at 20:25







DataNoob7

















asked Mar 20 at 22:22









DataNoob7DataNoob7

193




193











  • $begingroup$
    This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
    $endgroup$
    – Sandeep B
    Mar 21 at 13:02










  • $begingroup$
    Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
    $endgroup$
    – DataNoob7
    Mar 21 at 20:13











  • $begingroup$
    I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
    $endgroup$
    – DataNoob7
    2 days ago
















  • $begingroup$
    This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
    $endgroup$
    – Sandeep B
    Mar 21 at 13:02










  • $begingroup$
    Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
    $endgroup$
    – DataNoob7
    Mar 21 at 20:13











  • $begingroup$
    I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
    $endgroup$
    – DataNoob7
    2 days ago















$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02




$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02












$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13





$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13













$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago




$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago










0






active

oldest

votes











Your Answer





StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47705%2fnlp-how-to-detect-the-presence-of-a-phrase-and-its-derivatives%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Data Science Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47705%2fnlp-how-to-detect-the-presence-of-a-phrase-and-its-derivatives%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Adding axes to figuresAdding axes labels to LaTeX figuresLaTeX equivalent of ConTeXt buffersRotate a node but not its content: the case of the ellipse decorationHow to define the default vertical distance between nodes?TikZ scaling graphic and adjust node position and keep font sizeNumerical conditional within tikz keys?adding axes to shapesAlign axes across subfiguresAdding figures with a certain orderLine up nested tikz enviroments or how to get rid of themAdding axes labels to LaTeX figures

Luettelo Yhdysvaltain laivaston lentotukialuksista Lähteet | Navigointivalikko

Gary (muusikko) Sisällysluettelo Historia | Rockin' High | Lähteet | Aiheesta muualla | NavigointivalikkoInfobox OKTuomas "Gary" Keskinen Ancaran kitaristiksiProjekti Rockin' High