NLP - How to detect the presence of a phrase and it's derivatives2019 Community Moderator ElectionWhat is the difference between NLP and text mining?NLP: wit.ai. How to use confidence score?How to detect product name from the bill text?Boolean classification on stringsUsing NLP to detect insurance FraudNLP - How to perform semantic analysis?How to deal with missing data for Bernoulli Naive Bayes?How to detect when the “bibliography” of a paper has began?NLP: What are some popular packages for phrase tokenization?NLP: Fuzzy Word/Phrase Match

What is the intuitive meaning of having a linear relationship between the logs of two variables?

Flow chart document symbol

How did Arya survive the stabbing?

Tiptoe or tiphoof? Adjusting words to better fit fantasy races

if() else if() VS if() else if()

How does it work when somebody invests in my business?

What does 算不上 mean in 算不上太美好的日子?

How to create a 32-bit integer from eight (8) 4-bit integers?

How can a function with a hole (removable discontinuity) equal a function with no hole?

Invade the Pyramid if you Dare

How do scammers retract money, while you can’t?

Is oxalic acid dihydrate considered a primary acid standard in analytical chemistry?

Where does the Z80 processor start executing from?

Crossing the line between justified force and brutality

Go Pregnant or Go Home

Pole-zeros of a real-valued causal FIR system

Was Spock the First Vulcan in Starfleet?

Can the discrete variable be a negative number?

Customer Requests (Sometimes) Drive Me Bonkers!

How do I rename a Linux host without needing to reboot for the rename to take effect?

What can we do to stop prior company from asking us questions?

What is the best translation for "slot" in the context of multiplayer video games?

What grammatical function is や performing here?

How to Reset Passwords on Multiple Websites Easily?

NLP - How to detect the presence of a phrase and it's derivatives

2019 Community Moderator ElectionWhat is the difference between NLP and text mining?NLP: wit.ai. How to use confidence score?How to detect product name from the bill text?Boolean classification on stringsUsing NLP to detect insurance FraudNLP - How to perform semantic analysis?How to deal with missing data for Bernoulli Naive Bayes?How to detect when the “bibliography” of a paper has began?NLP: What are some popular packages for phrase tokenization?NLP: Fuzzy Word/Phrase Match

I have a dataset with a free form text field as one of the variables. Essentially I want to determine if a record has the phrase "The cat is not present". However, this phrase could be written as "cat is not present", "cat- not present", "There is no cat", "cat: not present", "no cat here to report", "report: no cat", and many other derivatives. I also want to exclude situations like "I was outside playing with my friend Bob. It was sunny. It was warm. Cat was not present. Overall, it was a good day" because this has "useful" context.

The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"

edited Mar 22 at 20:25

asked Mar 20 at 22:22

DataNoob7

193

$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02

$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13

$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago

add a comment |

The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"

edited Mar 22 at 20:25

asked Mar 20 at 22:22

DataNoob7

193

$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02

$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13

$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago

add a comment |

The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"

edited Mar 22 at 20:25

asked Mar 20 at 22:22

DataNoob7

193

The end goal is to calculate the number of records that has "no cat is present" (minus instances where this phrase or derivatives has context) vs "cat is present"

machine-learning python nlp

edited Mar 22 at 20:25

asked Mar 20 at 22:22

DataNoob7

193

edited Mar 22 at 20:25

asked Mar 20 at 22:22

DataNoob7

193

edited Mar 22 at 20:25

asked Mar 20 at 22:22

DataNoob7

193

asked Mar 20 at 22:22

DataNoob7

193

asked Mar 20 at 22:22

DataNoob7

193

$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02

$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13

$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago

add a comment |

$begingroup$
This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.
$endgroup$
– Sandeep B
Mar 21 at 13:02

$begingroup$
Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.
$endgroup$
– DataNoob7
Mar 21 at 20:13

$begingroup$
I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.
$endgroup$
– DataNoob7
2 days ago

This is surely an NLP problem. Your last sentence however made it slightly difficult, and make it understand context. One easy solution that comes to my mind is: 1. do a binary classification with a lot of such labeled data. 2. Another solution can be do a multi class classification. There are codes avaibable on internet. Both of these solutions are simple. 3. Third solution can be to some use some attention or semantic similarity kind of solution.

– Sandeep B
Mar 21 at 13:02

Thank you for the reply. I am still a beginner in applying these types of solutions, so it might not be so simple! That being said, I would rather not copy and paste, and tweak. I would like to learn it and write it myself. Please excuse my ignorance, but is there more formal names for these type of NLP techniques? I would like to use the method that provides the most confidence (which is likely your third option). I don't think I will need to create a test / training set because we are using the whole population for a specific period of time.

– DataNoob7
Mar 21 at 20:13

I have looked up solutions 1 and 2, but I don't think it answers the question due to the variety of which these instances can appear. Unless I am misunderstanding.

– DataNoob7
2 days ago

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f47705%2fnlp-how-to-detect-the-presence-of-a-phrase-and-its-derivatives%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Trjtdtk

0

Your Answer

Post as a guest

0

0

Post as a guest

Popular posts from this blog

0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

0

0

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog