How to dual encode two sentences to show similarity scoreQuery similarity: how much data is used in practice?Shall I use the Euclidean Distance or the Cosine Similarity to compute the semantic similarity of two words?Output a word instead of a vector after word embedding?How to score arbitrary sentences based on how likely they are to occur in the real world?How to improve Vector Space Models with semantic similarity?What is the efficient way to generate a similarity score when comparing two face images?How to combine heterogeneous image features extracted with different algorithms for similar image retrieval?how to encode labels for relationship extractionEfficient Similarity Indexing and Searching in High DimensionsWhy does averaging a sentence's worth of word vectors work?

Should I use acronyms in dialogues before telling the readers what it stands for in fiction?

What does Jesus mean regarding "Raca," and "you fool?" - is he contrasting them?

Using Past-Perfect interchangeably with the Past Continuous

Asserting that Atheism and Theism are both faith based positions

Practical application of matrices and determinants

Do I need to be arrogant to get ahead?

Fewest number of steps to reach 200 using special calculator

Help rendering a complicated sum/product formula

Synchronized implementation of a bank account in Java

I got the following comment from a reputed math journal. What does it mean?

Probably overheated black color SMD pads

How does 取材で訪れた integrate into this sentence?

How does one measure the Fourier components of a signal?

Is honey really a supersaturated solution? Does heating to un-crystalize redissolve it or melt it?

Do I need to consider instance restrictions when showing a language is in P?

Do native speakers use "ultima" and "proxima" frequently in spoken English?

Are dual Irish/British citizens bound by the 90/180 day rule when travelling in the EU after Brexit?

Can a medieval gyroplane be built?

Deletion of copy-ctor & copy-assignment - public, private or protected?

Existence of a celestial body big enough for early civilization to be thought of as a second moon

Could Sinn Fein swing any Brexit vote in Parliament?

Pronounciation of the combination "st" in spanish accents

Unfrosted light bulb

What should I install to correct "ld: cannot find -lgbm and -linput" so that I can compile a Rust program?

How to dual encode two sentences to show similarity score

Query similarity: how much data is used in practice?Shall I use the Euclidean Distance or the Cosine Similarity to compute the semantic similarity of two words?Output a word instead of a vector after word embedding?How to score arbitrary sentences based on how likely they are to occur in the real world?How to improve Vector Space Models with semantic similarity?What is the efficient way to generate a similarity score when comparing two face images?How to combine heterogeneous image features extracted with different algorithms for similar image retrieval?how to encode labels for relationship extractionEfficient Similarity Indexing and Searching in High DimensionsWhy does averaging a sentence's worth of word vectors work?

I've been trying to grasp the concept of Google's semantic experiences. By using it, I'm planning to implement a semantic query tool.

With universal sentence encoder I can first pre-encode all sentences and put them in the database. When user wants to perform query, input will too be converted in 512-dimensional vector, and we will perform sequential search on whole database by comparing cosine similarity (highest similarity vector is picked). But this is extremely slow...

Fortunately, on their semantic experiences page, they wrote the following:

The Universal Sentence Encoder model is very similar to what we're
using in Talk to Books and Semantris, although those applications are
using a dual-encoder approach that maximizes for response relevance,
while the Universal Sentence Encoder is a single encoder that returns
an embedding for the input, instead of a score on an input pair.

One of the simpler methods they use for transforming a sentence into embedding vector is DAN (deep averaging neural network).

From what I understand, sentence is split in words, which are converted into vectors (word2vec), then we get average of all vectors that we have obtained. Average value is finally passed to one or more hidden feedforward layers, and eventually output layer that has softmax activation functions and has 512 neurons.

How can I create a dual encoder though? Do I use two different neural networks? Or does output just contain one neuron that outputs similarity?

Thank you!

P.S

My apologies if anything ignorant has been mentioned here, I'm quite confused with the concept of projecting lower dimensional space to 512 dimensional vector space.

asked Nov 26 '18 at 19:45

ShellRox

20919

bumped to the homepage by Community♦ 2 days ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

add a comment |

I've been trying to grasp the concept of Google's semantic experiences. By using it, I'm planning to implement a semantic query tool.

Fortunately, on their semantic experiences page, they wrote the following:

The Universal Sentence Encoder model is very similar to what we're
using in Talk to Books and Semantris, although those applications are
using a dual-encoder approach that maximizes for response relevance,
while the Universal Sentence Encoder is a single encoder that returns
an embedding for the input, instead of a score on an input pair.

One of the simpler methods they use for transforming a sentence into embedding vector is DAN (deep averaging neural network).

How can I create a dual encoder though? Do I use two different neural networks? Or does output just contain one neuron that outputs similarity?

Thank you!

P.S

My apologies if anything ignorant has been mentioned here, I'm quite confused with the concept of projecting lower dimensional space to 512 dimensional vector space.

asked Nov 26 '18 at 19:45

ShellRox

20919

bumped to the homepage by Community♦ 2 days ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

add a comment |

I've been trying to grasp the concept of Google's semantic experiences. By using it, I'm planning to implement a semantic query tool.

Fortunately, on their semantic experiences page, they wrote the following:

The Universal Sentence Encoder model is very similar to what we're
using in Talk to Books and Semantris, although those applications are
using a dual-encoder approach that maximizes for response relevance,
while the Universal Sentence Encoder is a single encoder that returns
an embedding for the input, instead of a score on an input pair.

One of the simpler methods they use for transforming a sentence into embedding vector is DAN (deep averaging neural network).

How can I create a dual encoder though? Do I use two different neural networks? Or does output just contain one neuron that outputs similarity?

Thank you!

P.S

My apologies if anything ignorant has been mentioned here, I'm quite confused with the concept of projecting lower dimensional space to 512 dimensional vector space.

asked Nov 26 '18 at 19:45

ShellRox

20919

I've been trying to grasp the concept of Google's semantic experiences. By using it, I'm planning to implement a semantic query tool.

Fortunately, on their semantic experiences page, they wrote the following:

The Universal Sentence Encoder model is very similar to what we're
using in Talk to Books and Semantris, although those applications are
using a dual-encoder approach that maximizes for response relevance,
while the Universal Sentence Encoder is a single encoder that returns
an embedding for the input, instead of a score on an input pair.

One of the simpler methods they use for transforming a sentence into embedding vector is DAN (deep averaging neural network).

How can I create a dual encoder though? Do I use two different neural networks? Or does output just contain one neuron that outputs similarity?

Thank you!

P.S

My apologies if anything ignorant has been mentioned here, I'm quite confused with the concept of projecting lower dimensional space to 512 dimensional vector space.

neural-network deep-learning word-embeddings search vector-space-models

asked Nov 26 '18 at 19:45

ShellRox

20919

asked Nov 26 '18 at 19:45

ShellRox

20919

asked Nov 26 '18 at 19:45

ShellRox

20919

asked Nov 26 '18 at 19:45

ShellRox

20919

asked Nov 26 '18 at 19:45

ShellRox

20919

bumped to the homepage by Community♦ 2 days ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

bumped to the homepage by Community♦ 2 days ago

This question has answers that may be good or bad; the system has marked it active so that they can be reviewed.

add a comment |

1 Answer
1

active

oldest

votes

You should look into Siamese Networks, or left-right feed forward networks if you have a large set of training data such as SNLI or multiNLI datasets available to train on.

Otherwise, you should use a form of Latent semantic analysis to compress your vectors to get a faster to search set, and then perform full cosine similarity on this subset of your entire dataset.

answered Feb 13 at 15:24

Tristan Wise

add a comment |

Your Answer

StackExchange.ifUsing("editor", function ()
return StackExchange.using("mathjaxEditing", function ()
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix)
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\$","\$"]]);
);
);
, "mathjax-editing");

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f41712%2fhow-to-dual-encode-two-sentences-to-show-similarity-score%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

You should look into Siamese Networks, or left-right feed forward networks if you have a large set of training data such as SNLI or multiNLI datasets available to train on.

Otherwise, you should use a form of Latent semantic analysis to compress your vectors to get a faster to search set, and then perform full cosine similarity on this subset of your entire dataset.

answered Feb 13 at 15:24

Tristan Wise

add a comment |

You should look into Siamese Networks, or left-right feed forward networks if you have a large set of training data such as SNLI or multiNLI datasets available to train on.

Otherwise, you should use a form of Latent semantic analysis to compress your vectors to get a faster to search set, and then perform full cosine similarity on this subset of your entire dataset.

answered Feb 13 at 15:24

Tristan Wise

add a comment |

You should look into Siamese Networks, or left-right feed forward networks if you have a large set of training data such as SNLI or multiNLI datasets available to train on.

Otherwise, you should use a form of Latent semantic analysis to compress your vectors to get a faster to search set, and then perform full cosine similarity on this subset of your entire dataset.

answered Feb 13 at 15:24

Tristan Wise

You should look into Siamese Networks, or left-right feed forward networks if you have a large set of training data such as SNLI or multiNLI datasets available to train on.

Otherwise, you should use a form of Latent semantic analysis to compress your vectors to get a faster to search set, and then perform full cosine similarity on this subset of your entire dataset.

answered Feb 13 at 15:24

Tristan Wise

answered Feb 13 at 15:24

Tristan Wise

answered Feb 13 at 15:24

Tristan Wise

answered Feb 13 at 15:24

Tristan Wise

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Data Science Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

5ztLOt,vHCq3yrIy,lZkiY6,7OEaa 8T9AGPO,UpgnfIIjfTEIX1dGyEROriB0JJEl,pRh

搜尋此網誌

Trjtdtk

bumped to the homepage by Community♦ 2 days ago

bumped to the homepage by Community♦ 2 days ago

bumped to the homepage by Community♦ 2 days ago

bumped to the homepage by Community♦ 2 days ago

1 Answer
1

Your Answer

Post as a guest

1 Answer
1

1 Answer
1

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

bumped to the homepage by Community♦ 2 days ago

bumped to the homepage by Community♦ 2 days ago

bumped to the homepage by Community♦ 2 days ago

bumped to the homepage by Community♦ 2 days ago

1 Answer 1

Your Answer

Sign up or log in

Post as a guest

Post as a guest

1 Answer 1

1 Answer 1

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog

Tähtien Talli Jäsenet | Lähteet | NavigointivalikkoSuomen Hippos – Tähtien Talli

1 Answer
1

1 Answer
1

1 Answer
1