How to measure the similarity between two images? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsSimilarity measure for ordered binary vectorsusers' percentile similarity measuresimilarity measuresimilarity measure with two featuresIs there a way to measure correlation between two similar datasets?How to compare performance of Cosine Similarity and Manhatten Distance?Cosine similarity between query and document confusionHow to create clusters based on sentence similarity?Similarity measure before and after dimensionality reduction or clusteringHow to measaure the similarity between two series?
How to make an animal which can only breed for a certain number of generations?
What helicopter has the most rotor blades?
Why aren't road bike wheels tiny?
What came first? Venom as the movie or as the song?
Help Recreating a Table
Why did Bronn offer to be Tyrion Lannister's champion in trial by combat?
Im stuck and having trouble with ¬P ∨ Q Prove: P → Q
Should man-made satellites feature an intelligent inverted "cow catcher"?
How do I deal with an erroneously large refund?
What kind of equipment or other technology is necessary to photograph sprites (atmospheric phenomenon)
Does Prince Arnaud cause someone holding the Princess to lose?
Why did Israel vote against lifting the American embargo on Cuba?
Who's this lady in the war room?
Is there a verb for listening stealthily?
What is the difference between 准时 and 按时?
What's the connection between Mr. Nancy and fried chicken?
Can I take recommendation from someone I met at a conference?
Is Vivien of the Wilds + Wilderness Reclamation a competitive combo?
Raising a bilingual kid. When should we introduce the majority language?
How can I introduce the names of fantasy creatures to the reader?
How to leave only the following strings?
Is my guitar’s action too high?
How to keep bees out of canned beverages?
"Destructive force" carried by a B-52?
How to measure the similarity between two images?
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30 UTC (7:30pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsSimilarity measure for ordered binary vectorsusers' percentile similarity measuresimilarity measuresimilarity measure with two featuresIs there a way to measure correlation between two similar datasets?How to compare performance of Cosine Similarity and Manhatten Distance?Cosine similarity between query and document confusionHow to create clusters based on sentence similarity?Similarity measure before and after dimensionality reduction or clusteringHow to measaure the similarity between two series?
$begingroup$
I have two group images for cat and dog. And each group contain 2000 images for cat and dog respectively.
My goal is try to cluster the images by using k-means.
Assume image1 is x
, and image2 is y
.Here we need to measure the similarity between any two images. what is the common way to measure between two images?
machine-learning k-means similarity image
$endgroup$
add a comment |
$begingroup$
I have two group images for cat and dog. And each group contain 2000 images for cat and dog respectively.
My goal is try to cluster the images by using k-means.
Assume image1 is x
, and image2 is y
.Here we need to measure the similarity between any two images. what is the common way to measure between two images?
machine-learning k-means similarity image
$endgroup$
$begingroup$
You can use Siamese Networks -> “Face Recognition from Scratch using Siamese Networks and TensorFlow” by Shubham Panchal link.medium.com/HT66TqDdCV . They output a similarity score which is binary.
$endgroup$
– Shubham Panchal
Apr 5 at 2:15
$begingroup$
Thanks, that is real cool method. any other traditional machine learning method? like no use neural network.
$endgroup$
– jason
Apr 5 at 2:31
add a comment |
$begingroup$
I have two group images for cat and dog. And each group contain 2000 images for cat and dog respectively.
My goal is try to cluster the images by using k-means.
Assume image1 is x
, and image2 is y
.Here we need to measure the similarity between any two images. what is the common way to measure between two images?
machine-learning k-means similarity image
$endgroup$
I have two group images for cat and dog. And each group contain 2000 images for cat and dog respectively.
My goal is try to cluster the images by using k-means.
Assume image1 is x
, and image2 is y
.Here we need to measure the similarity between any two images. what is the common way to measure between two images?
machine-learning k-means similarity image
machine-learning k-means similarity image
asked Apr 5 at 0:36
jasonjason
1444
1444
$begingroup$
You can use Siamese Networks -> “Face Recognition from Scratch using Siamese Networks and TensorFlow” by Shubham Panchal link.medium.com/HT66TqDdCV . They output a similarity score which is binary.
$endgroup$
– Shubham Panchal
Apr 5 at 2:15
$begingroup$
Thanks, that is real cool method. any other traditional machine learning method? like no use neural network.
$endgroup$
– jason
Apr 5 at 2:31
add a comment |
$begingroup$
You can use Siamese Networks -> “Face Recognition from Scratch using Siamese Networks and TensorFlow” by Shubham Panchal link.medium.com/HT66TqDdCV . They output a similarity score which is binary.
$endgroup$
– Shubham Panchal
Apr 5 at 2:15
$begingroup$
Thanks, that is real cool method. any other traditional machine learning method? like no use neural network.
$endgroup$
– jason
Apr 5 at 2:31
$begingroup$
You can use Siamese Networks -> “Face Recognition from Scratch using Siamese Networks and TensorFlow” by Shubham Panchal link.medium.com/HT66TqDdCV . They output a similarity score which is binary.
$endgroup$
– Shubham Panchal
Apr 5 at 2:15
$begingroup$
You can use Siamese Networks -> “Face Recognition from Scratch using Siamese Networks and TensorFlow” by Shubham Panchal link.medium.com/HT66TqDdCV . They output a similarity score which is binary.
$endgroup$
– Shubham Panchal
Apr 5 at 2:15
$begingroup$
Thanks, that is real cool method. any other traditional machine learning method? like no use neural network.
$endgroup$
– jason
Apr 5 at 2:31
$begingroup$
Thanks, that is real cool method. any other traditional machine learning method? like no use neural network.
$endgroup$
– jason
Apr 5 at 2:31
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Check this handout!
Well, there a few so... lets go:
Given two images $J[x,y]$ and $I[x,y]$ with $(x,y) in N^N times M$...
A - Used in template matching:
Template Matching is linear and is not invariant to rotation (actually not even robust to it) but it is pretty simple and robust to noise such as the ones in photography taken with low illumination.
You can easily implement these using OpenCV Template Matching. Bellow there are mathematical equations defining some of the similarity measures (adapted for comparing 2 equal sized images) used by cv2.matchTemplate:
1 - Sum Square Difference
$$ S_sq = sum_(n,m) in N^M times N bigr(J[n,m] - I[n,m]bigl)^2$$
This can be normalized as
$$ fracS_sqsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
2 - Cross-Correlation
$$ C_crr = sum_(n,m) in N^M times N bigr(J[n,m] times I[n,m]bigl)^2$$
This can be normalized as
$$ fracC_crrsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
B - Image descriptors/feature detectors:
Many descriptors were developed for images, their main use is to register images/objects and search for them in other scenes. But, still they offer a lot of information about the image and were used in pupil detection (A joint cascaded framework for simultaneous eye detection and eye state estimation) and even seem it used for lip reading (can't direct you to it since I am not sure it was already published)
They detect points that can be considered as features in images (relevant points) the local texture of these points or even their geometrical position to each other can be used as features.
You can learn more about it in Stanford's Image Processing Classes (check handouts for classes 12,13 and 14, if you want to keep research on Computer vision I recomend you check the whole course and maybe Rich Radke classes on Digital Image Processing and Computer Vision for Visual Effects, there is a lot of information there that can be useful for this hard working computer vision style you're trying to take)
1 - SIFT and SURF:
These are Scale Invariant methods, SURF is a speed-up and open version of SIFT, SIFT is proprietary.
2 - BRIEF, BRISK and FAST:
These are binary descriptors and are really fast (mainly on processors with a pop_count instruction) and can be used in a similar way to SIFT and SURF. Also, I've used BRIEF features as substitutes on template matching for Facial Landmark Detection with high gain on speed and no loss on accuracy for both the IPD and the KIPD classifiers, although I didn't publish any of it yet (and this is just an incremental observation on the future articles so I don't think there is harm in sharing).
3 - Histogram of Oriented Gradients (HoG):
This is rotation invariant and is used for face detection...
C - Convolutional Neural Networks:
I know you don't want to used NN's but I think it is fair to point they are REALLY POWERFULL, training a CNN with Triplet Loss can be really nice for learning a representative feature space for clustering (and classification).
Check Wesley's GitHub for a example of it's power in facial recognition using Triplet Loss to get features and then SVM to classify.
Also, if your problem with Deep Learning is computational cost, you can easily find pre-trained layers with cats and dogs around.
D - Check on previous work:
This cats and dogs fight has been going on for a long time... you can check solutions on Kaggle Competitions (Forum and Kernels), there were 2 on cats and dogs This One and That One
E - Famous Measures:
SSIM Structural similarity Index- L2 Norm (Or Euclidean Distance)
- Mahalanobis Distance
F - Check on other kind of features
Cats and dogs can be a easy to identify by their ears and nose... size too but I had cats as big as dogs... so not really that safe to use size.
But you can try segmenting the images into animals and background and then try to do region property analisys...
Also, check on this image similarity metrics toolkit page it is in C but...
Check this paper on image similarity
Take a look on this Stack Overflow question and this Research Gate one
If you have the time, this book here: Feature Extraction & Image Processing for Computer Vision from Mark S. Nixon have much information on this kind of procedure
You can try Fisher Discriminant Analysis and PCA to create a mapping and the evaluate with Mahalanobis Distance or L2 Norm
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48642%2fhow-to-measure-the-similarity-between-two-images%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Check this handout!
Well, there a few so... lets go:
Given two images $J[x,y]$ and $I[x,y]$ with $(x,y) in N^N times M$...
A - Used in template matching:
Template Matching is linear and is not invariant to rotation (actually not even robust to it) but it is pretty simple and robust to noise such as the ones in photography taken with low illumination.
You can easily implement these using OpenCV Template Matching. Bellow there are mathematical equations defining some of the similarity measures (adapted for comparing 2 equal sized images) used by cv2.matchTemplate:
1 - Sum Square Difference
$$ S_sq = sum_(n,m) in N^M times N bigr(J[n,m] - I[n,m]bigl)^2$$
This can be normalized as
$$ fracS_sqsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
2 - Cross-Correlation
$$ C_crr = sum_(n,m) in N^M times N bigr(J[n,m] times I[n,m]bigl)^2$$
This can be normalized as
$$ fracC_crrsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
B - Image descriptors/feature detectors:
Many descriptors were developed for images, their main use is to register images/objects and search for them in other scenes. But, still they offer a lot of information about the image and were used in pupil detection (A joint cascaded framework for simultaneous eye detection and eye state estimation) and even seem it used for lip reading (can't direct you to it since I am not sure it was already published)
They detect points that can be considered as features in images (relevant points) the local texture of these points or even their geometrical position to each other can be used as features.
You can learn more about it in Stanford's Image Processing Classes (check handouts for classes 12,13 and 14, if you want to keep research on Computer vision I recomend you check the whole course and maybe Rich Radke classes on Digital Image Processing and Computer Vision for Visual Effects, there is a lot of information there that can be useful for this hard working computer vision style you're trying to take)
1 - SIFT and SURF:
These are Scale Invariant methods, SURF is a speed-up and open version of SIFT, SIFT is proprietary.
2 - BRIEF, BRISK and FAST:
These are binary descriptors and are really fast (mainly on processors with a pop_count instruction) and can be used in a similar way to SIFT and SURF. Also, I've used BRIEF features as substitutes on template matching for Facial Landmark Detection with high gain on speed and no loss on accuracy for both the IPD and the KIPD classifiers, although I didn't publish any of it yet (and this is just an incremental observation on the future articles so I don't think there is harm in sharing).
3 - Histogram of Oriented Gradients (HoG):
This is rotation invariant and is used for face detection...
C - Convolutional Neural Networks:
I know you don't want to used NN's but I think it is fair to point they are REALLY POWERFULL, training a CNN with Triplet Loss can be really nice for learning a representative feature space for clustering (and classification).
Check Wesley's GitHub for a example of it's power in facial recognition using Triplet Loss to get features and then SVM to classify.
Also, if your problem with Deep Learning is computational cost, you can easily find pre-trained layers with cats and dogs around.
D - Check on previous work:
This cats and dogs fight has been going on for a long time... you can check solutions on Kaggle Competitions (Forum and Kernels), there were 2 on cats and dogs This One and That One
E - Famous Measures:
SSIM Structural similarity Index- L2 Norm (Or Euclidean Distance)
- Mahalanobis Distance
F - Check on other kind of features
Cats and dogs can be a easy to identify by their ears and nose... size too but I had cats as big as dogs... so not really that safe to use size.
But you can try segmenting the images into animals and background and then try to do region property analisys...
Also, check on this image similarity metrics toolkit page it is in C but...
Check this paper on image similarity
Take a look on this Stack Overflow question and this Research Gate one
If you have the time, this book here: Feature Extraction & Image Processing for Computer Vision from Mark S. Nixon have much information on this kind of procedure
You can try Fisher Discriminant Analysis and PCA to create a mapping and the evaluate with Mahalanobis Distance or L2 Norm
$endgroup$
add a comment |
$begingroup$
Check this handout!
Well, there a few so... lets go:
Given two images $J[x,y]$ and $I[x,y]$ with $(x,y) in N^N times M$...
A - Used in template matching:
Template Matching is linear and is not invariant to rotation (actually not even robust to it) but it is pretty simple and robust to noise such as the ones in photography taken with low illumination.
You can easily implement these using OpenCV Template Matching. Bellow there are mathematical equations defining some of the similarity measures (adapted for comparing 2 equal sized images) used by cv2.matchTemplate:
1 - Sum Square Difference
$$ S_sq = sum_(n,m) in N^M times N bigr(J[n,m] - I[n,m]bigl)^2$$
This can be normalized as
$$ fracS_sqsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
2 - Cross-Correlation
$$ C_crr = sum_(n,m) in N^M times N bigr(J[n,m] times I[n,m]bigl)^2$$
This can be normalized as
$$ fracC_crrsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
B - Image descriptors/feature detectors:
Many descriptors were developed for images, their main use is to register images/objects and search for them in other scenes. But, still they offer a lot of information about the image and were used in pupil detection (A joint cascaded framework for simultaneous eye detection and eye state estimation) and even seem it used for lip reading (can't direct you to it since I am not sure it was already published)
They detect points that can be considered as features in images (relevant points) the local texture of these points or even their geometrical position to each other can be used as features.
You can learn more about it in Stanford's Image Processing Classes (check handouts for classes 12,13 and 14, if you want to keep research on Computer vision I recomend you check the whole course and maybe Rich Radke classes on Digital Image Processing and Computer Vision for Visual Effects, there is a lot of information there that can be useful for this hard working computer vision style you're trying to take)
1 - SIFT and SURF:
These are Scale Invariant methods, SURF is a speed-up and open version of SIFT, SIFT is proprietary.
2 - BRIEF, BRISK and FAST:
These are binary descriptors and are really fast (mainly on processors with a pop_count instruction) and can be used in a similar way to SIFT and SURF. Also, I've used BRIEF features as substitutes on template matching for Facial Landmark Detection with high gain on speed and no loss on accuracy for both the IPD and the KIPD classifiers, although I didn't publish any of it yet (and this is just an incremental observation on the future articles so I don't think there is harm in sharing).
3 - Histogram of Oriented Gradients (HoG):
This is rotation invariant and is used for face detection...
C - Convolutional Neural Networks:
I know you don't want to used NN's but I think it is fair to point they are REALLY POWERFULL, training a CNN with Triplet Loss can be really nice for learning a representative feature space for clustering (and classification).
Check Wesley's GitHub for a example of it's power in facial recognition using Triplet Loss to get features and then SVM to classify.
Also, if your problem with Deep Learning is computational cost, you can easily find pre-trained layers with cats and dogs around.
D - Check on previous work:
This cats and dogs fight has been going on for a long time... you can check solutions on Kaggle Competitions (Forum and Kernels), there were 2 on cats and dogs This One and That One
E - Famous Measures:
SSIM Structural similarity Index- L2 Norm (Or Euclidean Distance)
- Mahalanobis Distance
F - Check on other kind of features
Cats and dogs can be a easy to identify by their ears and nose... size too but I had cats as big as dogs... so not really that safe to use size.
But you can try segmenting the images into animals and background and then try to do region property analisys...
Also, check on this image similarity metrics toolkit page it is in C but...
Check this paper on image similarity
Take a look on this Stack Overflow question and this Research Gate one
If you have the time, this book here: Feature Extraction & Image Processing for Computer Vision from Mark S. Nixon have much information on this kind of procedure
You can try Fisher Discriminant Analysis and PCA to create a mapping and the evaluate with Mahalanobis Distance or L2 Norm
$endgroup$
add a comment |
$begingroup$
Check this handout!
Well, there a few so... lets go:
Given two images $J[x,y]$ and $I[x,y]$ with $(x,y) in N^N times M$...
A - Used in template matching:
Template Matching is linear and is not invariant to rotation (actually not even robust to it) but it is pretty simple and robust to noise such as the ones in photography taken with low illumination.
You can easily implement these using OpenCV Template Matching. Bellow there are mathematical equations defining some of the similarity measures (adapted for comparing 2 equal sized images) used by cv2.matchTemplate:
1 - Sum Square Difference
$$ S_sq = sum_(n,m) in N^M times N bigr(J[n,m] - I[n,m]bigl)^2$$
This can be normalized as
$$ fracS_sqsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
2 - Cross-Correlation
$$ C_crr = sum_(n,m) in N^M times N bigr(J[n,m] times I[n,m]bigl)^2$$
This can be normalized as
$$ fracC_crrsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
B - Image descriptors/feature detectors:
Many descriptors were developed for images, their main use is to register images/objects and search for them in other scenes. But, still they offer a lot of information about the image and were used in pupil detection (A joint cascaded framework for simultaneous eye detection and eye state estimation) and even seem it used for lip reading (can't direct you to it since I am not sure it was already published)
They detect points that can be considered as features in images (relevant points) the local texture of these points or even their geometrical position to each other can be used as features.
You can learn more about it in Stanford's Image Processing Classes (check handouts for classes 12,13 and 14, if you want to keep research on Computer vision I recomend you check the whole course and maybe Rich Radke classes on Digital Image Processing and Computer Vision for Visual Effects, there is a lot of information there that can be useful for this hard working computer vision style you're trying to take)
1 - SIFT and SURF:
These are Scale Invariant methods, SURF is a speed-up and open version of SIFT, SIFT is proprietary.
2 - BRIEF, BRISK and FAST:
These are binary descriptors and are really fast (mainly on processors with a pop_count instruction) and can be used in a similar way to SIFT and SURF. Also, I've used BRIEF features as substitutes on template matching for Facial Landmark Detection with high gain on speed and no loss on accuracy for both the IPD and the KIPD classifiers, although I didn't publish any of it yet (and this is just an incremental observation on the future articles so I don't think there is harm in sharing).
3 - Histogram of Oriented Gradients (HoG):
This is rotation invariant and is used for face detection...
C - Convolutional Neural Networks:
I know you don't want to used NN's but I think it is fair to point they are REALLY POWERFULL, training a CNN with Triplet Loss can be really nice for learning a representative feature space for clustering (and classification).
Check Wesley's GitHub for a example of it's power in facial recognition using Triplet Loss to get features and then SVM to classify.
Also, if your problem with Deep Learning is computational cost, you can easily find pre-trained layers with cats and dogs around.
D - Check on previous work:
This cats and dogs fight has been going on for a long time... you can check solutions on Kaggle Competitions (Forum and Kernels), there were 2 on cats and dogs This One and That One
E - Famous Measures:
SSIM Structural similarity Index- L2 Norm (Or Euclidean Distance)
- Mahalanobis Distance
F - Check on other kind of features
Cats and dogs can be a easy to identify by their ears and nose... size too but I had cats as big as dogs... so not really that safe to use size.
But you can try segmenting the images into animals and background and then try to do region property analisys...
Also, check on this image similarity metrics toolkit page it is in C but...
Check this paper on image similarity
Take a look on this Stack Overflow question and this Research Gate one
If you have the time, this book here: Feature Extraction & Image Processing for Computer Vision from Mark S. Nixon have much information on this kind of procedure
You can try Fisher Discriminant Analysis and PCA to create a mapping and the evaluate with Mahalanobis Distance or L2 Norm
$endgroup$
Check this handout!
Well, there a few so... lets go:
Given two images $J[x,y]$ and $I[x,y]$ with $(x,y) in N^N times M$...
A - Used in template matching:
Template Matching is linear and is not invariant to rotation (actually not even robust to it) but it is pretty simple and robust to noise such as the ones in photography taken with low illumination.
You can easily implement these using OpenCV Template Matching. Bellow there are mathematical equations defining some of the similarity measures (adapted for comparing 2 equal sized images) used by cv2.matchTemplate:
1 - Sum Square Difference
$$ S_sq = sum_(n,m) in N^M times N bigr(J[n,m] - I[n,m]bigl)^2$$
This can be normalized as
$$ fracS_sqsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
2 - Cross-Correlation
$$ C_crr = sum_(n,m) in N^M times N bigr(J[n,m] times I[n,m]bigl)^2$$
This can be normalized as
$$ fracC_crrsqrtsum J[n,m]^2 times sum I[n,m]^2 $$
B - Image descriptors/feature detectors:
Many descriptors were developed for images, their main use is to register images/objects and search for them in other scenes. But, still they offer a lot of information about the image and were used in pupil detection (A joint cascaded framework for simultaneous eye detection and eye state estimation) and even seem it used for lip reading (can't direct you to it since I am not sure it was already published)
They detect points that can be considered as features in images (relevant points) the local texture of these points or even their geometrical position to each other can be used as features.
You can learn more about it in Stanford's Image Processing Classes (check handouts for classes 12,13 and 14, if you want to keep research on Computer vision I recomend you check the whole course and maybe Rich Radke classes on Digital Image Processing and Computer Vision for Visual Effects, there is a lot of information there that can be useful for this hard working computer vision style you're trying to take)
1 - SIFT and SURF:
These are Scale Invariant methods, SURF is a speed-up and open version of SIFT, SIFT is proprietary.
2 - BRIEF, BRISK and FAST:
These are binary descriptors and are really fast (mainly on processors with a pop_count instruction) and can be used in a similar way to SIFT and SURF. Also, I've used BRIEF features as substitutes on template matching for Facial Landmark Detection with high gain on speed and no loss on accuracy for both the IPD and the KIPD classifiers, although I didn't publish any of it yet (and this is just an incremental observation on the future articles so I don't think there is harm in sharing).
3 - Histogram of Oriented Gradients (HoG):
This is rotation invariant and is used for face detection...
C - Convolutional Neural Networks:
I know you don't want to used NN's but I think it is fair to point they are REALLY POWERFULL, training a CNN with Triplet Loss can be really nice for learning a representative feature space for clustering (and classification).
Check Wesley's GitHub for a example of it's power in facial recognition using Triplet Loss to get features and then SVM to classify.
Also, if your problem with Deep Learning is computational cost, you can easily find pre-trained layers with cats and dogs around.
D - Check on previous work:
This cats and dogs fight has been going on for a long time... you can check solutions on Kaggle Competitions (Forum and Kernels), there were 2 on cats and dogs This One and That One
E - Famous Measures:
SSIM Structural similarity Index- L2 Norm (Or Euclidean Distance)
- Mahalanobis Distance
F - Check on other kind of features
Cats and dogs can be a easy to identify by their ears and nose... size too but I had cats as big as dogs... so not really that safe to use size.
But you can try segmenting the images into animals and background and then try to do region property analisys...
Also, check on this image similarity metrics toolkit page it is in C but...
Check this paper on image similarity
Take a look on this Stack Overflow question and this Research Gate one
If you have the time, this book here: Feature Extraction & Image Processing for Computer Vision from Mark S. Nixon have much information on this kind of procedure
You can try Fisher Discriminant Analysis and PCA to create a mapping and the evaluate with Mahalanobis Distance or L2 Norm
answered Apr 5 at 4:11
Pedro Henrique MonfortePedro Henrique Monforte
559218
559218
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48642%2fhow-to-measure-the-similarity-between-two-images%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
You can use Siamese Networks -> “Face Recognition from Scratch using Siamese Networks and TensorFlow” by Shubham Panchal link.medium.com/HT66TqDdCV . They output a similarity score which is binary.
$endgroup$
– Shubham Panchal
Apr 5 at 2:15
$begingroup$
Thanks, that is real cool method. any other traditional machine learning method? like no use neural network.
$endgroup$
– jason
Apr 5 at 2:31