What is the range of values of the expected percentile ranking? Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern) 2019 Moderator Election Q&A - Questionnaire 2019 Community Moderator Election ResultsNeural Network - Sparsity of collaborative based filtering and modelling the prediction problemIn a recommender system, how can you normalise the similarity between two arbitrary users?Recreating the sum symbol using pythonWhat does it mean when we say most of the points in a hypercube are at the boundary?What does the term “proportional to” mean in Bayes Equation?How to choose negative examples for recommendation system?What are the introductory mathematics courses that are most pertinent to machine learning?What is the difference between parameters & cooficients in Machine learning?What methods exist for recommendation based on implicit information?What is the first tool to learn start your your data science projects?
Stop battery usage [Ubuntu 18]
Mortgage adviser recommends a longer term than necessary combined with overpayments
Who can trigger ship-wide alerts in Star Trek?
Can the prologue be the backstory of your main character?
How to rotate it perfectly?
What LEGO pieces have "real-world" functionality?
Single author papers against my advisor's will?
3 doors, three guards, one stone
Is there a service that would inform me whenever a new direct route is scheduled from a given airport?
Why don't the Weasley twins use magic outside of school if the Trace can only find the location of spells cast?
Active filter with series inductor and resistor - do these exist?
How many things? AとBがふたつ
Jazz greats knew nothing of modes. Why are they used to improvise on standards?
I'm thinking of a number
Replacing HDD with SSD; what about non-APFS/APFS?
Why use gamma over alpha radiation?
Autumning in love
What was the last x86 CPU that did not have the x87 floating-point unit built in?
If I can make up priors, why can't I make up posteriors?
Did the new image of black hole confirm the general theory of relativity?
Fishing simulator
How can players take actions together that are impossible otherwise?
What items from the Roman-age tech-level could be used to deter all creatures from entering a small area?
What loss function to use when labels are probabilities?
What is the range of values of the expected percentile ranking?
Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 17/18, 2019 at 00:00UTC (8:00pm US/Eastern)
2019 Moderator Election Q&A - Questionnaire
2019 Community Moderator Election ResultsNeural Network - Sparsity of collaborative based filtering and modelling the prediction problemIn a recommender system, how can you normalise the similarity between two arbitrary users?Recreating the sum symbol using pythonWhat does it mean when we say most of the points in a hypercube are at the boundary?What does the term “proportional to” mean in Bayes Equation?How to choose negative examples for recommendation system?What are the introductory mathematics courses that are most pertinent to machine learning?What is the difference between parameters & cooficients in Machine learning?What methods exist for recommendation based on implicit information?What is the first tool to learn start your your data science projects?
$begingroup$
I'm currently reading
Hu, Koren, Volinsky: Collaborative Filtering for Implicit Feedback Datasets
One thing that confuses me is the "expected percentile ranking", an function the authors define to evaluate the goodness of their recommendations. They define it in the Evaluation methodology on page 6 as:
$$overlinetextrank = fracsum_u,i r^t_ui textrank_uisum_u,i r^t_ui$$
where $u$ is a user, $i$ is an item (e.g. a TV show), $r_ui in [0, infty)$ is the amount how much user $u$ did watch show $i$. $textrank_ui in [0, 1]$ is the percentage rank of item $i$ for user $u$. For example, it is 0 if for user $u$ the item $i$ has the highest $r$ value and 1 if the item $i$ for user $u$ has the lowest $r$ value.
I'm not super sure if I understood it correctly.
The authors write that lower values of $overlinetextrank$ are more desirable and for random predictions would lead to an expected value of $overlinetextrank$ of 0.5.
Examples
- Assume there is only one item. In this case $textrank = 0$. Makes sense, as there cannot be any predictions.
- Assume there is only one user and two items with $r_1,1 = 1$ and $r_1,2 = 2$. Then:
$$overlinetextrank = frac1 cdot textrank_1, 1 + 2 cdot textrank_1, 21+2$$
This means $overlinetextrank in 2/3, 1/3$.
- If there is only a single user and all $|I|$ values of $r_ui$ are the same, then $overlinetextrank = sum_ui textrank_ui = frac2$
Questions
- Is my understanding of the metric correct? Especially my last example and the statement by the authors that $overlinetextrank geq 50%$ indicated an algorithm is no better than random seem off.
- What is $t$?
recommender-system math
$endgroup$
add a comment |
$begingroup$
I'm currently reading
Hu, Koren, Volinsky: Collaborative Filtering for Implicit Feedback Datasets
One thing that confuses me is the "expected percentile ranking", an function the authors define to evaluate the goodness of their recommendations. They define it in the Evaluation methodology on page 6 as:
$$overlinetextrank = fracsum_u,i r^t_ui textrank_uisum_u,i r^t_ui$$
where $u$ is a user, $i$ is an item (e.g. a TV show), $r_ui in [0, infty)$ is the amount how much user $u$ did watch show $i$. $textrank_ui in [0, 1]$ is the percentage rank of item $i$ for user $u$. For example, it is 0 if for user $u$ the item $i$ has the highest $r$ value and 1 if the item $i$ for user $u$ has the lowest $r$ value.
I'm not super sure if I understood it correctly.
The authors write that lower values of $overlinetextrank$ are more desirable and for random predictions would lead to an expected value of $overlinetextrank$ of 0.5.
Examples
- Assume there is only one item. In this case $textrank = 0$. Makes sense, as there cannot be any predictions.
- Assume there is only one user and two items with $r_1,1 = 1$ and $r_1,2 = 2$. Then:
$$overlinetextrank = frac1 cdot textrank_1, 1 + 2 cdot textrank_1, 21+2$$
This means $overlinetextrank in 2/3, 1/3$.
- If there is only a single user and all $|I|$ values of $r_ui$ are the same, then $overlinetextrank = sum_ui textrank_ui = frac2$
Questions
- Is my understanding of the metric correct? Especially my last example and the statement by the authors that $overlinetextrank geq 50%$ indicated an algorithm is no better than random seem off.
- What is $t$?
recommender-system math
$endgroup$
add a comment |
$begingroup$
I'm currently reading
Hu, Koren, Volinsky: Collaborative Filtering for Implicit Feedback Datasets
One thing that confuses me is the "expected percentile ranking", an function the authors define to evaluate the goodness of their recommendations. They define it in the Evaluation methodology on page 6 as:
$$overlinetextrank = fracsum_u,i r^t_ui textrank_uisum_u,i r^t_ui$$
where $u$ is a user, $i$ is an item (e.g. a TV show), $r_ui in [0, infty)$ is the amount how much user $u$ did watch show $i$. $textrank_ui in [0, 1]$ is the percentage rank of item $i$ for user $u$. For example, it is 0 if for user $u$ the item $i$ has the highest $r$ value and 1 if the item $i$ for user $u$ has the lowest $r$ value.
I'm not super sure if I understood it correctly.
The authors write that lower values of $overlinetextrank$ are more desirable and for random predictions would lead to an expected value of $overlinetextrank$ of 0.5.
Examples
- Assume there is only one item. In this case $textrank = 0$. Makes sense, as there cannot be any predictions.
- Assume there is only one user and two items with $r_1,1 = 1$ and $r_1,2 = 2$. Then:
$$overlinetextrank = frac1 cdot textrank_1, 1 + 2 cdot textrank_1, 21+2$$
This means $overlinetextrank in 2/3, 1/3$.
- If there is only a single user and all $|I|$ values of $r_ui$ are the same, then $overlinetextrank = sum_ui textrank_ui = frac2$
Questions
- Is my understanding of the metric correct? Especially my last example and the statement by the authors that $overlinetextrank geq 50%$ indicated an algorithm is no better than random seem off.
- What is $t$?
recommender-system math
$endgroup$
I'm currently reading
Hu, Koren, Volinsky: Collaborative Filtering for Implicit Feedback Datasets
One thing that confuses me is the "expected percentile ranking", an function the authors define to evaluate the goodness of their recommendations. They define it in the Evaluation methodology on page 6 as:
$$overlinetextrank = fracsum_u,i r^t_ui textrank_uisum_u,i r^t_ui$$
where $u$ is a user, $i$ is an item (e.g. a TV show), $r_ui in [0, infty)$ is the amount how much user $u$ did watch show $i$. $textrank_ui in [0, 1]$ is the percentage rank of item $i$ for user $u$. For example, it is 0 if for user $u$ the item $i$ has the highest $r$ value and 1 if the item $i$ for user $u$ has the lowest $r$ value.
I'm not super sure if I understood it correctly.
The authors write that lower values of $overlinetextrank$ are more desirable and for random predictions would lead to an expected value of $overlinetextrank$ of 0.5.
Examples
- Assume there is only one item. In this case $textrank = 0$. Makes sense, as there cannot be any predictions.
- Assume there is only one user and two items with $r_1,1 = 1$ and $r_1,2 = 2$. Then:
$$overlinetextrank = frac1 cdot textrank_1, 1 + 2 cdot textrank_1, 21+2$$
This means $overlinetextrank in 2/3, 1/3$.
- If there is only a single user and all $|I|$ values of $r_ui$ are the same, then $overlinetextrank = sum_ui textrank_ui = frac2$
Questions
- Is my understanding of the metric correct? Especially my last example and the statement by the authors that $overlinetextrank geq 50%$ indicated an algorithm is no better than random seem off.
- What is $t$?
recommender-system math
recommender-system math
asked Apr 2 at 7:13
Martin ThomaMartin Thoma
6,6951657134
6,6951657134
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
What is $t$?
It means observed $r_ui$ in the one-week test set (page 6-left).
Is my understanding of the metric correct?
First two examples are correct. Assuming user-item relation $r_ui^t$ is constant $a$ for all items in the test set, and predicted ranks are uniform across $[0, 1]$, then, the third one would be:
$$overlinetextrank = fracsum_u,i r^t_ui textrank_uisum_u,i r^t_ui=fracsum_u,i a text rank_uisum_u,i a=frac1sum_u,i text rank_ui=frac1frac2=frac12$$
This makes sense. Items are identical to the user, therefore no model can do better than random guessing, since there is no observed preference to help the model favor one item over the other. Of course, another assumption here is that training (4 weeks) and test (next week) sets are from the same distribution.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48411%2fwhat-is-the-range-of-values-of-the-expected-percentile-ranking%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
What is $t$?
It means observed $r_ui$ in the one-week test set (page 6-left).
Is my understanding of the metric correct?
First two examples are correct. Assuming user-item relation $r_ui^t$ is constant $a$ for all items in the test set, and predicted ranks are uniform across $[0, 1]$, then, the third one would be:
$$overlinetextrank = fracsum_u,i r^t_ui textrank_uisum_u,i r^t_ui=fracsum_u,i a text rank_uisum_u,i a=frac1sum_u,i text rank_ui=frac1frac2=frac12$$
This makes sense. Items are identical to the user, therefore no model can do better than random guessing, since there is no observed preference to help the model favor one item over the other. Of course, another assumption here is that training (4 weeks) and test (next week) sets are from the same distribution.
$endgroup$
add a comment |
$begingroup$
What is $t$?
It means observed $r_ui$ in the one-week test set (page 6-left).
Is my understanding of the metric correct?
First two examples are correct. Assuming user-item relation $r_ui^t$ is constant $a$ for all items in the test set, and predicted ranks are uniform across $[0, 1]$, then, the third one would be:
$$overlinetextrank = fracsum_u,i r^t_ui textrank_uisum_u,i r^t_ui=fracsum_u,i a text rank_uisum_u,i a=frac1sum_u,i text rank_ui=frac1frac2=frac12$$
This makes sense. Items are identical to the user, therefore no model can do better than random guessing, since there is no observed preference to help the model favor one item over the other. Of course, another assumption here is that training (4 weeks) and test (next week) sets are from the same distribution.
$endgroup$
add a comment |
$begingroup$
What is $t$?
It means observed $r_ui$ in the one-week test set (page 6-left).
Is my understanding of the metric correct?
First two examples are correct. Assuming user-item relation $r_ui^t$ is constant $a$ for all items in the test set, and predicted ranks are uniform across $[0, 1]$, then, the third one would be:
$$overlinetextrank = fracsum_u,i r^t_ui textrank_uisum_u,i r^t_ui=fracsum_u,i a text rank_uisum_u,i a=frac1sum_u,i text rank_ui=frac1frac2=frac12$$
This makes sense. Items are identical to the user, therefore no model can do better than random guessing, since there is no observed preference to help the model favor one item over the other. Of course, another assumption here is that training (4 weeks) and test (next week) sets are from the same distribution.
$endgroup$
What is $t$?
It means observed $r_ui$ in the one-week test set (page 6-left).
Is my understanding of the metric correct?
First two examples are correct. Assuming user-item relation $r_ui^t$ is constant $a$ for all items in the test set, and predicted ranks are uniform across $[0, 1]$, then, the third one would be:
$$overlinetextrank = fracsum_u,i r^t_ui textrank_uisum_u,i r^t_ui=fracsum_u,i a text rank_uisum_u,i a=frac1sum_u,i text rank_ui=frac1frac2=frac12$$
This makes sense. Items are identical to the user, therefore no model can do better than random guessing, since there is no observed preference to help the model favor one item over the other. Of course, another assumption here is that training (4 weeks) and test (next week) sets are from the same distribution.
answered Apr 2 at 8:23
EsmailianEsmailian
3,206320
3,206320
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f48411%2fwhat-is-the-range-of-values-of-the-expected-percentile-ranking%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown