A/B testing: How to calculate p-value on post test segments?Word Frequency Analysis of Document SetsHow to control false positives in sequential A/B testing while keeping a low sample size?How to generate bootstrapping samples in R?What are the methods to ensure that the population split for A/B test is random?Campaign Hypothesis Testing: Is using chi-square test appropriate?Approaches to A/B testing when you can't randomize on the user levelhow to calculate p valueTesting whether observed control/test split invalidates my assumption of 50/50 randomised trafficHow to verify A/B test
Lock in SQL Server and Oracle
Has any spacecraft ever had the ability to directly communicate with civilian air traffic control?
How to replace the "space symbol" (squat-u) in listings?
In gnome-terminal only 2 out of 3 zoom keys work
Airbnb - host wants to reduce rooms, can we get refund?
A non-technological, repeating, visible object in the sky, holding its position in the sky for hours
Build a trail cart
What's the polite way to say "I need to urinate"?
Confusion about capacitors
How to determine the actual or "true" resolution of a digital photograph?
Why do Ichisongas hate elephants and hippos?
Does a creature that is immune to a condition still make a saving throw?
Is it possible to Ready a spell to be cast just before the start of your next turn by having the trigger be an ally's attack?
What's the metal clinking sound at the end of credits in Avengers: Endgame?
Possible to set `foldexpr` using a function reference?
How deep to place a deadman anchor for a slackline?
When did stoichiometry begin to be taught in U.S. high schools?
Why does Bran Stark feel that Jon Snow "needs to know" about his lineage?
How to set the font color of quantity objects (Version 11.3 vs version 12)?
What is a Recurrent Neural Network?
Was it really necessary for the Lunar Module to have 2 stages?
What does "rf" mean in "rfkill"?
Is it possible to measure lightning discharges as Nikola Tesla?
Is thermodynamics only applicable to systems in equilibrium?
A/B testing: How to calculate p-value on post test segments?
Word Frequency Analysis of Document SetsHow to control false positives in sequential A/B testing while keeping a low sample size?How to generate bootstrapping samples in R?What are the methods to ensure that the population split for A/B test is random?Campaign Hypothesis Testing: Is using chi-square test appropriate?Approaches to A/B testing when you can't randomize on the user levelhow to calculate p valueTesting whether observed control/test split invalidates my assumption of 50/50 randomised trafficHow to verify A/B test
$begingroup$
My question on A/B testing is about doing post test segmentation analysis.
For example:
I run an A/B test on my website to track bounce rate. On the
treatment group, i put a video to explain my company. On the control
group i put just plain text. I pick a segment of users who are first
time users from USA to be split 50/50 into the 2 groups.
Metric that i am tracking is average bounce rate (assume 20%).
Power effect (0.8)
effect size i expect to see(10% so bounce rate should fall to (20% - 0.10 * 20% = 18%))
Calculated sample size required is say 1000 for each group.
Say i run the test for the correct amount of time. At the end of the test, i get a p-value of 0.06. i do not reject the null hypothesis.
However, when i do post test segmentation analysis, for example, i saw
that users who signed up for a free trial, 44% of them played the
video.
In this case, how do i calculate if the 44% was significant? (while taking into account the multiple comparison problem?)
Like in the Airbnb experiment, they did post segmentation analysis on the browser type and was able to calculate the p-value.
My approach
Does this mean that for every segment that i want to analyze, i need to have at least 1000 samples? Also how would i recalculate the p-value given that the p-value of this A/B test was already generated above as 0.06?
statistics ab-test experiments
$endgroup$
add a comment |
$begingroup$
My question on A/B testing is about doing post test segmentation analysis.
For example:
I run an A/B test on my website to track bounce rate. On the
treatment group, i put a video to explain my company. On the control
group i put just plain text. I pick a segment of users who are first
time users from USA to be split 50/50 into the 2 groups.
Metric that i am tracking is average bounce rate (assume 20%).
Power effect (0.8)
effect size i expect to see(10% so bounce rate should fall to (20% - 0.10 * 20% = 18%))
Calculated sample size required is say 1000 for each group.
Say i run the test for the correct amount of time. At the end of the test, i get a p-value of 0.06. i do not reject the null hypothesis.
However, when i do post test segmentation analysis, for example, i saw
that users who signed up for a free trial, 44% of them played the
video.
In this case, how do i calculate if the 44% was significant? (while taking into account the multiple comparison problem?)
Like in the Airbnb experiment, they did post segmentation analysis on the browser type and was able to calculate the p-value.
My approach
Does this mean that for every segment that i want to analyze, i need to have at least 1000 samples? Also how would i recalculate the p-value given that the p-value of this A/B test was already generated above as 0.06?
statistics ab-test experiments
$endgroup$
$begingroup$
You probably need to start by studying how hypothesis testing works: en.wikipedia.org/wiki/Statistical_hypothesis_testing. For instance, what is your null hypothesis? your alternative hypothesis? your test statistic? And I don't know where you're getting "does this mean ... i need 1000 samples" is coming from; you might need to explain your thinking/reasoning. Finally, please ask only one question per post.
$endgroup$
– D.W.
Jun 12 '18 at 23:23
$begingroup$
Cross-posted: datascience.stackexchange.com/q/24702/8560, stats.stackexchange.com/q/313582/2921. Please do not post the same question on multiple sites. Each community should have an honest shot at answering without anybody's time being wasted.
$endgroup$
– D.W.
Jun 12 '18 at 23:26
add a comment |
$begingroup$
My question on A/B testing is about doing post test segmentation analysis.
For example:
I run an A/B test on my website to track bounce rate. On the
treatment group, i put a video to explain my company. On the control
group i put just plain text. I pick a segment of users who are first
time users from USA to be split 50/50 into the 2 groups.
Metric that i am tracking is average bounce rate (assume 20%).
Power effect (0.8)
effect size i expect to see(10% so bounce rate should fall to (20% - 0.10 * 20% = 18%))
Calculated sample size required is say 1000 for each group.
Say i run the test for the correct amount of time. At the end of the test, i get a p-value of 0.06. i do not reject the null hypothesis.
However, when i do post test segmentation analysis, for example, i saw
that users who signed up for a free trial, 44% of them played the
video.
In this case, how do i calculate if the 44% was significant? (while taking into account the multiple comparison problem?)
Like in the Airbnb experiment, they did post segmentation analysis on the browser type and was able to calculate the p-value.
My approach
Does this mean that for every segment that i want to analyze, i need to have at least 1000 samples? Also how would i recalculate the p-value given that the p-value of this A/B test was already generated above as 0.06?
statistics ab-test experiments
$endgroup$
My question on A/B testing is about doing post test segmentation analysis.
For example:
I run an A/B test on my website to track bounce rate. On the
treatment group, i put a video to explain my company. On the control
group i put just plain text. I pick a segment of users who are first
time users from USA to be split 50/50 into the 2 groups.
Metric that i am tracking is average bounce rate (assume 20%).
Power effect (0.8)
effect size i expect to see(10% so bounce rate should fall to (20% - 0.10 * 20% = 18%))
Calculated sample size required is say 1000 for each group.
Say i run the test for the correct amount of time. At the end of the test, i get a p-value of 0.06. i do not reject the null hypothesis.
However, when i do post test segmentation analysis, for example, i saw
that users who signed up for a free trial, 44% of them played the
video.
In this case, how do i calculate if the 44% was significant? (while taking into account the multiple comparison problem?)
Like in the Airbnb experiment, they did post segmentation analysis on the browser type and was able to calculate the p-value.
My approach
Does this mean that for every segment that i want to analyze, i need to have at least 1000 samples? Also how would i recalculate the p-value given that the p-value of this A/B test was already generated above as 0.06?
statistics ab-test experiments
statistics ab-test experiments
asked Nov 14 '17 at 2:14
jxnjxn
1383
1383
$begingroup$
You probably need to start by studying how hypothesis testing works: en.wikipedia.org/wiki/Statistical_hypothesis_testing. For instance, what is your null hypothesis? your alternative hypothesis? your test statistic? And I don't know where you're getting "does this mean ... i need 1000 samples" is coming from; you might need to explain your thinking/reasoning. Finally, please ask only one question per post.
$endgroup$
– D.W.
Jun 12 '18 at 23:23
$begingroup$
Cross-posted: datascience.stackexchange.com/q/24702/8560, stats.stackexchange.com/q/313582/2921. Please do not post the same question on multiple sites. Each community should have an honest shot at answering without anybody's time being wasted.
$endgroup$
– D.W.
Jun 12 '18 at 23:26
add a comment |
$begingroup$
You probably need to start by studying how hypothesis testing works: en.wikipedia.org/wiki/Statistical_hypothesis_testing. For instance, what is your null hypothesis? your alternative hypothesis? your test statistic? And I don't know where you're getting "does this mean ... i need 1000 samples" is coming from; you might need to explain your thinking/reasoning. Finally, please ask only one question per post.
$endgroup$
– D.W.
Jun 12 '18 at 23:23
$begingroup$
Cross-posted: datascience.stackexchange.com/q/24702/8560, stats.stackexchange.com/q/313582/2921. Please do not post the same question on multiple sites. Each community should have an honest shot at answering without anybody's time being wasted.
$endgroup$
– D.W.
Jun 12 '18 at 23:26
$begingroup$
You probably need to start by studying how hypothesis testing works: en.wikipedia.org/wiki/Statistical_hypothesis_testing. For instance, what is your null hypothesis? your alternative hypothesis? your test statistic? And I don't know where you're getting "does this mean ... i need 1000 samples" is coming from; you might need to explain your thinking/reasoning. Finally, please ask only one question per post.
$endgroup$
– D.W.
Jun 12 '18 at 23:23
$begingroup$
You probably need to start by studying how hypothesis testing works: en.wikipedia.org/wiki/Statistical_hypothesis_testing. For instance, what is your null hypothesis? your alternative hypothesis? your test statistic? And I don't know where you're getting "does this mean ... i need 1000 samples" is coming from; you might need to explain your thinking/reasoning. Finally, please ask only one question per post.
$endgroup$
– D.W.
Jun 12 '18 at 23:23
$begingroup$
Cross-posted: datascience.stackexchange.com/q/24702/8560, stats.stackexchange.com/q/313582/2921. Please do not post the same question on multiple sites. Each community should have an honest shot at answering without anybody's time being wasted.
$endgroup$
– D.W.
Jun 12 '18 at 23:26
$begingroup$
Cross-posted: datascience.stackexchange.com/q/24702/8560, stats.stackexchange.com/q/313582/2921. Please do not post the same question on multiple sites. Each community should have an honest shot at answering without anybody's time being wasted.
$endgroup$
– D.W.
Jun 12 '18 at 23:26
add a comment |
1 Answer
1
active
oldest
votes
$begingroup$
Well if you want to answer the question if a single segment reaches the same level and you ignore all other segments behaviors then this should be the required number (given that initial performance of the segments was the same).
As a warning when you use to many segments this: https://xkcd.com/882/
can happen.
$endgroup$
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "557"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f24702%2fa-b-testing-how-to-calculate-p-value-on-post-test-segments%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
$begingroup$
Well if you want to answer the question if a single segment reaches the same level and you ignore all other segments behaviors then this should be the required number (given that initial performance of the segments was the same).
As a warning when you use to many segments this: https://xkcd.com/882/
can happen.
$endgroup$
add a comment |
$begingroup$
Well if you want to answer the question if a single segment reaches the same level and you ignore all other segments behaviors then this should be the required number (given that initial performance of the segments was the same).
As a warning when you use to many segments this: https://xkcd.com/882/
can happen.
$endgroup$
add a comment |
$begingroup$
Well if you want to answer the question if a single segment reaches the same level and you ignore all other segments behaviors then this should be the required number (given that initial performance of the segments was the same).
As a warning when you use to many segments this: https://xkcd.com/882/
can happen.
$endgroup$
Well if you want to answer the question if a single segment reaches the same level and you ignore all other segments behaviors then this should be the required number (given that initial performance of the segments was the same).
As a warning when you use to many segments this: https://xkcd.com/882/
can happen.
answered Nov 14 '17 at 11:00
El BurroEl Burro
460311
460311
add a comment |
add a comment |
Thanks for contributing an answer to Data Science Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
Use MathJax to format equations. MathJax reference.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f24702%2fa-b-testing-how-to-calculate-p-value-on-post-test-segments%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
$begingroup$
You probably need to start by studying how hypothesis testing works: en.wikipedia.org/wiki/Statistical_hypothesis_testing. For instance, what is your null hypothesis? your alternative hypothesis? your test statistic? And I don't know where you're getting "does this mean ... i need 1000 samples" is coming from; you might need to explain your thinking/reasoning. Finally, please ask only one question per post.
$endgroup$
– D.W.
Jun 12 '18 at 23:23
$begingroup$
Cross-posted: datascience.stackexchange.com/q/24702/8560, stats.stackexchange.com/q/313582/2921. Please do not post the same question on multiple sites. Each community should have an honest shot at answering without anybody's time being wasted.
$endgroup$
– D.W.
Jun 12 '18 at 23:26