Abstract
At some moment in our lives, we are probably faced with the following question: How likely is it that you would recommend [company X] to a friend or colleague?. This question is related to the Net Promoter Score (NPS), a simple measure used by several companies as indicator of customer loyalty. Even though it is a well-known measure in the business world, studies that address the statistical properties or the sample size determination problem related to this measure are still scarce. We adopt a Bayesian approach to provide point and interval estimators for the NPS and discuss the determination of the sample size. Computational tools were implemented to use this methodology in practice. An illustrative example with data from financial services is also presented.
Key words
Average length criterion; customer loyalty; Dirichlet distribution multinomial distribution; sample size
Introduction
Reichheld 2003REICHHELD FF. 2003. The One Number You Need to Grow. Harvard Bus Rev 81(12): 46–55. proposed a statistics called Net Promoter Score (NPS) that may be used by a company as an indicator of customer loyalty. The author applied a questionnaire with some questions related to loyalty to a sample of customers of some industries, and with the purchase history of each customer it was possible to determine which questions had the strongest statistical correlation with repeat purchase or referrals. One of these questions performed better in most industries: “How likely is it that you would recommend [company X] to a friend or colleague?“. Reichheld 2003REICHHELD FF. 2003. The One Number You Need to Grow. Harvard Bus Rev 81(12): 46–55. suggested that the response to the this questions must be on a 0 to 10 rating scale. Then, it is considered “promoters“ the customers who respond with 9 or 10, “passives“ the customers who respond with 7 or 8, and “detractors“ the customers who respond with 0 through 6. The idea is that the more “promoters“ company X has, the bigger its growth. An estimate of the NPS is computed as the difference between the proportions (or percentages) of “promoters“ and “detractors“.
Keiningham et al. 2008KEININGHAM TL, AKSOY L, COOIL B, ANDREASSEN TW & WILLIAMS L. 2008. A Holistic Examination of Net Promoter. J Database Mark Cust Strategy Manag 15(2): 79–90. discuss the claims that NPS is the single most reliable indicator of a company’s ability to grow, and that it is a superior metric to costumer satisfaction. Markoulidakis et al. 2021MARKOULIDAKIS I, RALLIS I, GEORGOULAS I, KOPSIAFTIS G, DOULAMIS A & DOULAMIS N. 2021. Multiclass Confusion Matrix Reduction Method and Its Application on Net Promoter Score Classification Problem. Technologies 9(4): 81. approach the customer experience as a NPS classification problem via machine learning algorithms. Rocks 2016ROCKS B. 2016. Interval Estimation for the “Net Promoter Score". Am Stat 70(4): 365–372. presents a brief summary of some critiques about the NPS, see references therein. We may also cite Eskildsen & Kristensen 2011ESKILDSEN JK & KRISTENSEN K. 2011. The Accuracy of the Net Promoter Score Under Different Distributional Assumptions. 2011 International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering. and Kristensen & Eskildsen 2014KRISTENSEN K & ESKILDSEN J. 2014. Is the NPS a Trustworthy Performance Measure? TQM J 26(2): 202–214. for related work.
In the context of statistical modeling, Rocks 2016ROCKS B. 2016. Interval Estimation for the “Net Promoter Score". Am Stat 70(4): 365–372. focus on estimating intervals for the NPS in a frequentist approach via Wald intervals and Score methods. Also, the author performs a study simulation to assess the coverage probability of the proposed interval estimates, and conclude that variations on the adjusted Wald and an iterative Score method performed better.
In Section 2, we present the frequentist approach to make inference about the NPS. In Section 3, we present the Bayesian approach with the multinomial/Dirichlet model, the methodologies to obtain point and interval estimates for the NPS. Also in Section 3, the problem of the minimum sample size determination is discussed and a study simulation is conducted. In Section 4, we present an illustrative example with data on financial services. We conclude with some remarks in Section 5.
Frequentist approach
Let and the respective numbers of detractors, passives and promoters in a customer sample of size ; and the respective proportions in the customer population. Then, the parameter of interest is , where , and an estimator for this parameter is given by . The Wald interval with confidence level is given by , where is the quantile of probability of the standard normal distribution and (Rocks 2016ROCKS B. 2016. Interval Estimation for the “Net Promoter Score". Am Stat 70(4): 365–372. ). In this context, when NPS is used to estimate , the error is less than or equal to with confidence . Then, we may obtain so that we are confident that the error in estimating is less than or equal to a specified bound () on the error, i.e.,
which implies that Another interval proposed by Rocks 2016ROCKS B. 2016. Interval Estimation for the “Net Promoter Score". Am Stat 70(4): 365–372. and based on Goodman 1964GOODMAN LA. 1964. Simultaneous Confidence Intervals for Contrasts Among Multinomial Populations. Ann Math Stat 35(2): 716–725. is , where is the quantile of probability of the chi-squared distribution with two degrees of freedom. In the same way of the previous interval, we obtain thatSimilar expressions to (1) and (2) for may be obtained from the other intervals proposed by Rocks 2016ROCKS B. 2016. Interval Estimation for the “Net Promoter Score". Am Stat 70(4): 365–372. . But note that, in order to obtain the minimum we have to set a value for , which depends on unknown quantities ( and ). One solution is to set based on previous studies or consider the maximum value of , which is 1 (Rocks 2016ROCKS B. 2016. Interval Estimation for the “Net Promoter Score". Am Stat 70(4): 365–372. ). Another problem is that these intervals are based on the respective asymptotic distribution in each case, i.e., as the , and the computed minimum may not provide a good approximation to the asymptotic distribution.
In order to circumvent these problems, we propose a Bayesian model in order to make inference for the NPS and to establish a sample size determination methodology. Even though it is a well-known and widely used measure in the business world, studies that address the statistical properties or the sample size determination problem are still scarce. See Rossi & Allenby 2003ROSSI PE & ALLENBY GM. 2003. Bayesian Statistics and Marketing. Market Sci 22(3): 304–328. for an exposition of the usefulness of the Bayesian methods in marketing.
Bayesian approach
Let , where and are the proportions of detractors, passives and promoters in the customer population, respectively. Then, the NPS in the respective population is given by , where , which is the parameter of interest. In a sample of customers we count the number of customers in each category based on their responses for the aforementioned question. Let , where and are the numbers of customers categorized as detractors, passives and promoters, respectively, in the customer sample of size .
Multinomial/Dirichlet model
Given , we assume a multinomial distribution for the counts , and we denote . The respective probability distribution is given by
where such that , and .The natural (conjugate) choice for the prior distribution of is a Dirichlet distribution. We denote and the respective probability density function is given by
where , is the gamma function and is a vector of positive hyperparameters which must be set by the researcher and must reflect the prior knowledge about at the moment. In Figures 1 and 2 we present the ternary plots for the Dirichlet distribution for different values for . For values of close to zero the distribution of concentrates in the corners of the plot, which implies that the distribution of will mostly concentrate around the values -1, 0 and 1. As the values in vector increase but still smaller than 1, the distribution of will mostly concentrates on the edges. If the distribution of is equivalent to a uniform distribution over the 2-simplex. For values of , and greater than 1, the distribution of becomes increasingly concentrated in a given region of the plot as the values of , and increase, where the determination of the region of concentration depends on the values of , and .Let , Adcock 1987ADCOCK CJ. 1987. A Bayesian Approach to Calculating Sample Sizes for Multinomial Sampling. J R Stat Soc Series D 36: 155–159. proposes a method to determine before sampling if there are two independent estimates and for , and note that the expected value of is given by
where is estimated by , . For example, suppose that and , taking as an estimate for and solving the expected value equation for , we have that , and multiplying this value by , , we obtain . On the other hand, if we have that and , we obtain that and , i.e., increases as the distance between and decreases.
Given the multinomial distribution to model the counts and the Dirichlet distribution for , the model may be written hierarchically as follows
In this setting, given a observation of , we have that the posterior distribution for is a Dirichlet distribution with parameter , i.e., (Turkman et al. 2019TURKMAN MAA, PAULINO CD & MÜLLER P. 2019. Computational Bayesian Statistics: An Introduction. Cambridge: Cambridge University Press. ). Also, Bayesian updating becomes straightforward since the current parameters of the posterior distribution may be used as the hyperparameters of the prior distribution in the next sampling of . Given a way to generate random values from the Dirichlet distribution, this provide us a simple way to draw values from the posterior distribution of in order to obtain, approximately, posterior summaries as the mean, median, variance, quantiles, etc, and make inferences about the NPS.
An algorithm to obtain a sample of size from the posterior distribution of the is outlined as follows.
-
Set the values of , and (e.g., ).
-
Draw a value of from the Dirichlet distribution with parameter .
-
Compute and keep this value.
-
Repeat Steps 2-3 times.
It is well known that the marginal distributions of a Dirichlet distribution are beta distributions (Kotz et al. 2000KOTZ S, BALAKRISHNAN N & JOHNSON NL. 2000. Continuous Multivariate Distributions, Volume 1: Models and Applications. New York: Wiley & Sons. ). Let . Then, it follows that
which give us that the mean of the posterior distribution of the NPS is where . This mean may be used as a point estimator for the NPS. The respective variance is given byA credible interval that we may construct is based on (3)and (4), i.e., , where is a fixed constant that controls the number of posterior standard deviations within the credible interval. This type of credible interval may be derived from decision theory, where a loss function is composed of measures of bias and discrepancy, and is related to the calibration of the trade-off between these measures (Rice et al. 2008RICE KM, LUMLEY T & SZPIRO AA. 2008. Trading Bias for Precision: Decision Theory for Intervals and Sets. ). Also in the context of decision theory, the may be determined for a fixed coverage probability using asymptotic properties for posterior distributions. For example, see Costa et al. 2021COSTA EG, PAULINO CD & SINGER JM. 2021. Verifying Compliance with Ballast Water Standards: A Decision-Theoretic Approach. SORT 45(1): 19–32. , 5 for more details. We developed an Excel spreadsheet that computes this credible interval and a point estimate based on (3) and (4). See Supplementary Material for more details.
Another credible interval may be specified by the highest posterior density (HPD) interval. In this case we use a Monte Carlo approach to approximate the HPD interval. In other words, we use a sample drawn from the posterior distribution of , which may be easily done since the posterior distribution is a Dirichlet distribution. See Turkman et al. 2019TURKMAN MAA, PAULINO CD & MÜLLER P. 2019. Computational Bayesian Statistics: An Introduction. Cambridge: Cambridge University Press. , pgs. 47-48 for more details.
Minimum sample size
To determine the minimum sample size required to estimate with a pre-specified precision, we consider a criterion based on the average length of credible intervals. The posterior credible interval accounts for the magnitude of the NPS and this may help the company to know when to perform a gap analysis and create a business action plan in order to improve the NPS, i.e., increase the NPS until the company has more promoters than detractors ().
Let and be the lower and upper bounds of the HPD interval for . The rationale here is to set the minimum Bayesian coverage probability and obtain the minimum sample size by requiring that the length of the HPD interval be such that
where is the maximum admissible length for the HPD interval and the expected value is computed based on the marginal probability function of the outcomes (). This is called average length criterion (ALC). See Costa et al. 2021COSTA EG, PAULINO CD & SINGER JM. 2021. Sample Size for Estimating Organism Concentration in Ballast Water: A Bayesian Approach. Braz J Probab Stat 35(1): 158–171. doi:10.1214/20-BJPS470.10.1214/20-BJPS470... and references therein for more details about this criterion.
Since it is impractical to obtain analytically the lower and upper bounds of the HPD interval for , we use a Monte Carlo approach (Chen & Shao 1999CHEN M-H & SHAO Q-M. 1999. Monte Carlo Estimation of Bayesian Credible and HPD Intervals. J Comput Graph Stat 8(1): 69–92. ) to obtain the respective bounds as well as the the respective expected value.
An algorithm to obtain the minimum sample size satisfying this criterion is outlined as follows.
-
Set values for , , and take .
-
Draw a sample of size (e.g., ) of ; to draw , first draw one value of from the Dirichlet distribution with parameter and given this value, draw from the multinomial distribution with parameter .
-
Obtain the HPD interval of probability for each and then the respective interval length: for each vector drawn in Step 2, obtain the lower and upper bounds of the HPD interval of probability as indicated in Chen & Shao 1999CHEN M-H & SHAO Q-M. 1999. Monte Carlo Estimation of Bayesian Credible and HPD Intervals. J Comput Graph Stat 8(1): 69–92. . Then, compute the difference between the upper and lower bounds for each vector drawn, in order to obtain the interval lengths.
-
Compute the average of the HPD interval lengths.
-
If this average is lower or equal to , stop. The value obtained in this step is the required value. Otherwise, set and return to Step 2.
We developed an R package (R Core Team 2022R CORE TEAM. 2022. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing.
R: A Language and Environment for Statis...
) which provides a function to obtain point and interval estimates via Monte Carlo simulation as discussed in the above section. Also, the package have a function to compute the minimum sample size to estimate the NPS through HPD interval via ALC (see Supplementary Material).
In Tables I and II, we present the minimum sample size to estimate the NPS using the HPD interval computed via ALC for all the scenarios for the prior distribution of presented in Figures 1 and 2, and some values of and . In the algorithm to obtain the minimum , we consider and in the Step 3 we draw samples of size from the posterior distribution of the NPS in order to obtain the HPD interval bounds. For fixed (), the minimum decreases as () increases, as expected (Tables I and II).
Ternary plots for the Dirichlet distribution with: and in the top row; and in the middle row; and in the bottom row.
Ternary plots for the Dirichlet distribution with: and in the top row; and in the bottom row.
Also in Tables I and II, we observe that the minimum increases when changes from to , which makes sense since the first case represents a Dirichlet distribution concentrated in the corners that implies a prior distribution for mostly concentrated around the values -1, 0, and 1; and the second case represents a Dirichlet distribution mostly concentrated on the edges. For , , we observe that the minimum increases as increases until some point and after that point the minimum decreases. It seems that there is some value in which the minimum increases as approaches from left, and for the minimum decreases as approaches infinity. Also, we observe that the magnitude of depends on , the smaller the , the larger the , i.e., when the prior knowledge is not enough to make the minimum start decreasing and satisfy the ALC for the fixed values of and . This is consistent with the interpretation that is a “prior sample size“ and that it may be viewed as a measure of the quality of the prior knowledge (Adcock 1987ADCOCK CJ. 1987. A Bayesian Approach to Calculating Sample Sizes for Multinomial Sampling. J R Stat Soc Series D 36: 155–159. ).
For the adopted model parameters, the running time to compute the minimum varied from 47 seconds to 5.52 hours, depending on the setting. The smaller the values of and/or , the larger the running time. The computers that have been used have the following characteristics: (i) OS Linux Ubuntu 20.04, RAM 7.7 GB, processor AMD PRO A8-8600B; and (ii) OS Linux Ubuntu 22.04, RAM 5.6 GB, processor AMD Ryzen 5 5500U.
Simulation study
We conduct a simulation study to verify whether the HPD intervals obtained with the sample sizes proposed in Section 3.2 satisfy the ALC.
For each sample size obtained via the ALC displayed in Tables I and II, and the respective values of , and , we perform the following steps: (i) we draw from the Dirichlet distribution and compute the correspondent NPS ; (ii) given the , we draw from the multinomial distribution; (iii) given and , we draw a sample of size 1000 from the posterior distribution of and obtain the respective HPD interval with probability ; (iv) we compute the length of the respective HPD interval and verify if belongs to this interval; (v) we repeat the steps (i)-(iv) 1000 times, and we compute the average of the lengths of the HPD intervals and the proportion of times that belonged to the HPD interval.
ALC based minimum sample size to estimate the NPS through the HPD interval via multinomial/Dirichlet model with 𝛼 = (𝛼1, 𝛼2, 𝛼3).
ALC based minimum sample size to estimate the NPS through the HPD interval via multinomial/Dirichlet model with 𝛼 = (𝛼1, 𝛼2, 𝛼3).
In Tables III-VI we displayed the simulation results. As expected, the ALC based lengths and coverage probabilities of HPD intervals estimated via the simulation study are close to the respective values of and for each sample size in Tables I and II. For the estimated ALC based lengths of HPD intervals, in general the values obtained in the simulation study are slightly larger than the respective , with a difference in the third decimal place. On the other hand, for the estimated ALC based coverage probabilities, in general the values obtained are slightly smaller than the respective .
Table III ALC based length of HPD intervals estimated via simulation for some scenarios under the multinomial/Dirichlet model using sample sizes displayed in Table I.ℓmax\𝜌 | (0.1, 0.1, 0.1) | (0.99, 0.99, 0.99) | |||||
---|---|---|---|---|---|---|---|
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.0219 | 0.0216 | 0.0227 | 0.0203 | 0.0209 | 0.0210 | |
0.04 | 0.0429 | 0.0440 | 0.0440 | 0.0407 | 0.0417 | 0.0420 | |
0.06 | 0.0604 | 0.0660 | 0.0627 | 0.0612 | 0.0623 | 0.0632 | |
0.08 | 0.0821 | 0.0837 | 0.0828 | 0.0815 | 0.0846 | 0.0848 | |
0.10 | 0.1016 | 0.0986 | 0.1041 | 0.1005 | 0.1060 | 0.1040 | |
0.12 | 0.1217 | 0.1235 | 0.1202 | 0.1218 | 0.1239 | 0.1262 | |
0.14 | 0.1442 | 0.1391 | 0.1445 | 0.1417 | 0.1446 | 0.1441 | |
0.16 | 0.1542 | 0.1619 | 0.1677 | 0.1638 | 0.1650 | 0.1661 | |
0.18 | 0.1817 | 0.1861 | 0.1782 | 0.1825 | 0.1857 | 0.1845 | |
0.20 | 0.2034 | 0.2068 | 0.1959 | 0.2013 | 0.2093 | 0.2055 | |
ℓmax\𝜌 | (1, 1, 1) | (5, 5, 5) | |||||
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.0207 | 0.0212 | 0.0208 | 0.0206 | 0.0207 | 0.0208 | |
0.04 | 0.0407 | 0.0417 | 0.0426 | 0.0408 | 0.0415 | 0.0412 | |
0.06 | 0.0612 | 0.0626 | 0.0628 | 0.0608 | 0.0622 | 0.0625 | |
0.08 | 0.0840 | 0.0856 | 0.0819 | 0.0811 | 0.0829 | 0.0826 | |
0.10 | 0.1027 | 0.1034 | 0.1044 | 0.1022 | 0.1026 | 0.1035 | |
0.12 | 0.1230 | 0.1238 | 0.1231 | 0.1225 | 0.1237 | 0.1243 | |
0.14 | 0.1425 | 0.1463 | 0.1450 | 0.1432 | 0.1438 | 0.1445 | |
0.16 | 0.1630 | 0.1672 | 0.1665 | 0.1616 | 0.1647 | 0.1659 | |
0.18 | 0.1818 | 0.1834 | 0.1837 | 0.1821 | 0.1869 | 0.1853 | |
0.20 | 0.2044 | 0.2056 | 0.2075 | 0.2020 | 0.2051 | 0.2061 | |
ℓmax\𝜌 | (50, 50, 50) | (100, 100, 100) | |||||
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.0204 | 0.0207 | 0.0207 | 0.0203 | 0.0207 | 0.0207 | |
0.04 | 0.0408 | 0.0414 | 0.0412 | 0.0407 | 0.0413 | 0.0414 | |
0.06 | 0.0609 | 0.0618 | 0.0620 | 0.0609 | 0.0620 | 0.0622 | |
0.08 | 0.0813 | 0.0824 | 0.0826 | 0.0811 | 0.0827 | 0.0827 | |
0.10 | 0.1011 | 0.1031 | 0.1037 | 0.1013 | 0.1030 | 0.1033 | |
0.12 | 0.1220 | 0.1234 | 0.1234 | 0.1219 | 0.1238 | 0.1235 | |
0.14 | 0.1426 | 0.1444 | 0.1441 | 0.1415 | 0.1439 | 0.1446 | |
0.16 | 0.1626 | 0.1644 | 0.1648 | 0.1620 | 0.1645 | 0.1527 | |
0.18 | 0.1826 | 0.1853 | 0.1855 | 0.1824 | 0.1815 | 0.1525 | |
0.20 | 0.2017 | 0.2064 | 0.2056 | 0.2022 | 0.1812 | 0.1523 |
ℓmax\𝜌 | (0.1, 0.1, 0.1) | (0.99, 0.99, 0.99) | |||||
---|---|---|---|---|---|---|---|
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.982 | 0.952 | 0.892 | 0.992 | 0.941 | 0.894 | |
0.04 | 0.993 | 0.950 | 0.900 | 0.987 | 0.941 | 0.904 | |
0.06 | 0.989 | 0.946 | 0.899 | 0.981 | 0.942 | 0.901 | |
0.08 | 0.977 | 0.952 | 0.901 | 0.990 | 0.951 | 0.885 | |
0.10 | 0.990 | 0.944 | 0.887 | 0.982 | 0.945 | 0.901 | |
0.12 | 0.989 | 0.953 | 0.906 | 0.985 | 0.932 | 0.876 | |
0.14 | 0.987 | 0.951 | 0.896 | 0.991 | 0.948 | 0.885 | |
0.16 | 0.991 | 0.939 | 0.882 | 0.984 | 0.950 | 0.883 | |
0.18 | 0.986 | 0.941 | 0.882 | 0.987 | 0.944 | 0.907 | |
0.20 | 0.987 | 0.952 | 0.900 | 0.996 | 0.945 | 0.890 | |
ℓmax\𝜌 | (1, 1, 1) | (5, 5, 5) | |||||
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.992 | 0.942 | 0.889 | 0.982 | 0.939 | 0.889 | |
0.04 | 0.989 | 0.942 | 0.889 | 0.996 | 0.943 | 0.882 | |
0.06 | 0.986 | 0.944 | 0.901 | 0.987 | 0.944 | 0.894 | |
0.08 | 0.987 | 0.942 | 0.893 | 0.994 | 0.944 | 0.886 | |
0.10 | 0.989 | 0.955 | 0.898 | 0.988 | 0.948 | 0.904 | |
0.12 | 0.984 | 0.952 | 0.883 | 0.987 | 0.955 | 0.903 | |
0.14 | 0.988 | 0.950 | 0.878 | 0.992 | 0.944 | 0.908 | |
0.16 | 0.988 | 0.941 | 0.895 | 0.985 | 0.931 | 0.899 | |
0.18 | 0.980 | 0.949 | 0.908 | 0.990 | 0.943 | 0.890 | |
0.20 | 0.984 | 0.955 | 0.900 | 0.985 | 0.949 | 0.878 | |
ℓmax\𝜌 | (50, 50, 50) | (100, 100, 100) | |||||
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.985 | 0.937 | 0.906 | 0.985 | 0.942 | 0.897 | |
0.04 | 0.986 | 0.935 | 0.868 | 0.986 | 0.939 | 0.894 | |
0.06 | 0.994 | 0.953 | 0.883 | 0.991 | 0.933 | 0.910 | |
0.08 | 0.991 | 0.950 | 0.898 | 0.990 | 0.945 | 0.892 | |
0.10 | 0.990 | 0.939 | 0.911 | 0.991 | 0.931 | 0.891 | |
0.12 | 0.988 | 0.944 | 0.888 | 0.983 | 0.943 | 0.893 | |
0.14 | 0.988 | 0.947 | 0.894 | 0.988 | 0.932 | 0.889 | |
0.16 | 0.984 | 0.952 | 0.905 | 0.983 | 0.950 | 0.891 | |
0.18 | 0.990 | 0.942 | 0.901 | 0.985 | 0.963 | 0.898 | |
0.20 | 0.993 | 0.944 | 0.899 | 0.986 | 0.949 | 0.875 |
ℓmax\𝜌 | (2, 5, 8) | (4, 10, 16) | |||||
---|---|---|---|---|---|---|---|
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.0205 | 0.0209 | 0.0207 | 0.0205 | 0.0208 | 0.0207 | |
0.04 | 0.0409 | 0.0417 | 0.0414 | 0.0406 | 0.0415 | 0.0415 | |
0.06 | 0.0604 | 0.0625 | 0.0620 | 0.0608 | 0.0621 | 0.0620 | |
0.08 | 0.0818 | 0.0834 | 0.0823 | 0.0816 | 0.0826 | 0.0830 | |
0.10 | 0.1022 | 0.1038 | 0.1025 | 0.1021 | 0.1038 | 0.1029 | |
0.12 | 0.1220 | 0.1236 | 0.1232 | 0.1218 | 0.1239 | 0.1233 | |
0.14 | 0.1415 | 0.1438 | 0.1433 | 0.1419 | 0.1438 | 0.1442 | |
0.16 | 0.1623 | 0.1632 | 0.1664 | 0.1616 | 0.1650 | 0.1661 | |
0.18 | 0.1825 | 0.1845 | 0.1833 | 0.1840 | 0.1843 | 0.1852 | |
0.20 | 0.2003 | 0.2040 | 0.2061 | 0.2024 | 0.2083 | 0.2055 | |
ℓmax\𝜌 | (8, 20, 32) | (16, 40, 64) | |||||
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.0203 | 0.0207 | 0.0208 | 0.0203 | 0.0207 | 0.0208 | |
0.04 | 0.0408 | 0.0417 | 0.0414 | 0.0405 | 0.0413 | 0.0412 | |
0.06 | 0.0608 | 0.0620 | 0.0623 | 0.0610 | 0.0617 | 0.0620 | |
0.08 | 0.0821 | 0.0822 | 0.0829 | 0.0813 | 0.0826 | 0.0828 | |
0.10 | 0.1013 | 0.1029 | 0.1036 | 0.1012 | 0.1032 | 0.1035 | |
0.12 | 0.1217 | 0.1238 | 0.1232 | 0.1222 | 0.1234 | 0.1244 | |
0.14 | 0.1414 | 0.1447 | 0.1443 | 0.1419 | 0.1438 | 0.1439 | |
0.16 | 0.1616 | 0.1644 | 0.1646 | 0.1610 | 0.1645 | 0.1645 | |
0.18 | 0.1815 | 0.1852 | 0.1844 | 0.1819 | 0.1840 | 0.1847 | |
0.20 | 0.2028 | 0.2051 | 0.2061 | 0.2019 | 0.2065 | 0.2040 |
ℓmax\𝜌 | (2, 5, 8) | (4, 10, 16) | |||||
---|---|---|---|---|---|---|---|
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.990 | 0.947 | 0.898 | 0.981 | 0.946 | 0.889 | |
0.04 | 0.987 | 0.954 | 0.891 | 0.987 | 0.934 | 0.893 | |
0.06 | 0.985 | 0.930 | 0.906 | 0.986 | 0.946 | 0.898 | |
0.08 | 0.985 | 0.941 | 0.907 | 0.988 | 0.941 | 0.884 | |
0.10 | 0.987 | 0.948 | 0.899 | 0.986 | 0.949 | 0.891 | |
0.12 | 0.990 | 0.939 | 0.892 | 0.986 | 0.947 | 0.881 | |
0.14 | 0.981 | 0.948 | 0.909 | 0.989 | 0.942 | 0.887 | |
0.16 | 0.981 | 0.934 | 0.897 | 0.991 | 0.945 | 0.897 | |
0.18 | 0.986 | 0.956 | 0.892 | 0.980 | 0.938 | 0.894 | |
0.20 | 0.988 | 0.955 | 0.895 | 0.993 | 0.943 | 0.896 | |
ℓmax\𝜌 | (8, 20, 32) | (16, 40, 64) | |||||
0.01 | 0.05 | 0.10 | 0.01 | 0.05 | 0.10 | ||
0.02 | 0.990 | 0.944 | 0.885 | 0.989 | 0.944 | 0.902 | |
0.04 | 0.988 | 0.939 | 0.903 | 0.986 | 0.940 | 0.881 | |
0.06 | 0.993 | 0.944 | 0.898 | 0.994 | 0.939 | 0.891 | |
0.08 | 0.984 | 0.947 | 0.894 | 0.985 | 0.941 | 0.883 | |
0.10 | 0.983 | 0.943 | 0.890 | 0.989 | 0.930 | 0.874 | |
0.12 | 0.988 | 0.954 | 0.907 | 0.990 | 0.950 | 0.893 | |
0.14 | 0.989 | 0.951 | 0.893 | 0.987 | 0.938 | 0.879 | |
0.16 | 0.993 | 0.956 | 0.899 | 0.987 | 0.954 | 0.887 | |
0.18 | 0.990 | 0.945 | 0.899 | 0.984 | 0.956 | 0.890 | |
0.20 | 0.985 | 0.950 | 0.887 | 0.984 | 0.940 | 0.895 |
Illustrative example
In the situation where the sample is not obtained yet, we may determine the sample size to obtain a HPD interval for the NPS via ALC by setting , , , and . For example, if we consider , and , the minimum sample size is 655 (Table I), i.e., to obtain a HPD interval with maximum length equals to 0.10 and respective coverage probability equals to 0.95, we should ask 655 people the NPS question and then categorize them into detractors, passives and promoters in order to obtain the observed values of , and , respectively.
Given the difficult to obtain a real NPS dataset from a company because such a information is very sensitive, we consider a hypothetical dataset on financial services in three markets in the year of 2021 (see Supplementary Material) to mimic the application of the methods in a real situation.
To illustrate the methodology and the Bayesian updating, we consider the data from the first and second quarter of the Mexico market. For the first quarter we have no prior knowledge, then we set . For this quarter the numbers of detractors, passives and promoters are 136, 82 and 188, respectively, which implies a posterior Dirichlet distribution with vector parameter for . Drawing a sample of size from the respective posterior distribution of the NPS and computing its summaries, we have that a point estimate for the NPS is 0.127 and the HPD 95% interval is [0.038, 0.206]. For the second quarter, we may use the posterior parameter of the first quarter as the prior parameter for the current quarter, i.e., , and . For the second quarter the numbers of detractors, passives and promoters are 133, 96 and 190, respectively, which implies a posterior Dirichlet distribution with vector parameter for . Again, drawing a sample of size for the respective posterior distribution of the NPS, we have a point estimate of 0.131 for the NPS and the HPD 95% interval is [0.072, 0.192]. All these results were obtained via the developed R package. Another way to obtain point and interval estimates for the NPS without needing simulation methods is to compute (3) and (4) for this data, as discussed in Section 3.1.
Concluding remarks
We discussed the sample size determination for estimating the NPS. To approach this problem we consider a Bayesian approach via a multinomial/Dirichlet model and the average length criterion. We provide point and interval estimators for the NPS as in closed forms or via drawing a sample from the posterior distribution of the NPS and computing its summaries. A simulation study is conducted to verify whether the HPD intervals obtained with the proposed minimum sample sizes satisfy the ALC. Also, the Bayesian approach makes the inference updating becomes straightforward as illustrated in Section 4, i.e., a sequential procedure to estimate the NPS. Computational tools were developed to use these methodologies in practice.
Supplementary Material
The Excel spreadsheet is available at https://doi.org/10.5281/zenodo.7679211. The R package is available at https://github.com/eliardocosta/BayesNPS (DOI: 10.5281/zenodo.7617770). The data used in the illustrative example is available at https://www.kaggle.com/code/charlottetu/net-promoterscore/.
ACKNOWLEDGMENTS
The authors would like to thank the Associate Editor and the anonymous referees for the enlightening comments and constructive suggestions, which improved the paper considerably.
- ADCOCK CJ. 1987. A Bayesian Approach to Calculating Sample Sizes for Multinomial Sampling. J R Stat Soc Series D 36: 155–159.
- CHEN M-H & SHAO Q-M. 1999. Monte Carlo Estimation of Bayesian Credible and HPD Intervals. J Comput Graph Stat 8(1): 69–92.
- COSTA EG, PAULINO CD & SINGER JM. 2021. Sample Size for Estimating Organism Concentration in Ballast Water: A Bayesian Approach. Braz J Probab Stat 35(1): 158–171. doi:10.1214/20-BJPS470
- COSTA EG, PAULINO CD & SINGER JM. 2021. Verifying Compliance with Ballast Water Standards: A Decision-Theoretic Approach. SORT 45(1): 19–32.
- ESKILDSEN JK & KRISTENSEN K. 2011. The Accuracy of the Net Promoter Score Under Different Distributional Assumptions. 2011 International Conference on Quality, Reliability, Risk, Maintenance, and Safety Engineering.
- GOODMAN LA. 1964. Simultaneous Confidence Intervals for Contrasts Among Multinomial Populations. Ann Math Stat 35(2): 716–725.
- KEININGHAM TL, AKSOY L, COOIL B, ANDREASSEN TW & WILLIAMS L. 2008. A Holistic Examination of Net Promoter. J Database Mark Cust Strategy Manag 15(2): 79–90.
- KOTZ S, BALAKRISHNAN N & JOHNSON NL. 2000. Continuous Multivariate Distributions, Volume 1: Models and Applications. New York: Wiley & Sons.
- KRISTENSEN K & ESKILDSEN J. 2014. Is the NPS a Trustworthy Performance Measure? TQM J 26(2): 202–214.
- MARKOULIDAKIS I, RALLIS I, GEORGOULAS I, KOPSIAFTIS G, DOULAMIS A & DOULAMIS N. 2021. Multiclass Confusion Matrix Reduction Method and Its Application on Net Promoter Score Classification Problem. Technologies 9(4): 81.
- R CORE TEAM. 2022. R: A Language and Environment for Statistical Computing Vienna, Austria: R Foundation for Statistical Computing.
» R: A Language and Environment for Statistical Computing - REICHHELD FF. 2003. The One Number You Need to Grow. Harvard Bus Rev 81(12): 46–55.
- RICE KM, LUMLEY T & SZPIRO AA. 2008. Trading Bias for Precision: Decision Theory for Intervals and Sets.
- ROCKS B. 2016. Interval Estimation for the “Net Promoter Score". Am Stat 70(4): 365–372.
- ROSSI PE & ALLENBY GM. 2003. Bayesian Statistics and Marketing. Market Sci 22(3): 304–328.
- TURKMAN MAA, PAULINO CD & MÜLLER P. 2019. Computational Bayesian Statistics: An Introduction. Cambridge: Cambridge University Press.
Publication Dates
-
Publication in this collection
27 May 2024 -
Date of issue
2024
History
-
Received
1 Sept 2023 -
Accepted
9 Jan 2024