III. HOME WORK PROBLEMS

8. GOAL SOFTENING (3)

It is well known that the accuracy of a sampling scheme depends only on the size of the sample and NOT on the size of the underlying population which we shall assume to be infinite in this problem. Suppose the distribution of performances J of a system is normal N(0,sJ) when plotted against a design variable q. We randomly sample N designs, q1, q2, ..., qN in an experiment and observe the system performance, J(qi) in additive noise N(0,snoise), i.e., J(qi)observed = J(qi) + noise. We now ask the question what is
Prob{max[J(qi)observed, i = 1, ..., N] in top-5% of J(q)} = p = ?

assuming we are interested in maximizing performance.

  1. Purely as a test of your probabilistic intuition, what do you think is the likely value of p for the case sJ = snoise and N = 100?
  2. A. p <= 0.4
    B. 0.4 <= p < 0.75
    C. p > 0.75

    Choose one alternative and then calculate the answer to see if you are correct.

  3. Whatever the value of p in the above, now consider doing m independent experiments of N samples each and ask
  4. Prob{at least 1 of max[J(qi)observed, i = 1, ..., N] of the m experiments in top-5% of J(q)} = p(m) = ?
     
  5. Calculate p(m) as a function of m, p. What did you learn from this calculation?

  6. (Try out a couple of your guesses in the previous question for different values of m).
  7. Suppose instead of getting m we did ONE experiment with mN samples and asks
  8. Prob{max[J(qi)observed, i = 1, ..., mN] in top-5% of J(q)} = p* = ?
     
  9. Is p* > p(m) or p(m) > p*? Can you relate this to the ideas of Ordinal Optimization?
 
SOLUTION:
  1. The correct estimate is that p ~0.52 which is not very good. In fact if we increase the size of samples, N, to 300, p increases only slightly (see 3. below). To calculate this probability, the easiest way is by direct simulation.
  2. To have at least ONE of the best of the m separate exepriments, say m = 3, of 100 samples each belonging to the top 5% is equivalent to one success in 3 Bernoulli trials with probability of p (obtained in the first question) for success. This is given by
  3. p(3) = 1 - (1 - p)3
     
  4. For p = 0.4, p(3) = 0.784 which is considerably more interesting.

  5. Note also that once p is determined, we can calculate p(m) for different m without further simulation.
  6. We assert that p(3) > p*. This is achieved by asking a softer question (vs. asking that the best of 3*100 = 300 samples belong to the top-5%).



NEXT QUESTION

Return to Table of Content