Two possible strategies:
- We could go to lots of lakes and takes lots of samples and thus compute lots of values for p_samp_yes and then drain those lakes and thus learn the actual value p_pop_yes and then compare that value with all those values of p_samp_yes and see if there is a connection.