How much more data do I need? Estimating requirements for downstream tasks