What does “improving exploration capacity” look like in a multi-armed bandit?
You could potentially model this as an (a) increase in the amount of bandit pulls you can do in parallel (simple models only assume one pull at a time), (b) a decrease in the amount of time it takes between a bandit pull and the information being received (simple bandit models assume this to be instantaneous), (c) an increase in the accuracy of information received by each bandit pull (simple models assume the information received is perfectly accurate).
You could potentially model this as an (a) increase in the amount of bandit pulls you can do in parallel (simple models only assume one pull at a time), (b) a decrease in the amount of time it takes between a bandit pull and the information being received (simple bandit models assume this to be instantaneous), (c) an increase in the accuracy of information received by each bandit pull (simple models assume the information received is perfectly accurate).