Start thinking about what we need from new implementation.
I think we definitely want use multiprocessing instead of multithreading. One potential issue that running 100 processes on a single machine may add bottlenecks that are different from multithreading bottlenecks. We want either one beefy machine with many cores to run that or spread the whole load across multiple machines. One good thing is that those 100 processes will all be sleeping most of the time, waiting for response from QServ so wee need 100 active cores, probably some small fraction of that should be enough. We could probably use master02 for that, it has 28 true cores. For multi-machine test we could use verification cluster and scale it as necessary. Management of multi-node test is more complicated but should not be too terrible as we don't need very many clients, probably 5 or 6 should be enough. We need to configure clients to only do a part of the load e.g. 5 clients doing 1/5 of LV queries each and one client doing all remaining types of queries (or something similar).
We need to make sure that objectId in queries correspond to existing ones and also use very large set of objectIds to cover large fraction of chunks, not just few of those. It's probably worth to dump a reasonably large set of objectIds from database (secondary index) and use that list with randomization.
Same applies to region-based queries, we want randomness there too but regions should not fall into empty area of the sky.
All useful info should be dumped to a file with exact timestamps so we could load it into time series database (e.g. InfluxDB) and probably correlate with anything that happens on Qserv side. Would be nice to have that integrated into NCSA grafana monitor, but we do not have an ability to edit grafana panels there. Should we ask if we can feed our data into whatever backend they use (InfluxDB or anything else) and give us a grafana playground so we could mess around a bit?
The data that are interesting are: query execution time, how many queries are running, types of queries, number of rows and data sizes returned, maybe something else. Each client would dump its own set of metrics, we'll need to merge them, grafana should be able to do it easily. It would still be useful to identify each separate client to see if there are any correlations. And of course query class should be a part of the metrics too.
Configuration for new harness will be somewhat more complicated so I think we should move most of it from Python code to separate config file. YAML is probably easiest to do for what we need. Things that will be there: definitions of query classes, target rates, per-class query or query templates.
Testing the harness
To debug this new harness would be better to make it testable, e.g. run with a sort of mock database that will internally generate some responses.