You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We do some optimizations on the group partitioner to improve its speed.
Theoretically, since we've pre-grouped some partitions already, we
should be faster than the capability based partitioner. For example, if
we a dynamically quantized group, this could contain
```
[ choose_q_params, quantize, dequantize, op, dequantize_weight, weight, bias, get_item_scale, get_item_zp]
```
9 nodes. In capability-based partitioner they will have to run DFS on
all 9 nodes in order to group these together. Based on the hints and
purpose of the group based partitioenr, we don't perform these checks
and instead group all these 9 nodes, saving time by avoiding these
checks. Some stats when partitioning the mobile bert model:
```
elpased time old partitioner: 65.3421
old_partitioner num partitions: 170
elpased time new partitioner: 5.1964
new_partitioner num partitions: 170
```
we see a 13x improvement in partitioning when using the group based
partitioner, while still partitioning around the same number of nodes.
0 commit comments