@ -6442,15 +6442,25 @@ static bool bfq_bfqq_may_idle(struct bfq_queue *bfqq)
* The value of the variable is computed considering that
* idling is usually beneficial for the throughput if :
* ( a ) the device is not NCQ - capable , or
* ( b ) regardless of the presence of NCQ , the request pattern
* for bfqq is I / O - bound ( possible throughput losses
* caused by granting idling to seeky queues are mitigated
* by the fact that , in all scenarios where boosting
* throughput is the best thing to do , i . e . , in all
* symmetric scenarios , only a minimal idle time is
* allowed to seeky queues ) .
* ( b ) regardless of the presence of NCQ , the device is rotational
* and the request pattern for bfqq is I / O - bound ( possible
* throughput losses caused by granting idling to seeky queues
* are mitigated by the fact that , in all scenarios where
* boosting throughput is the best thing to do , i . e . , in all
* symmetric scenarios , only a minimal idle time is allowed to
* seeky queues ) .
*
* Secondly , and in contrast to the above item ( b ) , idling an
* NCQ - capable flash - based device would not boost the
* throughput even with intense I / O ; rather it would lower
* the throughput in proportion to how fast the device
* is . Accordingly , the next variable is true if any of the
* above conditions ( a ) and ( b ) is true , and , in particular ,
* happens to be false if bfqd is an NCQ - capable flash - based
* device .
*/
idling_boosts_thr = ! bfqd - > hw_tag | | bfq_bfqq_IO_bound ( bfqq ) ;
idling_boosts_thr = ! bfqd - > hw_tag | |
( ! blk_queue_nonrot ( bfqd - > queue ) & & bfq_bfqq_IO_bound ( bfqq ) ) ;
/*
* The value of the next variable ,
@ -6491,14 +6501,16 @@ static bool bfq_bfqq_may_idle(struct bfq_queue *bfqq)
bfqd - > wr_busy_queues = = 0 ;
/*
* There is then a case where idling must be performed not for
* throughput concerns , but to preserve service guarantees . To
* introduce it , we can note that allowing the drive to
* enqueue more than one request at a time , and hence
* There is then a case where idling must be performed not
* for throughput concerns , but to preserve service
* guarantees .
*
* To introduce this case , we can note that allowing the drive
* to enqueue more than one request at a time , and hence
* delegating de facto final scheduling decisions to the
* drive ' s internal scheduler , causes loss of control on the
* drive ' s internal scheduler , entail s loss of control on the
* actual request service order . In particular , the critical
* situation is when requests from different processes happens
* situation is when requests from different processes happen
* to be present , at the same time , in the internal queue ( s )
* of the drive . In such a situation , the drive , by deciding
* the service order of the internally - queued requests , does
@ -6509,51 +6521,97 @@ static bool bfq_bfqq_may_idle(struct bfq_queue *bfqq)
* the service distribution enforced by the drive ' s internal
* scheduler is likely to coincide with the desired
* device - throughput distribution only in a completely
* symmetric scenario where : ( i ) each of these processes must
* get the same throughput as the others ; ( ii ) all these
* processes have the same I / O pattern ( either sequential or
* random ) . In fact , in such a scenario , the drive will tend
* to treat the requests of each of these processes in about
* the same way as the requests of the others , and thus to
* provide each of these processes with about the same
* throughput ( which is exactly the desired throughput
* distribution ) . In contrast , in any asymmetric scenario ,
* device idling is certainly needed to guarantee that bfqq
* receives its assigned fraction of the device throughput
* ( see [ 1 ] for details ) .
* symmetric scenario where :
* ( i ) each of these processes must get the same throughput as
* the others ;
* ( ii ) all these processes have the same I / O pattern
( either sequential or random ) .
* In fact , in such a scenario , the drive will tend to treat
* the requests of each of these processes in about the same
* way as the requests of the others , and thus to provide
* each of these processes with about the same throughput
* ( which is exactly the desired throughput distribution ) . In
* contrast , in any asymmetric scenario , device idling is
* certainly needed to guarantee that bfqq receives its
* assigned fraction of the device throughput ( see [ 1 ] for
* details ) .
*
* We address this issue by controlling , actually , only the
* symmetry sub - condition ( i ) , i . e . , provided that
* sub - condition ( i ) holds , idling is not performed ,
* regardless of whether sub - condition ( ii ) holds . In other
* words , only if sub - condition ( i ) holds , then idling is
* allowed , and the device tends to be prevented from queueing
* many requests , possibly of several processes . The reason
* for not controlling also sub - condition ( ii ) is that we
* exploit preemption to preserve guarantees in case of
* symmetric scenarios , even if ( ii ) does not hold , as
* explained in the next two paragraphs .
*
* Even if a queue , say Q , is expired when it remains idle , Q
* can still preempt the new in - service queue if the next
* request of Q arrives soon ( see the comments on
* bfq_bfqq_update_budg_for_activation ) . If all queues and
* groups have the same weight , this form of preemption ,
* combined with the hole - recovery heuristic described in the
* comments on function bfq_bfqq_update_budg_for_activation ,
* are enough to preserve a correct bandwidth distribution in
* the mid term , even without idling . In fact , even if not
* idling allows the internal queues of the device to contain
* many requests , and thus to reorder requests , we can rather
* safely assume that the internal scheduler still preserves a
* minimum of mid - term fairness . The motivation for using
* preemption instead of idling is that , by not idling ,
* service guarantees are preserved without minimally
* sacrificing throughput . In other words , both a high
* throughput and its desired distribution are obtained .
*
* More precisely , this preemption - based , idleless approach
* provides fairness in terms of IOPS , and not sectors per
* second . This can be seen with a simple example . Suppose
* that there are two queues with the same weight , but that
* the first queue receives requests of 8 sectors , while the
* second queue receives requests of 1024 sectors . In
* addition , suppose that each of the two queues contains at
* most one request at a time , which implies that each queue
* always remains idle after it is served . Finally , after
* remaining idle , each queue receives very quickly a new
* request . It follows that the two queues are served
* alternatively , preempting each other if needed . This
* implies that , although both queues have the same weight ,
* the queue with large requests receives a service that is
* 1024 / 8 times as high as the service received by the other
* queue .
*
* As for sub - condition ( i ) , actually we check only whether
* bfqq is being weight - raised . In fact , if bfqq is not being
* weight - raised , we have that :
* - if the process associated with bfqq is not I / O - bound , then
* it is not either latency - or throughput - critical ; therefore
* idling is not needed for bfqq ;
* - if the process asociated with bfqq is I / O - bound , then
* idling is already granted with bfqq ( see the comments on
* idling_boosts_thr ) .
* On the other hand , device idling is performed , and thus
* pure sector - domain guarantees are provided , for the
* following queues , which are likely to need stronger
* throughput guarantees : weight - raised queues , and queues
* with a higher weight than other queues . When such queues
* are active , sub - condition ( i ) is false , which triggers
* device idling .
*
* We do not check sub - condition ( ii ) at all , i . e . , the next
* variable is true if and only if bfqq is being
* weight - raised . We do not need to control sub - condition ( ii )
* for the following reason :
* - if bfqq is being weight - raised , then idling is already
* guaranteed to bfqq by sub - condition ( i ) ;
* - if bfqq is not being weight - raised , then idling is
* already guaranteed to bfqq ( only ) if it matters , i . e . , if
* bfqq is associated to a currently I / O - bound process ( see
* the above comment on sub - condition ( i ) ) .
* According to the above considerations , the next variable is
* true ( only ) if sub - condition ( i ) holds . To compute the
* value of this variable , we not only use the return value of
* the function bfq_symmetric_scenario ( ) , but also check
* whether bfqq is being weight - raised , because
* bfq_symmetric_scenario ( ) does not take into account also
* weight - raised queues ( see comments on
* bfq_weights_tree_add ( ) ) .
*
* As a side note , it is worth considering that the above
* device - idling countermeasures may however fail in the
* following unlucky scenario : if idling is ( correctly )
* disabled in a time period during which the symmetry
* sub - condition holds , and hence the device is allowed to
* disabled in a time period during which all symmetry
* sub - conditions hold , and hence the device is allowed to
* enqueue many requests , but at some later point in time some
* sub - condition stops to hold , then it may become impossible
* to let requests be served in the desired order until all
* the requests already queued in the device have been served .
*/
asymmetric_scenario = bfqq - > wr_coeff > 1 ;
asymmetric_scenario = bfqq - > wr_coeff > 1 | |
! bfq_symmetric_scenario ( bfqd ) ;
/*
* We have now all the components we need to compute the return