Swap readahead would read in a few pages regardless if the underlyingdevice is busy or not. It may incur long waiting time if the device iscongested, and it may also exacerbate the congestion.

Use inode_read_congested() to check if the underlying device is busy ornot like what file page readahead does. Get inode from swap_info_struct.Although we can add inode information in swap_address_space(address_space->host), it may lead some unexpected side effect, i.e.it may break mapping_cap_account_dirty(). Using inode fromswap_info_struct seems simple and good enough.

Just does the check in vma_cluster_readahead() sinceswap_vma_readahead() is just used for non-rotational device whichmuch less likely has congestion than traditional HDD.

Although swap slots may be consecutive on swap partition, it still may befragmented on swap file. This check would help to reduce excessive stallfor such case.

The test with page_fault1 of will-it-scale (sometimes tracing may justshow runtest.py that is the wrapper script of page_fault1), which basicallylaunches NR_CPU threads to generate 128MB anonymous pages for each thread,on my virtual machine with congested HDD shows long tail latency is reducedsignificantly.