diff mbox series

[1/1] dm-mpath: do not fail paths on ALUA state transitioning state

Message ID CAHZQxyKbksT=FrLvtPFyBUzGChsTHaRZ-+R0Uc1oDcedVHLTUg@mail.gmail.com
State New
Headers show
Series [1/1] dm-mpath: do not fail paths on ALUA state transitioning state | expand

Commit Message

Brian Bunker April 13, 2022, 5:50 p.m. UTC
I would like to revisit this patch since it continues to cause fallout
for us. Our best practice has always been to use the setting
no_path_retry of 0 in multipath-tools.This means that our customers
who have a previously working configuration file upgrade into this
problem.

My understanding around why this patch was not accepted the first time
was because some array vendors stay in the ALUA transitioning state
for a very long time. It doesn't seem to me that not failing the paths
leads to a problem since the path checker and priority will protect
against continually using the transioning paths, but I am not aware of
the array vendor that led to this patch in the first place. If this
patch is still not acceptable, can it be made acceptable with a flag
allowing this behavior?

Without this patch we have to reach out to all of our customers who
are at risk and let them know that a change of no_path_retry to some
non zero value is required before they upgrade. There is no good way
to reach them all before this issue is hit and they take an unexpected
outage.

The solution of no_path_retry is not a perfect fit for us either.
There are situations where getting to all paths down and the error
bubbling up as soon as possible is expected. A distinction between the
transitioning state getting there and some other state like
unavailable or standby is not there. The fail path logic is the same.

If the answer is that multipath-tools should handle this, a
distinction in failing the path should be made to allow the
multipath-tools to queue on transitioning but fail on other states to
be able to retain the previous behavior without either regression
mentioned above.

Signed-off-by: Brian Bunker <brian@purestorage.com>
Acked-by: Krishna Kant <krishna.kant@purestorage.com>
Acked-by: Seamus Connor <sconnor@purestorage.com>
--
                if (!atomic_read(&m->nr_valid_paths) &&
diff mbox series

Patch

diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index bced42f082b0..28948cc481f9 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1652,12 +1652,12 @@  static int multipath_end_io(struct dm_target
*ti, struct request *clone,
        if (error && blk_path_error(error)) {
                struct multipath *m = ti->private;

-               if (error == BLK_STS_RESOURCE)
+               if (error == BLK_STS_RESOURCE || error == BLK_STS_AGAIN)
                        r = DM_ENDIO_DELAY_REQUEUE;
                else
                        r = DM_ENDIO_REQUEUE;

-               if (pgpath)
+               if (pgpath && (error != BLK_STS_AGAIN))
                        fail_path(pgpath);