diff mbox series

[2/2] scsi: tape: add unexpected rewind handling

Message ID 20230822181413.1210647-2-jmeneghi@redhat.com
State New
Headers show
Series [1/2] scsi: tape: add third party poweron reset handling | expand

Commit Message

John Meneghini Aug. 22, 2023, 6:14 p.m. UTC
Handle the unexpected condition where the tape drive reports
that tape is rewinding.

Patch one in this series was designed to handle an unexpected third
party reset condition on the tape device by setting pos_unknown
following a POR Unit Attention. Because we do not have access to
an Amazon VTL application Laurance and I tried to repoduce the
aforementioned POR data corruption problem by using a physical tape
drive with a multi-initiator iSCSI gateway.

We were easily able to issue the third party reset from initiator 1
while initiator 2 had a backup in progress. We saw the tape drive
automatically rewind following the reset, and the st driver on
initiator 2 attempt to write a filemark with MTEOM. However, we
discovered our tape drive (an HP Ultrium 5-SCSI Z64D) never sends a
Unit Attention of any kind. Instead, following the third party
reset, the tape drive continually returned "No Sense, Rewind
operation in progress".

Here are the test results w/out this patch.

<<< Rest by other initiator
st 33:0:0:0: [st0] Error: 2, cmd: a 0 0 28 0 0
st 33:0:0:0: [st0] Sense Key : No Sense [current]
st 33:0:0:0: [st0] Add. Sense: Rewind operation in progress
st 33:0:0:0: [st0] Error on write:
st 33:0:0:0: [st0] Number of r/w requests 35913, dio used in 35913...
st 33:0:0:0: [st0] Async write waits 0, finished 0.
st 33:0:0:0: [st0] Error: 2, cmd: 10 0 0 0 1 0     <<< write filemark
st 33:0:0:0: [st0] Sense Key : No Sense [current]
st 33:0:0:0: [st0] Add. Sense: Rewind operation in progress
st 33:0:0:0: [st0] Error on write filemark.
st 33:0:0:0: [st0] Buffer flushed, 1 EOF(s) written  <<< flush buffer
st 33:0:0:0: [st0] Rewinding tape.
st 33:0:0:0: [st0] Error: 2, cmd: 1 0 0 0 0 0
st 33:0:0:0: [st0] Sense Key : No Sense [current]
st 33:0:0:0: [st0] Add. Sense: Rewind operation in progress

With the patch:

<<< Rest by other initiator
st 32:0:0:0: [st0] Error: 8000002, cmd: a 0 0 28 0 0
st 32:0:0:0: [st0] Sense Key : No Sense [current]
st 32:0:0:0: [st0] Add. Sense: Rewind operation in progress
st 32:0:0:0: [st0] Error on write:
<<< no write filemark or flush buffer >>>
st 32:0:0:0: [st0] Number of r/w requests 1624, dio used in 1624...
st 32:0:0:0: [st0] Rewinding tape.
st 32:0:0:0: [st0] Error: 8000002, cmd: 1 0 0 0 0 0
st 32:0:0:0: [st0] Sense Key : No Sense [current]
st 32:0:0:0: [st0] Add. Sense: Rewind operation in progress

I'm providing this patch because I think it's valuable for testing
purposes and it should be safe. Any time the device unexpectedly
reports "Rewind is in progress", it should be safe to set
pos_unknown in the driver.

Tested-by: Laurence Oberman <loberman@redhat.com>
Signed-off-by: John Meneghini <jmeneghi@redhat.com>
---
 drivers/scsi/st.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Kai Mäkisara (Kolumbus) Aug. 24, 2023, 9:13 a.m. UTC | #1
> On 22. Aug 2023, at 21.14, John Meneghini <jmeneghi@redhat.com> wrote:
> 
> Handle the unexpected condition where the tape drive reports
> that tape is rewinding.
> 
> ...
> I'm providing this patch because I think it's valuable for testing
> purposes and it should be safe. Any time the device unexpectedly
> reports "Rewind is in progress", it should be safe to set
> pos_unknown in the driver.
> 
I am a bit hesitant about this, because it does not recognize if the rewind in
progress was initiated by the user or not. In immediate mode (ST_NOWAIT
option), a user rewind may be still in progress when a (impatient) user
tries to do something else.

One possibility would be to make this conditional on !STp->immediate.

Another, perhaps better, method would be to use the STps->rw state
variable. A new state ST_REWINDING could be introduced (or state
should be set to ST_IDLE when rewinding).

(Looking at the state, I think it should be set to something else than
ST_WRITING more frequently. This could, in some cases prevent
improper automatic writing of filemarks. See, for instance, the problem
with failing rewinds in the report with PATCH 1/2.)

Thanks, Kai


> Tested-by: Laurence Oberman <loberman@redhat.com>
> Signed-off-by: John Meneghini <jmeneghi@redhat.com>
> ---
> drivers/scsi/st.c | 3 +++
> 1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
> index 338aa8c42968..b641490ed9d1 100644
> --- a/drivers/scsi/st.c
> +++ b/drivers/scsi/st.c
> @@ -416,6 +416,9 @@ static int st_chk_result(struct scsi_tape *STp, struct st_request * SRpnt)
> STp->cleaning_req = 1; /* ASC and ASCQ => cleaning requested */
> if (cmdstatp->have_sense && scode == UNIT_ATTENTION && cmdstatp->sense_hdr.asc == 0x29)
> STp->pos_unknown = 1; /* ASC => power on / reset */
> + if (cmdstatp->have_sense && cmdstatp->sense_hdr.asc == 0
> + && cmdstatp->sense_hdr.ascq == 0x1a)
> + STp->pos_unknown = 1; /* ASCQ => rewind in progress */
> 
> STp->pos_unknown |= STp->device->was_reset;
> 
> -- 
> 2.39.3
>
John Meneghini Aug. 29, 2023, 10:21 p.m. UTC | #2
On 8/24/23 05:13, "Kai Mäkisara (Kolumbus)" wrote:
> 
> 
>> On 22. Aug 2023, at 21.14, John Meneghini <jmeneghi@redhat.com> wrote:
>>
>> Handle the unexpected condition where the tape drive reports
>> that tape is rewinding.
>>
>> ...
>> I'm providing this patch because I think it's valuable for testing
>> purposes and it should be safe. Any time the device unexpectedly
>> reports "Rewind is in progress", it should be safe to set
>> pos_unknown in the driver.
>>
> I am a bit hesitant about this, because it does not recognize if the rewind in
> progress was initiated by the user or not. In immediate mode (ST_NOWAIT
> option), a user rewind may be still in progress when a (impatient) user
> tries to do something else.

That's fine.  We can drop this patch if you are uncomfortable with it.  The real need it patch 1, which is and will affect 
customers using the AWS tape gateway.

> One possibility would be to make this conditional on !STp->immediate.
> 
> Another, perhaps better, method would be to use the STps->rw state
> variable. A new state ST_REWINDING could be introduced (or state
> should be set to ST_IDLE when rewinding).
> 
> (Looking at the state, I think it should be set to something else than
> ST_WRITING more frequently. This could, in some cases prevent
> improper automatic writing of filemarks. See, for instance, the problem
> with failing rewinds in the report with PATCH 1/2.)

Agreed. This patch was only a improvised way to run the code I needed to test in patch 1/2.

Let's leave this patch out.

Thanks,

/John
diff mbox series

Patch

diff --git a/drivers/scsi/st.c b/drivers/scsi/st.c
index 338aa8c42968..b641490ed9d1 100644
--- a/drivers/scsi/st.c
+++ b/drivers/scsi/st.c
@@ -416,6 +416,9 @@  static int st_chk_result(struct scsi_tape *STp, struct st_request * SRpnt)
 		STp->cleaning_req = 1; /* ASC and ASCQ => cleaning requested */
 	if (cmdstatp->have_sense && scode == UNIT_ATTENTION && cmdstatp->sense_hdr.asc == 0x29)
 		STp->pos_unknown = 1; /* ASC => power on / reset */
+	if (cmdstatp->have_sense && cmdstatp->sense_hdr.asc == 0
+			&& cmdstatp->sense_hdr.ascq == 0x1a)
+		STp->pos_unknown = 1; /* ASCQ => rewind in progress */
 
 	STp->pos_unknown |= STp->device->was_reset;