Message ID | 20230201180637.2102556-3-bvanassche@acm.org |
---|---|
State | New |
Headers | show |
Series | Use SYNCHRONIZE CACHE instead of FUA for UFS devices | expand |
Hi Bart, > From: Asutosh Das <asutoshd@codeaurora.org> > > UFS devices perform better when using SYNCHRONIZE CACHE command > instead of the FUA flag. Hence this patch. If you have, could you share the result when using SYNCHRONIZE CACHE command? Thanks, Daejun > > Signed-off-by: Asutosh Das <asutoshd@codeaurora.org> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > [ bvanassche: modified a source code comment ] > --- > drivers/ufs/core/ufshcd.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > index bf3cb12ef02f..461aa51cfccc 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -5056,6 +5056,9 @@ static int ufshcd_slave_alloc(struct scsi_device *sdev) > /* WRITE_SAME command is not supported */ > sdev->no_write_same = 1; > > + /* Use SYNCHRONIZE CACHE instead of FUA to improve performance */ > + sdev->sdev_bflags = BLIST_BROKEN_FUA; > + > ufshcd_lu_init(hba, sdev); > > ufshcd_setup_links(hba, sdev); >
Hi Bart, I love your patch! Yet something to improve: [auto build test ERROR on mkp-scsi/for-next] [also build test ERROR on jejb-scsi/for-next linus/master v6.2-rc6 next-20230202] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Bart-Van-Assche/scsi-core-Introduce-the-BLIST_BROKEN_FUA-flag/20230202-021019 base: https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next patch link: https://lore.kernel.org/r/20230201180637.2102556-3-bvanassche%40acm.org patch subject: [PATCH 2/2] scsi: ufs: Use SYNCHRONIZE CACHE instead of FUA config: arm-randconfig-r046-20230130 (https://download.01.org/0day-ci/archive/20230202/202302021202.mM5iFxSO-lkp@intel.com/config) compiler: arm-linux-gnueabi-gcc (GCC) 12.1.0 reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/4edde0173805f04faa8e79aab4de3e929ea4b7c0 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Bart-Van-Assche/scsi-core-Introduce-the-BLIST_BROKEN_FUA-flag/20230202-021019 git checkout 4edde0173805f04faa8e79aab4de3e929ea4b7c0 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=arm SHELL=/bin/bash drivers/ufs/core/ If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): drivers/ufs/core/ufshcd.c: In function 'ufshcd_slave_alloc': >> drivers/ufs/core/ufshcd.c:5060:29: error: 'BLIST_BROKEN_FUA' undeclared (first use in this function) 5060 | sdev->sdev_bflags = BLIST_BROKEN_FUA; | ^~~~~~~~~~~~~~~~ drivers/ufs/core/ufshcd.c:5060:29: note: each undeclared identifier is reported only once for each function it appears in vim +/BLIST_BROKEN_FUA +5060 drivers/ufs/core/ufshcd.c 5031 5032 /** 5033 * ufshcd_slave_alloc - handle initial SCSI device configurations 5034 * @sdev: pointer to SCSI device 5035 * 5036 * Returns success 5037 */ 5038 static int ufshcd_slave_alloc(struct scsi_device *sdev) 5039 { 5040 struct ufs_hba *hba; 5041 5042 hba = shost_priv(sdev->host); 5043 5044 /* Mode sense(6) is not supported by UFS, so use Mode sense(10) */ 5045 sdev->use_10_for_ms = 1; 5046 5047 /* DBD field should be set to 1 in mode sense(10) */ 5048 sdev->set_dbd_for_ms = 1; 5049 5050 /* allow SCSI layer to restart the device in case of errors */ 5051 sdev->allow_restart = 1; 5052 5053 /* REPORT SUPPORTED OPERATION CODES is not supported */ 5054 sdev->no_report_opcodes = 1; 5055 5056 /* WRITE_SAME command is not supported */ 5057 sdev->no_write_same = 1; 5058 5059 /* Use SYNCHRONIZE CACHE instead of FUA to improve performance */ > 5060 sdev->sdev_bflags = BLIST_BROKEN_FUA; 5061 5062 ufshcd_lu_init(hba, sdev); 5063 5064 ufshcd_setup_links(hba, sdev); 5065 5066 return 0; 5067 } 5068
On 1/02/23 20:06, Bart Van Assche wrote: > From: Asutosh Das <asutoshd@codeaurora.org> > > UFS devices perform better when using SYNCHRONIZE CACHE command > instead of the FUA flag. Hence this patch. Hi It would be nice to get some clarification on what is going on for this case. This includes with Data Reliability enabled? In theory, WRITE+FUA should be at least as fast as WRITE+SYNCHRONIZE CACHE, right? Do we have any explanation for why that would not be true? In particular, is SYNCHRONIZE CACHE faster because it is not, in fact, providing Reliable Writes? > > Signed-off-by: Asutosh Das <asutoshd@codeaurora.org> > Signed-off-by: Bart Van Assche <bvanassche@acm.org> > [ bvanassche: modified a source code comment ] > --- > drivers/ufs/core/ufshcd.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c > index bf3cb12ef02f..461aa51cfccc 100644 > --- a/drivers/ufs/core/ufshcd.c > +++ b/drivers/ufs/core/ufshcd.c > @@ -5056,6 +5056,9 @@ static int ufshcd_slave_alloc(struct scsi_device *sdev) > /* WRITE_SAME command is not supported */ > sdev->no_write_same = 1; > > + /* Use SYNCHRONIZE CACHE instead of FUA to improve performance */ > + sdev->sdev_bflags = BLIST_BROKEN_FUA; > + > ufshcd_lu_init(hba, sdev); > > ufshcd_setup_links(hba, sdev);
Hi Bart, I love your patch! Yet something to improve: [auto build test ERROR on mkp-scsi/for-next] [also build test ERROR on jejb-scsi/for-next linus/master v6.2-rc6] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Bart-Van-Assche/scsi-core-Introduce-the-BLIST_BROKEN_FUA-flag/20230202-021019 base: https://git.kernel.org/pub/scm/linux/kernel/git/mkp/scsi.git for-next patch link: https://lore.kernel.org/r/20230201180637.2102556-3-bvanassche%40acm.org patch subject: [PATCH 2/2] scsi: ufs: Use SYNCHRONIZE CACHE instead of FUA config: x86_64-randconfig-a012-20230130 (https://download.01.org/0day-ci/archive/20230202/202302021626.GfHCwXIh-lkp@intel.com/config) compiler: clang version 14.0.6 (https://github.com/llvm/llvm-project f28c006a5895fc0e329fe15fead81e37457cb1d1) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/4edde0173805f04faa8e79aab4de3e929ea4b7c0 git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Bart-Van-Assche/scsi-core-Introduce-the-BLIST_BROKEN_FUA-flag/20230202-021019 git checkout 4edde0173805f04faa8e79aab4de3e929ea4b7c0 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 olddefconfig COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash If you fix the issue, kindly add following tag where applicable | Reported-by: kernel test robot <lkp@intel.com> All errors (new ones prefixed by >>): >> drivers/ufs/core/ufshcd.c:5060:22: error: use of undeclared identifier 'BLIST_BROKEN_FUA' sdev->sdev_bflags = BLIST_BROKEN_FUA; ^ drivers/ufs/core/ufshcd.c:9988:44: warning: shift count >= width of type [-Wshift-count-overflow] if (!dma_set_mask_and_coherent(hba->dev, DMA_BIT_MASK(64))) ^~~~~~~~~~~~~~~~ include/linux/dma-mapping.h:76:54: note: expanded from macro 'DMA_BIT_MASK' #define DMA_BIT_MASK(n) (((n) == 64) ? ~0ULL : ((1ULL<<(n))-1)) ^ ~~~ 1 warning and 1 error generated. vim +/BLIST_BROKEN_FUA +5060 drivers/ufs/core/ufshcd.c 5031 5032 /** 5033 * ufshcd_slave_alloc - handle initial SCSI device configurations 5034 * @sdev: pointer to SCSI device 5035 * 5036 * Returns success 5037 */ 5038 static int ufshcd_slave_alloc(struct scsi_device *sdev) 5039 { 5040 struct ufs_hba *hba; 5041 5042 hba = shost_priv(sdev->host); 5043 5044 /* Mode sense(6) is not supported by UFS, so use Mode sense(10) */ 5045 sdev->use_10_for_ms = 1; 5046 5047 /* DBD field should be set to 1 in mode sense(10) */ 5048 sdev->set_dbd_for_ms = 1; 5049 5050 /* allow SCSI layer to restart the device in case of errors */ 5051 sdev->allow_restart = 1; 5052 5053 /* REPORT SUPPORTED OPERATION CODES is not supported */ 5054 sdev->no_report_opcodes = 1; 5055 5056 /* WRITE_SAME command is not supported */ 5057 sdev->no_write_same = 1; 5058 5059 /* Use SYNCHRONIZE CACHE instead of FUA to improve performance */ > 5060 sdev->sdev_bflags = BLIST_BROKEN_FUA; 5061 5062 ufshcd_lu_init(hba, sdev); 5063 5064 ufshcd_setup_links(hba, sdev); 5065 5066 return 0; 5067 } 5068
On 2/1/23 23:52, Adrian Hunter wrote: > On 1/02/23 20:06, Bart Van Assche wrote: >> UFS devices perform better when using SYNCHRONIZE CACHE command >> instead of the FUA flag. Hence this patch. > > It would be nice to get some clarification on what is > going on for this case. > > This includes with Data Reliability enabled? > > In theory, WRITE+FUA should be at least as fast as > WRITE+SYNCHRONIZE CACHE, right? > > Do we have any explanation for why that would not > be true? > > In particular, is SYNCHRONIZE CACHE faster because > it is not, in fact, providing Reliable Writes? Hi Adrian, Setting the FUA bit in a WRITE command is functionally equivalent to submitting a WRITE command without FUA and submitting a SYNCHRONIZE CACHE command afterwards. For both sequences the storage device has to guarantee that the written data will survive a sudden power loss event. It is not clear to me why WRITE + SYNCHRONIZE CACHE is faster than WRITE + FUA. All I know is that this behavior has been observed for multiple UFS devices from multiple vendors. I hope that one of the UFS vendors can provide more information. Bart.
On Thu, 2023-02-02 at 10:09 -0800, Bart Van Assche wrote: > On 2/1/23 23:52, Adrian Hunter wrote: > > On 1/02/23 20:06, Bart Van Assche wrote: > > > UFS devices perform better when using SYNCHRONIZE CACHE command > > > instead of the FUA flag. Hence this patch. > > > > It would be nice to get some clarification on what is > > going on for this case. > > > > This includes with Data Reliability enabled? > > > > In theory, WRITE+FUA should be at least as fast as > > WRITE+SYNCHRONIZE CACHE, right? > > > > Do we have any explanation for why that would not > > be true? > > > > In particular, is SYNCHRONIZE CACHE faster because > > it is not, in fact, providing Reliable Writes? > Hi Adrian, > > Setting the FUA bit in a WRITE command is functionally equivalent to > submitting a WRITE command without FUA and submitting a SYNCHRONIZE > CACHE command afterwards. For both sequences the storage device has > to guarantee that the written data will survive a sudden power loss > event. Well, that may not be true in all situations. Semantically FUA is a barrier: it can be implemented such that it destages only the current write plus the cache writes that occurred before the write with the FUA. It could also be implemented as you suggest above, which simply destages the entire cache, but it doesn't have to be. One of the reasons for FUA to exist is the potential difference between the two. James
On 2/2/23 10:46, James Bottomley wrote: > Well, that may not be true in all situations. Semantically FUA is a > barrier: it can be implemented such that it destages only the current > write plus the cache writes that occurred before the write with the > FUA. It could also be implemented as you suggest above, which simply > destages the entire cache, but it doesn't have to be. One of the > reasons for FUA to exist is the potential difference between the two. Hi James, Although support for the barrier concept has been removed from the block layer, would it be possible to tell me in which T10 document I can find more information about the barrier semantics? All I found in the latest SBC-5 draft (revision 4; 2023-01-24) about FUA is the following (section 5.40 WRITE (10)): "A force unit access (FUA) bit set to one specifies that the device server shall write the logical blocks to: a) the non-volatile cache, if any; or b) the medium. An FUA bit set to zero specifies that the device server shall write the logical blocks to: a) volatile cache, if any; b) non-volatile cache, if any; or c) the medium." To me the description of FUA in the SBC-3 draft from 11 November 2013 seems identical to the above text. Thanks, Bart.
On Thu, 2023-02-02 at 11:00 -0800, Bart Van Assche wrote: > On 2/2/23 10:46, James Bottomley wrote: > > Well, that may not be true in all situations. Semantically FUA is > > a barrier: it can be implemented such that it destages only the > > current write plus the cache writes that occurred before the write > > with the FUA. It could also be implemented as you suggest above, > > which simply destages the entire cache, but it doesn't have to be. > > One of the reasons for FUA to exist is the potential difference > > between the two. > > Hi James, > > Although support for the barrier concept has been removed from the > block layer, would it be possible to tell me in which T10 document I > can find more information about the barrier semantics? All I found > in the latest SBC-5 draft (revision 4; 2023-01-24) about FUA is the > following (section 5.40 WRITE (10)): I have only a vague recollection of manufacturers implementing this semantic but ... > "A force unit access (FUA) bit set to one specifies that the device > server shall write the logical blocks to: > a) the non-volatile cache, if any; or > b) the medium. > An FUA bit set to zero specifies that the device server shall write > the logical blocks to: > a) volatile cache, if any; > b) non-volatile cache, if any; or > c) the medium." > > To me the description of FUA in the SBC-3 draft from 11 November 2013 > seems identical to the above text. So what that says is the FUA write writes to the medium and *doesn't* flush the volatile cache (so any writeback data stays there). I assume this style is for metadata reservations, so we guarantee fs tree consistency but not necessarily file data consistency. However, that makes flushing everything a way bigger hammer than this behaviour. James
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c index bf3cb12ef02f..461aa51cfccc 100644 --- a/drivers/ufs/core/ufshcd.c +++ b/drivers/ufs/core/ufshcd.c @@ -5056,6 +5056,9 @@ static int ufshcd_slave_alloc(struct scsi_device *sdev) /* WRITE_SAME command is not supported */ sdev->no_write_same = 1; + /* Use SYNCHRONIZE CACHE instead of FUA to improve performance */ + sdev->sdev_bflags = BLIST_BROKEN_FUA; + ufshcd_lu_init(hba, sdev); ufshcd_setup_links(hba, sdev);