Message ID | 20220307184446.3149199-1-alex.bennee@linaro.org |
---|---|
State | New |
Headers | show |
Series | [RFC] target/i386: for maximum rounding precision for fildll | expand |
On 3/7/22 08:44, Alex Bennée wrote: > The instruction description says "It is loaded without rounding > errors." which implies we should have the widest rounding mode > possible. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/888 > Signed-off-by: Alex Bennée <alex.bennee@linaro.org> > --- > target/i386/tcg/fpu_helper.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c > index cdd8e9f947..d986fd5792 100644 > --- a/target/i386/tcg/fpu_helper.c > +++ b/target/i386/tcg/fpu_helper.c > @@ -250,11 +250,15 @@ void helper_fildl_ST0(CPUX86State *env, int32_t val) > void helper_fildll_ST0(CPUX86State *env, int64_t val) > { > int new_fpstt; > + FloatX80RoundPrec old = get_floatx80_rounding_precision(&env->fp_status); > + set_floatx80_rounding_precision(floatx80_precision_x, &env->fp_status); > > new_fpstt = (env->fpstt - 1) & 7; > env->fpregs[new_fpstt].d = int64_to_floatx80(val, &env->fp_status); > env->fpstt = new_fpstt; > env->fptags[new_fpstt] = 0; /* validate stack entry */ > + > + set_floatx80_rounding_precision(old, &env->fp_status); > } Yep. Need a similar fix for fildl_ST0, for the case floatx80_precision_s is currently set (int32_t has more than the 23 bits of single-precision). r~
Richard Henderson <richard.henderson@linaro.org> writes: > On 3/7/22 08:44, Alex Bennée wrote: >> The instruction description says "It is loaded without rounding >> errors." which implies we should have the widest rounding mode >> possible. >> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/888 >> Signed-off-by: Alex Bennée <alex.bennee@linaro.org> >> --- >> target/i386/tcg/fpu_helper.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> diff --git a/target/i386/tcg/fpu_helper.c >> b/target/i386/tcg/fpu_helper.c >> index cdd8e9f947..d986fd5792 100644 >> --- a/target/i386/tcg/fpu_helper.c >> +++ b/target/i386/tcg/fpu_helper.c >> @@ -250,11 +250,15 @@ void helper_fildl_ST0(CPUX86State *env, int32_t val) >> void helper_fildll_ST0(CPUX86State *env, int64_t val) >> { >> int new_fpstt; >> + FloatX80RoundPrec old = get_floatx80_rounding_precision(&env->fp_status); >> + set_floatx80_rounding_precision(floatx80_precision_x, &env->fp_status); >> new_fpstt = (env->fpstt - 1) & 7; >> env->fpregs[new_fpstt].d = int64_to_floatx80(val, &env->fp_status); >> env->fpstt = new_fpstt; >> env->fptags[new_fpstt] = 0; /* validate stack entry */ >> + >> + set_floatx80_rounding_precision(old, &env->fp_status); >> } > > Yep. > > Need a similar fix for fildl_ST0, for the case floatx80_precision_s is > currently set (int32_t has more than the 23 bits of single-precision). It can't hurt to convert with: set_floatx80_rounding_precision(floatx80_precision_x, &env->fp_status); in that case as well right? --8<---------------cut here---------------start------------->8--- target/i386: for maximum rounding precision for fildll The instruction description says "It is loaded without rounding errors." which implies we should have the widest rounding mode possible. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/888 Signed-off-by: Alex Bennée <alex.bennee@linaro.org> 1 file changed, 13 insertions(+) target/i386/tcg/fpu_helper.c | 13 +++++++++++++ modified target/i386/tcg/fpu_helper.c @@ -237,24 +237,37 @@ void helper_fldl_ST0(CPUX86State *env, uint64_t val) merge_exception_flags(env, old_flags); } +static FloatX80RoundPrec tmp_maximise_precision(float_status *st) +{ + FloatX80RoundPrec old = get_floatx80_rounding_precision(st); + set_floatx80_rounding_precision(floatx80_precision_x, st); + return old; +} + void helper_fildl_ST0(CPUX86State *env, int32_t val) { int new_fpstt; + FloatX80RoundPrec old = tmp_maximise_precision(&env->fp_status); new_fpstt = (env->fpstt - 1) & 7; env->fpregs[new_fpstt].d = int32_to_floatx80(val, &env->fp_status); env->fpstt = new_fpstt; env->fptags[new_fpstt] = 0; /* validate stack entry */ + + set_floatx80_rounding_precision(old, &env->fp_status); } void helper_fildll_ST0(CPUX86State *env, int64_t val) { int new_fpstt; + FloatX80RoundPrec old = tmp_maximise_precision(&env->fp_status); new_fpstt = (env->fpstt - 1) & 7; env->fpregs[new_fpstt].d = int64_to_floatx80(val, &env->fp_status); env->fpstt = new_fpstt; env->fptags[new_fpstt] = 0; /* validate stack entry */ + + set_floatx80_rounding_precision(old, &env->fp_status); } --8<---------------cut here---------------end--------------->8--- > > > r~
On 3/7/22 10:48, Alex Bennée wrote: >> Need a similar fix for fildl_ST0, for the case floatx80_precision_s is >> currently set (int32_t has more than the 23 bits of single-precision). > > It can't hurt to convert with: > > set_floatx80_rounding_precision(floatx80_precision_x, &env->fp_status); > > in that case as well right? s/can't hurt/is required/ was my point. Followup patch looks good: Reviewed-by: Richard Henderson <richard.henderson@linaro.org> r~
diff --git a/target/i386/tcg/fpu_helper.c b/target/i386/tcg/fpu_helper.c index cdd8e9f947..d986fd5792 100644 --- a/target/i386/tcg/fpu_helper.c +++ b/target/i386/tcg/fpu_helper.c @@ -250,11 +250,15 @@ void helper_fildl_ST0(CPUX86State *env, int32_t val) void helper_fildll_ST0(CPUX86State *env, int64_t val) { int new_fpstt; + FloatX80RoundPrec old = get_floatx80_rounding_precision(&env->fp_status); + set_floatx80_rounding_precision(floatx80_precision_x, &env->fp_status); new_fpstt = (env->fpstt - 1) & 7; env->fpregs[new_fpstt].d = int64_to_floatx80(val, &env->fp_status); env->fpstt = new_fpstt; env->fptags[new_fpstt] = 0; /* validate stack entry */ + + set_floatx80_rounding_precision(old, &env->fp_status); } uint32_t helper_fsts_ST0(CPUX86State *env)
The instruction description says "It is loaded without rounding errors." which implies we should have the widest rounding mode possible. Resolves: https://gitlab.com/qemu-project/qemu/-/issues/888 Signed-off-by: Alex Bennée <alex.bennee@linaro.org> --- target/i386/tcg/fpu_helper.c | 4 ++++ 1 file changed, 4 insertions(+)