SHORT Half Precision MFLOP/s

EVENTSET
FIXC0 INSTR_RETIRED_ANY
FIXC1 CPU_CLK_UNHALTED_CORE
FIXC2 CPU_CLK_UNHALTED_REF
FIXC3 TOPDOWN_SLOTS
PMC0  FP_ARITH_INST_RETIRED2_SCALAR
PMC1  FP_ARITH_INST_RETIRED2_128B_PACKED_HALF
PMC2  FP_ARITH_INST_RETIRED2_256B_PACKED_HALF
PMC3  FP_ARITH_INST_RETIRED2_512B_PACKED_HALF

METRICS
Runtime (RDTSC) [s] time
Runtime unhalted [s] FIXC1*inverseClock
Clock [MHz]  1.E-06*(FIXC1/FIXC2)/inverseClock
CPI  FIXC1/FIXC0
HP [MFLOP/s]  1.0E-06*(PMC0+PMC1*8.0+PMC2*16.0+PMC3*32.0)/time
128B HP [MFLOP/s] 1.0E-06*(PMC1*8.0)/time
256B HP [MFLOP/s] 1.0E-06*(PMC2*16.0)/time
512B HP [MFLOP/s] 1.0E-06*(PMC3*32.0)/time
Packed [MUOPS/s]   1.0E-06*(PMC1+PMC2+PMC3)/time
Scalar [MUOPS/s] 1.0E-06*PMC0/time
Vectorization ratio 100*(PMC1+PMC2+PMC3)/(PMC0+PMC1+PMC2+PMC3)

LONG
Formulas:
HP [MFLOP/s] = 1.0E-06*(FP_ARITH_INST_RETIRED2_SCALAR+FP_ARITH_INST_RETIRED2_128B_PACKED_HALF*8+FP_ARITH_INST_RETIRED2_256B_PACKED_HALF*16+FP_ARITH_INST_RETIRED2_512B_PACKED_HALF*32)/runtime
128B HP [MFLOP/s] = 1.0E-06*(FP_ARITH_INST_RETIRED2_128B_PACKED_HALF*8)/runtime
256B HP [MFLOP/s] = 1.0E-06*(FP_ARITH_INST_RETIRED2_256B_PACKED_HALF*16)/runtime
512B HP [MFLOP/s] = 1.0E-06*(FP_ARITH_INST_RETIRED2_512B_PACKED_HALF*32)/runtime
Packed [MUOPS/s] = 1.0E-06*(FP_ARITH_INST_RETIRED2_128B_PACKED_HALF+FP_ARITH_INST_RETIRED2_256B_PACKED_HALF+FP_ARITH_INST_RETIRED2_512B_PACKED_HALF)/runtime
Scalar [MUOPS/s] = 1.0E-06*FP_ARITH_INST_RETIRED2_SCALAR/runtime
Vectorization ratio [%] = 100*(FP_ARITH_INST_RETIRED2_128B_PACKED_HALF+FP_ARITH_INST_RETIRED2_256B_PACKED_HALF+FP_ARITH_INST_RETIRED2_512B_PACKED_HALF)/(FP_ARITH_INST_RETIRED2_SCALAR+FP_ARITH_INST_RETIRED2_128B_PACKED_HALF+FP_ARITH_INST_RETIRED2_256B_PACKED_HALF+FP_ARITH_INST_RETIRED2_512B_PACKED_HALF)
-
Scalar and packed half precision FLOP rates new in Sapphire Rapids.

