If you do not use DFT(REORDER) and either OPTIMIZE(2) or OPTIMIZE(3), you will not get good performance from the compiler generated code.
Also, do not use OPT(2) or OPT(3) without DFT(REORDER) - the compiler will generate worse code and take much longer to do so.