Elastic-substitution decoding (ESD), first introduced by Chiang (2010), can be important
for obtaining good results when applying labels to enrich hierarchical statistical machine
translation (SMT). However, an efficient implementation is essential for scalable application.
We describe how to achieve this, contributing essential details that were missing in the original
exposition. We compare ESD to strict matching and show its superiority for both reordering
and syntactic labels. To overcome the sub-optimal performance due to the late evaluation
of features marking label substitution types, we increase the diversity of the rules explored
during cube pruning initialization with respect to labels their labels. This approach gives
significant improvements over basic ESD and performs favorably compared to extending the
search by increasing the cube pruning pop-limit. Finally, we look at combining multiple
labels. The combination of reordering labels and target-side boundary-tags yields a significant
improvement in terms of the word-order sensitive metrics Kendall reordering score and
METEOR. This confirms our intuition that the combination of reordering labels and syntactic
labels can yield improvements over either label by itself, despite increased sparsity