To constrain the output of the decoder to just the reference sentences, add this as a feature:
[feature] .... ConstrainedDecoding path=ref.txt