The evaluation of the sum can be represented by a full (each internal node has two children) binary tree whose leaves are the numbers of the initial list (X_{i})_{i} and the internal nodes represent the operations. Let D_{i} be the depth of the leaf X_{i} in the evaluation tree. It is not hard to see that the cost of the evaluation is Sum_{i} X_{i}D_{i}.

Let us prove that there exists a minimal-cost tree in which the two (or two of the) smallest values leaves are children of the same node (at the maximum depth). Indeed, consider a minimal-cost tree and any deepest internal node, its two children must be leaves, say of values a and b and of depth D maximum in the tree. Consider any other leaf c, say of depth D_{c} ≤ D. Swapping a (or b, whatever) and c in the tree generates a cost difference of a(D_{c}-D) + c(D-D_{c}) = (c-a)(D-D_{c}). But if c ≤ a, then (c-a)(D-D_{c}) ≤ 0 and this implies (c-a)(D-D_{c}) = 0 (otherwise we would have found a tree of a lesser cost). Consequently if a and b are not yet the two smallest values, we can swap these values with a and b while keeping the cost minimal, hence building an optimal tree satisfying the property.

This proves that an optimal strategy can always start by adding the two smallest values: This gives you the greedy choice strategy to follow. Pick the two smallest values, replace them by their sum and repeat on the new set of values.

NB: For a complete formal proof, look for Huffman Coding on the internet: replacing the frequencies/probabilities of the letters by the input numbers should lead to the same problem.