"What widely-used piece(s) of software originally needed this A20 hack?"
As Yuhong Bao briefly mentioned above, MS-DOS -- for the CALL 5 entry point. For CP/M-80 compatibility, the offset at PSP bytes 6 and 7 needs to be as high as possible (DOS seems to use 0xFEF0). But it's also treated as the offset of an entry point within MS-DOS. MS-DOS, certainly in the early versions, was less than 0xFEF0 bytes long, so the only way for 0xFEF0 to be the offset of an address in MS-DOS was to use the wrap. (In fact the entry point chosen was right at the start of memory, 0000:00C0, so the PSP gets set up with the address F01D:FEF0).
Incidentally, CP/M-86 didn't do this, because it didn't provide the CALL 5 entry point.