When using the merge() method with two HDFS-only parameters (Path inFiles, Path outFile) on a SequenceFile.Sorter instance it throws a "Disk Error Exception". Solution approaches in the web are misleading, suggesting to have a look at the available harddisk space on all cluster nodes. On the cluster I use, harddisk space is no issue.
The problem is that the method requires a writable space on the nodes' local disk under the same path as the HDFS output path. I worked around the issue by saving the merged file to /tmp, which is writable in HDFS _and_ the local file system. After the completion of the method, the file only persists in HDFS. I move it to the desired place with fs.rename(tmpPath, destinationPath).
This might be a configuration issue, because for some reason, the merge() method accepts a _remote_ HDFS path and does not require the same path to exist locally.