Codebase list libexternalsortinginjava-java / debian/latest

Tree @debian/latest (Download .tar.gz)

[![Build Status](](
[![][maven img]][maven]
[![][license img]][license]
![Java CI](

External-Memory Sorting in Java: useful to sort very large files using multiple cores and an external-memory algorithm.

The versions 0.1 of the library are compatible with Java 6 and above. Versions 0.2 and above
require at least Java 8.

This code is used in [Apache Jackrabbit Oak]( as well as in [Apache Beam]( and in [Spotify scio](

Code sample


//... inputfile: input file name
//... outputfile: output file name
// next command sorts the lines from inputfile to outputfile
ExternalSort.mergeSortedFiles(ExternalSort.sortInBatch(new File(inputfile)), new File(outputfile));
// you can also provide a custom string comparator, see API

Code sample (CSV)

For sorting CSV files, it  might be more convenient to use `CsvExternalSort`.


// provide a comparator
Comparator<CSVRecord> comparator = (op1, op2) -> op1.get(0).compareTo(op2.get(0));
//... inputfile: input file name
//... outputfile: output file name
//...provide sort options
CsvSortOptions sortOptions = new CsvSortOptions
				.Builder(comparator, CsvExternalSort.DEFAULTMAXTEMPFILES, CsvExternalSort.estimateAvailableMemory())
// container to store the header lines
ArrayList<CSVRecord> header = new ArrayList<CSVRecord>();

// next two lines sort the lines from inputfile to outputfile
List<File> sortInBatch = CsvExternalSort.sortInBatch(file, null, sortOptions, header);
// at this point you can access header if you'd like.
CsvExternalSort.mergeSortedFiles(sortInBatch, outputfile, sortOptions, true, header);


The `numHeader` parameter is the number of lines of headers in the CSV files (typically 1 or 0) and the `skipHeader` parameter indicates whether you would like to exclude these lines from the parsing.

API Documentation

Maven dependency

You can download the jar files from the Maven central repository:

You can also specify the dependency in the Maven "pom.xml" file:


How to build

- get the java jdk
- Install Maven 2
- mvn install - builds jar (requires signing)
- or mvn package - builds jar (does not require signing)
- mvn test - runs tests

[maven img]:

[license img]:
