PySpark

Spark can be used for distributed computing. This is necessary for very large volumes of data that cannot be loaded entirely into memory.