Fuel
From charlesreid1
Contents
Basics
Fuel is a library for creating machine learning data pipelines. There are multiple features that make it really convenient.
Find fuel on Github here: https://github.com/mila-udem/fuel
Overview of how it works: https://fuel.readthedocs.io/en/latest/overview.html
Prerequisites
Fuel uses HDF5, so you will need a copy of HDF5 header files installed locally. Use your package manager, or follow HDF5 installation instructions. On a Mac:
$ brew install hdf5
Now you can install Fuel.
Install Fuel from Source
$ git clone git@github.com:/mila-udem/fuel.git $ cd fuel $ python setup.py build $ python setup.py install
Basic Usage
Summary:
- Datasets are the principal interface to data, but are abstract classes
- IterableDatasets (less powerful) allow sequential access to data in specified order only
- IndexableDatasets (more powerful) allow random access to data
- Schemes allow iterating through IndexablelDatasets in various orders (batch, sequential, shuffle, etc.)
Wrapping Custom Datasets with Fuel
Basically, the process of wrapping a custom data set with fuel looks like this:
- Specify how the original data should be downloaded, processed, and turned into a fuel data set
- Specify how the fuel data set should be loaded
The first step - defining how to turn original data into fuel data:
- Create a download wrapper - this tells fuel how to download the original data ("briq" download?)
- Define a way to load a single piece of data (e.g., parameterized by name) and, optionally, paired/related pieces of data (e.g., two related images)
- Convert function to extract all data and assemble it all into an HDF5 file (and remove original data when finished)
The second step - specifying how the fuel data set should be loaded:
- Create a fuel Datasets object (inheriting from, e.g., H5PYDataset)
- Define a way for that data to be loaded (example: make a universally-available load_data method in a package specific to your data set, as in lfw_fuel)
Flags
fuel fuel is a package for automatic loading of data for machine learning and neural networks
Basic usage and Fuel classes: Fuel/Usage Loading custom datasets with fuel: Fuel/Custom Datasets
Category:Fuel · Category:Data Engineering
|