aiexperiments-ai-duet/server/magenta/scripts/README.md
2016-11-17 07:33:16 +03:00

1.4 KiB

Building your Dataset

After setting up your Magenta environment, you can build your first MIDI dataset. We do this by creating a directory of MIDI files and converting them into NoteSequences. If you don't have any MIDIs handy, you can use the Lakh MIDI Dataset or find some at MidiWorld.

Build and run the script. Warnings may be printed by the MIDI parser if it encounters a malformed MIDI file but these can be safely ignored. MIDI files that cannot be parsed will be skipped.

MIDI_DIRECTORY=<folder containing MIDI files. can have child folders.>

# TFRecord file that will contain NoteSequence protocol buffers.
SEQUENCES_TFRECORD=/tmp/notesequences.tfrecord

bazel run //magenta/scripts:convert_midi_dir_to_note_sequences -- \
--midi_dir=$MIDI_DIRECTORY \
--output_file=$SEQUENCES_TFRECORD \
--recursive

Note: To build and run in separate commands, run

bazel build //magenta/scripts:convert_midi_dir_to_note_sequences

./bazel-bin/magenta/scripts/convert_midi_dir_to_note_sequences \
--midi_dir=$MIDI_DIRECTORY \
--output_file=$SEQUENCES_TFRECORD \
--recursive

Data processing APIs

If you are interested in adding your own model, please take a look at how we create our datasets under the hood: Data processing in Magenta