KIE.AI
language
language
  • 🇺🇸 English
  • 🇨🇳 Chinese
language
language
  • 🇺🇸 English
  • 🇨🇳 Chinese
Support
Market
File Upload APICommon API
Market
File Upload APICommon API
  1. Vocal Removal
  • Getting Started with KIE API (Important)
  • Market
  • Image Models
    • Seedream
      • Seedream3.0 - Text to Image
      • Seedream4.0 - Text to Image
      • Seedream4.0 - Edit
      • Seedream4.5 - Text to Image
      • Seedream4.5 - Edit
    • Z-image
      • Z-Image
    • Google
      • Google - Nano Banana 2
      • Google - imagen4-fast
      • Google - imagen4-ultra
      • Google - imagen4
      • Google - Nano Banana Edit
      • Google - Nano Banana
      • Google - Nano Banana Pro
    • Flux-2
      • Flux-2 - Pro Image to Image
      • Flux-2 - Pro Text to Image
      • Flux-2 - Image to Image
      • Flux-2 - Text to Image
    • Grok Imagine
      • Grok Imagine - Text to Image
      • Grok Imagine - image to image
      • Grok Imagine - Image Upscale
    • GPT Image
      • GPT Image-1.5 - Text to Image
      • GPT Image-1.5 - Image to Image
    • Topaz
      • Topaz - Image Upscale
    • Recraft
      • Recraft - Remove Background
      • Recraft - Crisp Upscale
    • Ideogram
      • Ideogram - V3 Reframe
      • Ideogram - Character Edit
      • Ideogram - Character Remix
      • Ideogram - Character
    • Qwen
      • Qwen - Text to Image
      • Qwen - Image to Image
      • Qwen - Image Edit
    • 4o Image API
      • 4o Image API Quickstart
      • 4o Image Generation Callbacks
      • Generate 4o Image
      • Get 4o Image Details
      • Get Direct Download URL
    • Flux Kontext API
      • Flux Kontext API Quickstart
      • Image Generation or Editing Callbacks
      • Generate or Edit Image
      • Get Image Details
  • Video Models
    • Grok Imagine
      • Grok Imagine Text to Video
      • Grok Imagine Image to Video
    • Kling
      • Kling 2.6 Text to Video
      • Kling 2.6 Image to Video
      • Kling - V2.5 Turbo Image to Video Pro
      • Kling - V2.5 Turbo Text to Video Pro
      • Kling AI Avatar Standard
      • Kling AI Avatar Pro
      • Kling V2.1 Master Image to Video
      • Kling V2.1 Master Text to Video
      • Kling V2.1 Pro
      • Kling V2.1 Standard
      • Kling 2.6 motion-control
      • kling-3.0 motion-control
      • Kling 3.0
    • Bytedance
      • Bytedance Seedance 1.5 Pro
      • Bytedance V1 Pro Fast Image to Video
      • Bytedance V1 Pro Image to Video
      • Bytedance - V1 Pro Text to Video
      • Bytedance - V1 Lite Image to Video
      • Bytedance - V1 Lite Text to Video
    • Hailuo
      • Hailuo 2.3 Pro Image to Video
      • Hailuo 2.3 Standard Image to Video
      • Hailuo Pro Text to Video
      • Hailuo Pro Image to Video
      • Hailuo Standard Text to Video
      • Hailuo Standard Image to Video
    • Sora2
      • Sora2 - Image to Video
      • Sora2 - Text to Video
      • Sora2 - Pro Image to Video
      • Sora2 - Pro Text to Video
      • Sora2 - Watermark Remover
      • Sora2 - Pro Storyboard
      • Sora2 - Characters
      • Sora2 - Characters Pro
    • Wan
      • Wan 2.6 - Image to Video
      • Wan 2.6 - Text to Video
      • Wan 2.6 - Video to Video
      • Wan - Image to Video
      • Wan - Text to Video
      • Wan - 2.2 A14B Speech to Video Turbo
      • Wan - Animate Move
      • Wan - Animate Replace
      • Wan - 2.6-flash-image-to-video
      • Wan - 2-6-flash-video-to-video
    • Topaz
      • Topaz - Video Upscale
    • Infinitalk
      • Infinitalk - From Audio
    • Runway API
      • Runway API Quickstart
      • AI Video Generation Callbacks
      • AI Video Extension Callbacks
      • Aleph
        • Aleph Video Generation Callbacks
        • Generate Aleph Video
        • Get Aleph Video Details
      • Generate AI Video
      • Get AI Video Details
      • Extend AI Video
  • Music Models
    • ElevenLabs
      • elevenlabs/audio-isolation
      • elevenlabs/sound-effect-v2
      • elevenlabs/speech-to-text
      • elevenlabs/text-to-dialogue-v3
      • elevenlabs/text-to-speech-multilingual-v2
      • elevenlabs/text-to-speech-turbo-2-5
  • Chat Models
    • GPT
      • GPT-5-2
    • Claude
      • Claude Sonnet 4.5
      • Claude Opus 4.5
    • Gemini
      • Gemini 3 Pro
      • Gemini 2.5 Flash
      • Gemini 2.5 Pro
  • Veo3.1 API
    • Veo3.1 API Quickstart
    • Veo3.1 Video Generation Callbacks
    • Get 4K Video Callbacks
    • Generate Veo3.1 Video
    • Extend Veo3.1 Video
    • Get 1080P Video
    • Get 4K Video
    • Get Veo3.1 Video Details
  • Suno API
    • Suno API Quickstart
    • Music Generation
      • Music Generation Callbacks
      • Music Extension Callbacks
      • Audio Upload and Cover Callbacks
      • Audio Upload and Extension Callbacks
      • Add Instrumental Callbacks
      • Add Vocals Callbacks
      • Music Cover Generation Callbacks
      • Replace Music Section Callbacks
      • Generate Music
      • Extend Music
      • Upload And Cover Audio
      • Upload And Extend Audio
      • Add Instrumental to Music
      • Add Vocals to Music
      • Get Music Task Details
      • Get Timestamped Lyrics
      • Boost Music Style
      • Generate Music Cover
      • Get Cover Generation Details
      • Replace Music Section
      • Generate Persona
      • Generate Mashup Music
    • Lyrics Generation
      • Lyrics Generation Callbacks
      • Generate Lyrics
      • Get Lyrics Task Details
    • WAV Conversion
      • Convert to WAV Callbacks
      • Convert to WAV Format
      • Get WAV Conversion Details
    • Vocal Removal
      • Audio Separation Callbacks
      • MIDI Generation Callbacks
      • Vocal & Instrument Stem Separation
        POST
      • Get Vocal Separation Details
        GET
      • Generate MIDI from Audio
        POST
      • Get MIDI Generation Details
        GET
    • Music Video Generation
      • Music Video Generation Callbacks
      • Create Music Video
      • Get Music Video Details
  • Get Task Details
    GET
language
language
  • 🇺🇸 English
  • 🇨🇳 Chinese
language
language
  • 🇺🇸 English
  • 🇨🇳 Chinese
Support
Market
File Upload APICommon API
Market
File Upload APICommon API
  1. Vocal Removal

Generate MIDI from Audio

POST
/api/v1/midi/generate
Convert separated audio tracks into MIDI format with detailed note information for each instrument.

Usage Guide#

Convert separated audio tracks into structured MIDI data containing pitch, timing, and velocity information
Requires a completed vocal separation task ID (from the Vocal Removal API)
Generates MIDI note data for multiple detected instruments including drums, bass, guitar, keyboards, and more
Ideal for music transcription, notation, remixing, or educational analysis
Best results on clean, well-separated audio tracks with clear instrument parts

Prerequisites#

Required
You must first use the Vocal & Instrument Stem Separation API to separate your audio before generating MIDI.

Parameter Reference#

NameTypeDescription
taskIdstringRequired. Task ID from a completed vocal separation.
callBackUrlstringRequired. URL to receive MIDI generation completion notifications.
audioIdstringOptional. Specifies which separated audio track to generate MIDI from. This audioId can be obtained from the originData array in the Get Vocal Separation Details endpoint response. Each item in originData contains an id field that can be used here. If not provided, MIDI will be generated from all separated tracks.

Developer Notes#

The callback will contain detailed note data for each detected instrument.
Each note includes: pitch (MIDI note number), start (seconds), end (seconds), velocity (0-1).
Not all instruments may be detected — depends on audio content.
Pricing: Check current per-call credit costs at https://kie.ai/pricing.

Request

Authorization
Bearer Token
Provide your bearer token in the
Authorization
header when making requests to protected resources.
Example:
Authorization: Bearer ********************
or
Body Params application/jsonRequired

Examples

Responses

🟢200
application/json
MIDI generation task created successfully
Body

🔴500Error
Request Request Example
Shell
JavaScript
Java
Swift
curl --location --request POST 'https://api.kie.ai/api/v1/midi/generate' \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: application/json' \
--data-raw '{
  "taskId": "5c79****be8e",
  "callBackUrl": "https://example.callback",
  "audioId": "8ca376e7-******-08aaf2c6dd27"
}'
Response Response Example
200 - 成功示例
{
    "code": 200,
    "msg": "success",
    "data": {
        "taskId": "5c79****be8e"
    }
}
Previous
Get Vocal Separation Details
Next
Get MIDI Generation Details
Built with