KIE.AI
language
language
  • 🇺🇸 English
  • 🇨🇳 Chinese
language
language
  • 🇺🇸 English
  • 🇨🇳 Chinese
Support
Market
File Upload APICommon API
Market
File Upload APICommon API
  1. Vocal Removal
  • Getting Started with KIE API (Important)
  • Market
  • Image Models
    • Seedream
      • Seedream3.0 - Text to Image
      • Seedream4.0 - Text to Image
      • Seedream4.0 - Edit
      • Seedream4.5 - Text to Image
      • Seedream4.5 - Edit
    • Z-image
      • Z-Image
    • Google
      • Google - Nano Banana 2
      • Google - imagen4-fast
      • Google - imagen4-ultra
      • Google - imagen4
      • Google - Nano Banana Edit
      • Google - Nano Banana
      • Google - Nano Banana Pro
    • Flux-2
      • Flux-2 - Pro Image to Image
      • Flux-2 - Pro Text to Image
      • Flux-2 - Image to Image
      • Flux-2 - Text to Image
    • Grok Imagine
      • Grok Imagine - Text to Image
      • Grok Imagine - image to image
      • Grok Imagine - Image Upscale
    • GPT Image
      • GPT Image-1.5 - Text to Image
      • GPT Image-1.5 - Image to Image
    • Topaz
      • Topaz - Image Upscale
    • Recraft
      • Recraft - Remove Background
      • Recraft - Crisp Upscale
    • Ideogram
      • Ideogram - V3 Reframe
      • Ideogram - Character Edit
      • Ideogram - Character Remix
      • Ideogram - Character
    • Qwen
      • Qwen - Text to Image
      • Qwen - Image to Image
      • Qwen - Image Edit
    • 4o Image API
      • 4o Image API Quickstart
      • 4o Image Generation Callbacks
      • Generate 4o Image
      • Get 4o Image Details
      • Get Direct Download URL
    • Flux Kontext API
      • Flux Kontext API Quickstart
      • Image Generation or Editing Callbacks
      • Generate or Edit Image
      • Get Image Details
  • Video Models
    • Grok Imagine
      • Grok Imagine Text to Video
      • Grok Imagine Image to Video
    • Kling
      • Kling 2.6 Text to Video
      • Kling 2.6 Image to Video
      • Kling - V2.5 Turbo Image to Video Pro
      • Kling - V2.5 Turbo Text to Video Pro
      • Kling AI Avatar Standard
      • Kling AI Avatar Pro
      • Kling V2.1 Master Image to Video
      • Kling V2.1 Master Text to Video
      • Kling V2.1 Pro
      • Kling V2.1 Standard
      • Kling 2.6 motion-control
      • kling-3.0 motion-control
      • Kling 3.0
    • Bytedance
      • Bytedance Seedance 1.5 Pro
      • Bytedance V1 Pro Fast Image to Video
      • Bytedance V1 Pro Image to Video
      • Bytedance - V1 Pro Text to Video
      • Bytedance - V1 Lite Image to Video
      • Bytedance - V1 Lite Text to Video
    • Hailuo
      • Hailuo 2.3 Pro Image to Video
      • Hailuo 2.3 Standard Image to Video
      • Hailuo Pro Text to Video
      • Hailuo Pro Image to Video
      • Hailuo Standard Text to Video
      • Hailuo Standard Image to Video
    • Sora2
      • Sora2 - Image to Video
      • Sora2 - Text to Video
      • Sora2 - Pro Image to Video
      • Sora2 - Pro Text to Video
      • Sora2 - Watermark Remover
      • Sora2 - Pro Storyboard
      • Sora2 - Characters
      • Sora2 - Characters Pro
    • Wan
      • Wan 2.6 - Image to Video
      • Wan 2.6 - Text to Video
      • Wan 2.6 - Video to Video
      • Wan - Image to Video
      • Wan - Text to Video
      • Wan - 2.2 A14B Speech to Video Turbo
      • Wan - Animate Move
      • Wan - Animate Replace
      • Wan - 2.6-flash-image-to-video
      • Wan - 2-6-flash-video-to-video
    • Topaz
      • Topaz - Video Upscale
    • Infinitalk
      • Infinitalk - From Audio
    • Runway API
      • Runway API Quickstart
      • AI Video Generation Callbacks
      • AI Video Extension Callbacks
      • Aleph
        • Aleph Video Generation Callbacks
        • Generate Aleph Video
        • Get Aleph Video Details
      • Generate AI Video
      • Get AI Video Details
      • Extend AI Video
  • Music Models
    • ElevenLabs
      • elevenlabs/audio-isolation
      • elevenlabs/sound-effect-v2
      • elevenlabs/speech-to-text
      • elevenlabs/text-to-dialogue-v3
      • elevenlabs/text-to-speech-multilingual-v2
      • elevenlabs/text-to-speech-turbo-2-5
  • Chat Models
    • GPT
      • GPT-5-2
    • Claude
      • Claude Sonnet 4.5
      • Claude Opus 4.5
    • Gemini
      • Gemini 3 Pro
      • Gemini 2.5 Flash
      • Gemini 2.5 Pro
  • Veo3.1 API
    • Veo3.1 API Quickstart
    • Veo3.1 Video Generation Callbacks
    • Get 4K Video Callbacks
    • Generate Veo3.1 Video
    • Extend Veo3.1 Video
    • Get 1080P Video
    • Get 4K Video
    • Get Veo3.1 Video Details
  • Suno API
    • Suno API Quickstart
    • Music Generation
      • Music Generation Callbacks
      • Music Extension Callbacks
      • Audio Upload and Cover Callbacks
      • Audio Upload and Extension Callbacks
      • Add Instrumental Callbacks
      • Add Vocals Callbacks
      • Music Cover Generation Callbacks
      • Replace Music Section Callbacks
      • Generate Music
      • Extend Music
      • Upload And Cover Audio
      • Upload And Extend Audio
      • Add Instrumental to Music
      • Add Vocals to Music
      • Get Music Task Details
      • Get Timestamped Lyrics
      • Boost Music Style
      • Generate Music Cover
      • Get Cover Generation Details
      • Replace Music Section
      • Generate Persona
      • Generate Mashup Music
    • Lyrics Generation
      • Lyrics Generation Callbacks
      • Generate Lyrics
      • Get Lyrics Task Details
    • WAV Conversion
      • Convert to WAV Callbacks
      • Convert to WAV Format
      • Get WAV Conversion Details
    • Vocal Removal
      • Audio Separation Callbacks
      • MIDI Generation Callbacks
      • Vocal & Instrument Stem Separation
        POST
      • Get Vocal Separation Details
        GET
      • Generate MIDI from Audio
        POST
      • Get MIDI Generation Details
        GET
    • Music Video Generation
      • Music Video Generation Callbacks
      • Create Music Video
      • Get Music Video Details
  • Get Task Details
    GET
language
language
  • 🇺🇸 English
  • 🇨🇳 Chinese
language
language
  • 🇺🇸 English
  • 🇨🇳 Chinese
Support
Market
File Upload APICommon API
Market
File Upload APICommon API
  1. Vocal Removal

MIDI Generation Callbacks

System will call this callback when MIDI generation from separated audio is complete.
When you submit a MIDI generation task to the Suno API, you can use the callBackUrl parameter to set a callback URL. The system will automatically push the results to your specified address when the task is completed.

Callback Mechanism Overview#

The callback mechanism eliminates the need to poll the API for task status. The system will proactively push task completion results to your server.
Webhook Security
To ensure the authenticity and integrity of callback requests, we strongly recommend implementing webhook signature verification. See our Webhook Verification Guide for detailed implementation steps.

Callback Timing#

The system will send callback notifications in the following situations:
MIDI generation task completed successfully
MIDI generation task failed
Errors occurred during task processing

Callback Method#

HTTP Method: POST
Content Type: application/json
Timeout Setting: 15 seconds

Callback Request Format#

When the task is completed, the system will send a POST request to your callBackUrl:
Success Callback
Failure Callback
{
  "code": 200,
  "msg": "success",
  "data": {
    "taskId": "5c79****be8e",
    "state": "complete",
    "instruments": [
      {
        "name": "Drums",
        "notes": [
          {
            "pitch": 73,
            "start": "0.036458333333333336",
            "end": "0.18229166666666666",
            "velocity": 1
          },
          {
            "pitch": 61,
            "start": 0.046875,
            "end": "0.19270833333333334",
            "velocity": 1
          },
          {
            "pitch": 73,
            "start": 0.1875,
            "end": "0.4895833333333333",
            "velocity": 1
          }
        ]
      },
      {
        "name": "Electric Bass (finger)",
        "notes": [
          {
            "pitch": 44,
            "start": 7.6875,
            "end": "7.911458333333333",
            "velocity": 1
          },
          {
            "pitch": 56,
            "start": 7.6875,
            "end": "7.911458333333333",
            "velocity": 1
          },
          {
            "pitch": 51,
            "start": 7.6875,
            "end": "7.911458333333333",
            "velocity": 1
          }
        ]
      }
    ]
  }
}

Status Code Description#

code (integer, required)#

Callback status code indicating task processing result:
Status CodeDescription
200Success - MIDI generation completed successfully
500Internal Error - Please try again or contact support

msg (string, required)#

Status message providing detailed status description

taskId (string, required)#

Task ID, consistent with the taskId returned when you submitted the task

data (object)#

MIDI generation result information, returned on success

Success Response Fields#

data.state (string)#

Processing state. Value: complete when successful

data.instruments (array)#

Array of detected instruments with their MIDI note data
Instrument Object Properties:
name (string) — Instrument name (e.g., "Drums", "Electric Bass (finger)", "Acoustic Grand Piano")
notes (array) — Array of MIDI notes for this instrument
Note Object Properties:
pitch (integer) — MIDI note number (0-127). Middle C = 60. MIDI note reference
start (number | string) — Note start time in seconds from beginning of audio
end (number | string) — Note end time in seconds from beginning of audio
velocity (number) — Note velocity/intensity (0-1 range). 1 = maximum velocity

Callback Reception Examples#

Below are example codes for receiving callbacks in popular programming languages:
Node.js
Python
PHP

Best Practices#

Callback URL Configuration Recommendations
1.
Use HTTPS: Ensure callback URL uses HTTPS protocol for secure data transmission
2.
Verify Origin: Verify the legitimacy of the request source in callback processing
3.
Idempotent Processing: The same taskId may receive multiple callbacks, ensure processing logic is idempotent
4.
Quick Response: Callback processing should return 200 status code quickly to avoid timeout
5.
Asynchronous Processing: Complex business logic (like MIDI file conversion) should be processed asynchronously
6.
Handle Missing Instruments: Not all instruments may be detected - handle empty or missing instrument arrays gracefully
7.
Store Raw Data: Save the complete JSON response for future reference and reprocessing
Important Reminders
Callback URL must be publicly accessible
Server must respond within 15 seconds, otherwise will be considered timeout
If 3 consecutive retry attempts fail, the system will stop sending callbacks
Please ensure the stability of callback processing logic to avoid callback failures due to exceptions
MIDI data is retained for 14 days - download and save promptly if needed long-term
The number and types of instruments detected depends on audio content
Note times (start/end) may be strings or numbers - handle both types

Troubleshooting#

If you are not receiving callback notifications, please check the following:
Network Connection Issues
Confirm callback URL is accessible from public internet
Check firewall settings to ensure inbound requests are not blocked
Verify domain name resolution is correct
Server Response Issues
Ensure server returns HTTP 200 status code within 15 seconds
Check server logs for error messages
Verify endpoint path and HTTP method are correct
Content Format Issues
Confirm received POST request body is in JSON format
Check if Content-Type is application/json
Verify JSON parsing is correct
Handle both string and number types for timing values
Data Processing Issues
Some instruments may have empty note arrays
Not all audio will detect all instrument types
Verify the original vocal separation used split_stem type (not separate_vocal)
Check that the source taskId is from a successfully completed separation

Alternative Solutions#

If you cannot use the callback mechanism, you can also use polling:
Poll Query Results
Use the Get MIDI Generation Details endpoint to periodically query task status. We recommend querying every 10-30 seconds.
Previous
Audio Separation Callbacks
Next
Vocal & Instrument Stem Separation
Built with