Current theme: dark
← Back to Leaderboard
Task
Detailed breakdown of individual task performance across different models.
Status
All
Models
All
Task Name (18 tasks)
gemini-3-flash
chat_step_slack_integration
139.3s
conditional_channel_routing
108.3s
count_based_digest_workflow
117.0s
custom_step_mock_crm
169.0s
delay_step_implementation
120.7s
dynamic_email_subject
305.9s
email_push_multi_channel
123.0s
express_bridge_setup
128.4s
hmac_security_express
154.2s
multi_tenant_workflow
153.9s
novu_payload_default
39.3s
override_provider_config
41.8s
remix_quickstart_setup
97.7s
sms_step_twilio_integration
334.2s
subscriber_attributes_filtering
123.7s
subscriber_preferences_config
332.4s
workflow_error_handling
186.7s
zod_schema_validation_advanced
63.0s