Bring Your Own Model to AI Runners
How Forkline separates model choice from runner execution so teams can use their own providers while keeping repo work reviewable.
Bring your own model is not only about model choice. For engineering teams, BYOM is a way to keep inference, runner execution, Git artifacts, and human review in separate, understandable layers.
Model choice has become an operational decision, not just a developer preference. Teams need to decide which provider fits a task, how model usage is paid for, and how AI-generated work becomes normal engineering output.
That matters because AI coding products often bundle several things together: the model, the user interface, the automation surface, and the billing model. Bundling can be convenient. It can also make it harder to answer basic operational questions: which model should this task use, who pays for model inference, what happens if a provider is unavailable, and how does the output enter the team’s normal review path?
Forkline takes a different approach. Forkline provides AI runners for engineering work while letting teams bring their own model provider. The model layer stays separate from the runner layer. The work still lands where engineering teams expect it: branches, commits, pull requests, CI checks where applicable, and human review.
This article explains what BYOM means in Forkline, why separating model choice from runner execution matters, and what teams should expect when they use their own models for repo automation.
What you should take away
- BYOM keeps model inference separate from runner execution.
- Forkline bills for runner hours, not model tokens or inference.
- Teams can use providers they already evaluate, approve, and pay for.
- Runner output should still land as branches, commits, pull requests, CI evidence, and human review.
- Provider availability depends on account, API, region, and configuration, so BYOM is control rather than a guarantee that every model works in every setup.
What BYOM means in Forkline
Bring Your Own Model (BYOM) in Forkline means the team configures its own model provider for runner work instead of buying a bundled model product from Forkline.
Provider configuration can include options such as GitHub Copilot, OpenAI, Anthropic, Google, OpenRouter, DeepSeek, Alibaba/Qwen, Moonshot AI/Kimi, Z.AI, Ollama or local models, and other OpenAI-compatible or API-supported providers where available. The exact provider choice belongs to the team using the system.
In this architecture, model inference and repo execution are separate concerns:
- Your model provider handles model inference.
- Forkline provides the runner execution layer around repository work.
- Your Git provider holds the engineering artifacts: branches, commits, pull requests, and CI runs.
- Your team keeps the human review decision.
That separation is the point. Forkline is not trying to become another model subscription. It is the runner layer that turns bounded engineering tasks into reviewable work.
Why BYOM matters in practice
BYOM matters because model strategy changes faster than engineering workflow.
A team may already have internal preferences for a specific model provider, gateway, regional provider, or local model setup. Those preferences can come from cost, latency, quality, compliance, procurement, data handling, regional availability, or simple familiarity. They can also change over time.
If the execution layer is tightly bundled to one model vendor, changing model strategy means changing the workflow product too. With BYOM, the runner workflow can stay stable while the model provider can change where configuration supports it.
The core reasons are simple.
In day-to-day operation, those reasons become four practical advantages.
First, teams can keep using providers they already trust. If a company already has a provider relationship, it does not need to duplicate that relationship just to use AI runners.
Second, teams can separate model costs from runner execution costs. Forkline bills for runner execution hours, not model inference or tokens. Model usage and model costs remain with the team’s chosen provider.
Third, teams can reason more clearly about what they are buying. The model is the reasoning layer. The runner is the execution layer. The Git provider is the artifact layer. The human reviewer is still the decision layer.
Fourth, teams can improve availability without rebuilding the workflow. If one provider is degraded, unavailable in a region, blocked by procurement, or temporarily too expensive for a workload, the team can move to another configured provider while keeping the same repository workflow: the same runner surface, the same pull request review path, and the same validation habits.
That clarity becomes more important as AI billing moves toward usage-based models. For example, GitHub has announced AI Credits for Copilot usage. That is useful market context, not a reason to attack Copilot. The distinction is simpler: Forkline sells runner execution capacity and keeps model usage with the provider your team chooses.
How to set up BYOM in Forkline
Setting up BYOM in Forkline follows the normal onboarding path for a runner workflow.
- Sign up for Forkline. Create your account and authenticate with your Git provider. Forkline currently supports GitHub, GitLab, Gitea, and Forgejo.
- Connect your repositories. Select the repositories where you want Forkline runners to work on bounded engineering tasks.
- Configure your model provider. Add the provider configuration for the model source your team wants to use. Direct provider availability depends on the provider, region, account type, API access, and Forkline configuration.
- Choose runner capacity. Pick a monthly subscription or hour package based on the runner capacity you need. Forkline pricing covers runner execution hours. Model inference costs remain with your chosen provider.
- Run a bounded task. Start with work that is easy to review: a CI fix, a small maintenance task, or a narrow repository change.
- Review the output. Read the runner summary and model attribution, then verify the branch, commits, pull request, and CI status where applicable. A human should still approve or reject the work.
What Forkline adds on top of BYOM
BYOM by itself only answers the model question. It does not answer the workflow question.
Forkline adds the runner layer around your chosen model. That means runners can operate against repos, prepare changes, and publish work back into Git workflows instead of leaving the result in a private chat or local session.
Each runner produces a runner summary: the user’s task input plus the agent’s final answer or summary of what it did. Model attribution records which configured model produced the work. Together, those artifacts tell the reviewer what was attempted and which model was used.
The actual engineering changes are tracked as Git artifacts in the user’s Git provider: branches, commits, pull requests, and CI runs. That keeps the review surface familiar. The reviewer does not need to trust the summary by itself; they can inspect the diff and validation result.
This is the operating model Forkline is built around:
That is the difference between BYOM as a model setting and BYOM inside a team workflow.
What BYOM does not mean
BYOM is useful partly because it is specific. It should not be stretched into claims it does not make.
BYOM does not mean Forkline is a model provider. Your relationship with the selected model provider, gateway, or local model stack remains separate.
BYOM does not mean model inference is free. Forkline bills runner execution hours, but model usage is still governed by your selected provider’s pricing, rate limits, and terms.
BYOM also does not mean every provider is directly available in every setup. Account type, API access, regional availability, provider terms, and Forkline configuration all matter. If one direct provider path is not available, a team may still be able to use another configured provider, an OpenAI-compatible provider, or a routing layer such as OpenRouter where supported.
BYOM also does not remove human review. Forkline runners should create reviewable work, not silently merge changes. For critical changes, architectural decisions, and security-sensitive work, the human gate remains part of the workflow.
When BYOM is the right fit
BYOM is most useful when the team already has opinions about models but still needs better execution around repository work.
It is a good fit when:
- the team already pays for one or more model providers
- provider choice may change over time
- model costs need to remain visible outside the workflow tool
- AI work should land as pull requests, not private chat output
- reviewers need runner summaries, model attribution, diffs, and CI evidence where applicable
- the team wants to separate model strategy from engineering automation strategy
It is less important if the team only needs individual coding assistance inside an editor, or if model choice does not matter yet. In those cases, a bundled coding assistant may be enough.
Honest limitations
Forkline is focused on Git-native workflows today: GitHub, GitLab, Gitea, and Forgejo. Jira and Linear are coming soon, but they should not be treated as current production integrations until they are publicly available.
Forkline also does not prove that every model will perform equally well on every task. BYOM gives teams provider control; it does not guarantee output quality. Reviewers still need to inspect the actual work.
The strongest public proof for Forkline today is not a BYOM benchmark. It is runner-style engineering work, such as the promrail CI recovery PR, where Forkline produced reviewable Git changes for a real CI failure. That proof supports the runner workflow model. BYOM is the model-choice layer that sits under that workflow.
Conclusion
Bring your own model is valuable because it separates concerns.
The model provider handles inference. Forkline provides runner execution. Git holds the engineering artifacts. Humans keep the final review gate.
That is the practical value of BYOM in Forkline: keep the models and providers your team already trusts, then add a runner layer built for reviewable repository work.
If you want to try that model, start small: sign in at app.forkline.dev, connect one repository, configure a provider your team already trusts, and give a runner a bounded task that can be reviewed like any other pull request.