Tweak prompt API discussion

domenic · domenic · commit 0094d88e5984 · 2025-05-15T13:40:18.000+09:00
diff --git a/README.md b/README.md
@@ -405,15 +405,15 @@ Although the APIs contain support for streaming output, they don't support strea
 
 However, we believe that streaming input would not be a good fit for these APIs. Attempting to summarize or rewrite input as more input streams in will likely result in multiple wasteful rounds of revision. The underlying language model technology does not support streaming input, so the implementation would be buffering the input stream anyway, then repeatedly feeding new versions of the buffered text to the language model. If a developer wants to achieve such results, they can do so themselves, at the cost of writing code which makes the wastefulness of the operation more obvious. Developers can also customize such code, e.g. by only asking for new summaries every 5 seconds (or whatever interval makes the most sense for their use case).
 
-### Directly exposing a "prompt API"
+### Directly exposing a prompt API
 
 The same team that is working on these APIs is also prototyping an experimental [prompt API](https://github.com/webmachinelearning/prompt-api/). A natural question is how these efforts related. Couldn't one easily accomplish summarization/writing/rewriting by directly prompting a language model, thus making these higher-level APIs redundant?
 
-We currently believe higher-level APIs have a better chance of producing interoperability, as they make it more difficult to rely on the specifics of a model's capabilities, knowledge, or output formatting. [webmachinelearning/prompt-api#35](https://github.com/webmachinelearning/prompt-api/issues/35) contains specific illustrations of the potential interoperability problems with a raw prompt API. (It also contains a possible solution, which we are exploring!) When only specific use cases are targeted, implementations can more predictably produce similar output, that always works well enough to be usable by web developers regardless of which implementation is in play. This is similar to how other APIs backed by machine learning models work, such as the [shape detection API](https://wicg.github.io/shape-detection-api/) or the proposed [translator and language detector APIs](https://github.com/webmachinelearning/translation-api).
+We currently believe higher-level APIs have a better chance of guiding developers toward interoperability, as they make it more difficult to rely on the specifics of a model's capabilities, knowledge, or output formatting. [webmachinelearning/prompt-api#35](https://github.com/webmachinelearning/prompt-api/issues/35) contains specific illustrations of the potential interoperability problems with a raw prompt API. Although the [structured output](https://github.com/webmachinelearning/prompt-api/blob/main/README.md#structured-output-with-json-schema-or-regexp-constraints) feature can help mitigate these risks, it's not guaranteed that web developers will always use it. Whereas, when only specific use cases are targeted, implementations can more predictably produce similar output, that always works well enough to be usable by web developers regardless of which implementation is in play. This is similar to how other APIs backed by machine learning models work, such as the [shape detection API](https://wicg.github.io/shape-detection-api/) or the proposed [translator and language detector APIs](https://github.com/webmachinelearning/translation-api).
 
 Another reason to favor higher-level APIs is that it is possible to produce better results with them than with a raw prompt API, by fine-tuning the model on the specific tasks and configurations that are offered. They can also encapsulate the application of more advanced techniques, e.g. hierarchical summarization and prefix caching; see [this comment](https://github.com/WICG/proposals/issues/163#issuecomment-2297913033) from a web developer regarding their experience of the complexity of real-world summarization tasks.
 
-For the time being, the Chrome built-in AI team is moving forward more aggresively with the writing assistance APIs (as well as the translator and language detector APIs), with the next milestone being [origin trials](https://developer.chrome.com/docs/web-platform/origin-trials). The prompt API remains extra-experimental, with its next milestone being [experimentation only within Chrome Extensions](https://developer.chrome.com/blog/august2024-built-in-ai?hl=en#prompt_api_in_chrome_extensions). Nevertheless, we invite discussion of all of these APIs within the Web Machine Learning Community Group.
+For these reasons, the Chrome built-in AI team is moving forward with both approaches in parallel, with task-based APIs like the writing assistance APIs expected to reach stability faster. Nevertheless, we invite discussion of all of these APIs within the Web Machine Learning Community Group.
 
 ## Privacy considerations