Would like support for a superfast model for gpt2

I have installed the app, and it works great on my Nexus 5X! Except that it is pretty slow.

Ideally, I would like the model to generate words faster or as fast as I can read them.

Therefore, would you be able to add support for a model which is based on distilgpt2, but with FP16 quantization and a sequence length of, say 32?

I realise that you have supplied the code to create one's own models within this app, but try as I might, the models that I create using gpt2.py keep failing to work when I add them to the app.

Is there any chance of you adding an extra model, as described above, that is as fast as possible, to the `download.gradle`?

Thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Would like support for a superfast model for gpt2 #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Would like support for a superfast model for gpt2 #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions