npm package discovery and stats viewer.

Discover Tips

  • General search

    [free text search, go nuts!]

  • Package details

    pkg:[package-name]

  • User packages

    @[username]

Sponsor

Optimize Toolset

I’ve always been into building performant and accessible sites, but lately I’ve been taking it extremely seriously. So much so that I’ve been building a tool to help me optimize and monitor the sites that I build to make sure that I’m making an attempt to offer the best experience to those who visit them. If you’re into performant, accessible and SEO friendly sites, you might like it too! You can check it out at Optimize Toolset.

About

Hi, 👋, I’m Ryan Hefner  and I built this site for me, and you! The goal of this site was to provide an easy way for me to check the stats on my npm packages, both for prioritizing issues and updates, and to give me a little kick in the pants to keep up on stuff.

As I was building it, I realized that I was actually using the tool to build the tool, and figured I might as well put this out there and hopefully others will find it to be a fast and useful way to search and browse npm packages as I have.

If you’re interested in other things I’m working on, follow me on Twitter or check out the open source projects I’ve been publishing on GitHub.

I am also working on a Twitter bot for this site to tweet the most popular, newest, random packages from npm. Please follow that account now and it will start sending out packages soon–ish.

Open Software & Tools

This site wouldn’t be possible without the immense generosity and tireless efforts from the people who make contributions to the world and share their work via open source initiatives. Thank you 🙏

© 2026 – Pkg Stats / Ryan Hefner

appium-llm-plugin

v0.0.5

Published

An Appium 2 plugin that uses LLMs to find elements

Readme

appium-llm-plugin

An Appium plugin allowing natural language element descriptions as selectors, with any OpenAI-compatible API as the backend.

🚧 CONSTRUCTION WARNING 🚧

This plugin is highly experimental! Expect lots of rough edges and sudden plunges to your virtual doom.

How it works

Install this like any other Appium plugin (see below). When active, you have access to a new element locator strategy: ai. Depending on which AI query mode you choose, the plugin will send basic information about the app state (page source or screenshot) along with a prompt to an AI model, and attempt to find your target element based on the natural language description you have provided of it.

Requirements

Along with a recent version of Appium 2 (the requirement for any plugin), you need to have a locally-hosted model running via LM Studio, or an OpenAI API key. Inside LM Studio you should download one or more LLMs to try. They are hosted on huggingface. You could start with this one, which seemed to work well for the XML-based query modes.

Note that the screenshot query mode (see below) requires a model with a vision adapter for multimodal queries. I couldn't find any open source models that did well with this mode, but this one seemed to do the least poiorly.

You should probably also have an understanding of what LLMs are and how they work at a basic level, because sadly I'm not going to go into that here.

Installation and activation

appium plugin install --source=npm appium-llm-plugin
appium --use-plugins=llm

Capabilities

This plugin expects a variety of required and optional capabilities when you start your session while it's active.

|Capability Name|Description| |--|--| |appium:llmModel|[Required] The name of the model to use, e.g., gpt-4o| |appium:llmApiKey|[Required] Your API key for OpenAI, or LM Studio (usually lm-studio for the latter)| |appium:llmServerBaseUrl|[Optional] If omitted, defaults to OpenAI. If running locally, set it to LM Studio's base url, usually http://localhost:1234/v1| |appium:llmQueryMode|[Optional] Whether you want to use the screenshot, xml, or xmlpos query mode (see below for descriptions). Defaults to screenshot| |appium:llmTemperature|[Optional] The temperature to use in the query (defaults to 0.2). Lower values are probably better to avoid the AI getting too "creative"!| |appium:llmOrganization|[Optional] The organization ID (if applicable and using OpenAI)| |appium:llmProject|[Optional] The project ID (if applicable and using OpenAI)|

Client usage

This plugin augments a single Appium command: findElement (and only the singular version! No findElements is available). It adds a new locator strategy ai. How to use this locator strategy will differ from client to client. In WebdriverIO, for example, you can do this:

const rawEl = await driver.findElement('ai', "Your element description here")
const el = await driver.$(rawEl) // turn this into a "first class" element, which takes two lines for some reason

When you make the call to find an element using the ai locator strategy, the selector is just a natural language description of the element. It might be the text of the element. It might be you talking about features of the element. I don't care. But the AI model does! So pick a description that gets you the element you want. Depending on which query mode you have selected, the plugin will ask the model different questions, so you should think about which mode works best for your case. Read on for more about them.

Query Modes

You can use this plugin in one of 3 modes:

screenshot mode

In this mode, when you make a find element request, a screenshot is sent to the model along with a prompt asking for it to determine the bounding box for your described element in the image. Appium will then return to you an "image element", which is a lightweight element reference that all you can do with is basically call element.click() on. If the AI did its job right, when you do that, you will have tapped on the element! Big if.

xml mode

In this mode, when you make a find element request, Appium will collect the page source XML and send it to the model, along with a request for an XPath selector to be derived from the XML based on your description of the element (so in this case, descriptions that include visual information not available in the XML source will be unhelpful). That XPath selector is then used internally to find you an element, using Appium's normal finding strategies. The element reference returned to you is a "normal" element you can do anything with. But is it the right element? Who knows!

xmlpos mode

In this mode, when you make a find element request, the XML source is sent to the model, just like in the xml mode. But the model is not asked to find an XPath selector. Instead, the model is asked to look at the XML element that hopefully matches what you want, and to get the bounding box (location) information from the XML attributes. Appium then turns that bounding box rect into an "image element" and sends it back to you. This is a lightweight type of element that doesn't actually refer to a UI element that you can do much with. All you can do is click it, hoping that the AI didn't hallucinate those coordinates!

This mode currently only works with Android, since I haven't gotten around to writing a prompt that works with iOS's page source XML structure yet (great first contribution for you!).