(Post-WIP) AI Agents for Browser Use
- People
- David
- Joe
- Idea
- Test available AI Browser Use tools - "can an LLM use my browser to order me Taco Bell?"
- Details
- OpenAI Operator
- Open-source: browser-use
People David Idea Part of our work in the Kumubot cluster involves being able to work on both text as well as audio recordings - the idea was the figure out the capabilities of Qwen3-Omni on our Blackwell hardware Details * Blackwell software support (at least for the RTX 6000 Pro)
People David Idea Joe wanted to know what the capabilities of https://github.com/HKUDS/AI-Trader look like in the context for local LLMs Details * Looks like the assumption of the repo is that you can test multiple AI in agentic fashion against historical stock market data over a period
People: David Idea: To setup our Kumubot cluster, I went down a rabbit hole benchmarking different quantization methods for Qwen3 8B and 32B models to see which ones actually deliver the best accuracy-to-speed tradeoff in real-world use Details: • ExLlamaV3-4bpw surprisingly beat even BF16 on LiveBench accuracy (60.0 vs 58.
People: David Idea: Based on conversations with Keao about machine-translation benchmarking, I ran a bunch of LLM models through MMLU-ProX (lite) French benchmark tests (biology section + full 14-topic suite) to see which ones actually deliver the best mix of speed and accuracy in French. This started because we hypothesized that