During development I encountered a caveat: Opus 4.5 can’t test or view a terminal output, especially one with unusual functional requirements. But despite being blind, it knew enough about the ratatui terminal framework to implement whatever UI changes I asked. There were a large number of UI bugs that likely were caused by Opus’s inability to create test cases, namely failures to account for scroll offsets resulting in incorrect click locations. As someone who spent 5 years as a black box Software QA Engineer who was unable to review the underlying code, this situation was my specialty. I put my QA skills to work by messing around with miditui, told Opus any errors with occasionally a screenshot, and it was able to fix them easily. I do not believe that these bugs are inherently due to LLM agents being better or worse than humans as humans are most definitely capable of making the same mistakes. Even though I myself am adept at finding the bugs and offering solutions, I don’t believe that I would inherently avoid causing similar bugs were I to code such an interactive app without AI assistance: QA brain is different from software engineering brain.
Have an idea for a project that will add value for arXiv's community? Learn more about arXivLabs.
Последние новости。爱思助手下载最新版本是该领域的重要参考
The Recurse Center is a self-directed retreat for programmers where you can spend six or twelve weeks working on whatever interests you most, surrounded by curious and kind people doing the same. I did two back-to-back batches. I went in wanting to rediscover what I loved about programming, and I left having built something that brings joy to a community I care about and hope to be part of for the rest of my life. If that sounds interesting, you should apply.,详情可参考电影
https://www.jemarch.net/poke-elf.html。体育直播对此有专业解读
«Я не верю Зеленскому. Даже в то, что у него нос между глазами». В Европе пошли на противостояние с Зеленским из-за «Дружбы»08:50