
gittech. site
for different kinds of informations and explorations.
I built an open source computer-use SDK enabling agents to authenticate
Published at
Jan 19, 2025
Main Article
cuse
An open-source framework for building AI agents that can interact with computers
Features
- Computer Control: Display, mouse, and keyboard interaction
- Authentication: Authenticate with credentials
- File Operations: View, create, and edit files
- Shell Access: Execute commands and manage processes
- App Framework: Build custom applications
- Linux Support: Run via Docker containers
Demo
Task: Log in to Gmail, check your inbox, and add new leads to the spreadsheet.
https://github.com/user-attachments/assets/6689d937-6d4c-4331-af8f-008486ad7f00
Quickstart
Install dependencies:
npm install @cusedev/core
Initialize and create a computer:
npx @cusedev/cli init
Create a Computer
instance
import { Computer } from '@cusedev/core';
const computer = new Computer();
Interact with the computer:
// Take a screenshot
const screenshot = await computer.system.display.getScreenshot();
// Type some text
await computer.system.keyboard.type({ text: 'Hello, World!' });
// Execute a command
const output = await computer.system.bash.execute({ command: 'ls -la' });
Documentation
Visit our documentation to learn more about:
- Getting started with the example project
- Adding cuse to your existing project
- Core concepts and API reference
- CLI commands and usage
Roadmap
- Support for other platforms
- Deployment
- Stateful Machines
- Reusable Workflows
Contributing
Contributions are welcome! Please check out our GitHub repository.
License
MIT License β see LICENSE file
Get in Touch
- Visit our website
- Join our Discord community